Граф коммитов

3995 Коммитов

Автор SHA1 Сообщение Дата
YundongYe 31b3604843
[kubespray] log level to 2 (#4111) 2020-01-09 15:51:13 +08:00
yiyione a8b675e233
[VS Code] Team share storage explorer (#4104)
* update

* update

* update

* change source code folders

* fix submit job in cluster explorer

* update nfs treeview

* add list blob

* add personal storage root item

* add azure blob upload delete create folder ...

* add download and open

* update open storage

* update

* add personal storage config

* refine storage tree view use page and load more item in azure blob

* add refresh after azure blob delete and upload

* update NFS tree view

* refine code and update NFS commands

* fix hdfs explorer

* add refresh to PAI cluster root

* update personal storage config

* update personal storage

* add remote file editor

* remove some todo comment

* fix tslint error

* remove unused data tree type

* update for samba

* update
2020-01-09 11:02:52 +08:00
Qinzheng Sun c9f10bc8f6
[Rest Server] fix k8s api request options (#4114)
* 443

* 520

* 651

* 656

* 803

* fix test

* timeout

* 858
2020-01-08 14:04:40 +08:00
Yifan Xiong 640bc35fbd
[Rest Server] Remove job name length limitation (#4106)
* Remove job name length limitation

Move jobname from framework labels to annotations.

Fixes #4101.

* Rename frameworkLabels

Rename frameworkLabels.
2020-01-08 10:49:36 +08:00
Yifan Xiong c93899c296
[Plugin] Remove tensorboard from submission plugin (#4110)
* Remove tensorboard from submission plugin

Tensorboard has been splited to runtime plugin,
remove it from submission plugin on webportal.

* Show uploaded file name

Show uploaded file name.
2020-01-07 16:05:52 +08:00
Binyang2014 ba4e479fa0
[Dashboard] bug fix (#4113) 2020-01-07 14:26:02 +08:00
Binyang2014 4507cfc6ad
[k8s-dashboard] fix k8s-dashboard deploy issue (#4109)
* fix deploy issue

* fix review comments

* minor fix
2020-01-07 12:14:04 +08:00
Mingliang Tao 08a3c7223c
Hide warning icon if retry is 0 (#4012) 2020-01-06 15:19:06 +08:00
Mingliang Tao 774f0c2d2c
Add logger in job attempt api (#4085) 2020-01-06 13:21:16 +08:00
Yuqi Wang 8590afd521
[Hived]: Refine config quick start docs (#4102) 2020-01-03 17:41:50 +08:00
Mingliang Tao 864236db5d
Refine home page and sign-in page (#4084) 2020-01-03 13:33:14 +08:00
Yuqi Wang 513d0104cb
[Hived]: Add config quick start docs (#4099) 2020-01-03 12:48:29 +08:00
YundongYe d659ffd97d
[Cluster Object Model] Generate necessary service config based on cluster type (#4036)
* init commit

* [Cluster Object Model] Check laucher-type and log-type based on cluster-type. (#4038)

* Remove .k8s.yaml logic. And remove launcher-type & log-type

* [Cluster Objec Model]Don't check hadoop rm uri when in pure k8s cluster. (#4052)

* [Rest-server] Add env, volume based on the cluster-type (#4058)

* Remove yarn config from pylon when cluster-type is k8s (#4063)

* [cluster object model] Document update. (#4066)

* From openpai_parser_type  to service_type
2020-01-02 11:00:14 +08:00
Binyang2014 1b588a72b1
[Storage Manager] persist storage logs (#4088)
* persist storage log
2019-12-31 17:27:07 +08:00
Binyang2014 1201799509
[Storage-Manager] move apt-get to docker image (#4086)
* move apt-get to docker image
2019-12-30 19:23:12 +08:00
Binyang2014 b3f71ac6e9
[Webportal] Remove storage plugin command inject (#4016)
When using storage plugin, we will not inject commands into the user command. Just change the extra field in the protocol.
Sample:
```yaml
extras:
  com.microsoft.pai.runtimeplugin:
    - plugin: teamwiase_storage
      parameters:
        storageConfigNames:
          - PAI_SAHRE  # storage config name provided by admin
```

#### Other changes 
- Fix plugin can not be removed bug
2019-12-30 15:59:49 +08:00
Qinzheng Sun ee21409834
[Web Portal] fix job submission wizard's svg icons (#4080) 2019-12-30 12:23:58 +08:00
Binyang2014 e31ca14670
[Runtime] add failurePolicy to runtime plugin (#4048)
Refer to issue #4014 add failurePolicy for runtime plugin.
User can make runtime plugin doesn't fail the entire runtime by setting failurePolicy: ignore

The default failurePolicy is fail
2019-12-29 20:06:23 +08:00
Yifan Xiong 207ca6cde8
[Rest Server] Update job name encoding method (#4069)
* Update job name encoding method

Update job name encoding method, use md5 hash instead.
Query job by k8s label selector.

Resolve job name related issues in #3935.

* Drop legacy jobs compatibility

Drop legacy jobs compatibility.
2019-12-28 02:14:13 +08:00
Qinzheng Sun 00fd00e29a
[Web Portal] Tweak home page's fonts (#4070)
* 652

* 245
2019-12-27 16:41:26 +08:00
Qinzheng Sun 97ca383c17
[Web Portal] fix navbar button's hover/pressed color (#4077) 2019-12-27 14:39:15 +08:00
Qinzheng Sun 8434590b18
[Rest Server] fix k8s api axios request (#4076) 2019-12-26 16:03:44 +08:00
Qinzheng Sun 5227c93f44 typo (#4067) 2019-12-26 09:54:55 +08:00
yiyione 817bd75ed7
Update version and change log (#4064) 2019-12-25 18:59:33 +08:00
Qinzheng Sun 1b68786f25
[Web Portal] fix dashboard menu (#4065) 2019-12-25 18:23:24 +08:00
Yifan Xiong 1d050e585f
[CI/CD] Add Jenkins stages for pure k8s version (#4059)
* Add Jenkins stages for pure k8s version

Add Jenkins stages for pure k8s version.

* Add protocol test case

Add protocol test case.
2019-12-25 17:59:10 +08:00
Qinzheng Sun caebbb30d4
[Web Portal] Refine key value list component (#3961)
* 1126

* 231

* 327
2019-12-25 15:54:19 +08:00
yiyione 11ca52c5ed
[VS Code] Fix submit job to https cluster (#4061)
* fix https

* update

* fix cert not safe error
2019-12-25 15:04:38 +08:00
Qinzheng Sun 0bea9d1261
[Rest Server] kubernetes pods/nodes list api (#4037)
* 420

* doc

* 431

* 434

* 458

* 833

* hived

* 551
2019-12-25 14:50:13 +08:00
Scarlett Li b0d2e09110
Update README.md 2019-12-25 12:18:19 +08:00
Scarlett Li fedc424a2a
Update README.md 2019-12-25 12:16:18 +08:00
Binyang2014 a80763ff3a
[k8s-dashboard] Enable https for k8s-dashboard (#4032)
* enable https

* fix ut

* remove dashboard roles

* bug fix
2019-12-25 10:32:30 +08:00
Binyang2014 44506d372a
[Grafana] remove grafana chart (#4055)
* remove grafana chart

* fix

* move dashboard to admin section
2019-12-25 10:23:14 +08:00
Qinzheng Sun af98e312f3
[Web Portal] Remove admin-lte layout dependencies (#4039)
* miscs

* layout

* responsive layout & sidebar items

* plugins

* fix

* 709

* notification button font size

* 235

* 253

* coin

* spelling
2019-12-23 18:56:10 +08:00
Zhiyuan He 56a6f6bbab
fix (#4049) 2019-12-23 17:18:15 +08:00
Yifan Xiong 927cda4886
[CI/CD] Update Jenkins (#4035)
Update Jenkins.
2019-12-23 16:50:52 +08:00
Binyang2014 4fc30d798b
[ErrorSpec] make pattern case insensitive (#4047) 2019-12-20 19:25:45 +08:00
Binyang2014 2aa4ec6da6
[runtime] fix image check (#4015) (#4017) 2019-12-20 18:28:38 +08:00
Yuqi Wang 44f27bb666
[Hived]: Per VC queuing to avoid cross VC starvation (#4041) 2019-12-20 12:22:14 +08:00
Binyang2014 317d0c4dfe
[Prometheus ] fix prometheus high io issue (#4019) (#4033) 2019-12-20 09:56:19 +08:00
Yifan Xiong 01facedc05
[Rest Server] Expose preemption status in scheduler (#4007)
* Expose preemption status in scheduler

Expose preemption status in scheduler.

* Add error handling for hived webservice call

Add error handling for hived webservice call.

* Update hived scheduler service configuration

Update hived scheduler service configuration.

* Update

Update.

* Update

Update.
2019-12-19 19:42:37 +08:00
yiyione e64da1d439
update yaml job config schema (#4011) 2019-12-19 14:45:20 +08:00
Hanyu Zhao dbb55ae270 [HiveD] fix pod hanging after reconfiguration (#4003) 2019-12-19 09:47:28 +08:00
Yifan Xiong ae388103b0
[Docs] OpenAPI docs for RESTful API (#3996)
Add OpenAPI docs for RESTful API.
2019-12-17 15:43:00 +08:00
Mingliang Tao 439d4e6965
Fix clone loading misbehavior (#3923) 2019-12-16 11:10:38 +08:00
yiyione 51e72bb030
[VS Code] use bash in simulation (#3991)
* use /bin/bash in simulation

* update
2019-12-12 15:06:38 +08:00
Binyang2014 013eecd116
[runtime] Change to call k8s api to get storage config (#3944)
Change to use k8s api to finish runtime storage plugin
2019-12-10 16:38:43 +08:00
Yifan Xiong 141301d4ad
Fix typos (#3980)
* Fix typos

Fix typos by [misspell](https://github.com/client9/misspell).

* Fix typos in api

Fix typos in api.

* Add spelling check in GitHub Actions

Add spelling check using misspell in GitHub Actions.
2019-12-10 11:37:07 +08:00
Binyang2014 f45849f50e
Binyli/job ssh (#3986) (#3992)
Remove ssh plugin when gangAllocation is set to false.
Tested in int bed with following job config
```yaml
extras:
  gangAllocation: false
  com.microsoft.pai.runtimeplugin:
    - plugin: ssh
      parameters:
        jobssh: true
```
2019-12-10 09:52:42 +08:00
Binyang2014 b0ff180888
[runtime] add docker image checker (#3974) (#3993)
* add docker image checker
2019-12-10 09:51:34 +08:00