Yuge Zhang
1bc5c348f4
Fix typo in history API error message ( #4401 )
2020-04-27 14:39:44 +08:00
YundongYe
972109f418
[Doc] Remove yarn content from deployment doc ( #4447 )
2020-04-26 10:22:48 +08:00
Zhiyuan He
72268d9328
fix job retry url ( #4442 )
...
* fix
* fix
2020-04-26 10:06:21 +08:00
YundongYe
14016f8be1
[cluster-object-model] Disable yarn value in cluster-type ( #4445 )
2020-04-24 17:07:59 +08:00
YundongYe
3ab8ae0479
Fix swagger issue. ( #4430 )
2020-04-24 15:24:52 +08:00
YundongYe
54453e4f45
[build] Disable yarn option in pai_build.py. ( #4443 )
...
* Disable Yarn build option and remove hadoop binary building source code
* Update doc
2020-04-24 13:08:01 +08:00
YundongYe
db46dc0e2e
[Jenkins]Remove Yarn Verion's Test ( #4440 )
2020-04-24 10:37:13 +08:00
Yifan Xiong
aaf3ac80d4
Fix Azure File issues in storage ( #4438 )
...
Fix Azure File issues in storage.
2020-04-24 10:34:14 +08:00
Yifan Xiong
4e4124c1f0
[Docs] Update nullable fields in job api ( #4437 )
...
Update nullable fields in job api.
2020-04-23 16:23:51 +08:00
Yifan Xiong
9951624c73
[Rest Server] Add rate limit for RESTful API ( #4418 )
...
Add rate limit for RESTful API, defaults to:
* 600 requests per ip every 1 min
* 60 list job requests per user every 1 min
* 60 submit job requests per user every 1 hour
2020-04-22 13:07:13 +08:00
Binyang2014
ae23350d7b
[watchdog] change default mem limit ( #4413 )
2020-04-22 10:02:19 +08:00
Yuqing Yang
8b9632926d
validate rest server swagger ( #4404 )
...
* add swagger validation to travis
* fix validation error in swagger (#4405 )
2020-04-20 10:47:59 +08:00
Yifan Xiong
7b1247d087
[Rest Server] Fix log path length ( #4400 )
...
Fix log path length, set job name length limit to 255.
Closes #4391 .
2020-04-17 16:40:56 +08:00
Yuqing Yang
13bea1e062
refine rest swagger and codes ( #4355 )
...
* fix empty error
* hosted on https://swagger
* remove /api/v2/jobs/{}/ssh
* update GET token and POST basic/login
* [Docs] Update job and virtual cluster api docs (#4363 )
* Update job and virtual cluster api docs
Update job and virtual cluster api docs.
* Update security option
Update security option.
* Update operation id
Update operation id.
* bind authn api to v2 and update doc (#4368 )
* Add document of auth api.
* Fix description
* update swagger
* Fix api's method
* Fix api/v2/auth to api/v2/authn
* [Restserver] Refine group api (#4371 )
* Update Group API
* Fix issue
* Change update api from put to patch
* Add patchExtension into put api.
* issue fix.
* typo fix
* Fix travis
* fix typo
* Update token APIs and k8s proxy APIs (#4375 )
* update example for ForbiddenUserError
* add get /api/v2/info
* add v2 router
* fix
* update
Co-authored-by: yiyione <yiyi_one@foxmail.com>
Co-authored-by: Yi Yi <yiyi@microsoft.com>
* update group api doc (#4374 )
* update group api doc
* Set patch's default value to false
* Pack group entry into
* Pack group entry into
* Add job attempt schema in api doc (#4376 )
* Refine user api (#4379 )
* Update user api doc. (#4383 )
* Check parameter in joi (#4388 )
* update token to tokens (#4389 )
Co-authored-by: yiyione <yiyi_one@foxmail.com>
* fix typo
* update (#4398 )
Co-authored-by: yiyione <yiyi_one@foxmail.com>
* basic logout api (#4399 )
* Add length limit for job name
Add length limit for job name.
Co-authored-by: Yifan Xiong <yifan.xiong@microsoft.com>
Co-authored-by: YundongYe <8675883+ydye@users.noreply.github.com>
Co-authored-by: yiyione <307402594@qq.com>
Co-authored-by: yiyione <yiyi_one@foxmail.com>
Co-authored-by: Yi Yi <yiyi@microsoft.com>
Co-authored-by: Mingliang Tao <mintao@microsoft.com>
2020-04-17 15:42:12 +08:00
Mingliang Tao
e7db965ae2
Fix abnormal job list bug in home page ( #4397 )
2020-04-17 14:03:59 +08:00
Yifan Xiong
8063bf6509
[Rest Server] Update job and virtual cluster api ( #4381 )
...
* Add bearer auth for job and vc api
* Migrate sku types api in vc
* Add auth token in webportal
2020-04-17 13:08:06 +08:00
Yifan Xiong
5165b5cbf9
[Rest Server] Add minimal priority when priority class disabled ( #4396 )
...
* Add minimal priority when priority class disabled
Add minimal priority when priority class disabled.
* Skip yarn version
Skip yarn version.
2020-04-17 10:41:37 +08:00
Binyang2014
6d31509563
[deploy] update doc about removing weave-net( #4390 )
2020-04-17 09:06:35 +08:00
Mingliang Tao
f403cdf8f3
Refine gpu chart ( #4386 )
2020-04-16 17:32:04 +08:00
Mingliang Tao
30d6eb73af
Change title from Utilization to Allocation ( #4382 )
2020-04-14 12:32:09 +08:00
Mingliang Tao
4a7ea4697a
Move chevron arrow to left ( #4380 )
2020-04-14 12:31:53 +08:00
Binyang2014
4e0121f8e3
Update runtime and errorSpec ( #4378 )
...
Bump runtime version to v0.1.1
Update errorSpec to handle ecc error
2020-04-10 15:37:58 +08:00
Yifan Xiong
0b8c35896d
Change gpuType(s) to skuType(s) ( #4362 )
...
Change gpuType(s) to skuType(s).
2020-04-09 17:12:20 +08:00
Zhiyuan He
763a3b498f
Disable `default` vc option in edit user dialog ( #4360 )
2020-04-08 13:08:14 +08:00
YundongYe
06f8c51969
[pylon] Update openssl-1.1.0f's url and fix GitPython version in dev-box ( #4367 )
...
* Update openssl-1.1.0f's url
* Fix GitPython to 2.x.x
2020-04-07 15:45:38 +08:00
Yuqi Wang
ed66e057a6
Update HiveD Image ( #4366 )
2020-04-07 11:07:04 +08:00
Binyang2014
4fb34fe5cc
[log-manager] Fix return too much log issue ( #4358 )
2020-04-03 13:25:37 +08:00
fan yang
9838419c6c
add a link to architecture in Readme ( #4353 )
2020-04-02 14:53:12 +08:00
Yifan Xiong
19b8ef3bca
[Rest Server] Support CPU cell in hived scheduler ( #4342 )
...
* Support CPU cell in hived scheduler
* Update resource pre-check
2020-04-02 14:40:07 +08:00
Yifan Xiong
4a2aceadbe
[Rest Server] Update virtual cluster metrics using scheduler api ( #4329 )
...
* Update virtual cluster metrics using scheduler api
* Update node gpu metics using scheduler api
* Check virtual cluster quota using hived scheduler api during job submission
* Update virtual cluster metrics on webportal
2020-04-01 19:08:14 +08:00
Zhiyuan He
c0ef775d2a
Upgrade Twisted to 20.3.0 ( #4347 )
2020-04-01 11:22:36 +08:00
yiyione
e178ea6e99
[CI] Update python to 3.8 for lint document ( #4345 )
...
* Update python to 3.8 for lint document
* fix
Co-authored-by: Yi Yi <yiyi@microsoft.com>
2020-04-01 10:59:15 +08:00
Zhiyuan He
51a6e1fa16
Add a notice if user still wants to use yarn version ( #4340 )
...
* init
* fix
* fix
2020-03-31 10:10:40 +08:00
Binyang2014
a0b18ff94b
Bump node-exporter version to 0.18.1 ( #4341 )
2020-03-31 09:36:27 +08:00
Yuqi Wang
165142afed
Update HiveD Image and add exitspec for ContainerGpuDeviceNotFoundError ( #4343 )
2020-03-30 21:29:28 +08:00
Yifan Xiong
0631768ebb
Enable priority class by default ( #4334 )
...
Enable priority class by default.
2020-03-27 18:58:16 +08:00
Binyang2014
561149b8fe
[Rest Server] fix fetch job status failed ( #4333 )
2020-03-27 16:13:57 +08:00
Zhiyuan He
f8365cd1d3
Update default docker image list ( #4328 )
...
* fix
* fix
* lint
* fix
2020-03-26 17:02:49 +08:00
Mingliang Tao
f33c4c03b7
Remove out-dated marketplace ( #4324 )
2020-03-26 15:12:53 +08:00
yiyione
4742bd3fd1
[VS Code] Remove VS Code extension code ( #4317 )
...
* Remove vscode extension code
* fix broken links
* update broken links
* update
Co-authored-by: Yi Yi <yiyi@microsoft.com>
2020-03-26 14:50:02 +08:00
Yuqi Wang
abb68009d4
Update HiveD Image ( #4320 )
2020-03-24 12:22:21 +08:00
Binyang2014
dfc7b86390
[kube-runtime] remove kube-runtime ( #4311 )
...
* remove kube-runtime
* add runtime tag
* add doc
* fix doc
* more changes
* rename to openpai-runtime
* fix build issue
* change version
2020-03-23 20:48:37 +08:00
Yuqi Wang
3a91564935
[Hived]: Move Hived from OpenPAI to dedicated repo ( #4319 )
...
* [Hived]: Move Hived from OpenPAI to dedicated repo
Moved to https://github.com/microsoft/hivedscheduler
2020-03-23 20:17:53 +08:00
Zhiyuan He
1b9635b2f3
fix ( #4318 )
2020-03-23 19:46:21 +08:00
Zhiyuan He
62106ff7b5
init ( #4313 )
2020-03-23 12:45:58 +08:00
Yuqi Wang
e30100559a
[Hived]: Add Feature Demo Doc ( #4235 )
...
* [Hived]: Add Feature Demo Doc
* [Hived]: Add Feature Demo Doc
* [HiveD] refine doc (#4241 )
* refine readme.md
* refine readme.md
* minor
* resolve comments
* refine feature demo
* Refine Feature Demo Doc (#4309 )
* Refine
Co-authored-by: Hanyu Zhao <zhaohanyu@pku.edu.cn>
2020-03-20 20:23:34 +08:00
Yifan Xiong
cb851a5ec6
Migrate protocol spec ( #4307 )
...
* Migrate protocol spec
Migrate pai protocol spec to https://github.com/microsoft/openpai-protocol .
* Fix broken links
Fix broken links.
2020-03-20 17:05:42 +08:00
Yuqi Wang
c83c7fcc71
Merge pull request #4304 from microsoft/del-es
...
Remove Elasticsearch
2020-03-20 15:10:33 +08:00
Binyang2014
bdb37c77e2
[webportal] Change storage representation ( #4300 )
...
* Change storage representation
* fix commets
2020-03-20 14:11:34 +08:00
Zhiyuan He
29f5ce61f1
fix CI in rest-server after removal of elasticsearch ( #4305 )
2020-03-20 13:01:59 +08:00