Граф коммитов

37 Коммитов

Автор SHA1 Сообщение Дата
Yi Yi 529db900c3
Add release note for v1.8.1 (#5656)
* add release note for v1.8.1

* update the version number in docs

* update release month
2021-12-27 14:09:57 +08:00
Zhiyuan He 19e11d88e5
fix doc related to china deployment (#5593)
* fix

* fix
2021-08-10 16:02:12 +08:00
Yi Yi 68d29eff32
Add release note for v1.8.0 (#5556)
* Add release note for v1.8.0

* update

* update installation-guide

* update image tag in kubespray
2021-07-14 17:22:12 +08:00
Starmie@Choice Specs 34fc8f600d
Autoscaler in the main docs (#5523)
* Autoscaler in the main doc

* autoscaler in the catalogue

* reindex

Co-authored-by: Chengruidong Zhang (FA Talent) <v-chenzhang@microsoft.com>
2021-06-21 10:55:01 +08:00
Guoxin c6e44085d0
add username label to PAI job low gpu utilization alert (#5499) 2021-06-03 14:06:48 +08:00
Binyang2014 f53167fb2f
update remove weave cni doc (#5488) 2021-05-11 15:40:33 +08:00
Binyang2014 894defb281
[cherry-pick] add release note (#5438) (#5454)
Add release note

Co-authored-by: Mingliang Tao <mintao@microsoft.com>
2021-04-28 12:13:03 +08:00
Yi Yi 2b7e3bdcbb
[Document] Add docs for renew k8s API server cert (#5434)
* Add docs for renew k8s API server cert

* fix kubernetes typo

* skip the drain step in master node

* change !prodheads000001 to !master-node

* delete token step should be run from master node

* fix chown commands run by other user

* update docs
2021-04-26 15:06:11 +08:00
Starmie@Choice Specs 32898e6858
Update docs about config.yaml (#5421)
* warn if config.yaml not exists on the cluster
* remind the user to check config.yaml when add / remove nodes
2021-04-13 16:05:46 +08:00
Starmie@Choice Specs 3ca79e2d9f
Add / remove remove nodes using paictl.py (#5400) 2021-04-01 10:30:48 +08:00
Zhiyuan He f8fc15d50e
Add "Solve Unmounted Database Problem" doc (#5390) 2021-03-25 12:11:06 +08:00
Zhiyuan He fe18fd9fce
Create v1.6.0 release (#5355)
Co-authored-by: Yi Yi <yiyi@microsoft.com>
2021-03-18 16:48:05 +08:00
Zhiyuan He 3eb3ddbeb2
Update doc about nvidia-docker2 (#5365) 2021-03-11 09:44:18 +08:00
Xiang Long 724961c538
Add docker cache doc (#5349) 2021-03-09 13:19:59 +08:00
yiyione 2b13ed7125
[CI] Use nightly-build status in README.md (#5330)
* Add nightly build status to README.md

* update the build status in docs
2021-03-04 11:09:24 +08:00
Zhiyuan He 5b9c68dd63
Add zh_CN prerequisite doc (#5333) 2021-03-02 15:36:09 +08:00
Guoxin 3a34fa30be
sync auth zh doc (#5332) 2021-03-02 13:06:28 +08:00
Guoxin 23d41e37c9
add cluster-utilization report doc (#5331) 2021-03-02 10:45:10 +08:00
Guoxin 1d62a1fa2b
update ssh doc: use pre-saved ssh keys (#5329) 2021-03-01 18:09:33 +08:00
Guoxin 59146fa6fa
[alert-manager] move pai-bearer-token config field under alert-manager (#5324) 2021-02-28 20:17:25 +08:00
Starmie@Choice Specs 0a984bced0
Reduce ansible logs when deploy (#5305)
* -v: display more ansible logs

* Introduced -v option in the installation doc

* If not in verbose mode, clear the callback whitelist

* In verbose mode, callback_whitelist = profile_tasks

Co-authored-by: Chengruidong Zhang (FA Talent) <v-chenzhang@microsoft.com>
2021-02-24 16:52:01 +08:00
Starmie@Choice Specs ce378418ad
Docs modification (admin) (#5299)
* Readme.md and some grammarly errors due to proper nouns

* Some missing grammarly errors due to proper nouns

* Some missing grammarly errors
2021-02-19 10:23:20 +08:00
Starmie@Choice Specs be386f0a8c
fix grammar issues in cluster-user manual (#5287) 2021-02-05 17:45:31 +08:00
yiyione c26313e937
[CI] Remove travis and update GitHub Actions to replace it (#5276)
* Remove .travis file

* Add frameworklauncher test to git action

* Add run coveralls in GitHub action

* Fix coverall in GitHub Action

* Fix coverall in GitHub Action

* Move coverall outside rest-server test to avoid post coverage twice

* Fix coverall

* change build-status badge from travis to GitHub Action

* update name of swagger-validate
2021-02-01 14:37:54 +08:00
Zhiyuan He 67f5593772
fix installation compatibility of ubuntu 20.04 (#5255) 2021-01-18 17:11:51 +08:00
Zhiyuan He 3800494a48
zh_CN doc update (#5249)
* cn_doc_update

* fix

* fix
2021-01-14 10:13:36 +08:00
Guoxin 67f8a8c177
release v1.5.1 (#5192) 2020-12-24 15:34:21 +08:00
Guoxin 178868bec7
release v1.4.0 (#5133)
* release v1.4.0

* update branch name and image tag to v1.4 in installation doc
2020-12-09 12:59:11 +08:00
Zhiyuan He 3e55d307b2
Align zh_CN doc (#5097) 2020-11-19 14:06:07 +08:00
Binyang2014 0d1574d973
fix doc typo (#5090) 2020-11-17 11:25:25 +08:00
vvfreesoul 392c2d8ea5
HTTPS eng doc (#5078) 2020-11-16 11:43:50 +08:00
Guoxin 97641692c2
Alert Email Template Refine (#5064)
* add kill user job alert template
* add troubleshooting link in general email template
* allow customized templates
* redefine actions schema in customized-receivers to facilitate parameters passing
2020-11-12 20:21:10 +08:00
Guoxin 49c8b0f026
deployment process refine (#5077)
- move requirement check before configuration setting;
- move quick-start-config folder setting to quick-start-services.sh; 
- fix installation document branch checkout issue; change suggested version to v1.3.y;
- refine uninstallation doc
- fix git clone issue: clone to an existing folder may raise error, so the code pai/kubespray may be unsuccessfully cloned.
2020-11-12 13:15:35 +08:00
vvfreesoul 8cda19243a
HTTPS (#5076)
* fix如何设置HTTPS访问

* lint error fix

* https test

* https配置完整中文版本

* https配置完整中文版本fix

* https配置完整中文版本fix
2020-11-12 10:36:26 +08:00
Guoxin 6cb7f8d242
Alert Severity (#5055)
* define alerts severity

* show severity in email

* introduce alert severity in doc and examples

* show severity in web-portal
2020-11-09 15:33:05 +08:00
Zhiyuan He 3a6ead7031
[Doc] minor issue (#5049) 2020-11-03 16:46:12 +08:00
shaiic-pai d434a3c096
Add zh_CN docs (#5037)
* add zh cn docs

* fix link
2020-11-03 10:33:22 +08:00