Граф коммитов

427 Коммитов

Автор SHA1 Сообщение Дата
microsoft-github-policy-service[bot] 23734a65f9
Microsoft mandatory file 2023-09-09 13:10:36 +00:00
Zheyu Shen ac297cc72d
Merge pull request #33 from Azure/arsdragonfly/2019u2
Arsdragonfly/2019u2
2023-09-09 21:10:24 +08:00
Zheyu Shen 5699aebf98 fix 2023-09-09 21:09:21 +08:00
Zheyu Shen 6a8ba48bc3 add nuget publishing 2023-09-09 20:55:55 +08:00
Zheyu Shen a703aa8d7c add nuget 2023-09-09 18:04:28 +08:00
Zheyu Shen 07bf4f1a3f make vmextension zip 2023-09-09 15:24:14 +08:00
Zheyu Shen 2ac1afbd15 build hpcnodeagent.tar.gz 2023-09-08 21:21:13 +08:00
Zheyu Shen a683ad9f11 add previously missing files 2023-09-08 17:47:43 +08:00
Zheyu Shen b9470fcf76 improve building 2023-09-08 15:49:19 +08:00
Zheyu Shen f362653cbc initial vcpkg build 2023-09-07 17:36:50 +08:00
Zheyu Shen 6260c2c153 WIP 2023-09-07 15:54:02 +08:00
Zheyu Shen d0b01012b1 fix compilation on 22.04 2023-09-07 15:53:08 +08:00
Zihao Chen 03e3aeb523
Merge pull request #32 from AaronYll/v2
Cherry-pick commit 94e8108 and commit 9e3d9a5 from master
2023-09-07 15:42:16 +08:00
liayuan 2e924a88df Add Dockerfile that generates the image to build nodemanager. 2023-09-07 15:39:44 +08:00
liayuan 276755116a Change Readme.txt to README.md 2023-09-07 15:29:57 +08:00
Liangliang Yuan b8cbbaafb8 Support build from docker 2023-09-07 15:28:14 +08:00
Liangliang Yuan eadc90db9d fix race condition when get stale request to StartTask 2023-09-07 15:19:34 +08:00
Sunbin Zhu 70554a6a76 support both python2/3
support both python2/3
2021-12-23 19:08:41 +08:00
chezhang 832d81dc64 Update version to 2.5.1.0 2021-08-24 12:18:19 +08:00
chezhang b45dfc8e97 change the content in default execution filters
(cherry picked from commit 98c508be5f)
2021-08-24 12:16:43 +08:00
chezhang 25d03b41a1 Some changes about log 2021-06-30 18:07:34 +08:00
chezhang 632440943e Fix a bug that user/kernel time in statistics maybe abnormal due to reading value from empty stream 2021-06-30 17:23:31 +08:00
chezhang 15b7529103 Fix a bug that processes in statistics may be empty by retrieving tasks from cpuset subsystem instead of cpuacct subsystem in cgroup 2021-06-30 17:20:36 +08:00
chezhang 0731b03962 Fix a bug that heartbeat thread may be stuck due to deadlock 2020-07-27 16:48:13 +08:00
chezhang 6130db6eb7 Revise some log messages 2020-07-27 16:28:25 +08:00
chezhang 772b2f4012 Update version to 2.5.0.0 2020-07-17 21:51:02 +08:00
zclok010 cce1f68b9f Improve GPU instance name readability in metric info by adding GPU name 2019-08-30 14:18:33 +08:00
zclok010 9695b14042 Fix a issue that node with FQDN host name may not be recognized by scheduler 2019-08-20 11:44:21 +08:00
zclok010 3dbf5d342d Fix build issue 2019-08-20 11:38:18 +08:00
zclok010 b369a14ff3 Move build-in execution filters to https://github.com/Azure-Samples/hpcpack-samples 2019-08-01 23:35:55 +08:00
zclok010 5ae6582161 add a missing build-in execution filter in OnTaskStart.sh 2019-07-31 15:33:59 +08:00
zclok010 1a575bccb3 suppress frequent warning log message when counter file of a mellanox network driver is invalid 2019-07-31 15:32:31 +08:00
zclok010 dc73a1fb4d git ignore VMExtension/*.zip 2019-07-26 00:04:41 +08:00
zclok010 ba4ca371d7 git ignore VMExtension\*.zip 2019-07-26 00:01:58 +08:00
zclok010 f91b5bae05 update config sample 2019-07-25 23:58:02 +08:00
Sunbin Zhu 6abe5e32d4 Merge branch 'v2' of https://github.com/Azure/hpcpack-linux-agent into v2 2019-07-25 23:37:09 +08:00
Sunbin Zhu a544594e1e Update hpcnodemanager.py
Add firewall rule to allow port 40002
2019-07-25 23:37:02 +08:00
zclok010 302720eb73 revise some version info 2019-07-23 12:03:32 +08:00
zclok010 0de36a6ff9 mitigate scheduler pressure when connection is poor by decreasing HTTP reporter retry frequency 2019-07-23 12:03:32 +08:00
zclok010 19fcca1a97 Support multiple instances monitoring with instance filter;
Seperate network usage monitoring from total usage to usage of individual network instances
2019-07-23 12:03:32 +08:00
Sunbin Zhu c7c071ffa0 Include VM extension code
Include VM extension code
2019-07-19 15:03:46 +08:00
zclok010 a2ff32a012 add comments of known issue of run-away processes 2019-07-04 00:14:05 +08:00
zclok010 5760fb798b Merge branch 'dockerTask' into v2 2019-07-03 14:49:55 +08:00
zclok010 16c5aaaecb HpcData client location change in execution filter 2019-06-19 15:11:19 +08:00
zclok010 183c4d8ef2 update version info 2019-06-17 16:01:59 +08:00
zclok010 9d827bcac1 Fix node manager crash issue due to out-of-bound array writing when constructing monitoring packet with too many data values 2019-06-13 19:32:41 +08:00
zclok010 9ff4b44466 docker task improvement 2019-06-13 15:31:44 +08:00
zclok010 a1dd952348 Add IB network usage factor 2019-05-31 16:59:55 +08:00
zclok010 ae73ce7624 Merge branch 'config' into v2 2019-05-27 16:24:51 +08:00
zclok010 29b2baa192 Merge branch 'ibNetwork' into v2 2019-05-27 16:22:24 +08:00