This commit is contained in:
nidhi0622 2023-02-06 13:25:58 -06:00
Родитель 3056f9184a c8be905c44
Коммит dfb2dde747
13 изменённых файлов: 380 добавлений и 144 удалений

50
.github/workflows/release.yml поставляемый Normal file
Просмотреть файл

@ -0,0 +1,50 @@
name: Upload Release Asset
on:
push:
# Sequence of patterns matched against refs/tags
tags:
- 'v*'
jobs:
build:
name: Upload Release Asset
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Build project # This would actually build your project, using zip for an example artifact
run: |
cd ./hcheck/hcheck/
dotnet build -r linux-x64 --self-contained
- name: Publish
run: dotnet publish ./hcheck/hcheck/hcheck.csproj -c Release -o release -r linux-x64 --self-contained
- name: copy send_log file
run: cp ./hcheck/hcheck/src/send_log /home/runner/work/cyclecloud-nodehealth/cyclecloud-nodehealth/hcheck/hcheck/bin/Release/net6.0/linux-x64/
- name: Get the version
id: get_version
run:
echo ::set-output name=VERSION::${GITHUB_REF#refs/tags/}
- name: tar files
run: |
echo ${{ steps.get_version.outputs.version }}
cd /home/runner/work/cyclecloud-nodehealth/cyclecloud-nodehealth/hcheck/hcheck/bin/Release/net6.0/
tar czf hcheck-linux-${{ steps.get_version.outputs.version }}.tgz ./linux-x64
- name: Create Release
id: create_release
uses: actions/create-release@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
tag_name: ${{ github.ref }}
release_name: Release ${{ github.ref }}
draft: false
prerelease: true
- name: Upload Release Asset
id: upload-release-asset
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ steps.create_release.outputs.upload_url }} # This pulls from the CREATE RELEASE step above, referencing it's ID to get its outputs object, which include a `upload_url`. See this blog post for more info: https://jasonet.co/posts/new-features-of-github-actions/#passing-data-to-future-steps
asset_path: /home/runner/work/cyclecloud-nodehealth/cyclecloud-nodehealth/hcheck/hcheck/bin/Release/net6.0/hcheck-linux-${{ steps.get_version.outputs.version }}.tgz
asset_name: hcheck-linux-${{ steps.get_version.outputs.version }}.tgz
asset_content_type: application/tgz

Просмотреть файл

@ -4,9 +4,30 @@ Azure-healthcheck project is a helper that is capable of running custom healthch
This project supports [NHC](https://github.com/mej/nhc) healthcheck scripts and allows the addition of custom scripts. This was achieved with the help of work by Cormac Garvey, [cc_slurm_nhc](https://github.com/Azure/azurehpc/tree/master/experimental/cc_slurm_nhc). To learn more about this project and the advantages of running GPU healthchecks, refer to [this article](https://techcommunity.microsoft.com/t5/azure-global/automated-hpc-ai-compute-node-health-checks-integrated-with-the/ba-p/3113454).
## Table of Contents (by [gh-md-toc](https://github.com/ekalinin/github-markdown-toc))
<!--ts-->
* [Installation](#installation)
* [Prerequisites](#prerequisites)
* [Building the project](#building-the-project)
* [Uploading the executable files into the blobs storage](#uploading-the-executable-files-into-the-blobs-storage)
* [Uploading the project to the Azure locker](#uploading-the-project-to-the-azure-locker)
* [Customizing healtchecks](#customizing-healtchecks)
* [Importing the cluster template into CycleCloud](#importing-the-cluster-template-into-cyclecloud)
* [Running NHC healthcheck](#running-nhc-healthcheck)
* [Designing custom NHC tests](#designing-custom-nhc-tests)
* [Running custom test scripts](#running-custom-test-scripts)
* [Designing custom test scripts](#designing-custom-test-scripts)
* [Running the hcheck binary](#running-the-hcheck-binary)
* [Changing the script for reporting errors](#changing-the-script-for-reporting-errors)
* [Testing the project](#testing-the-project)
* [Sample healthcheck report](#sample-healthcheck-report)
* [Contributing](#contributing)
* [Trademarks](#trademarks)
## Installation
### Pre-requisites
### Prerequisites
The instructions below assume that:
* you have a valid CycleCloud subscription
@ -16,11 +37,27 @@ The instructions below assume that:
### Building the project
The project comes with a pre-built binary used to run the test scripts and build reports compatible with linux-x64. If you wish to build the source yourself, you will need to install .NET Core. Please refer to the deploy.sh for an example of steps you need to take
The project comes with a pre-built binary used to run the test scripts and build reports compatible with linux-x64. If you wish to build the source yourself, you will need to install .NET Core. Please refer to the deploy.sh for an example of steps you need to take.
```bash
cd ./hcheck/hcheck/
dotnet build -r linux-x64 --self-contained
```
### Uploading the executable files into the blobs storage
All the executable files used by the project (including the external script for sending logs) need to be archived and stored in the blobs folder. You can reference deploy.sh to see how this is achieved:
```bash
VERSION=$(cyclecloud project info | grep Version | cut -d: -f2 | cut -d" " -f2)
DEST_FILE=$(pwd)/blobs/hcheck-linux-$VERSION.tgz
cp ../../../src/send_log ./linux-x64
tar czf $DEST_FILE ./linux-x64
```
### Uploading the project to the Azure locker
In order for you to be able to add the project to your CycleCloud cluster, you will first need to upload it to your Azure Locker.
In order for you to be able to add the project to your CycleCloud cluster, you will first need to upload it to your Azure Locker. The easiest way to do it is by editing deploy.sh
```bash
cyclecloud project upload your-locker-name
@ -45,7 +82,6 @@ Most of them can be configured from the "Advanced Settings" tab in CycleCloud Se
![Alt](/images/advanced_settings.png "Advanced Settings")
### Importing the cluster template into CycleCloud
With CycleCloud CLI, upload the cluster template. Run the commands below to save your cluster settings (such as the region and configuration), and then import the cluster template along with those settings.
@ -55,7 +91,7 @@ cyclecloud export_parameters MyClusterName > param.json
cyclecloud import_cluster --force -f slurm.txt -c Slurm MyClusterName -p param.json
```
## Running NHC healthcheck:
## Running NHC healthcheck
Which NHC checks are run is based on the .conf file. By default, this project includes a set of cluster-specific configuration files. If you want to use a custom configuration instead, put your .config file into the nhc-config subfolder within your project's files directory and edit the parameter to reflect that name instead:
@ -68,19 +104,13 @@ Alternatively, you can change the cluster template directly. This can be useful
config = YOUR_CUSTOM_NAME.conf
```
### Designing custom tests:
### Designing custom NHC tests
You can write your own test scripts to be run by the healthcheck tool.
1) NHC-based tests (.nhc files) have to be placed in the nhc-tests folder. In order for NHC to actually use them, you will need to create your own configuration files. Just place them in nhc-config folder and pass the name to the NHC config name parameter in the settings
NHC-based tests (.nhc files) have to be placed in the nhc-tests folder. In order for NHC to actually use them, you will need to create your own configuration files. Just place them in nhc-config folder and pass the name to the NHC config name parameter in the settings
2) Custom test scripts. Whether it is a bash or a python script, anything executable can be a test, as long as it adheres to the following rules:
- Exit code for a passing test is 0. Any non-zero exit code is considered a failure and will be reported
- To receive a meaningful report on the error, you need to output the message into the stdout
- If you want the report to contain more information than a single message can convey, you can make your script output a json string - just make sure it has a field "message" that would be used to log the error. If you do this, everything but the message field will end up in the "extra-info" part of the report as a valid json (please refer to the [Sample healthcheck report](##Sample-healthcheck-report) section for an example). If there are any formatting issues or you fail to include the "message" field, the whole json construction will become the reported message instead
## Running custom test scripts:
## Running custom test scripts
Put the custom scripts you want the healthcheck tool to run into the custom-tests directory. Update healthchecks.custom.pattern in the cluster-ini template to a pattern that the healthcheck will use to determine which test scripts to run.
@ -93,8 +123,16 @@ Alternatively, you can change the cluster template directly. This can be useful
pattern = *.sh
```
All your healthchecks should exit with code 0 upon the successfull pass of a healthcheck, and non-0 otherwise.
All your healthchecks should exit with code 0 upon the successfull pass of a healthcheck, and non-0 otherwise.
### Designing custom test scripts
Whether it is a bash or a python script, anything executable can be a test, as long as it adheres to the following rules:
- Your script should contain a [shebang](#https://en.wikipedia.org/wiki/Shebang_(Unix))
- Exit code for a passing test is 0. Any non-zero exit code is considered a failure and will be reported
- To receive a meaningful report on the error, you need to output the message into the stdout
- If you want the report to contain more information than a single message can convey, you can make your script output a json string - just make sure it has a field "message" that would be used to log the error. If you do this, everything but the message field will end up in the "extra-info" part of the report as a valid json (please refer to the [Sample healthcheck report](#sample-healthcheck-report) section for an example). If there are any formatting issues or you fail to include the "message" field, the whole json construction will become the reported message instead
## Running the hcheck binary
@ -110,9 +148,24 @@ You should never have to run the tool manually, but in the case you want to do s
| --nr | Number of reruns for the set of scripts | --nr 3 |
| --pt | Pattern for custom script detection | -pt .sh |
| --rpath | Path to where the report would be generated | --rpath /tmp/log/report.json |
| --rscript | Path to the script reporting the results back to the portal | --rscript ./send_logs |
## Changing the script for reporting errors
## Testing the project:
Currently, the script reporting errors back to the portal is CycleCloud specific and uses a custom version of jetpack log command to send detailed information. If you wish to use another script to report the errors back, here are the inline parameters that it will be called with:
| Flag | Use |
| --- | --- |
| -m | Short message that shows up in CycleCloud logs |
| --level error | Level of the message |
| --info | Extra information about the tests in json format |
| --code | Exit code of the test |
| --testname | Name of the test |
| --nodeid | Id of the vm the tests were run on |
| --time | The time it took to run the test in ms |
| --error | Error message retured by the test script |
## Testing the project
You can test the project by putting your custom scripts returning fixed results into the custom-test folder and setting the healthchecks.custom.pattern to the pattern that would detect them.
@ -122,6 +175,8 @@ C# tool itself also comes with unit tests that you can run yourself by going int
dotnet test
```
If you want to test how healthchecks work on a real cluster, you can use the provided evenfail.sh test located in sample-healthchecks subfolder. Just copy it to the ./specs/default/cluster-init/files/custom-tests directory, import the slurm.txt template into your cluster (which should have a single dash in its name, for example - "cycleslurm-demo"), and put "even*.sh" as the custom script pattern parameter. After this, you can run deploy.sh and start the cluster.
## Sample healthcheck report
All healthcheck scripts run by the tool are required to exit with a non-zero code upon an error encountered. If you want to store some extra information into the report and have it as a proper json field, make sure your script outputs a valid json that contains a field "message" - that field will be trimmed from the extra information and would be used as a main output of the script. A failure to add a "message" field or errors in json would result in the whole json string used as a message.
@ -159,8 +214,6 @@ All healthcheck scripts run by the tool are required to exit with a non-zero cod
}
```
## Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a

Двоичные данные
blobs/hcheck-linux-1.0.9.tgz Normal file

Двоичный файл не отображается.

Просмотреть файл

@ -17,6 +17,6 @@ cp ../../../src/send_log /linux-x64
tar czf hcheck-linux-$VERSION.tgz linux-x64
cp hcheck-linux-$VERSION.tgz ../../../../../blobs/
cd $( dirname $0 )/
echo \#!/usr/bin/env bash > ../../../../../specs/default/cluster-init/files/version.sh
echo export HEALTHCHECK_VERSION=$VERSION >> ../../../../../specs/default/cluster-init/files/version.sh
echo \#!/usr/bin/env bash > ./specs/default/cluster-init/files/version.sh
#echo export HEALTHCHECK_VERSION=$VERSION >> ./specs/default/cluster-init/files/version.sh
#cyclecloud project upload azure-storage

Просмотреть файл

@ -26,7 +26,7 @@ public class HealthcheckTest
{
using (TestScriptGenerator tsg = new TestScriptGenerator("print(\"Hello, python world\")\nexit(0)", true))
{
string[] args = { "-k", tsg.Path, "--rpath", "./report.json", "--python", "python3"};
string[] args = { "-k", tsg.Path, "--rpath", "./report.json"};
Healthcheck.Main(args);
ArgumentProcessor argus = new ArgumentProcessor(args);
ReportBuilder builder = new ReportBuilder(argus, argus.FilePath);

Просмотреть файл

@ -23,7 +23,7 @@ namespace hcheck
public bool isSuccess = false;
public DateTime startTime;
public DateTime exitTime;
public virtual void RunProcess(string filePath, string[] args = null, int timeout = 1000)
public virtual void RunProcess(string filePath, string[]? args = null, int timeout = 1000)
{
using (System.Diagnostics.Process pProcess = new System.Diagnostics.Process())
{

Просмотреть файл

@ -1,118 +1,118 @@
using System.Text.Json;
using System.Xml;
namespace hcheck
{
public class TestRunner
{
private HealthReport report;
public ProcessRunner? pr = null;
public TestRunner(HealthReport header)
{
report = header;
}
public HealthReport getReport()
{
return report;
}
private void AddRepeatInfo(string testPath, Dictionary<string, object> testResults, ProcessRunner pr)
{
if (report.testresults[testPath].ContainsKey("repeat-history"))
{
LinkedList<object> list= JsonSerializer.Deserialize< LinkedList<object>>( report.testresults[testPath]["repeat-history"].ToString());
list.AddLast(testResults);
report.testresults[testPath]["repeat-history"] = list;
//LinkedList<object> list = (LinkedList<object>)report.testresults[testPath]["repeat-history"];
//list.AddLast(testResults);
}
else
{
LinkedList<object> list = new LinkedList<object>();
list.AddLast(new Dictionary<string, object>(report.testresults[testPath]));
list.AddLast(testResults);
report.testresults[testPath]["repeat-history"] = list;
}
//first test run that returned an error should be reported
if (report.testresults[testPath]["exit-code"].ToString() == "0" && pr.exitCode != 0)
{
report.testresults[testPath]["exit-code"] = pr.exitCode;
report.testresults[testPath]["extra-info"] = testResults["extra-info"];
report.testresults[testPath][key: "message"] = testResults["message"];
}
}
public void RunTest(string testPath, ArgumentProcessor args)
{
//consern: properly initialize the test, success or fail
//how long took, collect results, write the report
//contract: if the external process is running, exit 0
//invoke nvidia-smi
if (pr == null) pr = new ProcessRunner();
//python scripts need to be run with a python installation
if (args.ReframePath != "")
{
string? reportPath = Path.GetDirectoryName(args.FilePath);
// string actualReportPath = (reportPath == null) ? "/var/log/reframe_results.json" : reportPath + "reframe_results.json";
string actualReportPath = "/var/log/reframe_results.json";
pr.RunProcess(args.ReframePath, new string[] { "-C",args.ReframeConfigPath,"--force-local","--report-file", actualReportPath,"-c", testPath, "-R","-r" }, 10000);
if (!pr.isSuccess)
{
Console.WriteLine("There was an error in launching the script: " + pr.stderr);
return;
}
else
Console.WriteLine("reframe ran with output " + pr.stdout);
string reframeErrorMessage = ReframeWorker.ReadReframeReport(actualReportPath);
pr.stdout = (reframeErrorMessage == "") ? "No message" : reframeErrorMessage;
}
else
{
pr.RunProcess(testPath);
}
var options = new JsonSerializerOptions()
{
AllowTrailingCommas = true
};
if (!pr.isSuccess)
{
Console.WriteLine("There was an error in launching the script: " + pr.stderr);
return;
}
Dictionary<string, object> testResults = new Dictionary<string, object>();
try
{
testResults.Add("exit-code", pr.exitCode);
testResults.Add("test-time", (pr.exitTime - pr.startTime).TotalMilliseconds);
testResults.Add("extra-info", "None");
Dictionary<string, object>? deserializedResult = JsonSerializer.Deserialize<Dictionary<string, object>>(pr.stdout, options);
if (deserializedResult == null) throw new System.Text.Json.JsonException();
Dictionary<string, object> extraInfo = new Dictionary<string, object>();
foreach (KeyValuePair<string, object> record in deserializedResult)
{
if (record.Key == "message") testResults.Add("message", record.Value);
else extraInfo.Add(record.Key, record.Value);
}
//if no "message" tag, treat the whole thing as a message
if (!testResults.ContainsKey("message")) throw new System.Text.Json.JsonException("No message set");
testResults["extra-info"] = extraInfo;
}
catch (System.Text.Json.JsonException ex) when (ex.Data != null) //if not parse-able, result was a simple message
{
testResults.Add("message", pr.stdout);
}
if (!report.testresults.ContainsKey(testPath))
report.testresults.Add(testPath, value: testResults);
else
{
AddRepeatInfo(testPath, testResults, pr);
}
}
}
using System.Text.Json;
using System.Xml;
namespace hcheck
{
public class TestRunner
{
private HealthReport report;
public ProcessRunner? pr = null;
public TestRunner(HealthReport header)
{
report = header;
}
public HealthReport getReport()
{
return report;
}
private void AddRepeatInfo(string testPath, Dictionary<string, object> testResults, ProcessRunner pr)
{
if (report.testresults[testPath].ContainsKey("repeat-history"))
{
LinkedList<object> list= JsonSerializer.Deserialize< LinkedList<object>>( report.testresults[testPath]["repeat-history"].ToString());
list.AddLast(testResults);
report.testresults[testPath]["repeat-history"] = list;
//LinkedList<object> list = (LinkedList<object>)report.testresults[testPath]["repeat-history"];
//list.AddLast(testResults);
}
else
{
LinkedList<object> list = new LinkedList<object>();
list.AddLast(new Dictionary<string, object>(report.testresults[testPath]));
list.AddLast(testResults);
report.testresults[testPath]["repeat-history"] = list;
}
//first test run that returned an error should be reported
if (report.testresults[testPath]["exit-code"].ToString() == "0" && pr.exitCode != 0)
{
report.testresults[testPath]["exit-code"] = pr.exitCode;
report.testresults[testPath]["extra-info"] = testResults["extra-info"];
report.testresults[testPath][key: "message"] = testResults["message"];
}
}
public void RunTest(string testPath, ArgumentProcessor args)
{
//consern: properly initialize the test, success or fail
//how long took, collect results, write the report
//contract: if the external process is running, exit 0
//invoke nvidia-smi
if (pr == null) pr = new ProcessRunner();
//python scripts need to be run with a python installation
if (args.ReframePath != "")
{
string? reportPath = Path.GetDirectoryName(args.FilePath);
// string actualReportPath = (reportPath == null) ? "/var/log/reframe_results.json" : reportPath + "reframe_results.json";
string actualReportPath = "/var/log/reframe_results.json";
pr.RunProcess(args.ReframePath, new string[] { "-C",args.ReframeConfigPath,"--force-local","--report-file", actualReportPath,"-c", testPath, "-R","-r" }, 10000);
if (!pr.isSuccess)
{
Console.WriteLine("There was an error in launching the script: " + pr.stderr);
return;
}
else
Console.WriteLine("reframe ran with output " + pr.stdout);
string reframeErrorMessage = ReframeWorker.ReadReframeReport(actualReportPath);
pr.stdout = (reframeErrorMessage == "") ? "No message" : reframeErrorMessage;
}
else
{
pr.RunProcess(testPath);
}
var options = new JsonSerializerOptions()
{
AllowTrailingCommas = true
};
if (!pr.isSuccess)
{
Console.WriteLine("There was an error in launching the script: " + pr.stderr);
return;
}
Dictionary<string, object> testResults = new Dictionary<string, object>();
try
{
testResults.Add("exit-code", pr.exitCode);
testResults.Add("test-time", (pr.exitTime - pr.startTime).TotalMilliseconds);
testResults.Add("extra-info", "None");
Dictionary<string, object>? deserializedResult = JsonSerializer.Deserialize<Dictionary<string, object>>(pr.stdout, options);
if (deserializedResult == null) throw new System.Text.Json.JsonException();
Dictionary<string, object> extraInfo = new Dictionary<string, object>();
foreach (KeyValuePair<string, object> record in deserializedResult)
{
if (record.Key == "message") testResults.Add("message", record.Value);
else extraInfo.Add(record.Key, record.Value);
}
//if no "message" tag, treat the whole thing as a message
if (!testResults.ContainsKey("message")) throw new System.Text.Json.JsonException("No message set");
testResults["extra-info"] = extraInfo;
}
catch (System.Text.Json.JsonException ex) when (ex.Data != null) //if not parse-able, result was a simple message
{
testResults.Add("message", pr.stdout);
}
if (!report.testresults.ContainsKey(testPath))
report.testresults.Add(testPath, value: testResults);
else
{
AddRepeatInfo(testPath, testResults, pr);
}
}
}
}

Просмотреть файл

@ -16,11 +16,18 @@ namespace hcheck
public TestScriptGenerator(string script, bool isPython = false)
{
_path = System.IO.Path.GetTempFileName();
if (isPython) _path += ".py"; //extension is used to detect python scripts
else File.WriteAllText(_path, "#!/usr/bin/env bash\n");
if (isPython)
{
_path += ".py"; //extension is used to detect python scripts
File.WriteAllText(_path, "#!/usr/bin/env python3\n");
}
else
{
File.WriteAllText(_path, "#!/usr/bin/env bash\n");
}
File.AppendAllText(_path, script);
ProcessRunner pr = new ProcessRunner();
pr.RunProcess("chmod", new string[]{"+x", _path});
pr.RunProcess("chmod", new string[] { "+x", _path });
}
public void Dispose()

117
hcheck/hcheck/src/send_log Executable file
Просмотреть файл

@ -0,0 +1,117 @@
#!/opt/cycle/jetpack/system/embedded/bin/python3
from jetpack import util
import time
import json
import sys
import argparse
class SendError(Exception):
pass
class LogError(Exception):
pass
def detailed_log(message, exit_code, extra_info, node_id, test_name, error_message, test_time = 0, level = "error", priority=None):
if level not in ["info", "warn", "error"]:
raise LogError("Invalid level: %s" % level)
priority = priority or _get_priority(level)
message_data = {
"level": level,
"message": message,
"priority": priority,
"exit_code": exit_code,
"extra_info": extra_info,
"node_id": node_id,
"test_name": test_name,
"test_time": test_time,
"error_message": error_message
}
send_internal_message(message_data, "log")
def send_internal_message(message_data, message_type):
'''
Sends a system message to CycleCloud
parameters:
message_data - this is a python dictionary
message_type - type of message, examples are test, log, installation
'''
if not isinstance(message_data, dict):
raise SendError("message_data parameter must be a dictionary")
config = util.parse_config(None)
try:
identity = config["identity"]
cluster_session_id = identity.get('cluster_session_id')
cluster_name = identity["cluster_name"]
instance_id = identity["instance_id"]
cycle_server_config = config['cycle_server']
except KeyError as e:
raise SendError("Unable to find '%s' in config" % str(e))
message_obj = {
"cluster_name": cluster_name,
"instance_id": instance_id,
"timestamp": util.iso_8601_timestamp(),
"cluster_session_id": cluster_session_id,
"type": message_type,
"data": message_data
}
def func():
r = _post_message(message_obj)
if r.status != 202:
raise Exception("Failed to send message: %d" % r.status)
return _retry_func(func)
def _post_message(message_obj):
return util.query_cyclecloud("/clusterlink/messages", body=json.dumps({"messages": [message_obj]}), method="POST")
def _retry_func(func):
wait_length = 5
while True:
try:
return func()
except Exception as e:
# retry 5 times, waiting 5, 10, 20, then 40 seconds
if wait_length < 41:
time.sleep(wait_length)
wait_length *= 2
else:
raise e
def _get_priority(level):
if level == 'info':
log_priority = 'medium'
elif level == 'warn':
log_priority = 'medium'
elif level == 'error':
log_priority = 'high'
else:
raise LogError("Invalid log level")
return log_priority
parser = argparse.ArgumentParser(description='Send detailed healthcheck logs to CycleCloud')
parser.add_argument("-m", "--message", help="message displayed in CycleCloud logs")
parser.add_argument("--info", help="extra information about the tests in json format")
parser.add_argument("--code", help="exit code of the test")
parser.add_argument("--testname", help="name of the test")
parser.add_argument("--nodeid", help="the id of the vm the tests were run on")
parser.add_argument("--time", help="the time it took to run the test in ms")
parser.add_argument("-l", "--level", help="the level the log should be submitted at")
parser.add_argument("--error", help="the error message retured by the test script")
args = parser.parse_args()
detailed_log(args.message, args.code, args.info, args.nodeid, args.testname, args.error, args.time)

Просмотреть файл

@ -1,9 +1,9 @@
[project]
name = healthcheck
type = application
version = 1.0.9
version = 0.0.9
[blobs]
Files = hcheck-linux-1.0.9.tgz
Files = hcheck-linux-0.0.9.tgz

Просмотреть файл

@ -0,0 +1,7 @@
#!/usr/bin/env bash
#this is set to work on HPC nodes of a cluster that has a single dash in its name.
#you might need to change the number in -f5 parameter of cut to adapt it to a different number of dashes in the cluster name
node_index=$(jetpack config cyclecloud.node.name | cut -d- -f5)
if [[ $(expr $node_index % 2) == 0 ]]; then
echo failed; exit 1;
fi

Просмотреть файл

@ -0,0 +1 @@
Use this folder to upload your custom test scripts

Просмотреть файл

@ -0,0 +1 @@
Use this folder to store NHC configuration files