b4ac085c73
Merge release/1.0 branch changes to master |
||
---|---|---|
.github/workflows | ||
hostfactory | ||
specs | ||
templates | ||
.gitignore | ||
LICENSE | ||
README.md | ||
SECURITY.md | ||
image.png | ||
package.py | ||
project.ini |
README.md
IBM Spectrum Symphony
This project installs and configures IBM Spectrum Symphony.
Use of IBM Spectrum Symphony requires a license agreement and Symphony binaries obtained directly from IBM Spectrum Analytics.
NOTE: Currently, this project only supports Linux-only Symphony clusters. Windows workers are not yet supported.
Table of Contents
Pre-Requisites
This project requires running Azure CycleCloud version 7.7.1 and Symphony 7.2.0 or later.
This project requires the following:
-
A license to use IBM Spectrum Symphony from IBM Spectrum Analytics.
-
The IBM Spectrum Symphony installation binaries.
a. Download the binaries from IBM and place them in the
./blobs/symphony/
directory.b. If the version is not 7.3.0.0 (the project default), then update the version number in the Files list in
./project.ini
and in the cluster template:./templates/symphony.txt
-
CycleCloud must be installed and running.
a. If this is not the case, see the CycleCloud QuickStart Guide for assistance.
-
The CycleCloud CLI must be installed and configured for use.
a. If this is not the case, see the CycleCloud CLI Install Guide for assistance.
Configuring the Project
The first step is to configure the project for use with your storage locker:
-
Open a terminal or Azure Cloud Shell session with the CycleCloud CLI enabled.
-
Clone this repo into a new directory and change to the new directory:
$ git clone https://github.com/Azure/cyclecloud-symphony.git
$ cd ./cyclecloud-symphony
-
Download the IBM Spectrum Symphony installation binaries and license entitlement file from IBM and place them in the
./blobs/symphony
directory. * ./blobs/symphony/sym-7.3.0.0.exe * ./blobs/symphony/sym-7.3.0.0_x86_64.bin * ./blobs/symphony/sym_adv_entitlement.datOr, if using an eval edition:
* ./blobs/symphony/sym_adv_ev_entitlement.dat
* ./blobs/symphony/symeval-7.3.0.0_x86_64.bin
* ./blobs/symphony/symeval-7.3.0.0.exe
- If the version number is not 7.3.0.0, update the version numbers in
project.ini
andtemplates/symphony.txt
Deploying the Project
To upload the project (including any local changes) to your target locker, run the
cyclecloud project upload
command from the project directory. The expected output looks like
this:
$ cyclecloud project upload my_locker
Sync completed!
IMPORTANT
For the upload to succeed, you must have a valid Pogo configuration for your target Locker.
Importing the Cluster Template
To import the cluster:
-
Open a terminal session with the CycleCloud CLI enabled.
-
Switch to the Symphony directory.
-
Run
cyclecloud import_template symphony -f templates/symphony.txt
. The expected output looks like this:$ cyclecloud import_template symphony -f templates/symphony.txt Importing template symphony.... ---------------- symphony : *template* ---------------- Keypair: $Keypair Cluster nodes: master: off Total nodes: 1
Host Factory Provider for Azure CycleCloud
This project extends the Symphony Host Factory with an Azure CycleCloud resource provider: azurecc.
The Host Factory will be configured as the default autoscaler for the cluster.
Installing the azurecc HostFactory
It is also possible to configure an existing Symphony installation to use the azurecc
HostFactory to
burst into Azure.
Please contact azure support for help with this configuration.
Guide for using Static Templates with HostFactory
Follow below steps if you are not using the chef in project:
- Copy cyclecloud-symphony-pkg-{version}.zip to /tmp directory in master node.
- Unzip the file in /tmp directory
- Ensure python3 is installed
- You should have following environment variables set as per your environment otherwise install.sh script under hostfactory/host_provider directory take default values: EGO_TOP HF_TOP HF_VERSION HF_CONFDIR HF_WORKDIR HF_LOGDIR You can run install.sh within the folder and it will install python virtual environment at plugin path like below: $HF_TOP/$HF_VERSION/providerplugins/azurecc/scripts/venv If you also need to generate symphony configuration then you can run install.sh with argument generate_config and other required arguments. This will set all the configurations assuming you have only azurecc provider and enable it.Example: ./install.sh generate_config --cluster <cluster_name> --username --password --web_server
Guide to using scripts
These scripts can be found under $HF_TOP/$HF_VERSION/providerplugins/azurecc/scripts.
- generateWeightedTemplates.sh This script is used to generated weighted template. You need to run this script as root. ./generateWeightedTemplates.sh This will create a template based on current cyclecloud template selections and print it. If there are errors please check in /tmp/template_generate.out You must then store template in $HF_TOP/$HF_VERSION/hostfactory/providers/conf/azurecc_templates.json file. After storing the file the change will take effect only after you stop and start HostFactory.
- testDryRunWeightedTemplates.sh These are test scripts which has 2 options: a. validate_templates - this will check the template in default path where azurecc_templates.json is stored ./testDryRunWeightedTemplates.sh validate_templates b. create_machines - for this you need to pass input json as an example: input.json: {"template": {"machineCount": 205, "templateId": "execute"}, "dry-run": true} ./testDryRunWeightedTemplates.sh create_machines input.json This will not create machines but show what machines would have been created with given input. Make sure "dry-run" is true.
Testing capacity issues
You can trigger a capacity issue using following python script. Run LastFailureTime.py under hostfactory/host_provider/test/ with follwing arguments:
- Number of seconds before current time
- Account name
- region
- machine name Update the offset compared to UTC time to match the time cycle_server is in, like below matches Central US Standard time, that is 5 hrs behind UTC: def format_time(timestamp): return datetime.datetime.strftime(timestamp, "%Y-%m-%d %H:%M:%S.%f-05:00") python LastFailureTime.py 1 westus2 Standard_D8_v5
Output is like below:
AdType = "Cloud.Capacity"
ExpirationTime = 2024-03-01 16:34:54.045927-06:00
AccountName = ""
StatusUpdatedTime = 2024-03-01 15:34:53.045927-06:00
Region = "westus2"
HasCapacity = False
Provider = "azure"
Name = "region//westus2/Standard_D8_v5"
MachineType = "Standard_D8_v5"
Above output is stored in /tmp/test.dat file and run below command to update the Cloud.Capacity record
curl -k -u ${CC_USER} -X POST -H "Content-Type:application/text" --data-binary "@/tmp/test.dat" localhost:8080/db/Cloud.Capacity?format=text
You can also verify this on CC GUI, HasCapacity is false
Note: You should run this script from machine where Cyclecloud is installed.
Additional configs for symphony templates
Following 2 configuration have been added:
- enable_weighted_templates -> default value is true. Used to generate template in which templateId corresponds to nodearray and vmTypes are in dictionary format with weight. But you can change it under configuration symphony for master node. In that case you need to use default format of templateId as nodearray+ sku name.
- ncpus_use_vcpus -> default value is true. It assumes you want slot ncpu attribute to be based on vcpu count. You can change in symphony configuration for master node.
- capacity-failure-backoff -> default value is 300 (unit is seconds). This is the time after which scalelib will wait to make next attempt at allocation after failure occurs.
Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.