11 KiB
Experiment config reference
If you want to create a new nni experiment, you need to prepare a config file in your local machine, and provide the path of this file to nnictl. The config file is written in yaml format, and need to be written correctly. This document describes the rule to write config file, and will provide some examples and templates for you.
Template
- light weight(without Annotation and Assessor)
authorName:
experimentName:
trialConcurrency:
maxExecDuration:
maxTrialNum:
#choice: local, remote
trainingServicePlatform:
searchSpacePath:
#choice: true, false
useAnnotation:
tuner:
#choice: TPE, Random, Anneal, Evolution
builtinTunerName:
classArgs:
#choice: maximize, minimize
optimize_mode:
gpuNum:
trial:
command:
codeDir:
gpuNum:
#machineList can be empty if the platform is local
machineList:
- ip:
port:
username:
passwd:
- Use Assessor
authorName:
experimentName:
trialConcurrency:
maxExecDuration:
maxTrialNum:
#choice: local, remote
trainingServicePlatform:
searchSpacePath:
#choice: true, false
useAnnotation:
tuner:
#choice: TPE, Random, Anneal, Evolution
builtinTunerName:
classArgs:
#choice: maximize, minimize
optimize_mode:
gpuNum:
assessor:
#choice: Medianstop
builtinAssessorName:
classArgs:
#choice: maximize, minimize
optimize_mode:
gpuNum:
trial:
command:
codeDir:
gpuNum:
#machineList can be empty if the platform is local
machineList:
- ip:
port:
username:
passwd:
- Use Annotation
authorName:
experimentName:
trialConcurrency:
maxExecDuration:
maxTrialNum:
#choice: local, remote
trainingServicePlatform:
#choice: true, false
useAnnotation:
tuner:
#choice: TPE, Random, Anneal, Evolution
builtinTunerName:
classArgs:
#choice: maximize, minimize
optimize_mode:
gpuNum:
assessor:
#choice: Medianstop
builtinAssessorName:
classArgs:
#choice: maximize, minimize
optimize_mode:
gpuNum:
trial:
command:
codeDir:
gpuNum:
#machineList can be empty if the platform is local
machineList:
- ip:
port:
username:
passwd:
Configuration
-
authorName
-
Description
authorName is the name of the author who create the experiment. TBD: add default value
-
-
experimentName
-
Description
experimentName is the name of the experiment you created.
TBD: add default value
-
-
trialConcurrency
-
Description
trialConcurrency specifies the max num of trial jobs run simultaneously.
Note: if you set trialGpuNum bigger than the free gpu numbers in your machine, and the trial jobs running simultaneously can not reach trialConcurrency number, some trial jobs will be put into a queue to wait for gpu allocation.
-
-
maxExecDuration
-
Description
maxExecDuration specifies the max duration time of an experiment.The unit of the time is {s, m, h, d}, which means {seconds, minutes, hours, days}.
-
-
maxTrialNum
-
Description
maxTrialNum specifies the max number of trial jobs created by nni, including successed and failed jobs.
-
-
trainingServicePlatform
-
Description
trainingServicePlatform specifies the platform to run the experiment, including {local, remote}.
-
local mode means you run an experiment in your local linux machine.
-
remote mode means you submit trial jobs to remote linux machines. If you set platform as remote, you should complete machineList field.
-
-
-
searchSpacePath
-
Description
searchSpacePath specifies the path of search space file you want to use, which should be a valid path in your local linux machine.
Note: if you set useAnnotation=True, you should remove searchSpacePath field or just let it be empty.
-
-
useAnnotation
-
Description
useAnnotation means whether you use annotation to analysis your code and generate search space.
Note: if you set useAnnotation=True, you should not set searchSpacePath.
-
-
tuner
-
Description
tuner specifies the tuner algorithm you use to run an experiment, there are two kinds of ways to set tuner. One way is to use tuner provided by nni sdk, you just need to set builtinTunerName and classArgs. Another way is to use your own tuner file, and you need to set codeDirectory, classFileName, className and classArgs.
-
builtinTunerName and classArgs
-
builtinTunerName
builtinTunerName specifies the name of system tuner you want to use, nni sdk provides four kinds of tuner, including {TPE, Random, Anneal, Evolution}
-
classArgs
classArgs specifies the arguments of tuner algorithm
-
-
codeDir, classFileName, className and classArgs
-
codeDir
codeDir specifies the directory of tuner code.
- classFileName
classFileName specifies the name of tuner file.
- className
className specifies the name of tuner class.
- classArgs
classArgs specifies the arguments of tuner algorithm.
-
-
gpuNum
gpuNum specifies the gpu number you want to use to run the tuner process. The value of this field should be a positive number.
Note: you could only specify one way to set tuner, for example, you could set {tunerName, optimizationMode} or {tunerCommand, tunerCwd}, and you could not set them both.
-
-
assessor
-
Description
assessor specifies the assessor algorithm you use to run an experiment, there are two kinds of ways to set assessor. One way is to use assessor provided by nni sdk, you just need to set builtinAssessorName and classArgs. Another way is to use your own tuner file, and you need to set codeDirectory, classFileName, className and classArgs.
-
builtinAssessorName and classArgs
-
builtinAssessorName
builtinAssessorName specifies the name of system assessor you want to use, nni sdk provides four kinds of tuner, including {TPE, Random, Anneal, Evolution}
-
classArgs
classArgs specifies the arguments of tuner algorithm
-
-
codeDir, classFileName, className and classArgs
-
codeDir
codeDir specifies the directory of tuner code.
- classFileName
classFileName specifies the name of tuner file.
- className
className specifies the name of tuner class.
- classArgs
classArgs specifies the arguments of tuner algorithm.
-
-
gpuNum
gpuNum specifies the gpu number you want to use to run the assessor process. The value of this field should be a positive number.
Note: you could only specify one way to set assessor, for example, you could set {assessorName, optimizationMode} or {assessorCommand, assessorCwd}, and you could not set them both.If you do not want to use assessor, you just need to leave assessor empty or remove assessor in your config file. Default value is 0.
-
-
trial
-
command
command specifies the command to run trial process.
-
codeDir
codeDir specifies the directory of your own trial file.
-
gpuNum
gpuNum specifies the num of gpu you want to use to run your trial process. Default value is 0.
-
-
machineList
machineList should be set if you set trainingServicePlatform=remote, or it could be empty.
-
ip
ip is the ip address of your remote machine.
-
port
port is the ssh port you want to use to connect machine.
Note: if you set port empty, the default value will be 22.
-
username
username is the account you use.
-
passwd
passwd specifies the password of your account.
-
sshKeyPath
If you want to use ssh key to login remote machine, you could set sshKeyPath in config file. sshKeyPath is the path of ssh key file, which should be valid.
Note: if you set passwd and sshKeyPath simultaneously, nni will try passwd.
-
passphrase
passphrase is used to protect ssh key, which could be empty if you don't have passphrase.
-
Examples
-
local mode
If you want to run your trial jobs in your local machine, and use annotation to generate search space, you could use the following config:
authorName: test
experimentName: test_experiment
trialConcurrency: 3
maxExecDuration: 1h
maxTrialNum: 10
#choice: local, remote
trainingServicePlatform: local
#choice: true, false
useAnnotation: true
tuner:
#choice: TPE, Random, Anneal, Evolution
builtinTunerName: TPE
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
gpuNum: 0
trial:
command: python3 mnist.py
codeDir: /nni/mnist
gpuNum: 0
If you want to use assessor, you could add assessor configuration in your file.
authorName: test
experimentName: test_experiment
trialConcurrency: 3
maxExecDuration: 1h
maxTrialNum: 10
#choice: local, remote
trainingServicePlatform: local
searchSpacePath: /nni/search_space.json
#choice: true, false
useAnnotation: false
tuner:
#choice: TPE, Random, Anneal, Evolution
builtinTunerName: TPE
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
gpuNum: 0
assessor:
#choice: Medianstop
builtinAssessorName: Medianstop
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
gpuNum: 0
trial:
command: python3 mnist.py
codeDir: /nni/mnist
gpuNum: 0
Or you could specify your own tuner and assessor file as following:
authorName: test
experimentName: test_experiment
trialConcurrency: 3
maxExecDuration: 1h
maxTrialNum: 10
#choice: local, remote
trainingServicePlatform: local
searchSpacePath: /nni/search_space.json
#choice: true, false
useAnnotation: false
tuner:
codeDir: /nni/tuner
classFileName: mytuner.py
className: MyTuner
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
gpuNum: 0
assessor:
codeDir: /nni/assessor
classFileName: myassessor.py
className: MyAssessor
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
gpuNum: 0
trial:
command: python3 mnist.py
codeDir: /nni/mnist
gpuNum: 0
- remote mode
If you want run trial jobs in your remote machine, you could specify the remote mahcine information as fllowing format:
authorName: test
experimentName: test_experiment
trialConcurrency: 3
maxExecDuration: 1h
maxTrialNum: 10
#choice: local, remote
trainingServicePlatform: remote
searchSpacePath: /nni/search_space.json
#choice: true, false
useAnnotation: false
tuner:
#choice: TPE, Random, Anneal, Evolution
builtinTunerName: TPE
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
gpuNum: 0
trial:
command: python3 mnist.py
codeDir: /nni/mnist
gpuNum: 0
#machineList can be empty if the platform is local
machineList:
- ip: 10.10.10.10
port: 22
username: test
passwd: test
- ip: 10.10.10.11
port: 22
username: test
passwd: test
- ip: 10.10.10.12
port: 22
username: test
sshKeyPath: /nni/sshkey
passphrase: qwert