11 KiB

Исходник Ответственный История

Experiment config reference

If you want to create a new nni experiment, you need to prepare a config file in your local machine, and provide the path of this file to nnictl. The config file is written in yaml format, and need to be written correctly. This document describes the rule to write config file, and will provide some examples and templates for you.

Template

light weight(without Annotation and Assessor)

authorName: 
experimentName: 
trialConcurrency: 
maxExecDuration: 
maxTrialNum: 
#choice: local, remote
trainingServicePlatform: 
searchSpacePath: 
#choice: true, false
useAnnotation: 
tuner:
  #choice: TPE, Random, Anneal, Evolution
  builtinTunerName:
  classArgs:
    #choice: maximize, minimize
    optimize_mode:
  gpuNum: 
trial:
  command: 
  codeDir: 
  gpuNum: 
#machineList can be empty if the platform is local
machineList:
  - ip: 
    port: 
    username: 
    passwd:

Use Assessor

authorName: 
experimentName: 
trialConcurrency: 
maxExecDuration: 
maxTrialNum: 
#choice: local, remote
trainingServicePlatform: 
searchSpacePath: 
#choice: true, false
useAnnotation: 
tuner:
  #choice: TPE, Random, Anneal, Evolution
  builtinTunerName:
  classArgs:
    #choice: maximize, minimize
    optimize_mode:
  gpuNum: 
assessor:
  #choice: Medianstop
  builtinAssessorName:
  classArgs:
    #choice: maximize, minimize
    optimize_mode:
  gpuNum: 
trial:
  command: 
  codeDir: 
  gpuNum: 
#machineList can be empty if the platform is local
machineList:
  - ip: 
    port: 
    username: 
    passwd:

Use Annotation

authorName: 
experimentName: 
trialConcurrency: 
maxExecDuration: 
maxTrialNum: 
#choice: local, remote
trainingServicePlatform: 
#choice: true, false
useAnnotation: 
tuner:
  #choice: TPE, Random, Anneal, Evolution
  builtinTunerName:
  classArgs:
    #choice: maximize, minimize
    optimize_mode:
  gpuNum: 
assessor:
  #choice: Medianstop
  builtinAssessorName:
  classArgs:
    #choice: maximize, minimize
    optimize_mode:
  gpuNum: 
trial:
  command: 
  codeDir: 
  gpuNum: 
#machineList can be empty if the platform is local
machineList:
  - ip: 
    port: 
    username: 
    passwd:

Configuration

authorName
- Description
  
  authorName is the name of the author who create the experiment. TBD: add default value
experimentName
- Description
  
  experimentName is the name of the experiment you created.
  TBD: add default value

trialConcurrency

Description

trialConcurrency specifies the max num of trial jobs run simultaneously.

Note: if you set trialGpuNum bigger than the free gpu numbers in your machine, and the trial jobs running simultaneously can not reach trialConcurrency number, some trial jobs will be put into a queue to wait for gpu allocation.

maxExecDuration
- Description
  
  maxExecDuration specifies the max duration time of an experiment.The unit of the time is {s, m, h, d}, which means {seconds, minutes, hours, days}.
maxTrialNum
- Description
  
  maxTrialNum specifies the max number of trial jobs created by nni, including successed and failed jobs.
trainingServicePlatform
- Description
  
  trainingServicePlatform specifies the platform to run the experiment, including {local, remote}.
  - local mode means you run an experiment in your local linux machine.
  - remote mode means you submit trial jobs to remote linux machines. If you set platform as remote, you should complete machineList field.
searchSpacePath
- Description
  
  searchSpacePath specifies the path of search space file you want to use, which should be a valid path in your local linux machine.
```
Note: if you set useAnnotation=True, you should remove searchSpacePath field or just let it be empty.
```
useAnnotation
- Description
  
  useAnnotation means whether you use annotation to analysis your code and generate search space.
```
Note: if you set useAnnotation=True, you should not set searchSpacePath.
```
tuner
- Description
  
  tuner specifies the tuner algorithm you use to run an experiment, there are two kinds of ways to set tuner. One way is to use tuner provided by nni sdk, you just need to set builtinTunerName and classArgs. Another way is to use your own tuner file, and you need to set codeDirectory, classFileName, className and classArgs.
- builtinTunerName and classArgs
  - builtinTunerName
    
    builtinTunerName specifies the name of system tuner you want to use, nni sdk provides four kinds of tuner, including {TPE, Random, Anneal, Evolution}
  - classArgs
    
    classArgs specifies the arguments of tuner algorithm
- codeDir, classFileName, className and classArgs
  - codeDir
    
    codeDir specifies the directory of tuner code.
    - classFileName
  classFileName specifies the name of tuner file.
  - className
  className specifies the name of tuner class.
  - classArgs
  classArgs specifies the arguments of tuner algorithm.
- gpuNum
  
  gpuNum specifies the gpu number you want to use to run the tuner process. The value of this field should be a positive number.
```
Note: you could only specify one way to set tuner, for example, you could set {tunerName, optimizationMode} or {tunerCommand, tunerCwd}, and you could not set them both. 
```
assessor
- Description
  
  assessor specifies the assessor algorithm you use to run an experiment, there are two kinds of ways to set assessor. One way is to use assessor provided by nni sdk, you just need to set builtinAssessorName and classArgs. Another way is to use your own tuner file, and you need to set codeDirectory, classFileName, className and classArgs.
- builtinAssessorName and classArgs
  - builtinAssessorName
    
    builtinAssessorName specifies the name of system assessor you want to use, nni sdk provides four kinds of tuner, including {TPE, Random, Anneal, Evolution}
  - classArgs
    
    classArgs specifies the arguments of tuner algorithm
- codeDir, classFileName, className and classArgs
  - codeDir
    
    codeDir specifies the directory of tuner code.
    - classFileName
  classFileName specifies the name of tuner file.
  - className
  className specifies the name of tuner class.
  - classArgs
  classArgs specifies the arguments of tuner algorithm.
- gpuNum
  
  gpuNum specifies the gpu number you want to use to run the assessor process. The value of this field should be a positive number.
```
Note: you could only specify one way to set assessor, for example, you could set {assessorName, optimizationMode} or {assessorCommand, assessorCwd}, and you could not set them both.If you do not want to use assessor, you just need to leave assessor empty or remove assessor in your config file. Default value is 0. 
```
trial
- command
  
  command specifies the command to run trial process.
- codeDir
  
  codeDir specifies the directory of your own trial file.
- gpuNum
  
  gpuNum specifies the num of gpu you want to use to run your trial process. Default value is 0.
machineList

machineList should be set if you set trainingServicePlatform=remote, or it could be empty.
- ip
  
  ip is the ip address of your remote machine.
- port
  
  port is the ssh port you want to use to connect machine.
```
Note: if you set port empty, the default value will be 22.
```
- username
  
  username is the account you use.
- passwd
  
  passwd specifies the password of your account.
- sshKeyPath
  
  If you want to use ssh key to login remote machine, you could set sshKeyPath in config file. sshKeyPath is the path of ssh key file, which should be valid.
```
Note: if you set passwd and sshKeyPath simultaneously, nni will try passwd.
```
- passphrase
  
  passphrase is used to protect ssh key, which could be empty if you don't have passphrase.

Examples

local mode

If you want to run your trial jobs in your local machine, and use annotation to generate search space, you could use the following config:

authorName: test
experimentName: test_experiment
trialConcurrency: 3
maxExecDuration: 1h
maxTrialNum: 10
#choice: local, remote
trainingServicePlatform: local
#choice: true, false
useAnnotation: true
tuner:
  #choice: TPE, Random, Anneal, Evolution
  builtinTunerName: TPE
  classArgs:
    #choice: maximize, minimize
    optimize_mode: maximize
  gpuNum: 0
trial:
  command: python3 mnist.py
  codeDir: /nni/mnist
  gpuNum: 0

If you want to use assessor, you could add assessor configuration in your file.

authorName: test
experimentName: test_experiment
trialConcurrency: 3
maxExecDuration: 1h
maxTrialNum: 10
#choice: local, remote
trainingServicePlatform: local
searchSpacePath: /nni/search_space.json
#choice: true, false
useAnnotation: false
tuner:
  #choice: TPE, Random, Anneal, Evolution
  builtinTunerName: TPE
  classArgs:
    #choice: maximize, minimize
    optimize_mode: maximize
  gpuNum: 0
assessor:
  #choice: Medianstop
  builtinAssessorName: Medianstop
  classArgs:
    #choice: maximize, minimize
    optimize_mode: maximize
  gpuNum: 0
trial:
  command: python3 mnist.py
  codeDir: /nni/mnist
  gpuNum: 0

Or you could specify your own tuner and assessor file as following:

authorName: test
experimentName: test_experiment
trialConcurrency: 3
maxExecDuration: 1h
maxTrialNum: 10
#choice: local, remote
trainingServicePlatform: local
searchSpacePath: /nni/search_space.json
#choice: true, false
useAnnotation: false
tuner:
  codeDir: /nni/tuner
  classFileName: mytuner.py
  className: MyTuner
  classArgs:
    #choice: maximize, minimize
    optimize_mode: maximize
  gpuNum: 0
assessor:
  codeDir: /nni/assessor
  classFileName: myassessor.py
  className: MyAssessor
  classArgs:
    #choice: maximize, minimize
    optimize_mode: maximize
  gpuNum: 0
trial:
  command: python3 mnist.py
  codeDir: /nni/mnist
  gpuNum: 0

remote mode

If you want run trial jobs in your remote machine, you could specify the remote mahcine information as fllowing format:

authorName: test
experimentName: test_experiment
trialConcurrency: 3
maxExecDuration: 1h
maxTrialNum: 10
#choice: local, remote
trainingServicePlatform: remote
searchSpacePath: /nni/search_space.json
#choice: true, false
useAnnotation: false
tuner:
  #choice: TPE, Random, Anneal, Evolution
  builtinTunerName: TPE
  classArgs:
    #choice: maximize, minimize
    optimize_mode: maximize
  gpuNum: 0
trial:
  command: python3 mnist.py
  codeDir: /nni/mnist
  gpuNum: 0
#machineList can be empty if the platform is local
machineList:
  - ip: 10.10.10.10
    port: 22
    username: test
    passwd: test
  - ip: 10.10.10.11
    port: 22
    username: test
    passwd: test
  - ip: 10.10.10.12
    port: 22
    username: test
    sshKeyPath: /nni/sshkey
    passphrase: qwert

11 KiB Исходник Ответственный История

Experiment config reference

Template

Configuration

Examples

11 KiB

Исходник Ответственный История