1e363c09ae | ||
---|---|---|
.. | ||
XGBoost | ||
caffe | ||
caffe2 | ||
chainer | ||
cntk | ||
images | ||
jupyter | ||
kafka | ||
keras | ||
mpi | ||
mxnet | ||
pytorch | ||
scikit-learn | ||
serving | ||
spark | ||
tensorflow | ||
README.md |
README.md
OpenPAI Job Examples
Table of Contents
- Quick start: how to write and submit a CIFAR-10 job
- List of off-the-shelf examples
- List of customized job template
- Contributing
Quick start: how to write and submit a CIFAR-10 job
(1) Prepare a job json file
In this section, we will use CIFAR-10 training job as an example to explain how to write and submit a job in OpenPAI.
CIFAR-10 is an established computer-vision dataset used for image classification.
- Full example for tensorflow cifar10 image classification training on OpenPAI:
{
// Name for the job, need to be unique
"jobName": "tensorflow-cifar10",
// URL pointing to the Docker image for all tasks in the job
"image": "openpai/pai.example.tensorflow",
// Data directory existing on HDFS
"dataDir": "/tmp/data",
// Output directory on HDFS,
"outputDir": "/tmp/output",
// List of taskRole, one task role at least
"taskRoles": [
{
// Name for the task role
"name": "cifar_train",
// Number of tasks for the task role, no less than 1
"taskNumber": 1,
// CPU number for one task in the task role, no less than 1
"cpuNumber": 8,
// Memory for one task in the task role, no less than 100
"memoryMB": 32768,
// GPU number for one task in the task role, no less than 0
"gpuNumber": 1,
// Executable command for tasks in the task role, can not be empty
"command": "git clone https://github.com/tensorflow/models && cd models/research/slim && python download_and_convert_data.py --dataset_name=cifar10 --dataset_dir=$PAI_DATA_DIR && python train_image_classifier.py --batch_size=64 --model_name=inception_v3 --dataset_name=cifar10 --dataset_split_name=train --dataset_dir=$PAI_DATA_DIR --train_dir=$PAI_OUTPUT_DIR"
}
]
}
-
Save content to a file. Name this file as cifar10.json
(2) Submit job json file from OpenPAI webportal
Users can refer to this tutorial submit a job in web portal for job submission from OpenPAI webportal.
List of off-the-shelf examples
Examples which can be run by submitting the json straightly without any modification.
- tensorflow.cifar10.json: Single GPU trainning on CIFAR-10 using TensorFlow.
- pytorch.mnist.json: Single GPU trainning on MNIST using PyTorch.
- pytorch.regression.json: Regression using PyTorch.
- mxnet.autoencoder.json: Autoencoder using MXNet.
- mxnet.image-classification.json: Image
- serving.tensorflow.json: TensorFlow model serving. classification on MNIST using MXNet.
List of customized job template
These user could customize and run these jobs over OpenPAI.
-
CNTK:
Contributing
If you want to contribute a job example that can be run on PAI, please open a new pull request.
-
Prepare a folder under pai/examples folder, for example create pai/examples/caffe2/
-
Prepare example files:
Under Caffe2 example dir, user should prepare these files for an example's contribution PR:
- README.md: Example's introductions
- Dockerfile: Example's dependencies
- Pai job json file: Example's OpenPAI job json template
- [Optional] Code file: Example's code file