pai/examples/scikit-learn/README.md

3.2 KiB

scikit-learn on OpenPAI

This guide introduces how to run scikit-learn job on OpenPAI. The following contents show some basic scikit-learn examples, other customized scikit-learn code can be run similarly.

Contents

  1. scikit-learn MNIST digit recognition example
  2. scikit-learn text-vectorizers example

scikit-learn MNIST digit recognition example

To run scikit-learn examples in OpenPAI, you need to prepare a job configuration file and submit it through webportal.

OpenPAI packaged the docker env required by the job for user to use. User could refer to DOCKER.md to customize this example docker env. If user have built a customized image and pushed it to Docker Hub, replace our pre-built image openpai/pai.example.sklearn with your own.

Here're some configuration file examples:

mnist

{
  "jobName": "sklearn-mnist",
  "image": "openpai/pai.example.sklearn",
  "taskRoles": [
    {
      "name": "main",
      "taskNumber": 1,
      "cpuNumber": 4,
      "memoryMB": 8192,
      "gpuNumber": 0,
      "command": "cd scikit-learn/benchmarks && python bench_mnist.py"
    }
  ]
}

scikit-learn text-vectorizers example

text-vectorizers

{
  "jobName": "sklearn-text-vectorizers",
  "image": "openpai/pai.example.sklearn",
  "taskRoles": [
    {
      "name": "main",
      "taskNumber": 1,
      "cpuNumber": 4,
      "memoryMB": 8192,
      "gpuNumber": 0,
      "command": "pip install memory_profiler && cd scikit-learn/benchmarks && python bench_text_vectorizers.py"
    }
  ]
}

For more details on how to write a job configuration file, please refer to job tutorial.

Note:

Since PAI runs PyTorch jobs in Docker, the trainning speed on PAI should be similar to speed on host.