remove deprecated project (#5489)
remove notebook-extension, python-sdk, samba-aad-server, storage-plugin, submit-simple-job, cleaner, driver, end-to-end-test, job-exit-spec, tools
|
@ -178,11 +178,6 @@ jobs:
|
|||
uses: actions/setup-node@v1
|
||||
with:
|
||||
node-version: ${{ matrix.node }}
|
||||
- name: Test contrib/submit-simple-job
|
||||
run: |
|
||||
cd contrib/submit-simple-job
|
||||
npm install
|
||||
npm test
|
||||
- name: Test contrib/submit-job-v2
|
||||
run: |
|
||||
cd contrib/submit-job-v2
|
||||
|
|
|
@ -1,155 +0,0 @@
|
|||
# OpenPAI Submitter
|
||||
|
||||
***Note: OpenPAI Submitter is deprecated. New plugin support for Jupyter Notebook is under development.***
|
||||
|
||||
***OpenPAI Submitter*** is a Jupyter Notebook extension, created for easy-to-use job submission and management on OpenPAI clusters. Users can submit Jupyter job in one click, and manage recent jobs by a flexible dialog.
|
||||
|
||||
![](docs_img/submitter-1.gif)
|
||||
|
||||
## How to Install
|
||||
|
||||
This extension requires **Python 3+** and Jupyter Notebook to work. Make sure you are using Jupyter Notebook with a Python 3 kernel.
|
||||
|
||||
Please use the following commands to install this extension (Make sure you are in the correct `python` environment).
|
||||
|
||||
```bash
|
||||
pip install --upgrade pip
|
||||
git clone https://github.com/Microsoft/pai
|
||||
cd pai/contrib/notebook-extension
|
||||
python setup.py # add --user to avoid permission issues if necessary
|
||||
```
|
||||
|
||||
This extension leverage the [`Python` SDK](https://github.com/microsoft/pai/tree/master/contrib/python-sdk) as the low level implementation. It will also be installed in above commands (use `-i` in of `setup.py` to avoid installing SDK).
|
||||
|
||||
Before starting, user needs to give the basic information of the clusters. If you log in to your cluster by user/password, you can use the following command to add your cluster. The <cluster-alias> is a cluster name chosen by you.
|
||||
```bash
|
||||
# for user/password authentication
|
||||
opai cluster add --cluster-alias <cluster-alias> --pai-uri <pai-uri> --user <user> --password <password>
|
||||
```
|
||||
If you log in to your cluster by Azure AD authentication, the following command is for you to add the cluster:
|
||||
```bash
|
||||
# for Azure AD authentication
|
||||
opai cluster add --cluster-alias <cluster-alias> --pai-uri <pai-uri> --user <user> --toke <token>
|
||||
```
|
||||
|
||||
Now you can use the command `opai cluster list` to list all clusters.
|
||||
|
||||
The following command is used to delete one of your clusters:
|
||||
```bash
|
||||
# Delete a cluster by calling its alias.
|
||||
opai cluster delete <cluster-alias>
|
||||
```
|
||||
|
||||
If you want to update some settings of clusters (e.g. cluster alias, username or password), it is recommended to delete the old cluster by `opai cluster delete <cluster-alias>`, then use `opai cluster add` to re-add it with new settings. A more complex way is to edit the [YAML file](../python-sdk/#define-your-clusters) directly.
|
||||
|
||||
There are other ways to manage the clusters, see the [documentation of SDK](../python-sdk).
|
||||
|
||||
## Quick Start
|
||||
|
||||
Once installed, the extension will add two buttons on the notebook page, namely <img src="./docs_img/submit-button.png" height="25" width="25"> and <img src="./docs_img/job-button.png" height="25" width="25">.
|
||||
|
||||
Button <img src="./docs_img/submit-button.png" height="25" width="25"> is designed for job submission. You can click it and the detailed cluster information will be loaded. Then click ***Quick Submit***. The extension will do the following work for you:
|
||||
|
||||
- Pack all files in current folder as a .zip file, and upload it to the cluster by WebHDFS.
|
||||
- Generate job settings automatically, then submit it.
|
||||
- Wait until the notebook is ready.
|
||||
|
||||
The picture below shows the submission process:
|
||||
|
||||
![](docs_img/submitter-1.gif)
|
||||
|
||||
You can safely close the page when the extension is waiting. Once the notebook is ready, the submitter will show up and give you the notebook URL:
|
||||
|
||||
![](docs_img/submitter-2.gif)
|
||||
|
||||
**Note: The waiting process will take 5 to 10 minutes.** If you are not willing to wait, you could probably click the bottom link on the submitter to start a new session. The submitted job will not lose, you can click <img src="./docs_img/job-button.png" height="25" width="25"> to find it.
|
||||
|
||||
### Submit as Interactive Notebook v.s. Python Script v.s. Silent Notebbook
|
||||
|
||||
You can submit jobs in two ways:
|
||||
- as an ***interactive notebook***
|
||||
- as a ***Python Script (.py file)***
|
||||
- as a ***silent notebook***
|
||||
|
||||
The interactive mode is a quick way for you to submit the notebook you work on locally to the cluster. The notebook will stay the same but have access to GPU resource on cluster. This mode is mainly designed for experimenting and debugging.
|
||||
|
||||
On the other hand, submitting the job as a .py file will firstly convert the notebook to a Python script, then execute the script directly. This mode is a good way for deployment and batch submitting.
|
||||
|
||||
If you submit a notebook as a silent notebook, you won't have an interactive notebook as in the interactive mode. Your notebook will be executed in the background. Once it is finished, you can get the result as a file. The difference between this mode and the python script mode is that, you can not see the output during the silent notebook is running, but you can get `matplotlib` plot or other graph of your notebook.
|
||||
|
||||
<img src="docs_img/submit-form.png" width="65%;" />
|
||||
|
||||
### Advanced job configuration
|
||||
|
||||
#### Setup frequently used `docker-images` and `resources`
|
||||
|
||||
As shown in above example figure, users could specify resources and docker image by selection in the panel. And further, you can add your frequently used docker images or resource combinations by:
|
||||
|
||||
```bash
|
||||
opai set -g image-list+=<image-1> image-list+=<image-2> ...
|
||||
opai set -g resource-list+="<#gpu>,<#cpu>,<#mem>" resource-list+="<#gpu>,<#cpu>,<#mem>" ...
|
||||
```
|
||||
Here `<#mem>` can be numbers in unit of `MBytes`, or a string like `32GB` (or `32g`).
|
||||
|
||||
For example, you can add `your.docker.image` and the resource spec `1 GPU, 4 vCores CPU, 3GB` by:
|
||||
|
||||
```bash
|
||||
opai set -g image-list+=your.docker.image
|
||||
opai set -g resource-list+="1,4,3gb"
|
||||
```
|
||||
|
||||
After running the command, one should restart the notebook to make it work:
|
||||
|
||||
<img src="docs_img/restart-kernel.png" width="50%;" />
|
||||
|
||||
|
||||
These settings are permanent since they are saved on disk. If you want to `update`, `delete`, or `change the order of` them, you can edit the file `~/.openpai/defaults.yaml` (For Windows, the path is `C:\Users\<Username>\.openpai\defaults.yaml`) directly. Also remember to restart the notebook kernel after editing.
|
||||
|
||||
#### Advanced configuration by `NotebookConfiguration`
|
||||
|
||||
In the submitting panel, user can change basic configuration of the job. However, for the users who want to change the advanced configuration, the extension would receive configuration from `NotebookConfiguration` in the notebook.
|
||||
|
||||
For example, after executing below codes in the notebook cell, the extension will configure the job resource specification to 2 GPUs, 16 CPU cores and 32 GB memory.
|
||||
```python
|
||||
from openpaisdk.notebook import NotebookConfiguration
|
||||
|
||||
NotebookConfiguration.set("mem", "512GB")
|
||||
```
|
||||
|
||||
Execute below codes to have a quick look of all supported items in `NotebookConfiguration`.
|
||||
```python
|
||||
# print supported configuration items
|
||||
NotebookConfiguration.print_supported_items()
|
||||
```
|
||||
|
||||
### Quick Submit v.s. Download Config
|
||||
|
||||
Only the pre-defined resource and docker image settings are available, when you use the button *Quick Submit* to submit jobs. If you need different settings, you can click the button *Download Config* to get the job configuration file. Then import it on the web portal for further configuring.
|
||||
|
||||
## Job Management
|
||||
![](docs_img/recent-jobs.gif)
|
||||
|
||||
Clicking <img src="./docs_img/job-button.png" height="20" width="25"> will open the *Recent Jobs* panel. **This panel records all jobs submitted by this extension on this machine** (If a job is submitted in a different way, it won't show up). The panel will show some basic information about your jobs. Also, it will show notebook URL **when the job is submitted as an interactive notebook, and the notebook is ready.** The panel will not show completed jobs by default, but you can use the upper-right toggle to find all jobs.
|
||||
|
||||
## How to Update or Uninstall
|
||||
|
||||
To update this extension, please use the following commands:
|
||||
```bash
|
||||
git clone https://github.com/Microsoft/pai
|
||||
cd pai/contrib/notebook-extension
|
||||
jupyter nbextension install openpai_submitter
|
||||
jupyter nbextension enable openpai_submitter/main
|
||||
```
|
||||
|
||||
To disable this extension, please use the following commands:
|
||||
```bash
|
||||
jupyter nbextension disable openpai_submitter/main
|
||||
```
|
||||
|
||||
## Known Issues
|
||||
- This extension is not compatible with *Variable Inspector*.
|
||||
- This extension is not compatible with AdBlock.
|
||||
|
||||
## Feedback
|
||||
|
||||
Please use this [link](https://github.com/microsoft/pai/issues/new?title=[Jupyter%20Extension%20Feedback]) for feedbacks.
|
Двоичные данные
contrib/notebook-extension/docs_img/job-button.png
До Ширина: | Высота: | Размер: 2.7 KiB |
Двоичные данные
contrib/notebook-extension/docs_img/recent-jobs.gif
До Ширина: | Высота: | Размер: 387 KiB |
Двоичные данные
contrib/notebook-extension/docs_img/restart-kernel.png
До Ширина: | Высота: | Размер: 45 KiB |
Двоичные данные
contrib/notebook-extension/docs_img/submit-button.png
До Ширина: | Высота: | Размер: 3.5 KiB |
Двоичные данные
contrib/notebook-extension/docs_img/submit-form.png
До Ширина: | Высота: | Размер: 41 KiB |
Двоичные данные
contrib/notebook-extension/docs_img/submitter-1.gif
До Ширина: | Высота: | Размер: 328 KiB |
Двоичные данные
contrib/notebook-extension/docs_img/submitter-2.gif
До Ширина: | Высота: | Размер: 327 KiB |
|
@ -1,71 +0,0 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\"Hello, OpenPAI\"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"! echo \"Hello, OpenPAI\""
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.8"
|
||||
},
|
||||
"varInspector": {
|
||||
"cols": {
|
||||
"lenName": 16,
|
||||
"lenType": 16,
|
||||
"lenVar": 40
|
||||
},
|
||||
"kernels_config": {
|
||||
"python": {
|
||||
"delete_cmd_postfix": "",
|
||||
"delete_cmd_prefix": "del ",
|
||||
"library": "var_list.py",
|
||||
"varRefreshCmd": "print(var_dic_list())"
|
||||
},
|
||||
"r": {
|
||||
"delete_cmd_postfix": ") ",
|
||||
"delete_cmd_prefix": "rm(",
|
||||
"library": "var_list.r",
|
||||
"varRefreshCmd": "cat(var_dic_list()) "
|
||||
}
|
||||
},
|
||||
"types_to_exclude": [
|
||||
"module",
|
||||
"function",
|
||||
"builtin_function_or_method",
|
||||
"instance",
|
||||
"_Feature"
|
||||
],
|
||||
"window_display": false
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
|
@ -1 +0,0 @@
|
|||
standard --env amd --env browser --env es6 --fix
|
|
@ -1,3 +0,0 @@
|
|||
# OpenPAI Submitter
|
||||
|
||||
A jupyter notebook plugin for quick submission to OpenPAI cluster.
|
|
@ -1,117 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
import json as openpai_ext_json
|
||||
import threading as openpai_ext_threading
|
||||
|
||||
if 'openpai_ext_lock' not in vars():
|
||||
openpai_ext_buffer_lock = openpai_ext_threading.Lock()
|
||||
|
||||
|
||||
class openpai_ext_Storage(object):
|
||||
'''
|
||||
This class will not be run in multiple threads,
|
||||
but it may be run in multiple processes.
|
||||
It uses file system to store information and sync with each processes.
|
||||
'''
|
||||
|
||||
def use_output(func):
|
||||
|
||||
def func_wrapper(*args, **kwargs):
|
||||
token = args[1]
|
||||
args = args[0:1] + args[2:]
|
||||
ret = func(*args, **kwargs)
|
||||
openpai_ext_buffer_lock.acquire()
|
||||
print("__openpai${}__".format(token) + openpai_ext_json.dumps(
|
||||
{
|
||||
'code': 0,
|
||||
'message': ret,
|
||||
}
|
||||
), flush=True)
|
||||
openpai_ext_buffer_lock.release()
|
||||
|
||||
return func_wrapper
|
||||
|
||||
def __init__(self, max_length=100):
|
||||
import os
|
||||
from openpaisdk import __flags__
|
||||
self.os = os
|
||||
self.max_length = max_length
|
||||
self.dirname = os.path.join(os.path.expanduser('~'), __flags__.cache)
|
||||
self.lock_path = os.path.join(self.dirname, "data.lock")
|
||||
self.data_path = os.path.join(self.dirname, "data")
|
||||
if not(os.path.exists(self.data_path)):
|
||||
self.data = []
|
||||
self.write_to_file()
|
||||
else:
|
||||
self.read_file()
|
||||
|
||||
def acquire_lock(self):
|
||||
if self.os.path.exists(self.lock_path):
|
||||
raise Exception(
|
||||
'Unexpected lock file: {}! Please refresh the page or remove it manually!'.format(self.lock_path))
|
||||
with open(self.lock_path, 'w'):
|
||||
pass
|
||||
|
||||
def release_lock(self):
|
||||
if not(self.os.path.exists(self.lock_path)):
|
||||
raise Exception('Missing lock file: {}! Please refresh the page.'.format(self.lock_path))
|
||||
self.os.remove(self.lock_path)
|
||||
|
||||
def write_to_file(self):
|
||||
self.acquire_lock()
|
||||
try:
|
||||
with open(self.data_path, 'w') as f:
|
||||
openpai_ext_json.dump(self.data, f)
|
||||
except Exception:
|
||||
pass
|
||||
finally:
|
||||
self.release_lock()
|
||||
|
||||
def read_file(self):
|
||||
with open(self.data_path) as f:
|
||||
self.data = openpai_ext_json.load(f)
|
||||
|
||||
@use_output
|
||||
def get(self):
|
||||
self.read_file()
|
||||
return self.data
|
||||
|
||||
@use_output
|
||||
def add(self, record):
|
||||
self.read_file()
|
||||
if len(self.data) == self.max_length:
|
||||
self.data = self.data[1:]
|
||||
self.data.append(record)
|
||||
self.write_to_file()
|
||||
return record
|
||||
|
||||
@use_output
|
||||
def clear(self):
|
||||
self.data = []
|
||||
self.write_to_file()
|
||||
return ""
|
||||
|
||||
@use_output
|
||||
def save(self, data):
|
||||
self.data = data
|
||||
self.write_to_file()
|
||||
return ""
|
||||
|
||||
|
||||
openpai_ext_storage = openpai_ext_Storage()
|
|
@ -1,6 +0,0 @@
|
|||
Type: IPython Notebook Extension
|
||||
Name: openpai_submitter
|
||||
Description: A jupyter notebook plugin for quick submission to OpenPAI cluster.
|
||||
Link: README.md
|
||||
Main: main.js
|
||||
Compatibility: 3.x, 4.x, 5.x, 6.x
|
|
@ -1,81 +0,0 @@
|
|||
// Copyright (c) Microsoft Corporation
|
||||
// All rights reserved.
|
||||
//
|
||||
// MIT License
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
// documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
// the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
// to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
// The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
// BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
// NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
// DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
define([
|
||||
'require',
|
||||
'jquery',
|
||||
'base/js/namespace',
|
||||
'base/js/events',
|
||||
'//cdn.datatables.net/1.10.19/js/jquery.dataTables.min.js',
|
||||
'nbextensions/openpai_submitter/scripts/panel',
|
||||
'nbextensions/openpai_submitter/scripts/panel_recent'
|
||||
], function (requirejs, $, Jupyter, events, _, Panel, PanelRecent) {
|
||||
function loadCss (filename) {
|
||||
var cssUrl = requirejs.toUrl(filename)
|
||||
$('head').append(
|
||||
$('<link rel="stylesheet" type="text/css" />')
|
||||
.attr('href', cssUrl)
|
||||
)
|
||||
}
|
||||
|
||||
function registerButtonPanel () {
|
||||
var handler = function () {
|
||||
panel.send(panel.MSG.CLICK_BUTTON)
|
||||
}
|
||||
var action = {
|
||||
icon: 'fa-rocket', // a font-awesome class used on buttons, etc
|
||||
help: 'openpai-submitter',
|
||||
help_index: 'zz',
|
||||
handler: handler
|
||||
}
|
||||
var prefix = 'my_extension'
|
||||
var actionName = 'show-panel'
|
||||
var fullActionName = Jupyter.actions.register(action, actionName, prefix)
|
||||
Jupyter.toolbar.add_buttons_group([fullActionName])
|
||||
}
|
||||
function registerButtonPanelRecent () {
|
||||
var handler = function () {
|
||||
panelRecent.send(panelRecent.MSG.CLICK_BUTTON)
|
||||
}
|
||||
var action = {
|
||||
icon: 'fa-list-alt', // a font-awesome class used on buttons, etc
|
||||
help: 'openpai-submitter',
|
||||
help_index: 'zz',
|
||||
handler: handler
|
||||
}
|
||||
var prefix = 'my_extension'
|
||||
var actionName = 'show-panel-recent'
|
||||
var fullActionName = Jupyter.actions.register(action, actionName, prefix)
|
||||
Jupyter.toolbar.add_buttons_group([fullActionName])
|
||||
}
|
||||
var panel = Panel()
|
||||
var panelRecent = PanelRecent()
|
||||
|
||||
function loadIPythonExtension () {
|
||||
loadCss('./misc/style.css')
|
||||
loadCss('//cdn.datatables.net/1.10.19/css/jquery.dataTables.min.css')
|
||||
panel.send(panel.MSG.PLEASE_INIT)
|
||||
panelRecent.send(panelRecent.MSG.PLEASE_INIT)
|
||||
registerButtonPanel()
|
||||
registerButtonPanelRecent()
|
||||
panel.bindPanelRecent(panelRecent)
|
||||
panelRecent.bindPanel(panel)
|
||||
}
|
||||
return {
|
||||
load_ipython_extension: loadIPythonExtension
|
||||
}
|
||||
})
|
|
@ -1,251 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
import threading as openpai_ext_threading
|
||||
import json as openpai_ext_json
|
||||
from openpaisdk import __flags__ as openpai_ext_flags
|
||||
|
||||
openpai_ext_flags.disable_to_screen = True
|
||||
|
||||
if 'openpai_ext_lock' not in vars():
|
||||
openpai_ext_buffer_lock = openpai_ext_threading.Lock()
|
||||
|
||||
|
||||
class openpai_ext_Thread(openpai_ext_threading.Thread):
|
||||
'''
|
||||
In Javascript:
|
||||
Each time the code executed by Jupyter.notebook.kernel.execute gives output,
|
||||
the callback function in callbacks.iopub.output will receive message.
|
||||
|
||||
In Python:
|
||||
We run python code in a new thread to avoid blocking the notebook.
|
||||
The handler is set to print json messages,
|
||||
thus the callback in javascript will get noticed.
|
||||
'''
|
||||
|
||||
def success_handler(self, ret):
|
||||
openpai_ext_buffer_lock.acquire()
|
||||
print("__openpai${}__".format(self.token) + openpai_ext_json.dumps(
|
||||
{
|
||||
'code': 0,
|
||||
'message': ret,
|
||||
}
|
||||
), flush=True)
|
||||
openpai_ext_buffer_lock.release()
|
||||
|
||||
def err_handler(self, e):
|
||||
openpai_ext_buffer_lock.acquire()
|
||||
print("__openpai${}__".format(self.token) + openpai_ext_json.dumps(
|
||||
{
|
||||
'code': -1,
|
||||
'message': str(e),
|
||||
}
|
||||
), flush=True)
|
||||
openpai_ext_buffer_lock.release()
|
||||
|
||||
def __init__(self, target, token, args=[], kwargs={}):
|
||||
super(openpai_ext_Thread, self).__init__()
|
||||
self.target = target
|
||||
self.token = token
|
||||
self.args = args
|
||||
self.kwargs = kwargs
|
||||
|
||||
def run(self):
|
||||
try:
|
||||
ret = self.target(*self.args, **self.kwargs)
|
||||
self.success_handler(ret)
|
||||
except Exception as e:
|
||||
import traceback
|
||||
self.err_handler(traceback.format_exc())
|
||||
|
||||
|
||||
class openpai_ext_Interface(object):
|
||||
|
||||
def __init__(self):
|
||||
from openpaisdk import LayeredSettings, ClusterList
|
||||
if LayeredSettings.get('container-sdk-branch') != 'master':
|
||||
LayeredSettings.update('user_basic', 'container-sdk-branch', 'master')
|
||||
self.cll = ClusterList().load()
|
||||
|
||||
def execute(self, target, token, args=[], kwargs={}):
|
||||
t = openpai_ext_Thread(target, token, args, kwargs)
|
||||
t.start()
|
||||
|
||||
def tell_resources(self, token):
|
||||
self.execute(self.cll.tell, token)
|
||||
|
||||
def available_resources(self, token):
|
||||
self.execute(self.cll.available_resources, token)
|
||||
|
||||
def read_defaults(self, token):
|
||||
def _read_defaults_helper():
|
||||
from openpaisdk import LayeredSettings
|
||||
from openpaisdk.job import JobResource
|
||||
# add default settings
|
||||
image_list = LayeredSettings.get('image-list')
|
||||
if image_list is None or len(image_list) == 0:
|
||||
# add default images here
|
||||
default_images = [
|
||||
'openpai/pytorch-py36-cu90',
|
||||
'openpai/pytorch-py36-cpu',
|
||||
'openpai/tensorflow-py36-cu90',
|
||||
'openpai/tensorflow-py36-cpu',
|
||||
]
|
||||
for image in default_images:
|
||||
LayeredSettings.update('global_default', 'image-list', image)
|
||||
image_list = LayeredSettings.get('image-list')
|
||||
resource_list = JobResource.parse_list(LayeredSettings.get('resource-list'))
|
||||
if resource_list is None or len(resource_list) == 0:
|
||||
# add default resource here
|
||||
default_resources = [
|
||||
'1,4,8g',
|
||||
'1,8,16g',
|
||||
'0,4,8g',
|
||||
'2,8,16g',
|
||||
'4,16,32g',
|
||||
]
|
||||
for resource in default_resources:
|
||||
LayeredSettings.update('global_default', 'resource-list', resource)
|
||||
resource_list = JobResource.parse_list(LayeredSettings.get('resource-list'))
|
||||
return {
|
||||
'image-list': image_list,
|
||||
'resource-list': resource_list,
|
||||
'web-default-form': LayeredSettings.get('web-default-form'),
|
||||
'web-default-image': LayeredSettings.get('web-default-image'),
|
||||
'web-default-resource': LayeredSettings.get('web-default-resource'),
|
||||
}
|
||||
self.execute(_read_defaults_helper, token)
|
||||
|
||||
def __set_selected(self, ctx):
|
||||
from openpaisdk import LayeredSettings
|
||||
LayeredSettings.update('global_default', 'web-default-form', ctx['form'])
|
||||
LayeredSettings.update('global_default', 'web-default-image', ctx['docker_image'])
|
||||
LayeredSettings.update('global_default', 'web-default-resource', ','.join([str(ctx['gpu']), str(ctx['cpu']), str(ctx['memoryMB'])]))
|
||||
|
||||
def __submit_job_helper(self, ctx):
|
||||
import tempfile
|
||||
from openpaisdk import Job
|
||||
import os
|
||||
import sys
|
||||
from openpaisdk.notebook import get_notebook_path
|
||||
from openpaisdk import LayeredSettings
|
||||
import yaml
|
||||
|
||||
# save settings
|
||||
self.__set_selected(ctx)
|
||||
|
||||
# setting layers description
|
||||
# layer name | from : priority
|
||||
# user_advanced | NotebookConfiguration.set : 0
|
||||
# user_basic | extension panel selection : 1
|
||||
# local_default | deaults in .openpai/defaults.yaml : 2
|
||||
# global_default | defaults in ~/.openpai/defaults.yaml : 3
|
||||
# - | predefined in flags.py : 4
|
||||
LayeredSettings.update("user_basic", "cluster-alias", ctx['cluster'])
|
||||
LayeredSettings.update("user_basic", "virtual-cluster", ctx['vc'])
|
||||
LayeredSettings.update("user_basic", "image", ctx['docker_image'])
|
||||
LayeredSettings.update("user_basic", "cpu", ctx['cpu']),
|
||||
LayeredSettings.update("user_basic", "gpu", ctx['gpu']),
|
||||
LayeredSettings.update("user_basic", "memoryMB", ctx['memoryMB'])
|
||||
|
||||
cfgs = LayeredSettings.as_dict()
|
||||
|
||||
notebook_path = get_notebook_path()
|
||||
_, _, sources = next(os.walk('.'))
|
||||
|
||||
if ctx['form'] == 'file':
|
||||
jobname = 'python_' + tempfile.mkdtemp()[-8:]
|
||||
mode = 'script'
|
||||
elif ctx['form'] == 'notebook':
|
||||
jobname = 'jupyter_' + tempfile.mkdtemp()[-8:]
|
||||
mode = 'interactive'
|
||||
else:
|
||||
jobname = 'silent_' + tempfile.mkdtemp()[-8:]
|
||||
mode = 'silent'
|
||||
|
||||
job = Job(jobname)\
|
||||
.from_notebook(
|
||||
nb_file=get_notebook_path(),
|
||||
cluster={
|
||||
'cluster_alias': cfgs['cluster-alias'],
|
||||
'virtual_cluster': cfgs['virtual-cluster'],
|
||||
'workspace': cfgs['workspace'],
|
||||
},
|
||||
mode=mode,
|
||||
**{
|
||||
'token': '',
|
||||
'image': cfgs["image"],
|
||||
'resources': {
|
||||
'cpu': cfgs["cpu"],
|
||||
'gpu': cfgs["gpu"],
|
||||
'memoryMB': cfgs["memoryMB"],
|
||||
'mem': cfgs['mem']
|
||||
},
|
||||
'sources': sources + cfgs["sources"],
|
||||
'pip_installs': cfgs["pip-installs"],
|
||||
}
|
||||
)
|
||||
ctx['job_config'] = yaml.dump(job.get_config(), default_flow_style=False)
|
||||
ctx['jobname'] = job.name
|
||||
if ctx['type'] == 'quick':
|
||||
ret = job.submit()
|
||||
ctx['joblink'] = ret['job_link']
|
||||
ctx['jobname'] = ret['job_name']
|
||||
return ctx
|
||||
|
||||
def submit_job(self, token, ctx):
|
||||
self.execute(self.__submit_job_helper, token, args=[ctx])
|
||||
|
||||
def __wait_jupyter_helper(self, ctx):
|
||||
from openpaisdk import Job
|
||||
job = Job(ctx['jobname']).load(cluster_alias=ctx['cluster'])
|
||||
ret = job.wait()
|
||||
ret = job.connect_jupyter() # ret will be None if run in silent mode and without this
|
||||
ctx['state'] = ret['state']
|
||||
if ret['notebook'] is None:
|
||||
ctx['notebook_url'] = '-'
|
||||
else:
|
||||
ctx['notebook_url'] = ret['notebook']
|
||||
return ctx
|
||||
|
||||
def wait_jupyter(self, token, ctx):
|
||||
self.execute(self.__wait_jupyter_helper, token, args=[ctx])
|
||||
|
||||
def __detect_jobs_helper(self, jobs_ctx):
|
||||
from openpaisdk import Job
|
||||
ret = []
|
||||
for ctx in jobs_ctx:
|
||||
try:
|
||||
job = Job(ctx['jobname']).load(cluster_alias=ctx['cluster'])
|
||||
job_info = job.connect_jupyter()
|
||||
ctx['state'] = job_info['state']
|
||||
ctx['notebook_url'] = job_info['notebook']
|
||||
if ctx['notebook_url'] is None:
|
||||
ctx['notebook_url'] = '-'
|
||||
except Exception as e:
|
||||
ctx['state'] = '<span title="{}">UNKNOWN</span>'.format(e)
|
||||
ctx['notebook_url'] = '-'
|
||||
finally:
|
||||
ret.append(ctx)
|
||||
return ret
|
||||
|
||||
def detect_jobs(self, token, jobs_ctx):
|
||||
self.execute(self.__detect_jobs_helper, token, args=[jobs_ctx])
|
||||
|
||||
|
||||
openpai_ext_interface = openpai_ext_Interface()
|
Двоичные данные
contrib/notebook-extension/openpai_submitter/misc/loading.gif
До Ширина: | Высота: | Размер: 50 KiB |
Двоичные данные
contrib/notebook-extension/openpai_submitter/misc/pailogo.jpg
До Ширина: | Высота: | Размер: 26 KiB |
|
@ -1,161 +0,0 @@
|
|||
.openpai-wrapper{
|
||||
position: fixed !important; /* remove !important will cause problems */
|
||||
border: thin solid rgba(0, 0, 0, 0.38);
|
||||
border-radius: 5px;
|
||||
padding: 10px;
|
||||
background-color: #fff;
|
||||
opacity: .95;
|
||||
z-index: 100;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
#openpai-panel-wrapper{
|
||||
width: 629px;
|
||||
height: 566px;
|
||||
top: 10%;
|
||||
left: 50%;
|
||||
}
|
||||
|
||||
#openpai-panel-recent-wrapper{
|
||||
width: 720px;
|
||||
height: 400px;
|
||||
top: 10%;
|
||||
left: 30%;
|
||||
}
|
||||
#openpai-panel, #openpai-panel-recent{
|
||||
margin-left: 10px;
|
||||
margin-right: 10px;
|
||||
margin-top: 5px;
|
||||
}
|
||||
|
||||
.openpai-float-right{
|
||||
float: right;
|
||||
}
|
||||
|
||||
.openpai-inline{
|
||||
display: inline;
|
||||
}
|
||||
|
||||
.openpai-header-text{
|
||||
vertical-align: middle;
|
||||
padding-left: 5px;
|
||||
}
|
||||
|
||||
.openpai-button{
|
||||
padding-top: 8px;
|
||||
}
|
||||
|
||||
.openpai-panel-header{
|
||||
margin-bottom: 10px;
|
||||
margin-right: 5px;
|
||||
margin-left: 5px;
|
||||
margin-top: 2px;
|
||||
}
|
||||
|
||||
table, th{
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
.openpai-fieldset{
|
||||
border: 1px solid #c0c0c0;
|
||||
margin: 0 2px;
|
||||
padding: 0.35em 0.625em 0.75em;
|
||||
}
|
||||
|
||||
.openpai-legend{
|
||||
font-size: 17px;
|
||||
line-height: inherit;
|
||||
border: 0;
|
||||
padding: 2px;
|
||||
width: auto;
|
||||
margin-bottom: 0px;
|
||||
}
|
||||
|
||||
#basic-setting-fieldset{
|
||||
margin-bottom: 10px;
|
||||
padding-bottom: 0.3em;
|
||||
padding-top: 0.8em;
|
||||
padding-left: 1em;
|
||||
padding-right: 0.35em;
|
||||
}
|
||||
|
||||
.loading-img{
|
||||
width: 45px;
|
||||
height: 45px;
|
||||
}
|
||||
|
||||
.loading-img-small{
|
||||
width: 19px;
|
||||
height: 19px;
|
||||
}
|
||||
|
||||
.switch {
|
||||
position: relative;
|
||||
display: inline-block;
|
||||
width: 36px;
|
||||
height: 20px;
|
||||
margin-bottom: 0px;
|
||||
}
|
||||
|
||||
.switch input {
|
||||
opacity: 0;
|
||||
width: 0;
|
||||
height: 0;
|
||||
}
|
||||
|
||||
.slider {
|
||||
position: absolute;
|
||||
cursor: pointer;
|
||||
top: 0;
|
||||
left: 0;
|
||||
right: 0;
|
||||
bottom: 0;
|
||||
background-color: #ccc;
|
||||
-webkit-transition: .3s;
|
||||
transition: .3s;
|
||||
}
|
||||
|
||||
.slider:before {
|
||||
position: absolute;
|
||||
content: "";
|
||||
height: 15px;
|
||||
width: 15px;
|
||||
left: 4px;
|
||||
bottom: 3px;
|
||||
background-color: white;
|
||||
-webkit-transition: .3s;
|
||||
transition: .3s;
|
||||
}
|
||||
|
||||
input:checked + .slider {
|
||||
background-color: #337ab7;
|
||||
}
|
||||
|
||||
input:focus + .slider {
|
||||
box-shadow: 0 0 1px #2196F3;
|
||||
}
|
||||
|
||||
input:checked + .slider:before {
|
||||
-webkit-transform: translateX(14px);
|
||||
-ms-transform: translateX(14px);
|
||||
transform: translateX(14px);
|
||||
}
|
||||
|
||||
.slider.round {
|
||||
border-radius: 30px;
|
||||
}
|
||||
|
||||
.slider.round:before {
|
||||
border-radius: 50%;
|
||||
}
|
||||
|
||||
#openpai-hide-jobs-toggle{
|
||||
margin-right: 1em;
|
||||
margin-top: 0.2em;
|
||||
margin-bottom: 0.2em;
|
||||
}
|
||||
|
||||
.openpai-fieldset .openpai-table-button{
|
||||
color: #337ab7;
|
||||
line-height: 1;
|
||||
}
|
|
@ -1,23 +0,0 @@
|
|||
// Copyright (c) Microsoft Corporation
|
||||
// All rights reserved.
|
||||
//
|
||||
// MIT License
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
// documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
// the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
// to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
// The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
// BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
// NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
// DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
define([], function () {
|
||||
return {
|
||||
plugin_name: 'openpai_submitter',
|
||||
panel_toggle_speed: 400
|
||||
}
|
||||
})
|
|
@ -1,175 +0,0 @@
|
|||
// Copyright (c) Microsoft Corporation
|
||||
// All rights reserved.
|
||||
//
|
||||
// MIT License
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
// documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
// the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
// to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
// The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
// BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
// NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
// DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
define([
|
||||
'require',
|
||||
'jquery',
|
||||
'base/js/namespace',
|
||||
'base/js/events',
|
||||
'nbextensions/openpai_submitter/scripts/config'
|
||||
],
|
||||
function (requirejs, $, Jupyter, events, config) {
|
||||
var panel
|
||||
var codeMain
|
||||
var codeStorage
|
||||
var pool = [] // {token: "token", resolveFunc: resolveFunc, rejectFunc: rejectFunc}
|
||||
|
||||
function getToken () {
|
||||
return Math.random().toString(36).substring(2, 6) + Math.random().toString(36).substring(2, 6)
|
||||
}
|
||||
|
||||
function initiate (panelInstance, resolve, reject) {
|
||||
/* save the python code to codeMain */
|
||||
panel = panelInstance
|
||||
var mainUrl = requirejs.toUrl('../main.py')
|
||||
var storageUrl = requirejs.toUrl('../data.py')
|
||||
var loadMain = new Promise(
|
||||
function (resolve, reject) {
|
||||
$.get(mainUrl).done(function (data) {
|
||||
codeMain = data
|
||||
resolve()
|
||||
})
|
||||
})
|
||||
var loadStorage = new Promise(
|
||||
function (resolve, reject) {
|
||||
$.get(storageUrl).done(function (data) {
|
||||
codeStorage = data
|
||||
resolve()
|
||||
})
|
||||
})
|
||||
Promise.all([loadMain, loadStorage]).then(
|
||||
() => resolve()
|
||||
).catch((e) => reject(e))
|
||||
}
|
||||
|
||||
var getIOPub = function (resolve, reject) {
|
||||
return {
|
||||
output: function (msg) {
|
||||
/*
|
||||
A callback to handle python execution.
|
||||
Note: This function will be executed multiple times,
|
||||
if any stdout/stderr comes out.
|
||||
*/
|
||||
function parseSingleOutput (token, msgContent) {
|
||||
/*
|
||||
msgContent: parsed JSON, such as: {"code": 0, "message": ""}
|
||||
*/
|
||||
for (var pooledToken in pool) {
|
||||
if (pooledToken === token) {
|
||||
if (msgContent['code'] !== 0) { pool[token]['rejectFunc'](msgContent['message']) } else { pool[token]['resolveFunc'](msgContent['message']) }
|
||||
delete pool[token]
|
||||
return
|
||||
}
|
||||
}
|
||||
console.error('[openpai submitter] Unknown token', token)
|
||||
}
|
||||
console.log('[openpai submitter] [code return]:', msg)
|
||||
if (msg.msg_type === 'error') {
|
||||
reject(msg.content.evalue)
|
||||
} else if (msg.content.name !== 'stdout') {
|
||||
// ignore any info which is not stdout
|
||||
console.error(msg.content.text)
|
||||
} else {
|
||||
try {
|
||||
var m = msg.content.text
|
||||
var tokens = m.match(/__openpai\$(.{8})__/g)
|
||||
if (tokens === null || tokens.length === 0) {
|
||||
console.error(m)
|
||||
return
|
||||
}
|
||||
var splittedMSG = m.split(/__openpai\$.{8}__/)
|
||||
var i = 0
|
||||
for (var item of splittedMSG) {
|
||||
item = $.trim(item)
|
||||
if (item === '') continue
|
||||
var jsonMSG = JSON.parse(item)
|
||||
parseSingleOutput(tokens[i].substr(10, 8), jsonMSG)
|
||||
i += 1
|
||||
}
|
||||
} catch (e) {
|
||||
console.error(e)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// return a promise
|
||||
function executePromise (initCode, code) {
|
||||
return new Promise(
|
||||
function (resolve, reject) {
|
||||
if (!(Jupyter.notebook.kernel.is_connected())) {
|
||||
console.error('Cannot find active kernel.')
|
||||
throw new Error('Cannot find active kernel. Please wait until the kernel is ready and refresh.')
|
||||
}
|
||||
resolve()
|
||||
}
|
||||
).then(
|
||||
function () {
|
||||
console.log('[openpai submitter] [code executed]:' + code)
|
||||
return new Promise(
|
||||
function (resolve, reject) {
|
||||
/* replace <openpai_token> with real token */
|
||||
var token = getToken()
|
||||
code = code.replace('<openpai_token>', token)
|
||||
var codeMerged = initCode + '\n' + code
|
||||
/* register final resolve / reject */
|
||||
pool[token] = {
|
||||
resolveFunc: resolve,
|
||||
rejectFunc: reject
|
||||
}
|
||||
/* execute */
|
||||
Jupyter.notebook.kernel.execute(
|
||||
codeMerged, {
|
||||
iopub: getIOPub(resolve, reject)
|
||||
}
|
||||
)
|
||||
})
|
||||
}
|
||||
)
|
||||
}
|
||||
|
||||
return {
|
||||
initiate: initiate,
|
||||
|
||||
// main api
|
||||
read_defaults:
|
||||
() => executePromise(codeMain, 'openpai_ext_interface.read_defaults("<openpai_token>")'),
|
||||
tell_resources:
|
||||
() => executePromise(codeMain, 'openpai_ext_interface.tell_resources("<openpai_token>")'),
|
||||
available_resources:
|
||||
() => executePromise(codeMain, 'openpai_ext_interface.available_resources("<openpai_token>")'),
|
||||
zip_and_upload:
|
||||
(ctx) => executePromise(codeMain, 'openpai_ext_interface.zip_and_upload("<openpai_token>",' + JSON.stringify(ctx) + ')'),
|
||||
submit_job:
|
||||
(ctx) => executePromise(codeMain, 'openpai_ext_interface.submit_job("<openpai_token>",' + JSON.stringify(ctx) + ')'),
|
||||
wait_jupyter:
|
||||
(ctx) => executePromise(codeMain, 'openpai_ext_interface.wait_jupyter("<openpai_token>",' + JSON.stringify(ctx) + ')'),
|
||||
detect_jobs:
|
||||
(jobsCtx) => executePromise(codeMain, 'openpai_ext_interface.detect_jobs("<openpai_token>",' + JSON.stringify(jobsCtx) + ')'),
|
||||
|
||||
// storage api
|
||||
add_job:
|
||||
(record) => executePromise(codeStorage, 'openpai_ext_storage.add("<openpai_token>",' + JSON.stringify(record) + ')'),
|
||||
get_jobs:
|
||||
() => executePromise(codeStorage, 'openpai_ext_storage.get("<openpai_token>")'),
|
||||
save_jobs:
|
||||
(data) => executePromise(codeStorage, 'openpai_ext_storage.save("<openpai_token>", ' + JSON.stringify(data) + ')')
|
||||
|
||||
}
|
||||
}
|
||||
)
|
|
@ -1,539 +0,0 @@
|
|||
// Copyright (c) Microsoft Corporation
|
||||
// All rights reserved.
|
||||
//
|
||||
// MIT License
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
// documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
// the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
// to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
// The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
// BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
// NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
// DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
define([
|
||||
'require',
|
||||
'jquery',
|
||||
'base/js/namespace',
|
||||
'base/js/events',
|
||||
'nbextensions/openpai_submitter/scripts/config',
|
||||
'nbextensions/openpai_submitter/scripts/interface',
|
||||
'nbextensions/openpai_submitter/scripts/utils'
|
||||
],
|
||||
function (requirejs, $, Jupyter, events, config, Interface, Utils) {
|
||||
function Panel () {
|
||||
var STATUS_R = [
|
||||
'NOT_READY',
|
||||
'READY_NOT_LOADING',
|
||||
'READY_LOADING',
|
||||
'SHOWING_INFO',
|
||||
'SUBMITTING_1',
|
||||
'SUBMITTING_2',
|
||||
'SUBMITTING_3',
|
||||
'SUBMITTING_OK',
|
||||
'CANCELLING',
|
||||
'ERROR',
|
||||
'FATAL'
|
||||
]
|
||||
var MSG_R = [
|
||||
'PLEASE_INIT',
|
||||
'INIT_OK',
|
||||
'CLICK_BUTTON',
|
||||
'CLICK_CLOSE',
|
||||
'CLICK_REFRESH',
|
||||
'SUBMIT_START_1',
|
||||
'SUBMIT_START_2',
|
||||
'SUBMIT_START_3',
|
||||
'SUBMIT_OK',
|
||||
'CANCEL',
|
||||
'ERROR',
|
||||
'FATAL_ERROR'
|
||||
]
|
||||
|
||||
var STATUS = {}
|
||||
for (var i = 0; i < STATUS_R.length; i += 1) { STATUS[STATUS_R[i]] = i }
|
||||
var MSG = {}
|
||||
for (var j = 0; j < MSG_R.length; j += 1) { MSG[MSG_R[j]] = j }
|
||||
|
||||
var set = function (s) {
|
||||
// console.log('[openpai submitter] set status', STATUS_R[s])
|
||||
status = s
|
||||
}
|
||||
|
||||
var status
|
||||
var panelRecent
|
||||
|
||||
set(STATUS.NOT_READY)
|
||||
|
||||
var speed = config.panel_toggle_speed
|
||||
|
||||
var showInformation = function (info) {
|
||||
/* this function will hide table and show information for users. */
|
||||
$('#panel-table-wrapper').hide()
|
||||
$('#panel-information').html(info)
|
||||
$('#panel-information-wrapper').show()
|
||||
}
|
||||
|
||||
var appendInformation = function (info) {
|
||||
$('#panel-information').append(info)
|
||||
}
|
||||
|
||||
var send = function (msg, value) {
|
||||
// console.log('[openpai submitter]', 'status:', STATUS_R[status], 'msg', MSG_R[msg], 'value', value)
|
||||
switch (msg) {
|
||||
case MSG.PLEASE_INIT:
|
||||
handleInit()
|
||||
break
|
||||
case MSG.INIT_OK:
|
||||
handleInitOK()
|
||||
break
|
||||
case MSG.CLICK_BUTTON:
|
||||
if (!($('#openpai-panel-wrapper').is(':visible'))) {
|
||||
if ((status !== STATUS.READY_LOADING) && (status !== STATUS.SUBMITTING_1) &&
|
||||
(status !== STATUS.SUBMITTING_2) && (status !== STATUS.SUBMITTING_3) &&
|
||||
(status !== STATUS.SUBMITTING_4) && (status !== STATUS.SUBMITTING_OK) &&
|
||||
(status !== STATUS.FATAL)) { send(MSG.CLICK_REFRESH) }
|
||||
}
|
||||
togglePanel()
|
||||
break
|
||||
case MSG.CLICK_CLOSE:
|
||||
closePanel()
|
||||
break
|
||||
case MSG.CLICK_REFRESH:
|
||||
handleRefresh()
|
||||
break
|
||||
case MSG.SUBMIT_START_1:
|
||||
handleSubmitStart1(value)
|
||||
break
|
||||
case MSG.ERROR:
|
||||
handleError(value)
|
||||
break
|
||||
case MSG.FATAL_ERROR:
|
||||
handleFatalError(value)
|
||||
break
|
||||
default:
|
||||
send(MSG.ERROR, 'unknown message received by panel!')
|
||||
}
|
||||
}
|
||||
|
||||
var handleInit = function () {
|
||||
var panelUrl = requirejs.toUrl('../templates/panel.html')
|
||||
var panel = $('<div id="openpai-panel-wrapper" class="openpai-wrapper"></div>').load(panelUrl)
|
||||
|
||||
Promise.all([
|
||||
/* Promise 1: add panel to html body and bind functions */
|
||||
panel.promise()
|
||||
.then(
|
||||
function () {
|
||||
panel.draggable()
|
||||
panel.toggle()
|
||||
$('body').append(panel)
|
||||
$('body').on('click', '#close-panel-button', function () {
|
||||
send(MSG.CLICK_CLOSE)
|
||||
})
|
||||
$('body').on('click', '#refresh-panel-button', function () {
|
||||
send(MSG.CLICK_REFRESH)
|
||||
})
|
||||
}
|
||||
)
|
||||
.then(
|
||||
() => Utils.set_timeout(600)
|
||||
).then(function () {
|
||||
panel.resizable()
|
||||
$('.openpai-logo').attr('src', requirejs.toUrl('../misc/pailogo.jpg'))
|
||||
$('#cluster-data')
|
||||
.DataTable({
|
||||
dom: 'rtip',
|
||||
order: [
|
||||
[2, 'desc']
|
||||
]
|
||||
})
|
||||
}),
|
||||
/* Promise 2: load python script */
|
||||
new Promise(function (resolve, reject) {
|
||||
Interface.initiate(panel, resolve, reject)
|
||||
})
|
||||
]).then(function (value) {
|
||||
send(MSG.INIT_OK, value)
|
||||
})
|
||||
.catch(function (err) {
|
||||
send(MSG.FATAL_ERROR, err)
|
||||
})
|
||||
}
|
||||
|
||||
var handleInitOK = function () {
|
||||
if (status === STATUS.NOT_READY) {
|
||||
if ($('#openpai-panel-wrapper').is(':visible')) {
|
||||
/* if the panel has been shown, then load the cluster info */
|
||||
set(STATUS.READY_NOT_LOADING)
|
||||
send(MSG.CLICK_REFRESH)
|
||||
} else {
|
||||
/* if the panel has not been shown, change the status to READY_NOT_LOADING and wait */
|
||||
showInformation('')
|
||||
set(STATUS.READY_NOT_LOADING)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
var handleRefresh = function () {
|
||||
if (status === STATUS.NOT_READY || status === STATUS.READY_LOADING) { return }
|
||||
if (status === STATUS.SUBMITTING_1 || status === STATUS.SUBMITTING_2 ||
|
||||
status === STATUS.SUBMITTING_3 || status === STATUS.SUBMITTING_4) {
|
||||
alert('Please do not refresh during submission.')
|
||||
return
|
||||
}
|
||||
if (status === STATUS.FATAL) {
|
||||
alert('Please refresh the whole page to reload this extension.')
|
||||
return
|
||||
}
|
||||
if (status === STATUS.SUBMIT_OK) {
|
||||
if (confirm('Are you sure to refresh? This will clear the current job!') === false) {
|
||||
return
|
||||
}
|
||||
}
|
||||
set(STATUS.READY_LOADING)
|
||||
showInformation('Loading the cluster information, please wait...' + Utils.getLoadingImg('loading-cluster-info'))
|
||||
Interface.read_defaults().then(function (data) {
|
||||
var resourceMenu = ''
|
||||
for (var item of data['resource-list']) {
|
||||
var memoryGB = parseInt(item['memoryMB'] / 1024)
|
||||
var optionValue = item['gpu'] + ',' + item['cpu'] + ',' + item['memoryMB']
|
||||
resourceMenu += '<option data-gpu="' + item['gpu'] + '" data-cpu="' + item['cpu'] + '" data-memory="' + item['memoryMB'] + '" value="' + optionValue + '">' + item['gpu'] + ' GPU, ' + item['cpu'] + ' vCores CPU, ' + memoryGB + ' GB memory</option>\n'
|
||||
}
|
||||
resourceMenu = $('<select name="resource-menu" id="resource-menu">' + resourceMenu + '</select>')
|
||||
var imageAliasDict = {
|
||||
'openpai/pytorch-py36-cu90': 'PyTorch + Python3.6 with GPU, CUDA 9.0',
|
||||
'openpai/pytorch-py36-cpu': 'PyTorch + Python3.6 with CPU',
|
||||
'openpai/tensorflow-py36-cu90': 'TensorFlow + Python3.6 with GPU, CUDA 9.0',
|
||||
'openpai/tensorflow-py36-cpu': 'TensorFlow + Python3.6 with CPU'
|
||||
}
|
||||
var imageMenu = ''
|
||||
var imageAlias
|
||||
for (var image of data['image-list']) {
|
||||
if (image in imageAliasDict) { imageAlias = imageAliasDict[image] } else { imageAlias = image }
|
||||
imageMenu += '<option value="' + image + '">' + imageAlias + '</option>'
|
||||
}
|
||||
imageMenu = $('<select name="docker-image-menu" id="docker-image-menu">' + imageMenu + '</select>')
|
||||
// select the first option
|
||||
|
||||
// append to html
|
||||
$('#resource-menu').remove()
|
||||
$('#docker-image-menu').remove()
|
||||
$('#resouce-menu-label').after(resourceMenu)
|
||||
$('#docker-image-menu-label').after(imageMenu)
|
||||
// select option
|
||||
var formMenu = $('#submit-form-menu')
|
||||
formMenu.find('option').removeAttr('selected')
|
||||
resourceMenu.find('option').removeAttr('selected')
|
||||
imageMenu.find('option').removeAttr('selected')
|
||||
function selectOption (menu, value) {
|
||||
if (value) {
|
||||
var option = menu.find('option[value="' + value + '"]')
|
||||
if (option.length > 0) { option.attr('selected', 'selected') } else { $(menu.find('option')[0]).attr('selected', 'selected') }
|
||||
} else { $(menu.find('option')[0]).attr('selected', 'selected') }
|
||||
}
|
||||
selectOption(formMenu, data['web-default-form'])
|
||||
selectOption(resourceMenu, data['web-default-resource'])
|
||||
selectOption(imageMenu, data['web-default-image'])
|
||||
}).then(
|
||||
() =>
|
||||
Interface.tell_resources().then(function (data) {
|
||||
var ret = []
|
||||
for (var cluster in data) {
|
||||
for (var vc in data[cluster]) {
|
||||
ret.push({
|
||||
cluster: cluster,
|
||||
vc: vc,
|
||||
gpu: {
|
||||
display: Utils.getLoadingImgSmall(),
|
||||
gpu_value: 0
|
||||
},
|
||||
button_sub: `<button class="submit_button" data-type="quick" data-cluster="${cluster}" data-vc="${vc}" >Quick Submit</button>`,
|
||||
button_edit: `<button class="submit_button" data-type="edit" data-cluster="${cluster}" data-vc="${vc}" >Download Config</button>`
|
||||
})
|
||||
}
|
||||
}
|
||||
$('#cluster-data')
|
||||
.DataTable({
|
||||
dom: 'rtip',
|
||||
order: [
|
||||
[2, 'desc']
|
||||
],
|
||||
destroy: true,
|
||||
data: ret,
|
||||
columns: [{
|
||||
data: 'cluster'
|
||||
}, {
|
||||
data: 'vc'
|
||||
}, {
|
||||
data: 'gpu',
|
||||
type: 'num',
|
||||
render: {
|
||||
_: 'display',
|
||||
sort: 'gpu_value'
|
||||
}
|
||||
}, {
|
||||
data: 'button_sub'
|
||||
}, {
|
||||
data: 'button_edit'
|
||||
}],
|
||||
initComplete: function () {
|
||||
set(STATUS.SHOWING_INFO)
|
||||
Interface.available_resources().then(function (clusterData) {
|
||||
var table = $('#cluster-data').DataTable()
|
||||
table.rows().every(function (rowIdx, tableLoop, rowLoop) {
|
||||
var tableData = this.data()
|
||||
var info = clusterData[tableData['cluster']][tableData['vc']]
|
||||
if (info === undefined) {
|
||||
tableData['gpu']['gpu_value'] = -2
|
||||
tableData['gpu']['display'] = '<a class="openpai-tooltip" href="#" title="Can\'t find this vc on cluster. Please use `opai cluster update` to update your cluster settings.">?</a>'
|
||||
} else
|
||||
if (info['GPUs'] === -1) {
|
||||
tableData['gpu']['gpu_value'] = info['GPUs']
|
||||
tableData['gpu']['display'] = '<a class="openpai-tooltip" href="#" title="Fetching resource of this version of PAI is not supported. Please update it to >= 0.14.0">?</a>'
|
||||
} else {
|
||||
tableData['gpu']['gpu_value'] = info['GPUs']
|
||||
tableData['gpu']['display'] = info['GPUs']
|
||||
}
|
||||
this.data(tableData)
|
||||
})
|
||||
table.draw()
|
||||
})
|
||||
},
|
||||
fnDrawCallback: function () {
|
||||
$('.openpai-tooltip').tooltip({
|
||||
classes: {
|
||||
'ui-tooltip': 'highlight'
|
||||
}
|
||||
}
|
||||
)
|
||||
$('.submit_button').on('click', function () {
|
||||
var cluster = $(this).data('cluster')
|
||||
var vc = $(this).data('vc')
|
||||
var type = $(this).data('type')
|
||||
send(MSG.SUBMIT_START_1, {
|
||||
cluster: cluster,
|
||||
vc: vc,
|
||||
type: type
|
||||
})
|
||||
})
|
||||
}
|
||||
})
|
||||
$('#panel-information-wrapper').hide()
|
||||
$('#panel-table-wrapper').show()
|
||||
})
|
||||
)
|
||||
.catch(function (e) {
|
||||
send(MSG.ERROR, e)
|
||||
})
|
||||
}
|
||||
|
||||
var handleSubmitStart1 = function (info) {
|
||||
if (status !== STATUS.SHOWING_INFO) {
|
||||
return
|
||||
}
|
||||
set(STATUS.SUBMITTING_1)
|
||||
/* get some basic */
|
||||
var submittingCtx = {
|
||||
form: $('#submit-form-menu').val(), // file | notebook | silent
|
||||
type: info['type'], // quick | edit
|
||||
cluster: info['cluster'],
|
||||
vc: info['vc'],
|
||||
gpu: parseInt($('#resource-menu option:selected').data('gpu')),
|
||||
cpu: parseInt($('#resource-menu option:selected').data('cpu')),
|
||||
memoryMB: parseInt($('#resource-menu option:selected').data('memory')),
|
||||
docker_image: $('#docker-image-menu').val(),
|
||||
notebook_name: Jupyter.notebook.notebook_name
|
||||
}
|
||||
if (submittingCtx['type'] === 'edit') { submittingCtx['stage_num'] = 1 } else {
|
||||
if (submittingCtx['form'] === 'file') { submittingCtx['stage_num'] = 1 } else { submittingCtx['stage_num'] = 2 }
|
||||
}
|
||||
|
||||
console.log('[openpai submitter] submitting ctx:', submittingCtx)
|
||||
showInformation('')
|
||||
if (submittingCtx['type'] === 'edit') { appendInformation('Uploading files and generating config...' + Utils.getLoadingImg('loading-stage-1')) } else {
|
||||
if (submittingCtx['stage_num'] === 1) { appendInformation('Uploading files and submitting the job...' + Utils.getLoadingImg('loading-stage-1')) } else { appendInformation('Stage 1 / 2 : Uploading files and submitting the job...' + Utils.getLoadingImg('loading-stage-1')) }
|
||||
}
|
||||
var promiseSubmitting = Jupyter.notebook.save_notebook()
|
||||
.then(
|
||||
function () {
|
||||
appendInformation('<p id="text-clear-info-force"><br><br> Click <a href="#" id="openpai-clear-info-force">[here]</a> to cancel this job.</p>')
|
||||
var cancelThis
|
||||
var promise = Promise.race([
|
||||
Interface.submit_job(submittingCtx),
|
||||
new Promise(function (resolve, reject) {
|
||||
cancelThis = reject
|
||||
})
|
||||
])
|
||||
$('body').off('click', '#openpai-clear-info-force').on('click', '#openpai-clear-info-force', function () {
|
||||
if (confirm('Are you sure to start a new OpenPAI Submitter job (Your previous job will be saved in the Recent Jobs panel)?')) {
|
||||
$('#openpai-clear-info-force').remove()
|
||||
cancelThis('cancelled')
|
||||
set(STATUS.NOT_READY)
|
||||
send(MSG.INIT_OK)
|
||||
}
|
||||
})
|
||||
return promise
|
||||
}
|
||||
)
|
||||
.then(
|
||||
function (ctx) {
|
||||
set(STATUS.SUBMITTING_2)
|
||||
$('#text-clear-info-force').remove()
|
||||
$('#loading-stage-1').remove()
|
||||
appendInformation('<br>')
|
||||
submittingCtx = ctx
|
||||
if (ctx['type'] === 'quick') {
|
||||
var submissionTime = (function () {
|
||||
var ts = new Date()
|
||||
var mm = ts.getMonth() + 1
|
||||
var dd = ts.getDate()
|
||||
var HH = ts.getHours()
|
||||
var MM = ts.getMinutes()
|
||||
var SS = ts.getSeconds()
|
||||
if (mm < 10) mm = '0' + mm
|
||||
if (dd < 10) dd = '0' + dd
|
||||
if (HH < 10) HH = '0' + HH
|
||||
if (MM < 10) MM = '0' + MM
|
||||
if (SS < 10) SS = '0' + SS
|
||||
return mm + '-' + dd + ' ' + HH + ':' + MM + ':' + SS
|
||||
}())
|
||||
panelRecent.send(
|
||||
panelRecent.MSG.ADD_JOB, {
|
||||
cluster: ctx['cluster'],
|
||||
vc: ctx['vc'],
|
||||
user: ctx['user'],
|
||||
time: submissionTime,
|
||||
jobname: ctx['jobname'],
|
||||
joblink: ctx['joblink'],
|
||||
form: ctx['form'],
|
||||
state: 'WAITING'
|
||||
}
|
||||
)
|
||||
appendInformation('The job name is: ' + ctx['jobname'] + '<br>')
|
||||
appendInformation('The job link is: <a href="' + ctx['joblink'] + '" target="_blank">' + ctx['joblink'] + '</a>')
|
||||
return new Promise((resolve, reject) => resolve(ctx))
|
||||
} else {
|
||||
/* ctx["type"] === "edit" */
|
||||
var download = function (filename, text) {
|
||||
var element = document.createElement('a')
|
||||
element.setAttribute('href', 'data:text/plain;charset=utf-8,' + encodeURIComponent(text))
|
||||
element.setAttribute('download', filename)
|
||||
element.style.display = 'none'
|
||||
document.body.appendChild(element)
|
||||
element.click()
|
||||
document.body.removeChild(element)
|
||||
}
|
||||
download(ctx['jobname'] + '.yaml', ctx['job_config'])
|
||||
}
|
||||
}
|
||||
)
|
||||
if (submittingCtx['stage_num'] === 2) {
|
||||
promiseSubmitting = promiseSubmitting.then(
|
||||
function (ctx) {
|
||||
appendInformation('<br><br>')
|
||||
if (ctx['form'] === 'notebook') {
|
||||
appendInformation('Stage 2 / 2: Wait until the notebook is ready...' +
|
||||
Utils.getLoadingImg('loading-stage-2'))
|
||||
} else { appendInformation('Stage 2 / 2: Wait until the result is ready...' + Utils.getLoadingImg('loading-stage-2')) }
|
||||
appendInformation('<br>')
|
||||
if (ctx['form'] === 'notebook') {
|
||||
appendInformation('<p id="text-notebook-show">Note: This procedure may persist for several minutes. You can safely close' +
|
||||
' this submitter, and <b>the notebook URL will be shown here once it is prepared.</b></p><br>')
|
||||
} else {
|
||||
appendInformation('<p id="text-notebook-show">Note: The notebook will run in the background. You can safely close' +
|
||||
' this submitter, and <b>the result file link will be shown here once it is prepared.</b></p><br>')
|
||||
}
|
||||
appendInformation('<p id="text-clear-info-force">You can also click <a href="#" id="openpai-clear-info-force">[here]</a> to start a new OpenPAI Submitter job. Your previous job will be saved in the <a href="https://github.com/microsoft/pai/tree/master/contrib/notebook-extension#job-management" target="_blank">Recent Jobs panel</a>.</p>')
|
||||
var cancelThis
|
||||
var promise = Promise.race([
|
||||
Interface.wait_jupyter(ctx),
|
||||
new Promise(function (resolve, reject) {
|
||||
cancelThis = reject
|
||||
})
|
||||
])
|
||||
$('body').off('click', '#openpai-clear-info-force').on('click', '#openpai-clear-info-force', function () {
|
||||
if (confirm('Are you sure to start a new OpenPAI Submitter job (Your previous job will be saved in the Recent Jobs panel)?')) {
|
||||
$('#text-clear-info-force').remove()
|
||||
cancelThis('cancelled')
|
||||
set(STATUS.NOT_READY)
|
||||
send(MSG.INIT_OK)
|
||||
}
|
||||
})
|
||||
return promise
|
||||
}
|
||||
).then(
|
||||
function (ctx) {
|
||||
if (!($('#openpai-panel-wrapper').is(':visible'))) {
|
||||
togglePanel()
|
||||
}
|
||||
$('#loading-stage-2').remove()
|
||||
$('#text-notebook-show').hide()
|
||||
$('#text-clear-info-force').hide()
|
||||
if (ctx['form'] === 'notebook') { appendInformation('The notebook url is: <a href="' + ctx['notebook_url'] + '" target="_blank">' + ctx['notebook_url'] + '</a>') } else { appendInformation('The result file link is (please copy it to your clipboard and paste it to a new page) : <a href="' + ctx['notebook_url'] + '" target="_blank">' + ctx['notebook_url'] + '</a>') }
|
||||
return new Promise((resolve, reject) => resolve(ctx))
|
||||
})
|
||||
}
|
||||
promiseSubmitting = promiseSubmitting.then(
|
||||
function (ctx) {
|
||||
set(STATUS.SUBMITTING_OK)
|
||||
appendInformation('<br><br> You can click <a href="#" id="openpai-clear-info">[here]</a> to start a new OpenPAI Submitter job. Your previous job will be saved in the <a href="https://github.com/microsoft/pai/tree/master/contrib/notebook-extension#job-management" target="_blank">Recent Jobs panel</a>.')
|
||||
$('body').off('click', '#openpai-clear-info').on('click', '#openpai-clear-info', function () {
|
||||
set(STATUS.NOT_READY)
|
||||
send(MSG.INIT_OK)
|
||||
})
|
||||
}
|
||||
).catch(function (e) {
|
||||
if (e !== 'cancelled') { send(MSG.ERROR, e) }
|
||||
})
|
||||
}
|
||||
|
||||
var handleError = function (err) {
|
||||
showInformation(
|
||||
'<p>An error happened. ' +
|
||||
'Please click [refresh] to retry. </p>' +
|
||||
'<br><p>Error Information:' + err + '</p>'
|
||||
)
|
||||
set(STATUS.ERROR)
|
||||
}
|
||||
|
||||
var handleFatalError = function (err) {
|
||||
showInformation(
|
||||
'<p>A fatal error happened and the OpenPAI Submitter has been terminated. ' +
|
||||
'Please refresh the page and click Kernel - Restart & Clear Output to retry. </p>' +
|
||||
'<br><p>Error Information:' + err + '</p>'
|
||||
)
|
||||
$('#refresh-panel-button').hide()
|
||||
set(STATUS.FATAL)
|
||||
}
|
||||
|
||||
var togglePanel = function (callback = null) {
|
||||
$('#openpai-panel-wrapper').toggle(speed, callback)
|
||||
}
|
||||
|
||||
var openPanel = function (callback = null) {
|
||||
$('#openpai-panel-wrapper').show(speed, callback)
|
||||
}
|
||||
|
||||
var closePanel = function (callback = null) {
|
||||
$('#openpai-panel-wrapper').hide(speed, callback)
|
||||
}
|
||||
|
||||
var bindPanelRecent = function (panelRecentInstance) {
|
||||
panelRecent = panelRecentInstance
|
||||
}
|
||||
|
||||
return {
|
||||
send: send,
|
||||
STATUS: STATUS,
|
||||
MSG: MSG,
|
||||
bindPanelRecent: bindPanelRecent
|
||||
}
|
||||
}
|
||||
|
||||
return Panel
|
||||
})
|
|
@ -1,367 +0,0 @@
|
|||
// Copyright (c) Microsoft Corporation
|
||||
// All rights reserved.
|
||||
//
|
||||
// MIT License
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
// documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
// the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
// to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
// The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
// BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
// NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
// DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
define([
|
||||
'require',
|
||||
'jquery',
|
||||
'base/js/namespace',
|
||||
'base/js/events',
|
||||
'nbextensions/openpai_submitter/scripts/config',
|
||||
'nbextensions/openpai_submitter/scripts/interface',
|
||||
'nbextensions/openpai_submitter/scripts/utils'
|
||||
],
|
||||
function (requirejs, $, Jupyter, events, config, Interface, Utils) {
|
||||
function Panel () {
|
||||
var STATUS_R = [
|
||||
'NOT_READY',
|
||||
'READY_NOT_LOADING',
|
||||
'READY_LOADING',
|
||||
'SHOWING_INFO',
|
||||
'ERROR',
|
||||
'FATAL'
|
||||
]
|
||||
var MSG_R = [
|
||||
'PLEASE_INIT',
|
||||
'INIT_OK',
|
||||
'ADD_JOB',
|
||||
'CLICK_BUTTON',
|
||||
'CLICK_CLOSE',
|
||||
'CLICK_REFRESH',
|
||||
'ERROR',
|
||||
'FATAL_ERROR'
|
||||
]
|
||||
|
||||
var STATUS = {}
|
||||
for (var i = 0; i < STATUS_R.length; i += 1) { STATUS[STATUS_R[i]] = i }
|
||||
var MSG = {}
|
||||
for (var j = 0; j < MSG_R.length; j += 1) { MSG[MSG_R[j]] = j }
|
||||
|
||||
var set = function (s) {
|
||||
// console.log('[openpai submitter] [panel-recent] set status', STATUS_R[s])
|
||||
status = s
|
||||
}
|
||||
|
||||
var status
|
||||
var panel // main panel
|
||||
var jobStatusFinished = ['FAILED', 'STOPPED', 'SUCCEEDED']
|
||||
var hasAddFilter = false
|
||||
|
||||
set(STATUS.NOT_READY)
|
||||
|
||||
var speed = config.panel_toggle_speed
|
||||
|
||||
var showInformation = function (info) {
|
||||
/* this function will hide table and show information for users. */
|
||||
$('#panel-recent-table-wrapper').hide()
|
||||
$('#panel-recent-information-wrapper').show()
|
||||
}
|
||||
|
||||
var appendInformation = function (info) {
|
||||
$('#panel-recent-information').append(info)
|
||||
}
|
||||
|
||||
var send = function (msg, value) {
|
||||
// console.log('[openpai submitter] [panel-recent]', 'status:', STATUS_R[status], 'msg', MSG_R[msg], 'value', value)
|
||||
switch (msg) {
|
||||
case MSG.PLEASE_INIT:
|
||||
handleInit()
|
||||
break
|
||||
case MSG.INIT_OK:
|
||||
handleInitOK()
|
||||
break
|
||||
case MSG.ADD_JOB:
|
||||
handleAddJob(value)
|
||||
break
|
||||
case MSG.CLICK_BUTTON:
|
||||
if (!($('#openpai-panel-recent-wrapper').is(':visible'))) {
|
||||
if ((status !== STATUS.READY_LOADING) && (status !== STATUS.FATAL)) {
|
||||
Utils.set_timeout(config.panel_toggle_speed).then(
|
||||
() => send(MSG.CLICK_REFRESH)
|
||||
)
|
||||
}
|
||||
}
|
||||
togglePanel()
|
||||
break
|
||||
case MSG.CLICK_CLOSE:
|
||||
closePanel()
|
||||
break
|
||||
case MSG.CLICK_REFRESH:
|
||||
handleRefresh()
|
||||
break
|
||||
case MSG.ERROR:
|
||||
handleError(value)
|
||||
break
|
||||
case MSG.FATAL_ERROR:
|
||||
handleFatalError(value)
|
||||
break
|
||||
default:
|
||||
send(MSG.ERROR, 'unknown message received by panel!')
|
||||
}
|
||||
}
|
||||
|
||||
var turnOnFilter = function () {
|
||||
hasAddFilter = true
|
||||
$.fn.dataTable.ext.search.push(
|
||||
function (settings, data, dataIndex) {
|
||||
/* only show unfinished jobs */
|
||||
if (settings.nTable.getAttribute('id') !== 'recent-jobs') { return true }
|
||||
return jobStatusFinished.indexOf(data[4]) < 0
|
||||
})
|
||||
}
|
||||
|
||||
var turnOffFilter = function () {
|
||||
hasAddFilter = false
|
||||
$.fn.dataTable.ext.search.pop()
|
||||
}
|
||||
|
||||
var handleInit = function () {
|
||||
var panelUrl = requirejs.toUrl('../templates/panel_recent.html')
|
||||
var panel = $('<div id="openpai-panel-recent-wrapper" class="openpai-wrapper"></div>').load(panelUrl)
|
||||
|
||||
Promise.all([
|
||||
/* Promise 1: add panel to html body and bind functions */
|
||||
panel.promise()
|
||||
.then(
|
||||
function () {
|
||||
panel.draggable()
|
||||
panel.toggle()
|
||||
$('body').append(panel)
|
||||
$('body').on('click', '#close-panel-recent-button', function () {
|
||||
send(MSG.CLICK_CLOSE)
|
||||
})
|
||||
$('body').on('click', '#refresh-panel-recent-button', function () {
|
||||
send(MSG.CLICK_REFRESH)
|
||||
})
|
||||
|
||||
turnOnFilter()
|
||||
|
||||
$('body').on('click', '#openpai-if-hide-jobs', function () {
|
||||
if ($('#openpai-if-hide-jobs').prop('checked') === true &&
|
||||
hasAddFilter === false) {
|
||||
turnOnFilter()
|
||||
$('#recent-jobs').DataTable().draw()
|
||||
} else if ($('#openpai-if-hide-jobs').prop('checked') === false &&
|
||||
hasAddFilter === true) {
|
||||
turnOffFilter()
|
||||
$('#recent-jobs').DataTable().draw()
|
||||
}
|
||||
})
|
||||
}
|
||||
)
|
||||
.then(
|
||||
() => Utils.set_timeout(600)
|
||||
).then(function () {
|
||||
panel.resizable()
|
||||
$('.openpai-logo').attr('src', requirejs.toUrl('../misc/pailogo.jpg'))
|
||||
$('#recent-jobs')
|
||||
.DataTable({
|
||||
dom: 'rtip',
|
||||
order: [
|
||||
[2, 'desc']
|
||||
],
|
||||
data: []
|
||||
})
|
||||
}),
|
||||
/* Promise 2: load python script */
|
||||
new Promise(function (resolve, reject) {
|
||||
Interface.initiate(panel, resolve, reject)
|
||||
})
|
||||
]).then(function (value) {
|
||||
send(MSG.INIT_OK, value)
|
||||
})
|
||||
.catch(function (err) {
|
||||
send(MSG.FATAL_ERROR, err)
|
||||
})
|
||||
}
|
||||
|
||||
var handleInitOK = function () {
|
||||
if (status === STATUS.NOT_READY) {
|
||||
if ($('#openpai-panel-recent-wrapper').is(':visible')) {
|
||||
/* if the panel has been shown, then load the cluster info */
|
||||
set(STATUS.READY_NOT_LOADING)
|
||||
send(MSG.CLICK_REFRESH)
|
||||
} else {
|
||||
/* if the panel has not been shown, change the status to READY_NOT_LOADING and wait */
|
||||
showInformation('')
|
||||
set(STATUS.READY_NOT_LOADING)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
var handleAddJob = function (record) {
|
||||
Interface.add_job(record)
|
||||
.catch((e) => send(MSG.ERROR, e))
|
||||
}
|
||||
|
||||
var handleRefresh = function () {
|
||||
if (status === STATUS.NOT_READY || status === STATUS.READY_LOADING) { return }
|
||||
if (status === STATUS.FATAL) {
|
||||
alert('Please refresh the whole page to reload this extension.')
|
||||
return
|
||||
}
|
||||
set(STATUS.READY_LOADING)
|
||||
var jobData
|
||||
Interface.get_jobs().then(
|
||||
function (data) {
|
||||
var ret = []
|
||||
jobData = data
|
||||
for (var i = 0; i < data.length; i += 1) {
|
||||
var record = data[i]
|
||||
var item = {
|
||||
jobname: record['jobname'],
|
||||
cluster: record['cluster'],
|
||||
vc: record['vc'],
|
||||
user: record['user'],
|
||||
time: record['time'],
|
||||
joblink: '<a href="' + record['joblink'] + '" target="_blank"><i class="fa fa-external-link openpai-table-button"></i></a>'
|
||||
}
|
||||
if (jobStatusFinished.indexOf(record['state']) >= 0) {
|
||||
item['state'] = record['state']
|
||||
if (record['form'] !== 'silent') { item['notebook_url'] = '-' } else {
|
||||
if ((record['notebook_url'] === undefined) || (record['notebook_url'] === '-')) { item['notebook_url'] = '-' } else { item['notebook_url'] = '<a data-path="' + record['notebook_url'] + '" href="#"><i class="silent-link item_icon notebook_icon icon-fixed-width openpai-table-button"></i></a>' }
|
||||
}
|
||||
} else {
|
||||
item['state'] = '<span class="datatable-status" data-jobname="' + record['jobname'] +
|
||||
'" data-cluster="' + record['cluster'] + '" data-vc="' + record['vc'] +
|
||||
'">' + Utils.getLoadingImgSmall() + '</span>'
|
||||
item['notebook_url'] = '<span class="datatable-notebook-url" data-jobname="' +
|
||||
record['jobname'] + '" data-cluster="' + record['cluster'] +
|
||||
'" data-vc="' + record['vc'] + '">' + Utils.getLoadingImgSmall() + '</span>'
|
||||
}
|
||||
ret.push(item)
|
||||
}
|
||||
$('#recent-jobs')
|
||||
.DataTable({
|
||||
dom: 'rtip',
|
||||
order: [
|
||||
[3, 'desc']
|
||||
],
|
||||
data: ret,
|
||||
destroy: true,
|
||||
rowId: rowData => 'openpai-job-' + rowData['jobname'],
|
||||
columns: [{
|
||||
data: 'jobname',
|
||||
width: '15%'
|
||||
}, {
|
||||
data: 'cluster',
|
||||
width: '12%'
|
||||
}, {
|
||||
data: 'vc',
|
||||
width: '12%'
|
||||
}, {
|
||||
data: 'time',
|
||||
width: '25%'
|
||||
}, {
|
||||
data: 'state',
|
||||
width: '12%'
|
||||
}, {
|
||||
data: 'joblink',
|
||||
width: '12%'
|
||||
}, {
|
||||
data: 'notebook_url',
|
||||
width: '12%'
|
||||
}],
|
||||
initComplete: function () {
|
||||
set(STATUS.READY_LOADING)
|
||||
$('body').off('click', '.silent-link').on('click', '.silent-link', function (e) {
|
||||
var url = $(e.target).parent().data('path')
|
||||
Utils.copy_to_clipboard(url).then(
|
||||
() => alert('The result file link has been copied to your clipboard! Please paste it to a new page.')
|
||||
).catch(
|
||||
() => alert('Failed to copy the file link. Please find the file manually. Location: ' + url)
|
||||
)
|
||||
})
|
||||
var jobsFinished = []
|
||||
var jobsUnfinished = []
|
||||
for (var item of jobData) {
|
||||
if (jobStatusFinished.indexOf(item['state']) >= 0) { jobsFinished.push(item) } else { jobsUnfinished.push(item) }
|
||||
}
|
||||
/* Only detect unfinished jobs */
|
||||
Interface.detect_jobs(jobsUnfinished)
|
||||
.then(function (jobsUnfinished) {
|
||||
Interface
|
||||
.save_jobs(jobsUnfinished.concat(jobsFinished))
|
||||
.catch((e) => console.error(e)) // Although it is a promise, we don't care whether it succeeds or not.
|
||||
for (var item of jobsUnfinished) {
|
||||
var originalData = $('#recent-jobs').DataTable().row('#openpai-job-' + item['jobname']).data()
|
||||
originalData['state'] = item['state']
|
||||
if (item['notebook_url'] !== undefined && item['notebook_url'] !== '-') {
|
||||
if (item['form'] === 'notebook') { originalData['notebook_url'] = '<a href="' + item['notebook_url'] + '" target="_blank"><i class="item_icon notebook_icon icon-fixed-width openpai-table-button"></i></a>' } else { originalData['notebook_url'] = '<a data-path="' + item['notebook_url'] + '" href="#"><i class="silent-link item_icon notebook_icon icon-fixed-width openpai-table-button"></i></a>' }
|
||||
} else { originalData['notebook_url'] = '-' }
|
||||
$('#recent-jobs').DataTable().row('#openpai-job-' + item['jobname']).data(originalData)
|
||||
}
|
||||
set(STATUS.SHOWING_INFO)
|
||||
})
|
||||
.catch(
|
||||
function (e) {
|
||||
console.error('[openpai submitter]', e)
|
||||
set(STATUS.SHOWING_INFO)
|
||||
}
|
||||
)
|
||||
}
|
||||
}
|
||||
)
|
||||
$('#panel-recent-information-wrapper').hide()
|
||||
$('#panel-recent-table-wrapper').show()
|
||||
}
|
||||
).catch((e) => send(MSG.ERROR, e))
|
||||
}
|
||||
|
||||
var handleError = function (err) {
|
||||
showInformation(
|
||||
'<p>An error happened. ' +
|
||||
'Please click [refresh] to retry. </p>' +
|
||||
'<br><p>Error Information:' + err + '</p>'
|
||||
)
|
||||
set(STATUS.ERROR)
|
||||
}
|
||||
|
||||
var handleFatalError = function (err) {
|
||||
showInformation(
|
||||
'<p>A fatal error happened and the OpenPAI Submitter has been terminated. ' +
|
||||
'Please refresh the page and click Kernel - Restart & Clear Output to retry. </p>' +
|
||||
'<br><p>Error Information:' + err + '</p>'
|
||||
)
|
||||
set(STATUS.FATAL)
|
||||
}
|
||||
|
||||
var togglePanel = function (callback = null) {
|
||||
$('#openpai-panel-recent-wrapper').toggle(speed, callback)
|
||||
}
|
||||
|
||||
var openPanel = function (callback = null) {
|
||||
$('#openpai-panel-recent-wrapper').show(speed, callback)
|
||||
}
|
||||
|
||||
var closePanel = function (callback = null) {
|
||||
$('#openpai-panel-recent-wrapper').hide(speed, callback)
|
||||
}
|
||||
|
||||
var bindPanel = function (panelInstance) {
|
||||
panel = panelInstance
|
||||
}
|
||||
|
||||
return {
|
||||
send: send,
|
||||
STATUS: STATUS,
|
||||
MSG: MSG,
|
||||
bindPanel: bindPanel
|
||||
}
|
||||
}
|
||||
|
||||
return Panel
|
||||
})
|
|
@ -1,75 +0,0 @@
|
|||
// Copyright (c) Microsoft Corporation
|
||||
// All rights reserved.
|
||||
//
|
||||
// MIT License
|
||||
//
|
||||
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
// documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
// the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
// to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
// The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
//
|
||||
// THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
// BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
// NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
// DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
define(['require'], function (requirejs) {
|
||||
return {
|
||||
getLoadingImg: function (idName) {
|
||||
var loadingImg
|
||||
if (idName !== undefined) { loadingImg = '<img src="' + requirejs.toUrl('../misc/loading.gif') + '" class="loading-img" id="' + idName + '"></img>' } else { loadingImg = '<img src="' + requirejs.toUrl('../misc/loading.gif') + '" class="loading-img"></img>' }
|
||||
return loadingImg
|
||||
},
|
||||
getLoadingImgSmall: function (idName) {
|
||||
var loadingImg
|
||||
if (idName !== undefined) { loadingImg = '<img src="' + requirejs.toUrl('../misc/loading.gif') + '" class="loading-img-small" id="' + idName + '"></img>' } else { loadingImg = '<img src="' + requirejs.toUrl('../misc/loading.gif') + '" class="loading-img-small"></img>' }
|
||||
return loadingImg
|
||||
},
|
||||
copy_to_clipboard: function (text) {
|
||||
return new Promise(function (resolve, reject) {
|
||||
function fallbackCopyTextToClipboard (text) {
|
||||
var textArea = document.createElement('textarea')
|
||||
textArea.value = text
|
||||
document.body.appendChild(textArea)
|
||||
textArea.focus()
|
||||
textArea.select()
|
||||
try {
|
||||
var successful = document.execCommand('copy')
|
||||
var msg = successful ? 'successful' : 'unsuccessful'
|
||||
resolve()
|
||||
} catch (err) {
|
||||
reject(err)
|
||||
}
|
||||
document.body.removeChild(textArea)
|
||||
}
|
||||
function copyTextToClipboard (text) {
|
||||
if (!navigator.clipboard) {
|
||||
fallbackCopyTextToClipboard(text)
|
||||
return
|
||||
}
|
||||
navigator.clipboard.writeText(text).then(function () {
|
||||
resolve()
|
||||
}, function (err) {
|
||||
reject(err)
|
||||
})
|
||||
}
|
||||
copyTextToClipboard(text)
|
||||
})
|
||||
},
|
||||
set_timeout: function timeout (ms, value) {
|
||||
return new Promise((resolve, reject) => {
|
||||
setTimeout(resolve, ms, value)
|
||||
})
|
||||
},
|
||||
set_timeout_func: function timeoutFunc (ms, func, args) {
|
||||
return new Promise((resolve, reject) => {
|
||||
setTimeout(function () {
|
||||
func.apply(args)
|
||||
resolve()
|
||||
}, ms)
|
||||
})
|
||||
}
|
||||
}
|
||||
})
|
|
@ -1,63 +0,0 @@
|
|||
<div id="openpai-panel">
|
||||
<div class="openpai-panel-header">
|
||||
<img src="" class="openpai-logo" class="openpai-inline" />
|
||||
<h3 class="openpai-inline openpai-header-text">OpenPAI Submitter</h3>
|
||||
<a id="close-panel-button" class="openpai-float-right" href="#">[close]</a>
|
||||
<span class="openpai-float-right"> </span>
|
||||
<a id="refresh-panel-button" class="openpai-float-right "href="#">[refresh]</a>
|
||||
</div>
|
||||
|
||||
<div id="openpai-panel-body">
|
||||
<fieldset class="openpai-fieldset" id="basic-setting-fieldset">
|
||||
<!-- <legend class="openpai-legend">Basic Settings: </legend> -->
|
||||
<div>
|
||||
<label for="submit-form-menu">Submit As: </label>
|
||||
<select name="submit-form-menu" id="submit-form-menu">
|
||||
<option value="notebook">Interactive Notebook</option>
|
||||
<option value="file">Python Script (.py File)</option>
|
||||
<option value="silent">Silent Notebook</option>
|
||||
</select>
|
||||
</div>
|
||||
<div>
|
||||
<label for="resource-menu" id="resouce-menu-label">Resource: </label>
|
||||
<!-- <select name="resource-menu" id="resource-menu">
|
||||
<option selected="selected" data-gpu="1" data-cpu="4" data-memory="8192">1 GPU, 4 vCores CPU, 8 GB memory</option>
|
||||
<option data-gpu="1" data-cpu="8" data-memory="16384">1 GPU, 8 vCores CPU, 16 GB memory</option>
|
||||
<option data-gpu="0" data-cpu="4" data-memory="8192">0 GPU, 4 vCores CPU, 8 GB memory</option>
|
||||
<option data-gpu="2" data-cpu="8" data-memory="16384">2 GPU, 8 vCores CPU, 16 GB memory</option>
|
||||
<option data-gpu="4" data-cpu="16" data-memory="32768">4 GPU, 16 vCores CPU, 32 GB memory</option>
|
||||
</select> -->
|
||||
</div>
|
||||
<div>
|
||||
<label for="docker-image-menu" id="docker-image-menu-label">Select a docker image: </label>
|
||||
<!-- <select name="docker-image-menu" id="docker-image-menu">
|
||||
<option value="openpai/pytorch-py36-cu90" selected="selected">PyTorch + Python3.6 with GPU, CUDA 9.0</option>
|
||||
<option value="openpai/pytorch-py36-cpu">PyTorch + Python3.6 with CPU</option>
|
||||
<option value="openpai/tensorflow-py36-cu90">TensorFlow + Python3.6 with GPU, CUDA 9.0</option>
|
||||
<option value="openpai/tensorflow-py36-cpu">TensorFlow + Python3.6 with CPU</option>
|
||||
</select> -->
|
||||
</div>
|
||||
</fieldset>
|
||||
<fieldset class="openpai-fieldset">
|
||||
|
||||
<div id="panel-table-wrapper" style="display:none;">
|
||||
<table id="cluster-data" class="display order-column" data-page-length="7">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Cluster</th>
|
||||
<th>VC</th>
|
||||
<th>Available GPU</th>
|
||||
<th></th>
|
||||
<th></th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody id="cluster-data-body">
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<div id="panel-information-wrapper">
|
||||
<span id="panel-information">The panel is not ready. Please wait.</span>
|
||||
</div>
|
||||
</fieldset>
|
||||
</div>
|
||||
</div>
|
|
@ -1,41 +0,0 @@
|
|||
<div id="openpai-panel-recent">
|
||||
<div class="openpai-panel-header">
|
||||
<img src="" class="openpai-logo openpai-inline" />
|
||||
<h3 class="openpai-header-text openpai-inline">Recent Jobs</h3>
|
||||
<a id="close-panel-recent-button" class="openpai-float-right openpai-button" href="#">[close]</a>
|
||||
<span class="openpai-float-right"> </span>
|
||||
<a id="refresh-panel-recent-button" class="openpai-float-right openpai-button" href="#">[refresh]</a>
|
||||
</div>
|
||||
|
||||
<div class="openpai-panel-body">
|
||||
<fieldset class="openpai-fieldset">
|
||||
<div id="panel-recent-table-wrapper" style="display: none">
|
||||
<div class="openpai-float-right" id="openpai-hide-jobs-toggle">
|
||||
<label class="switch">
|
||||
<input type="checkbox" checked id="openpai-if-hide-jobs">
|
||||
<span class="slider round"></span>
|
||||
</label>
|
||||
<span>Hide Completed Jobs</span>
|
||||
</div>
|
||||
<table id="recent-jobs" class="display order-column nowrap" data-page-length="5">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Name</th>
|
||||
<th>Cluster</th>
|
||||
<th>VC</th>
|
||||
<th>Time</th>
|
||||
<th>Status</th>
|
||||
<th>Link</th>
|
||||
<th>Notebook</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody id="recent-jobs-body">
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<div id="panel-recent-information-wrapper">
|
||||
<span id="panel-recent-information">The panel is not ready. Please wait.</span>
|
||||
</div>
|
||||
</fieldset>
|
||||
</div>
|
||||
</div>
|
|
@ -1,50 +0,0 @@
|
|||
"""this is the setup (install) script for OpenPAI notebook extension
|
||||
"""
|
||||
import os
|
||||
import sys
|
||||
from argparse import ArgumentParser
|
||||
from subprocess import check_output
|
||||
|
||||
|
||||
def run(cmds: list, comment: str = None):
|
||||
if comment:
|
||||
print(comment, flush=True)
|
||||
check_output(cmds, shell=True)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = ArgumentParser()
|
||||
parser.add_argument('--user', action='store_true', default=False, help='pip install in user mode')
|
||||
parser.add_argument('--ignore-sdk', '-i', action='store_true', default=False,
|
||||
help='dont install python sdk, make sure you have a workable version instead')
|
||||
args = parser.parse_args()
|
||||
|
||||
pip_cmd = [sys.executable, '-m', 'pip', 'install']
|
||||
if args.user:
|
||||
pip_cmd += ['--user']
|
||||
jupyter_cmd = [sys.executable, '-m', 'jupyter']
|
||||
|
||||
run(
|
||||
pip_cmd + ['jupyter', 'jupyter_contrib_nbextensions'],
|
||||
'==== install requirements ===='
|
||||
)
|
||||
|
||||
run(
|
||||
jupyter_cmd + ['contrib', 'nbextension', 'install', '--user'],
|
||||
'==== install nbextension ===='
|
||||
)
|
||||
|
||||
if not args.ignore_sdk:
|
||||
run(
|
||||
pip_cmd + ['--upgrade', os.path.join('..', 'python-sdk')],
|
||||
'==== install sdk ===='
|
||||
)
|
||||
|
||||
run(
|
||||
jupyter_cmd + ['nbextension', 'install', 'openpai_submitter'],
|
||||
'==== install openpai_submitter ===='
|
||||
)
|
||||
run(
|
||||
jupyter_cmd + ['nbextension', 'enable', 'openpai_submitter/main'],
|
||||
'==== enable openpai_submitter ===='
|
||||
)
|
|
@ -1,457 +0,0 @@
|
|||
The `Python` SDK and CLI for `OpenPAI`
|
||||
----
|
||||
|
||||
***Note: Python SDK is deprecated and will be removed in the future. New SDK and CLI support is available at [openpaisdk](https://github.com/microsoft/openpaisdk).***
|
||||
|
||||
This is a proof-of-concept SDK (Python) and CLI (command-line-interface) tool for the [OpenPAI](http://github.com/microsoft/pai). This project provides some facilities to make `OpenPAI` more easily accessible and usable for users. With it,
|
||||
|
||||
- User can easily access `OpenPAI` resources in scripts (`Python` or `Shell`) and `Jupyter` notebooks
|
||||
- User can easily submit and list jobs by simple commands, or snippets of code
|
||||
- User can easily accomplish complicated operations with `OpenPAI`
|
||||
- User can easily reuse local codes and notebooks
|
||||
- User can easily manage and switch between multiple `OpenPAI` clusters
|
||||
|
||||
Besides above benefits, this project also provides powerful runtime support, which bridges users' (local) working environments and jobs' running environments (inside the containers started by remote cluster). See more about[ the scenarios and user stories](docs/scenarios-and-user-stories.md).
|
||||
|
||||
- [Get started](#get-started)
|
||||
- [Installation](#installation)
|
||||
- [Dependencies](#dependencies)
|
||||
- [Define your clusters](#define-your-clusters)
|
||||
- [How-to guide for the CLI tool](#how-to-guide-for-the-cli-tool)
|
||||
- [Cluster and storage management](#cluster-and-storage-management)
|
||||
- [How to list existing clusters](#how-to-list-existing-clusters)
|
||||
- [How to open and edit the cluster configuration file](#how-to-open-and-edit-the-cluster-configuration-file)
|
||||
- [How to check the available resources of clusters](#how-to-check-the-available-resources-of-clusters)
|
||||
- [How to add a cluster](#how-to-add-a-cluster)
|
||||
- [How to delete a cluster](#how-to-delete-a-cluster)
|
||||
- [How to access storages of a cluster](#how-to-access-storages-of-a-cluster)
|
||||
- [Job operations](#job-operations)
|
||||
- [How to query my jobs in a cluster](#how-to-query-my-jobs-in-a-cluster)
|
||||
- [How to submit a job from existing job config file](#how-to-submit-a-job-from-existing-job-config-file)
|
||||
- [How to change the configuration before submitting](#how-to-change-the-configuration-before-submitting)
|
||||
- [How to submit a job if I have no existing job config file](#how-to-submit-a-job-if-i-have-no-existing-job-config-file)
|
||||
- [How to request (GPU) resources for the job](#how-to-request-gpu-resources-for-the-job)
|
||||
- [How to reference a local file when submitting a job](#how-to-reference-a-local-file-when-submitting-a-job)
|
||||
- [How to submit a job given a sequence of commands](#how-to-submit-a-job-given-a-sequence-of-commands)
|
||||
- [How to add `pip install` packages](#how-to-add-pip-install-packages)
|
||||
- [How to preview the generated job config but not submit it](#how-to-preview-the-generated-job-config-but-not-submit-it)
|
||||
- [`Jupyter` notebook](#jupyter-notebook)
|
||||
- [How to run a local notebook with remote resources](#how-to-run-a-local-notebook-with-remote-resources)
|
||||
- [How to launch a remote `Jupyter` server and connect it](#how-to-launch-a-remote-jupyter-server-and-connect-it)
|
||||
- [Other FAQ of CLI](#other-faq-of-cli)
|
||||
- [How to select a cluster to use until I change it](#how-to-select-a-cluster-to-use-until-i-change-it)
|
||||
- [How to simplify the command](#how-to-simplify-the-command)
|
||||
- [How to install a different version of SDK](#how-to-install-a-different-version-of-sdk)
|
||||
- [How to specify the `python` environment I want to use in the job container](#how-to-specify-the-python-environment-i-want-to-use-in-the-job-container)
|
||||
- [Python binding](#python-binding)
|
||||
- [Cluster management](#cluster-management)
|
||||
- [Job management](#job-management)
|
||||
- [Make contributions](#make-contributions)
|
||||
- [Release plan](#release-plan)
|
||||
- [Debug the SDK](#debug-the-sdk)
|
||||
- [Unit tests](#unit-tests)
|
||||
|
||||
# Get started
|
||||
|
||||
This section will give guidance about installation, cluster management. User may find more details not covered in the [command line ref](docs/command-line-references.md).
|
||||
|
||||
## Installation
|
||||
|
||||
We provide installing method leveraging `pip install`
|
||||
|
||||
```bash
|
||||
python -m pip install --upgrade pip
|
||||
pip install -U "git+https://github.com/Microsoft/pai@master#egg=openpaisdk&subdirectory=contrib/python-sdk"
|
||||
```
|
||||
|
||||
Refer to [How to install a different version of SDK](#How-to-install-a-different-version-of-SDK) for more details about installing. After installing, please verify by CLI or python binding as below.
|
||||
|
||||
```bash
|
||||
opai -h
|
||||
python -c "from openpaisdk import __version__; print(__version__)"
|
||||
```
|
||||
|
||||
### Dependencies
|
||||
|
||||
- The package requires python3 (mainly because of `type hinting`), and we only tested it on `py3.5+` environment. _Only commands `job sub` and `job notebook` require installing this project inside container, others don't make any constraints of `python` version in the docker container._
|
||||
- [`Pylon`](https://github.com/microsoft/pai/tree/master/docs/pylon) is required to parse the REST api path like `/reset-server/`.
|
||||
|
||||
## Define your clusters
|
||||
|
||||
Please store the list of your clusters in `~/.openpai/clusters.yaml`. Every cluster would have an alias for calling, and you may save more than one cluster in the list.
|
||||
|
||||
```YAML
|
||||
- cluster_alias: <your-cluster-alias>
|
||||
pai_uri: http://x.x.x.x
|
||||
user: <your-user-name>
|
||||
password: <your-password>
|
||||
token: <your-authen-token> # if Azure AD is enabled, must use token for authentication
|
||||
pylon_enabled: true
|
||||
aad_enabled: false
|
||||
storages: # a cluster may have multiple storages
|
||||
builtin: # storage alias, every cluster would always have a builtin storage
|
||||
protocol: hdfs
|
||||
uri: http://x.x.x.x # if not specified, use <pai_uri>
|
||||
ports:
|
||||
native: 9000 # used for hdfs-mount
|
||||
webhdfs: webhdfs # used for webhdfs REST API wrapping
|
||||
virtual_clusters:
|
||||
- <your-virtual-cluster-1>
|
||||
- <your-virtual-cluster-2>
|
||||
- ...
|
||||
```
|
||||
|
||||
Now below command shows all your clusters would be displayed.
|
||||
|
||||
```bash
|
||||
opai cluster list
|
||||
```
|
||||
|
||||
# How-to guide for the CLI tool
|
||||
|
||||
This section will brief you how to leverage the CLI tool (prefixed by `opai`) to improve the productivity of interacting with `OpenPAI`. Below is a summary of functions provided.
|
||||
|
||||
| Command | Description |
|
||||
| -------------------------- | ---------------------------------------------------------------------------------- |
|
||||
| `opai cluster list` | list clusters defined in `~/.openpai/clusters.yaml` |
|
||||
| `opai cluster resources` | list available resources of every cluster (GPUs/vCores/Memory per virtual cluster) |
|
||||
| `opai cluster edit` | open `~/.openpai/clusters.yaml` for your editing |
|
||||
| `opai cluster add` | add a cluster |
|
||||
| `opai job list` | list all jobs of given user (in a given cluster) |
|
||||
| `opai job status` | query the status of a job |
|
||||
| `opai job stop` | stop a job |
|
||||
| `opai job submit` | submit a given job config file to cluster |
|
||||
| `opai job sub` | shortcut to generate job config and submit from a given command |
|
||||
| `opai job notebook` | shortcut to run a local notebook remotely |
|
||||
| `opai storage <operation>` | execute `<operation>`* on selected storage (of a given cluster) |
|
||||
|
||||
_*: operations include `list`, `status`, `upload`, `download` and `delete`_
|
||||
|
||||
Before starting, we'd like to define some commonly used variables as below.
|
||||
|
||||
| Variable name | CLI options | Description |
|
||||
| ----------------- | --------------------- | --------------------------------------------- |
|
||||
| `<cluster-alias>` | `--cluster-alias, -a` | alias to specify a particular cluster |
|
||||
| `<job-name>` | `--job-name, -j` | job name |
|
||||
| `<docker-image>` | `--image, -i` | image name (and tag) for the job |
|
||||
| `<workspace>` | `--workspace, -w` | remote storage path to save files for a job * |
|
||||
|
||||
_*: if specified, a directory `<workspace>/jobs/<job-name>` and subfolders (e.g. `source`, `output` ...) will be created to store necessary files for the job named `<job-name>`_
|
||||
|
||||
## Cluster and storage management
|
||||
|
||||
### How to list existing clusters
|
||||
|
||||
To list all existing clusters in `~/.openpai/clusters.yaml`, execute below command
|
||||
|
||||
```bash
|
||||
opai cluster list
|
||||
```
|
||||
|
||||
### How to open and edit the cluster configuration file
|
||||
|
||||
We add a convenient shortcut command to open the cluster configuration file with your editor directly by
|
||||
|
||||
```bash
|
||||
opai cluster edit [--editor <path/to/editor>]
|
||||
```
|
||||
|
||||
The default editor is VS Code (`code`), users may change to other editor (e.g. `--editor notepad`).
|
||||
|
||||
## How to check the available resources of clusters
|
||||
|
||||
To check the availability of each cluster, use the command
|
||||
```bash
|
||||
opai cluster resources
|
||||
```
|
||||
it will return the available GPUs, vCores and memory of every virtual cluster in every cluster.
|
||||
|
||||
User can also check it in a `Python` script as below
|
||||
```python
|
||||
from openpaisdk import __cluster_config_file__
|
||||
from openpaisdk.io_utils import from_file
|
||||
from openpaisdk.cluster import ClusterList
|
||||
|
||||
cfg = from_file(__cluster_config_file__, default=[])
|
||||
ClusterList(cfg).available_resources()
|
||||
```
|
||||
|
||||
### How to add a cluster
|
||||
|
||||
User can use `add` and `delete` command to add (or delete) a clusters from the clusters file.
|
||||
|
||||
```bash
|
||||
# for user/password authentication
|
||||
opai cluster add --cluster-alias <cluster-alias> --pai-uri <pai-uri> --user <user> --password <password>
|
||||
# for Azure AD authentication
|
||||
opai cluster add --cluster-alias <cluster-alias> --pai-uri <pai-uri> --user <user> --token <token>
|
||||
```
|
||||
|
||||
On receiving the add command, the CLI will try to connect the cluster, and get basic configuration from it.
|
||||
|
||||
User can also add it by `python` binding as below.
|
||||
|
||||
|
||||
### How to delete a cluster
|
||||
|
||||
Delete a cluster by calling its alias.
|
||||
|
||||
```bash
|
||||
opai cluster delete <cluster-alias>
|
||||
```
|
||||
|
||||
### How to access storages of a cluster
|
||||
|
||||
Before accessing, user needs to attach storages to a specify cluster.
|
||||
|
||||
```bash
|
||||
opai cluster attach-hdfs --cluster-alias <cluster-alias> --storage-alias hdfs --web-hdfs-uri http://x.x.x.x:port --default
|
||||
```
|
||||
|
||||
It is supported to attach multiple heterogeneous storages (e.g. `HDFS`, `NFS` ...*) to a cluster, and one of the storages will be set as default (to upload local codes). If not defined, the storage firstly added will be set as default.
|
||||
|
||||
After attaching, basic operations (e.g. `list`, `upload`, `download` ...) are provided.
|
||||
|
||||
```bash
|
||||
opai storage list -a <cluster-alias> -s <storage-alias> <remote-path>
|
||||
opai storage download -a <cluster-alias> -s <storage-alias> <remote-path> <local-path>
|
||||
opai storage upload -a <cluster-alias> -s <storage-alias> <local-path> <remote-path>
|
||||
```
|
||||
|
||||
## Job operations
|
||||
|
||||
### How to query my jobs in a cluster
|
||||
|
||||
User could retrieve the list of submitted jobs from a cluster. If more information is wanted, add the `<job-name>` in the command.
|
||||
|
||||
```bash
|
||||
opai job list -a <cluster-alias> [<job-name>]
|
||||
```
|
||||
|
||||
### How to submit a job from existing job config file
|
||||
|
||||
If you already has a job config file, you could submit a job based on it directly. The job config file could be in the format of `json` or `yaml`, and it must be compatible with [job configuration specification v1](https://github.com/microsoft/pai/blob/master/docs/job_tutorial.md) or [pai-job-protocol v2](https://github.com/microsoft/openpai-protocol/blob/master/schemas/v2/schema.yaml).
|
||||
|
||||
```bash
|
||||
opai job submit -a <cluster-alias> <config-file>
|
||||
```
|
||||
|
||||
The CLI would judge whether it is `v1` or `v2` job configuration and call corresponding REST API to submit it.
|
||||
|
||||
### How to change the configuration before submitting
|
||||
|
||||
The CLI tools also provides the function to change some contents of existing job config file before submitting it. For example, we need to change the job name to avoid duplicated names, and maybe want to switch to a virtual cluster with more available resources. Of course, user could change the contents of `jobName` and `virtualCluster` (in `v1` format) or `name` and `virtualCluster` in `defaults` (in `v2` format) manually. But the CLI provides a more efficient and easy way to to the same thing.
|
||||
|
||||
```bash
|
||||
# compatible with v1 specification
|
||||
opai job submit --update name=<job-name> -u defaults:virtualCluster=test <config-file>
|
||||
|
||||
# compatible with v2 specification
|
||||
opai job submit --update jobName=<job-name> -u virtualCluster=test <config-file>
|
||||
```
|
||||
|
||||
### How to submit a job if I have no existing job config file
|
||||
|
||||
It is not convenient to write a job config file (no matter according to `v1` or `v2` specification). For users just want to run a specific command (or a sequence of commands) in the resources of the cluster, the CLI provides a command `sub` (different from`submit`), which could generate the job config file first and then `submit` it.
|
||||
|
||||
For example, user want to run `mnist_cnn.py` in a docker container (the file is contained by the docker image), the command would be
|
||||
|
||||
```bash
|
||||
opai job sub -a <cluster-alias> -i <docker-image> -j <job-name> python mnist_cnn.py
|
||||
```
|
||||
|
||||
### How to request (GPU) resources for the job
|
||||
|
||||
User could apply for specific resources (CPUs, GPUs and Memory) for the job, just by adding below options in above commands
|
||||
|
||||
- `--cpu <#cpu>`
|
||||
|
||||
- `--gpu <#gpu>`
|
||||
- `--memoryMB <#memory-in-unit-of-MB>`
|
||||
- `--ports <label-1>=<port-1> [--ports <label-2>=<port-2> [...]]`
|
||||
|
||||
### How to reference a local file when submitting a job
|
||||
|
||||
If the `mnist_cnn.py` is not copied in the docker image and it is a file stored in your local disk, above command would fail due to the file cannot be accessed in remote job container. To solve this problem, the option `--sources mnist_cnn.py` would be added in the command. Since the job container could access local disk directly, we need to upload the file to somewhere (defined by `--workspace`) in [the default storage of the cluster](#How-to-access-storages-of-a-cluster).
|
||||
|
||||
```bash
|
||||
opai job sub -a <cluster-alias> -i <docker-image> -j <job-name> -w <workspace> --sources mnist_cnn.py python mnist_cnn.py
|
||||
```
|
||||
|
||||
### How to submit a job given a sequence of commands
|
||||
|
||||
In some cases, user wants to do a sequence of commands in the job. The recommended way is to put your commands in a pair of quotes (like `"git clone ... && python ..."`) and combine them with `&&` if you have multiple commands to run. Here is an example of combining 3 commands.
|
||||
|
||||
```bash
|
||||
opai job sub [...] "git clone <repo-uri> && cd <repo-dir> && python run.py arg1 arg2 ..."
|
||||
```
|
||||
|
||||
### How to add `pip install` packages
|
||||
|
||||
Of course, you could write a sequence of commands like `pip install ... && python ...` . There is another way which use `--pip-installs <package>` and `--pip-path <path/to/pip>` options in the commands. it will add new commands in the `preCommands` in the `deployment`.
|
||||
|
||||
### How to preview the generated job config but not submit it
|
||||
|
||||
In some cases, user may want to preview the job config (in `v2` format) but not submit it directly. To fulfill this, just add `--preview` option. The commands support this feature includes `job submit`, `job sub` and `job notebook`.
|
||||
|
||||
## `Jupyter` notebook
|
||||
|
||||
### How to run a local notebook with remote resources
|
||||
|
||||
If given a local `<notebook>` (e.g. `mnist_cnn.ipynb` stored in local disk), and user wants to run it remotely (on `OpenPAI`) and see the result.
|
||||
|
||||
```bash
|
||||
opai job notebook -a <cluster-alias> -i <docker-image> -w <workspace> <notebook>
|
||||
```
|
||||
|
||||
This command requires options as the `opai job sub` does. This command would
|
||||
|
||||
- *Local* - upload `<notebook>` to `<workspace>/jobs/<job-name>/source` and submit the job to cluster (`<job-name>` is set to `<notebook>_<random-string>` if not defined)
|
||||
- _In job container_ - download `<notebook> ` and execute it by `jupyter nbconver --execute`, the result would be saved in `<html-result>` with the same name (`*.html`)
|
||||
- _In job container_ - upload `<html-result>` to `<workspace>/jobs/<job-name>/output`
|
||||
- _Local_ - wait and query the job state until its status to be `SUCCEEDED`
|
||||
- _Local_ - download `<html-result>` to local and open it with web browser
|
||||
|
||||
### How to launch a remote `Jupyter` server and connect it
|
||||
|
||||
Sometimes user may want to launch a remote `Jupyter` server and do some work on it interactively. To do this, just add `--interactive` in `job notebook` command. After submitting the job, a link like `http://x.x.x.x:port/notebooks/<notebook>` will be opened in your browser. Since it takes a while to start the container, please wait and refresh the page until the notebook opens. Use the default token `abcd` (unless it is overridden by `--token <token>`) to login the notebook.
|
||||
|
||||
## Other FAQ of CLI
|
||||
|
||||
### How to select a cluster to use until I change it
|
||||
|
||||
As shown in above examples, `--cluster-alias, -a` is required by lots of commands, but it may not be changed frequently. So it is annoying to type it every time. The CLI tool provides a command to select a cluster to use by
|
||||
|
||||
```
|
||||
opai cluster select [-g] <cluster-alias>
|
||||
```
|
||||
|
||||
Commands after `opai cluster select` will have a default option (if necessary) `--cluster-alias <cluster-alias>`, which can be overwritten explicitly. The mechanism and priority sequence is the same to below section.
|
||||
|
||||
### How to simplify the command
|
||||
|
||||
The mechanism behind `opai cluster select` command help us to simplify the command further. For example, we could set `--workspace, -w` with a default value by
|
||||
|
||||
```bash
|
||||
opai set [-g] workspace=<workspace>
|
||||
```
|
||||
|
||||
The SDK will first load (`~/.openpai/defaults.yaml`), and then update them with the contents in `.openpai/defaults.yaml` in your current working directory. In every command requires a `--workspace, -w` option but no value defined, the default value would be used.
|
||||
|
||||
Some commonly used default variables includes
|
||||
|
||||
- `cluster-alias=<cluster-alias>`
|
||||
- `image=<docker-image>`
|
||||
- `workspace=<workspace>`
|
||||
- `container-sdk-branch=<container-sdk-branch-tag>` which branch to use when install the sdk in job container
|
||||
|
||||
### How to install a different version of SDK
|
||||
|
||||
User could easily switch to another version of SDK both in local environment and in job container. In local environment, user just change `<your/branch>` to another branch (e.g. `pai-0.14.y` for `OpenPAI` end-June release or a feature developing branch for the canary version).
|
||||
|
||||
```bash
|
||||
pip install -U "git+https://github.com/Microsoft/pai@<your/branch>#egg=openpaisdk&subdirectory=contrib/python-sdk"
|
||||
```
|
||||
|
||||
To debug a local update, just use `pip install -U your/path/to/setup.py`.
|
||||
|
||||
For jobs submitted by the SDK or command line tool, the version specified by `opai set container-sdk-branch=<your/version>` would be used firstly. If not specified, `master` branch will be used.
|
||||
|
||||
### How to specify the `python` environment I want to use in the job container
|
||||
|
||||
In some cases, there are more than one `python` environments in a docker image. For example, there are both `python` and `python3` environments in `openpai/pai.example.keras.tensorflow`. User could add `--python <path/to/python>` (e.g. `--python python3`) in the command `job notebook` or `job sub` to use the specific `python` environment. Refer to [notebook example](examples/1-submit-and-query-via-command-line.ipynb) for more details.
|
||||
|
||||
# Python binding
|
||||
|
||||
## Cluster management
|
||||
|
||||
- [x] User can describe a cluster with `openpaisdk.core.ClusterList` class to describe multiple clusters
|
||||
|
||||
```python
|
||||
clusters = ClusterList().load() # defaultly loaded from "~/.openpai/clusters.yaml"
|
||||
```
|
||||
|
||||
User `add`, `delete` methods to update clusters, `select` and `get_client` methods to select one from multiple clusters.
|
||||
|
||||
To add a cluster:
|
||||
```python
|
||||
cluster_cfg = {
|
||||
"cluster_alias": ..., # each cluster mush have an unique alias
|
||||
"pai_uri": ...,
|
||||
"user": ...,
|
||||
# for user/password authentication
|
||||
"password": ...,
|
||||
# for Azure AD authentication
|
||||
"token": ...,
|
||||
}
|
||||
ClusterList().load().add(cluster_cfg).save()
|
||||
```
|
||||
|
||||
To delete a cluster:
|
||||
```python
|
||||
ClusterList().load().delete(cluster_alias).save()
|
||||
```
|
||||
|
||||
- [x] the `Cluster` class has methods to query and submit jobs
|
||||
|
||||
```python
|
||||
client = clusters.get_client(alias)
|
||||
client.jobs(name)
|
||||
client.rest_api_submit(job_config)
|
||||
```
|
||||
|
||||
- [x] the `Cluster` class has methods to access storage (through `WebHDFS` only for this version)
|
||||
|
||||
```python
|
||||
Cluster(...).storage.upload/download(...)
|
||||
```
|
||||
|
||||
## Job management
|
||||
|
||||
- [x] User can describe a job with `openpaisdk.core.Job` class, which is compatible with the v2 protocol
|
||||
|
||||
```python
|
||||
job = Job(name)
|
||||
job.submit(cluster_alias) # submit current job to a cluster
|
||||
```
|
||||
|
||||
- [x] provide some quick template of simple jobs
|
||||
|
||||
```python
|
||||
job.one_liner(...) # generate job config from a command
|
||||
job.from_notebook(...) # turn notebook to job
|
||||
```
|
||||
|
||||
# Make contributions
|
||||
|
||||
User may open issues and feature requests on [Github](https://github.com/microsoft/pai).
|
||||
|
||||
## Release plan
|
||||
|
||||
If there are functions requests not included, please open an issue for feature request.
|
||||
|
||||
## Debug the SDK
|
||||
|
||||
For users those want to improve the functions themselves, you may create the branch of `OpenPAI` project, and make modifications locally. And then set your own branch to the SDK installation source by
|
||||
|
||||
```bash
|
||||
opai set container-sdk-branch=<your/branch>
|
||||
```
|
||||
|
||||
Then the `pip install` command in the job container would use `<your/branch>` . User may check the generated job config to check.
|
||||
|
||||
To set the internal logger to debug level, create an empty file `.openpai/debug_enable` to let sdk enable debugging logging. And remove the empty file make it work normally.
|
||||
|
||||
## Unit tests
|
||||
|
||||
Please execute below command under the `tests` directory to have a quick unit test.
|
||||
```bash
|
||||
python -m unittest discover
|
||||
```
|
||||
|
||||
Since the unit tests will try to connect your cluster, we set a test environment instead of corrupting the practical settings. Please add a `ut_init.sh` file in `tests` as below
|
||||
```bash
|
||||
opai set clusters-in-local=yes # don't corrupt practical environment
|
||||
opai cluster add -a <cluster-alias> --pai-uri http://x.x.x.x --user <user> --password <password>
|
||||
opai cluster select <cluster-alias>
|
||||
```
|
|
@ -1,409 +0,0 @@
|
|||
## The `Python` SDK and CLI for `OpenPAI`
|
||||
|
||||
This is a proof-of-concept SDK (Python) and CLI (command-line-interface) tool for the [OpenPAI](http://github.com/microsoft/pai). This project provides some facilities to make `OpenPAI` more easily accessible and usable for users. With it,
|
||||
|
||||
- User can easily access `OpenPAI` resources in scripts (`Python` or `Shell`) and `Jupyter` notebooks
|
||||
- User can easily submit and list jobs by simple commands, or snippets of code
|
||||
- User can easily accomplish complicated operations with `OpenPAI`
|
||||
- User can easily reuse local codes and notebooks
|
||||
- User can easily manage and switch between multiple `OpenPAI` clusters
|
||||
|
||||
Besides above benefits, this project also provides powerful runtime support, which bridges users' (local) working environments and jobs' running environments (inside the containers started by remote cluster). See more about[ the scenarios and user stories](docs/scenarios-and-user-stories.md).
|
||||
|
||||
- [Get started](#get-started)
|
||||
- [Installation](#installation)
|
||||
- [Dependencies](#dependencies)
|
||||
- [Define your clusters](#define-your-clusters)
|
||||
- [How-to guide for the CLI tool](#how-to-guide-for-the-cli-tool)
|
||||
- [Cluster and storage management](#cluster-and-storage-management)
|
||||
- [How to list existing clusters](#how-to-list-existing-clusters)
|
||||
- [How to open and edit the cluster configuration file](#how-to-open-and-edit-the-cluster-configuration-file)
|
||||
- [How to check the available resources of clusters](#how-to-check-the-available-resources-of-clusters)
|
||||
- [How to add / delete a cluster](#how-to-add--delete-a-cluster)
|
||||
- [How to access storages of a cluster](#how-to-access-storages-of-a-cluster)
|
||||
- [Job operations](#job-operations)
|
||||
- [How to query my jobs in a cluster](#how-to-query-my-jobs-in-a-cluster)
|
||||
- [How to submit a job from existing job config file](#how-to-submit-a-job-from-existing-job-config-file)
|
||||
- [How to change the configuration before submitting](#how-to-change-the-configuration-before-submitting)
|
||||
- [How to submit a job if I have no existing job config file](#how-to-submit-a-job-if-i-have-no-existing-job-config-file)
|
||||
- [How to request (GPU) resources for the job](#how-to-request-gpu-resources-for-the-job)
|
||||
- [How to reference a local file when submitting a job](#how-to-reference-a-local-file-when-submitting-a-job)
|
||||
- [How to submit a job given a sequence of commands](#how-to-submit-a-job-given-a-sequence-of-commands)
|
||||
- [How to add `pip install` packages](#how-to-add-pip-install-packages)
|
||||
- [How to preview the generated job config but not submit it](#how-to-preview-the-generated-job-config-but-not-submit-it)
|
||||
- [`Jupyter` notebook](#jupyter-notebook)
|
||||
- [How to run a local notebook with remote resources](#how-to-run-a-local-notebook-with-remote-resources)
|
||||
- [How to launch a remote `Jupyter` server and connect it](#how-to-launch-a-remote-jupyter-server-and-connect-it)
|
||||
- [Other FAQ of CLI](#other-faq-of-cli)
|
||||
- [How to select a cluster to use until I change it](#how-to-select-a-cluster-to-use-until-i-change-it)
|
||||
- [How to simplify the command](#how-to-simplify-the-command)
|
||||
- [How to install a different version of SDK](#how-to-install-a-different-version-of-sdk)
|
||||
- [How to specify the `python` environment I want to use in the job container](#how-to-specify-the-python-environment-i-want-to-use-in-the-job-container)
|
||||
- [Python binding](#python-binding)
|
||||
- [Cluster management](#cluster-management)
|
||||
- [Job management](#job-management)
|
||||
- [Make contributions](#make-contributions)
|
||||
- [Release plan](#release-plan)
|
||||
- [Debug the SDK](#debug-the-sdk)
|
||||
- [Unit tests](#unit-tests)
|
||||
|
||||
# Get started
|
||||
|
||||
This section will give guidance about installation, cluster management. User may find more details not covered in the [command line ref](docs/command-line-references.md).
|
||||
|
||||
## Installation
|
||||
|
||||
We provide installing method leveraging `pip install`
|
||||
|
||||
```bash
|
||||
python -m pip install --upgrade pip
|
||||
pip install -U "git+https://github.com/Microsoft/pai@master#egg=openpaisdk&subdirectory=contrib/python-sdk"
|
||||
```
|
||||
|
||||
Refer to [How to install a different version of SDK](#How-to-install-a-different-version-of-SDK) for more details about installing. After installing, please verify by CLI or python binding as below.
|
||||
|
||||
```bash
|
||||
opai -h
|
||||
python -c "from openpaisdk import __version__; print(__version__)"
|
||||
```
|
||||
|
||||
### Dependencies
|
||||
|
||||
- The package requires python3 (mainly because of `type hinting`), and we only tested it on `py3.5+` environment. *Only commands `job sub` and `job notebook` require installing this project inside container, others don't make any constraints of `python` version in the docker container.*
|
||||
- [`Pylon`](https://github.com/microsoft/pai/tree/master/docs/pylon) is required to parse the REST api path like `/reset-server/`.
|
||||
|
||||
## Define your clusters
|
||||
|
||||
Please store the list of your clusters in `~/.openpai/clusters.yaml`. Every cluster would have an alias for calling, and you may save more than one cluster in the list.
|
||||
|
||||
```yaml
|
||||
- cluster_alias: cluster-for-test
|
||||
pai_uri: http://x.x.x.x
|
||||
user: myuser
|
||||
password: mypassword
|
||||
default_storage_alias: hdfs
|
||||
storages:
|
||||
- protocol: webHDFS
|
||||
storage_alias: hdfs
|
||||
web_hdfs_uri: http://x.x.x.x:port
|
||||
|
||||
```
|
||||
|
||||
Now below command shows all your clusters would be displayed.
|
||||
|
||||
```bash
|
||||
opai cluster list
|
||||
```
|
||||
|
||||
# How-to guide for the CLI tool
|
||||
|
||||
This section will brief you how to leverage the CLI tool (prefixed by `opai`) to improve the productivity of interacting with `OpenPAI`. Below is a summary of functions provided.
|
||||
|
||||
| Command | Description |
|
||||
| -------------------------------- | ---------------------------------------------------------------------------------- |
|
||||
| `opai cluster list` | list clusters defined in `~/.openpai/clusters.yaml` |
|
||||
| `opai cluster resources` | list available resources of every cluster (GPUs/vCores/Memory per virtual cluster) |
|
||||
| `opai cluster edit` | open `~/.openpai/clusters.yaml` for your editing |
|
||||
| `opai cluster add` | add a cluster |
|
||||
| `opai cluster attach-hdfs` | attach a `hdfs` storage through `WebHDFS` |
|
||||
| `opai job list` | list all jobs of current user (in a given cluster) |
|
||||
| `opai job submit` | submit a given job config file to cluster |
|
||||
| `opai job sub` | shortcut to generate job config and submit from a given command |
|
||||
| `opai job notebook` | shortcut to run a local notebook remotely |
|
||||
| `opai storage <operation>` | execute `<operation>`* on selected storage (of a given cluster) |
|
||||
|
||||
**: operations include `list`, `status`, `upload`, `download` and `delete`*
|
||||
|
||||
Before starting, we'd like to define some commonly used variables as below.
|
||||
|
||||
| Variable name | CLI options | Description |
|
||||
| ----------------------- | --------------------- | --------------------------------------------- |
|
||||
| `<cluster-alias>` | `--cluster-alias, -a` | alias to specify a particular cluster |
|
||||
| `<job-name>` | `--job-name, -j` | job name |
|
||||
| `<docker-image>` | `--image, -i` | image name (and tag) for the job |
|
||||
| `<workspace>` | `--workspace, -w` | remote storage path to save files for a job * |
|
||||
|
||||
**: if specified, a directory `<workspace>/jobs/<job-name>` and subfolders (e.g. `source`, `output` ...) will be created to store necessary files for the job named `<job-name>`*
|
||||
|
||||
## Cluster and storage management
|
||||
|
||||
### How to list existing clusters
|
||||
|
||||
To list all existing clusters in `~/.openpai/clusters.yaml`, execute below command
|
||||
|
||||
```bash
|
||||
opai cluster list
|
||||
```
|
||||
|
||||
### How to open and edit the cluster configuration file
|
||||
|
||||
We add a convenient shortcut command to open the cluster configuration file with your editor directly by
|
||||
|
||||
```bash
|
||||
opai cluster edit [--editor <path/to/editor>]
|
||||
```
|
||||
|
||||
The default editor is VS Code (`code`), users may change to other editor (e.g. `--editor notepad`).
|
||||
|
||||
## How to check the available resources of clusters
|
||||
|
||||
To check the availability of each cluster, use the command
|
||||
|
||||
```bash
|
||||
opai cluster resources
|
||||
```
|
||||
|
||||
it will return the available GPUs, vCores and memory of every virtual cluster in every cluster.
|
||||
|
||||
User can also check it in a `Python` script as below
|
||||
|
||||
```python
|
||||
from openpaisdk import __cluster_config_file__
|
||||
from openpaisdk.io_utils import from_file
|
||||
from openpaisdk.cluster import ClusterList
|
||||
|
||||
cfg = from_file(__cluster_config_file__, default=[])
|
||||
ClusterList(cfg).available_resources()
|
||||
```
|
||||
|
||||
### How to add / delete a cluster
|
||||
|
||||
User can use `add` and `delete` command to add (or delete) a clusters from the clusters file.
|
||||
|
||||
```bash
|
||||
opai cluster add --cluster-alias <cluster-alias> --pai-uri http://x.x.x.x --user myuser --password mypassword
|
||||
opai cluster delete <cluster-alias>
|
||||
```
|
||||
|
||||
After adding a cluster, user may add more information (such as storage info) to it.
|
||||
|
||||
### How to access storages of a cluster
|
||||
|
||||
Before accessing, user needs to attach storages to a specify cluster.
|
||||
|
||||
```bash
|
||||
opai cluster attach-hdfs --cluster-alias <cluster-alias> --storage-alias hdfs --web-hdfs-uri http://x.x.x.x:port --default
|
||||
```
|
||||
|
||||
It is supported to attach multiple heterogeneous storages (e.g. `HDFS`, `NFS` ...*) to a cluster, and one of the storages will be set as default (to upload local codes). If not defined, the storage firstly added will be set as default.
|
||||
|
||||
After attaching, basic operations (e.g. `list`, `upload`, `download` ...) are provided.
|
||||
|
||||
```bash
|
||||
opai storage list -a <cluster-alias> -s <storage-alias> <remote-path>
|
||||
opai storage download -a <cluster-alias> -s <storage-alias> <remote-path> <local-path>
|
||||
opai storage upload -a <cluster-alias> -s <storage-alias> <local-path> <remote-path>
|
||||
```
|
||||
|
||||
## Job operations
|
||||
|
||||
### How to query my jobs in a cluster
|
||||
|
||||
User could retrieve the list of submitted jobs from a cluster. If more information is wanted, add the `<job-name>` in the command.
|
||||
|
||||
```bash
|
||||
opai job list -a <cluster-alias> [<job-name>]
|
||||
```
|
||||
|
||||
### How to submit a job from existing job config file
|
||||
|
||||
If you already has a job config file, you could submit a job based on it directly. The job config file could be in the format of `json` or `yaml`, and it must be compatible with [job configuration specification v1](https://github.com/microsoft/pai/blob/master/docs/job_tutorial.md) or [pai-job-protocol v2](https://github.com/microsoft/openpai-protocol/blob/master/schemas/v2/schema.yaml).
|
||||
|
||||
```bash
|
||||
opai job submit -a <cluster-alias> <config-file>
|
||||
```
|
||||
|
||||
The CLI would judge whether it is `v1` or `v2` job configuration and call corresponding REST API to submit it.
|
||||
|
||||
### How to change the configuration before submitting
|
||||
|
||||
The CLI tools also provides the function to change some contents of existing job config file before submitting it. For example, we need to change the job name to avoid duplicated names, and maybe want to switch to a virtual cluster with more available resources. Of course, user could change the contents of `jobName` and `virtualCluster` (in `v1` format) or `name` and `virtualCluster` in `defaults` (in `v2` format) manually. But the CLI provides a more efficient and easy way to to the same thing.
|
||||
|
||||
```bash
|
||||
# compatible with v1 specification
|
||||
opai job submit --update name=<job-name> -u defaults:virtualCluster=test <config-file>
|
||||
|
||||
# compatible with v2 specification
|
||||
opai job submit --update jobName=<job-name> -u virtualCluster=test <config-file>
|
||||
```
|
||||
|
||||
### How to submit a job if I have no existing job config file
|
||||
|
||||
It is not convenient to write a job config file (no matter according to `v1` or `v2` specification). For users just want to run a specific command (or a sequence of commands) in the resources of the cluster, the CLI provides a command `sub` (different from`submit`), which could generate the job config file first and then `submit` it.
|
||||
|
||||
For example, user want to run `mnist_cnn.py` in a docker container (the file is contained by the docker image), the command would be
|
||||
|
||||
```bash
|
||||
opai job sub -a <cluster-alias> -i <docker-image> -j <job-name> python mnist_cnn.py
|
||||
```
|
||||
|
||||
### How to request (GPU) resources for the job
|
||||
|
||||
User could apply for specific resources (CPUs, GPUs and Memory) for the job, just by adding below options in above commands
|
||||
|
||||
- `--cpu <#cpu>`
|
||||
|
||||
- `--gpu <#gpu>`
|
||||
|
||||
- `--memoryMB <#memory-in-unit-of-MB>`
|
||||
- `--ports <label-1>=<port-1> [--ports <label-2>=<port-2> [...]]`
|
||||
|
||||
### How to reference a local file when submitting a job
|
||||
|
||||
If the `mnist_cnn.py` is not copied in the docker image and it is a file stored in your local disk, above command would fail due to the file cannot be accessed in remote job container. To solve this problem, the option `--sources mnist_cnn.py` would be added in the command. Since the job container could access local disk directly, we need to upload the file to somewhere (defined by `--workspace`) in [the default storage of the cluster](#How-to-access-storages-of-a-cluster).
|
||||
|
||||
```bash
|
||||
opai job sub -a <cluster-alias> -i <docker-image> -j <job-name> -w <workspace> --sources mnist_cnn.py python mnist_cnn.py
|
||||
```
|
||||
|
||||
### How to submit a job given a sequence of commands
|
||||
|
||||
In some cases, user wants to do a sequence of commands in the job. The recommended way is to put your commands in a pair of quotes (like `"git clone ... && python ..."`) and combine them with `&&` if you have multiple commands to run. Here is an example of combining 3 commands.
|
||||
|
||||
```bash
|
||||
opai job sub [...] "git clone <repo-uri> && cd <repo-dir> && python run.py arg1 arg2 ..."
|
||||
```
|
||||
|
||||
### How to add `pip install` packages
|
||||
|
||||
Of course, you could write a sequence of commands like `pip install ... && python ...` . There is another way which use `--pip-installs <package>` and `--pip-path <path/to/pip>` options in the commands. it will add new commands in the `preCommands` in the `deployment`.
|
||||
|
||||
### How to preview the generated job config but not submit it
|
||||
|
||||
In some cases, user may want to preview the job config (in `v2` format) but not submit it directly. To fulfill this, just add `--preview` option. The commands support this feature includes `job submit`, `job sub` and `job notebook`.
|
||||
|
||||
## `Jupyter` notebook
|
||||
|
||||
### How to run a local notebook with remote resources
|
||||
|
||||
If given a local `<notebook>` (e.g. `mnist_cnn.ipynb` stored in local disk), and user wants to run it remotely (on `OpenPAI`) and see the result.
|
||||
|
||||
```bash
|
||||
opai job notebook -a <cluster-alias> -i <docker-image> -w <workspace> <notebook>
|
||||
```
|
||||
|
||||
This command requires options as the `opai job sub` does. This command would
|
||||
|
||||
- *Local* - upload `<notebook>` to `<workspace>/jobs/<job-name>/source` and submit the job to cluster (`<job-name>` is set to `<notebook>_<random-string>` if not defined)
|
||||
- *In job container* - download `<notebook>` and execute it by `jupyter nbconver --execute`, the result would be saved in `<html-result>` with the same name (`*.html`)
|
||||
- *In job container* - upload `<html-result>` to `<workspace>/jobs/<job-name>/output`
|
||||
- *Local* - wait and query the job state until its status to be `SUCCEEDED`
|
||||
- *Local* - download `<html-result>` to local and open it with web browser
|
||||
|
||||
### How to launch a remote `Jupyter` server and connect it
|
||||
|
||||
Sometimes user may want to launch a remote `Jupyter` server and do some work on it interactively. To do this, just add `--interactive` in `job notebook` command. After submitting the job, a link like `http://x.x.x.x:port/notebooks/<notebook>` will be opened in your browser. Since it takes a while to start the container, please wait and refresh the page until the notebook opens. Use the default token `abcd` (unless it is overridden by `--token <token>`) to login the notebook.
|
||||
|
||||
## Other FAQ of CLI
|
||||
|
||||
### How to select a cluster to use until I change it
|
||||
|
||||
As shown in above examples, `--cluster-alias, -a` is required by lots of commands, but it may not be changed frequently. So it is annoying to type it every time. The CLI tool provides a command to select a cluster to use by
|
||||
|
||||
opai cluster select [-g] <cluster-alias>
|
||||
|
||||
|
||||
Commands after `opai cluster select` will have a default option (if necessary) `--cluster-alias <cluster-alias>`, which can be overwritten explicitly. The mechanism and priority sequence is the same to below section.
|
||||
|
||||
### How to simplify the command
|
||||
|
||||
The mechanism behind `opai cluster select` command help us to simplify the command further. For example, we could set `--workspace, -w` with a default value by
|
||||
|
||||
```bash
|
||||
opai set [-g] workspace=<workspace>
|
||||
```
|
||||
|
||||
The SDK will first load (`~/.openpai/defaults.yaml`), and then update them with the contents in `.openpai/defaults.yaml` in your current working directory. In every command requires a `--workspace, -w` option but no value defined, the default value would be used.
|
||||
|
||||
Some commonly used default variables includes
|
||||
|
||||
- `cluster-alias=<cluster-alias>`
|
||||
- `image=<docker-image>`
|
||||
- `workspace=<workspace>`
|
||||
- `sdk-branch=<sdk-branch-tag>` which branch to use when install the sdk in job container
|
||||
|
||||
### How to install a different version of SDK
|
||||
|
||||
User could easily switch to another version of SDK both in local environment and in job container. In local environment, user just change `<your/branch>` to another branch (e.g. `pai-0.14.y` for `OpenPAI` end-June release or a feature developing branch for the canary version).
|
||||
|
||||
```bash
|
||||
pip install -U "git+https://github.com/Microsoft/pai@<your/branch>#egg=openpaisdk&subdirectory=contrib/python-sdk"
|
||||
```
|
||||
|
||||
To debug a local update, just use `pip install -U your/path/to/setup.py`.
|
||||
|
||||
For jobs submitted by the SDK or command line tool, the version specified by `opai set sdk-branch=<your/version>` would be used firstly. If not specified, `master` branch will be used.
|
||||
|
||||
### How to specify the `python` environment I want to use in the job container
|
||||
|
||||
In some cases, there are more than one `python` environments in a docker image. For example, there are both `python` and `python3` environments in `openpai/pai.example.keras.tensorflow`. User could add `--python <path/to/python>` (e.g. `--python python3`) in the command `job notebook` or `job sub` to use the specific `python` environment. Refer to [notebook example](examples/1-submit-and-query-via-command-line.ipynb) for more details.
|
||||
|
||||
# Python binding
|
||||
|
||||
## Cluster management
|
||||
|
||||
- [x] User can describe a cluster with `openpaisdk.core.ClusterList` class to describe multiple clusters
|
||||
|
||||
```python
|
||||
clusters = ClusterList().load() # defaultly loaded from "~/.openpai/clusters.yaml"
|
||||
```
|
||||
|
||||
User `add`, `delete` methods to update clusters, `select` and `get_client` methods to select one from multiple clusters
|
||||
|
||||
- [x] the `Cluster` class has methods to query and submit jobs
|
||||
|
||||
```python
|
||||
client = clusters.get_client(alias)
|
||||
client.jobs(name)
|
||||
client.rest_api_submit(job_config)
|
||||
```
|
||||
|
||||
- [x] the `Cluster` class has methods to access storage (through `WebHDFS` only for this version)
|
||||
|
||||
```python
|
||||
Cluster(...).storage.upload/download(...)
|
||||
```
|
||||
|
||||
## Job management
|
||||
|
||||
- [x] User can describe a job with `openpaisdk.core.Job` class, which is compatible with the v2 protocol
|
||||
|
||||
```python
|
||||
job = Job(name)
|
||||
job.submit(cluster_alias) # submit current job to a cluster
|
||||
```
|
||||
|
||||
- [x] provide some quick template of simple jobs
|
||||
|
||||
```python
|
||||
job.one_liner(...) # generate job config from a command
|
||||
job.from_notebook(...) # turn notebook to job
|
||||
```
|
||||
|
||||
# Make contributions
|
||||
|
||||
User may open issues and feature requests on [Github](https://github.com/microsoft/pai).
|
||||
|
||||
## Release plan
|
||||
|
||||
If there are functions requests not included, please open an issue for feature request.
|
||||
|
||||
## Debug the SDK
|
||||
|
||||
For users those want to improve the functions themselves, you may create the branch of `OpenPAI` project, and make modifications locally. And then set your own branch to the SDK installation source by
|
||||
|
||||
```bash
|
||||
opai set sdk-branch=<your/branch>
|
||||
```
|
||||
|
||||
Then the `pip install` command in the job container would use `<your/branch>` . User may check the generated job config to check.
|
||||
|
||||
To set the internal logger to debug level, create an empty file `.openpai/debug_enable` to let sdk enable debugging logging. And remove the empty file make it work normally.
|
||||
|
||||
## Unit tests
|
||||
|
||||
Please execute below command under the `tests` directory to have a quick unit test.
|
||||
|
||||
```bash
|
||||
python -m unittest discover
|
||||
```
|
|
@ -1,149 +0,0 @@
|
|||
# 1. Get started
|
||||
|
||||
This section will give guidance about installation, cluster management and setting up the variables frequently used. Refer to README for more details.
|
||||
|
||||
## 1.1. Installation
|
||||
|
||||
Refer to [README](../README.md#21-Installation) for how to install the sdk and specify your cluster information.
|
||||
|
||||
## 1.2. Set default values
|
||||
|
||||
It is annoying that specify some arguments every time, (e.g. `-a <alias>` or `-i <image>`). During the workflow, user may often reference some variables without changing. For example, it is usually to use the same docker image for multiple jobs, and the storage root doesn't change either. To simplify, it is suggested setting them by `default` command, which would be stored in `.opanpai/defaults.json` in current working directory.
|
||||
|
||||
```bash
|
||||
opai set [<variable1>=<value1> [<var2>=<val2> [...]]]
|
||||
opai unset <variable1> [<var2> [...]]
|
||||
```
|
||||
|
||||
Here are some frequently used variables.
|
||||
|
||||
| Variable | Description |
|
||||
| -- | -- |
|
||||
| `cluster-alias` | the alias to select which cluster to connect |
|
||||
| `image` | docker image name (and tag) to use |
|
||||
| `workspace` | the root path in remote storage to store job information (`<workspace>/jobs/<job-name>`) |
|
||||
|
||||
<font color=blue>_Note: some required arguments in below examples are set in defaults (and ignored in the examples), please refer to `help` information by `-h` or `--help`_</font>
|
||||
|
||||
# 2. CLI tools
|
||||
|
||||
The command line tool `opai` provides several useful subcommands.
|
||||
|
||||
| Scene | Action | Description |
|
||||
| -- | -- | -- |
|
||||
| `cluster` | `list` | cluster configuration management |
|
||||
| `storage` | `list`, `status`, `upload`, `download`, `delete` | remote storage access |
|
||||
| `job` | `list`, `new`, `submit`, `sub` | query, create and summit a job |
|
||||
| `task` | `add` | add a task role to a job |
|
||||
| `require` | `pip`, `weblink` | add requirements (prerequisites) to a job or task role |
|
||||
| `runtime` | `execute` | python SDK run as the runtime |
|
||||
|
||||
## 2.1. Query your existing jobs
|
||||
|
||||
By executing below commands, all your existing job names would be displayed.
|
||||
|
||||
```bash
|
||||
opai job list [-a <alias>] [<job-name>] [{config,ssh}]
|
||||
```
|
||||
|
||||
## 2.2. Submit a job with an existing config file
|
||||
|
||||
Of course, you could submit a job from a job config `Json` file by
|
||||
|
||||
```bash
|
||||
opai job submit [-a <alias>] --config <your-job-config-file>
|
||||
```
|
||||
|
||||
## 2.3. Submit a job step by step from sketch up
|
||||
|
||||
To submit a job from sketch, user need to `create` the job (it would be cached in `.openpai/jobs/<job-name>`). Then task roles could be added by `task` command one by one, and `submit` commond would dump the job config to `.openpai/jobs/<job-name>/config.json` and submit it through `REST` API.
|
||||
|
||||
```bash
|
||||
opai job new [-a <alias>] -j <job-name> [-i <image>] [-s <source-file>]
|
||||
opai task -t <name-1> [-n <num>] [--gpu <gpu>] [--cpu <cpu>] [--mem <memMB>] python ...
|
||||
opai task -t <name-2> [-n <num>] [--gpu <gpu>] [--cpu <cpu>] [--mem <memMB>] python ...
|
||||
opai job submit [--preview]
|
||||
```
|
||||
|
||||
## 2.4. Add requirements (prerequisites)
|
||||
|
||||
It is common scenarios that users would prepare their environments by add requirements, such as installing python packages, mapping data storages. The prerequisites can apply to a specific task role (if both `--job-name, -j` and `--task-role-name, -t` specified) or to all task roles in the job (if only `--job-name` specified).
|
||||
|
||||
```bash
|
||||
opai require pip ...
|
||||
opai require weblink http://x.x.x.x/filename.zip /data
|
||||
```
|
||||
|
||||
In the above command, user can specify `--job-name <job-name>` (required) and `--task-role-name <task-role-name>` (optional). If task role name is specified, the command only applies to the specific task role, otherwise, it is for the job (all task roles).
|
||||
|
||||
Now we support
|
||||
|
||||
- python `pip` packages
|
||||
- data mapping with weblink
|
||||
|
||||
## 2.5. Submit one-line job in command line
|
||||
|
||||
For the jobs that are simple (e.g. with only one task role), the CLI tool provides a shortcut to combine create, task and submit into only one command `sub`.
|
||||
|
||||
If your job only has one task role and its command looks like `python script.py arg1 arg2`, you may submit it in a simplest way like
|
||||
|
||||
```bash
|
||||
opai job sub -j <job-name> [-a <alias>] [-i <your-image>] python script.py arg1 arg2
|
||||
```
|
||||
|
||||
## 2.6. _InProgress_ Job management and fetching outputs
|
||||
|
||||
The SDK provides simple job management based folder structure on _remote_ storage. It is recommended to upload user logging or results to the output directory.
|
||||
|
||||
|
||||
```bash
|
||||
workspace (remote storage)
|
||||
└─jobs
|
||||
└─job-name-1
|
||||
├─code
|
||||
└─output
|
||||
└─job-name-2
|
||||
├─code
|
||||
└─output
|
||||
```
|
||||
|
|
||||
The `workspace` and output directory path would be passed to job container by `PAI_SDK_JOB_WORKSPACE` and `PAI_SDK_JOB_OUTPUT_DIR`.
|
||||
|
||||
User can use below commands to fetch the outputs.
|
||||
|
||||
```bash
|
||||
opai output list [-j <job-name>]
|
||||
opai output download [-j <job-name>] <output-name> [<output-name-1> [...]]
|
||||
opai output peek [-j <job-name>] [--stdout] [--stdin] [--save <local-copy-name>]
|
||||
```
|
||||
|
||||
## 2.7. Storage access
|
||||
|
||||
```bash
|
||||
opai storage list <remote-path>
|
||||
opai storage delete <remote-path>
|
||||
opai storage status <remote-path>
|
||||
opai storage upload [--overwrite] <local-path> <remote-path>
|
||||
opai storage download <remote-path> <local-path>
|
||||
```
|
||||
|
||||
The `HDFS` accessing is implemented by the package `hdfs`, the backend of which is through `webHDFS` API.
|
||||
|
||||
## 2.8. _InProgress_ Job cloning and batch submitting
|
||||
|
||||
The advanced function like job cloning has been proven to be very useful. User can clone from a local job config file or an existing job name. And user may change some parameters (nested in dictionary path joined by `::`) to a new value.
|
||||
|
||||
```bash
|
||||
opai job clone --from <job-name-or-config> -j <new-job-name> <parameter::path::config>=<new-value> [...]
|
||||
```
|
||||
|
||||
It is natural to try submitting multiple jobs with only small changes in the config.
|
||||
|
||||
```python
|
||||
from subprocess import check_call
|
||||
# base job
|
||||
check_call(f'opai job sub -j base_job --env LR=0.001 python train.py $LR'.split())
|
||||
# batch submit
|
||||
for lr in ["0.005", "0.01"]:
|
||||
check_call(f'opai job clone --from base_job -j bj_lr_{lr} jobEnvs::LR={lr}'.split())
|
||||
```
|
|
@ -1,148 +0,0 @@
|
|||
# 1. Get started
|
||||
|
||||
This section will give guidance about installation, cluster management and setting up the variables frequently used. Refer to README for more details.
|
||||
|
||||
## 1.1. Installation
|
||||
|
||||
Refer to [README](../README.md#21-Installation) for how to install the sdk and specify your cluster information.
|
||||
|
||||
## 1.2. Set default values
|
||||
|
||||
It is annoying that specify some arguments every time, (e.g. `-a <alias>` or `-i <image>`). During the workflow, user may often reference some variables without changing. For example, it is usually to use the same docker image for multiple jobs, and the storage root doesn't change either. To simplify, it is suggested setting them by `default` command, which would be stored in `.opanpai/defaults.json` in current working directory.
|
||||
|
||||
```bash
|
||||
opai set [<variable1>=<value1> [<var2>=<val2> [...]]]
|
||||
opai unset <variable1> [<var2> [...]]
|
||||
```
|
||||
|
||||
Here are some frequently used variables.
|
||||
|
||||
| Variable | Description |
|
||||
| --------------- | ---------------------------------------------------------------------------------------------------- |
|
||||
| `cluster-alias` | the alias to select which cluster to connect |
|
||||
| `image` | docker image name (and tag) to use |
|
||||
| `workspace` | the root path in remote storage to store job information (`<workspace>/jobs/<job-name>`) |
|
||||
|
||||
<font color=blue>_Note: some required arguments in below examples are set in defaults (and ignored in the examples), please refer to `help` information by `-h` or `--help`_</font>
|
||||
|
||||
# 2. CLI tools
|
||||
|
||||
The command line tool `opai` provides several useful subcommands.
|
||||
|
||||
| Scene | Action | Description |
|
||||
| --------- | ------------------------------------------------ | ------------------------------------------------------ |
|
||||
| `cluster` | `list` | cluster configuration management |
|
||||
| `storage` | `list`, `status`, `upload`, `download`, `delete` | remote storage access |
|
||||
| `job` | `list`, `new`, `submit`, `sub` | query, create and summit a job |
|
||||
| `task` | `add` | add a task role to a job |
|
||||
| `require` | `pip`, `weblink` | add requirements (prerequisites) to a job or task role |
|
||||
| `runtime` | `execute` | python SDK run as the runtime |
|
||||
|
||||
## 2.1. Query your existing jobs
|
||||
|
||||
By executing below commands, all your existing job names would be displayed.
|
||||
|
||||
```bash
|
||||
opai job list [-a <alias>] [<job-name>] [{config,ssh}]
|
||||
```
|
||||
|
||||
## 2.2. Submit a job with an existing config file
|
||||
|
||||
Of course, you could submit a job from a job config `Json` file by
|
||||
|
||||
```bash
|
||||
opai job submit [-a <alias>] --config <your-job-config-file>
|
||||
```
|
||||
|
||||
## 2.3. Submit a job step by step from sketch up
|
||||
|
||||
To submit a job from sketch, user need to `create` the job (it would be cached in `.openpai/jobs/<job-name>`). Then task roles could be added by `task` command one by one, and `submit` commond would dump the job config to `.openpai/jobs/<job-name>/config.json` and submit it through `REST` API.
|
||||
|
||||
```bash
|
||||
opai job new [-a <alias>] -j <job-name> [-i <image>] [-s <source-file>]
|
||||
opai task -t <name-1> [-n <num>] [--gpu <gpu>] [--cpu <cpu>] [--mem <memMB>] python ...
|
||||
opai task -t <name-2> [-n <num>] [--gpu <gpu>] [--cpu <cpu>] [--mem <memMB>] python ...
|
||||
opai job submit [--preview]
|
||||
```
|
||||
|
||||
## 2.4. Add requirements (prerequisites)
|
||||
|
||||
It is common scenarios that users would prepare their environments by add requirements, such as installing python packages, mapping data storages. The prerequisites can apply to a specific task role (if both `--job-name, -j` and `--task-role-name, -t` specified) or to all task roles in the job (if only `--job-name` specified).
|
||||
|
||||
```bash
|
||||
opai require pip ...
|
||||
opai require weblink http://x.x.x.x/filename.zip /data
|
||||
```
|
||||
|
||||
In the above command, user can specify `--job-name <job-name>` (required) and `--task-role-name <task-role-name>` (optional). If task role name is specified, the command only applies to the specific task role, otherwise, it is for the job (all task roles).
|
||||
|
||||
Now we support
|
||||
|
||||
- python `pip` packages
|
||||
- data mapping with weblink
|
||||
|
||||
## 2.5. Submit one-line job in command line
|
||||
|
||||
For the jobs that are simple (e.g. with only one task role), the CLI tool provides a shortcut to combine create, task and submit into only one command `sub`.
|
||||
|
||||
If your job only has one task role and its command looks like `python script.py arg1 arg2`, you may submit it in a simplest way like
|
||||
|
||||
```bash
|
||||
opai job sub -j <job-name> [-a <alias>] [-i <your-image>] python script.py arg1 arg2
|
||||
```
|
||||
|
||||
## 2.6. *InProgress* Job management and fetching outputs
|
||||
|
||||
The SDK provides simple job management based folder structure on *remote* storage. It is recommended to upload user logging or results to the output directory.
|
||||
|
||||
```bash
|
||||
workspace (remote storage)
|
||||
└─jobs
|
||||
└─job-name-1
|
||||
├─code
|
||||
└─output
|
||||
└─job-name-2
|
||||
├─code
|
||||
└─output
|
||||
```
|
||||
|
||||
| The `workspace` and output directory path would be passed to job container by `PAI_SDK_JOB_WORKSPACE` and `PAI_SDK_JOB_OUTPUT_DIR`.
|
||||
|
||||
User can use below commands to fetch the outputs.
|
||||
|
||||
```bash
|
||||
opai output list [-j <job-name>]
|
||||
opai output download [-j <job-name>] <output-name> [<output-name-1> [...]]
|
||||
opai output peek [-j <job-name>] [--stdout] [--stdin] [--save <local-copy-name>]
|
||||
```
|
||||
|
||||
## 2.7. Storage access
|
||||
|
||||
```bash
|
||||
opai storage list <remote-path>
|
||||
opai storage delete <remote-path>
|
||||
opai storage status <remote-path>
|
||||
opai storage upload [--overwrite] <local-path> <remote-path>
|
||||
opai storage download <remote-path> <local-path>
|
||||
```
|
||||
|
||||
The `HDFS` accessing is implemented by the package `hdfs`, the backend of which is through `webHDFS` API.
|
||||
|
||||
## 2.8. *InProgress* Job cloning and batch submitting
|
||||
|
||||
The advanced function like job cloning has been proven to be very useful. User can clone from a local job config file or an existing job name. And user may change some parameters (nested in dictionary path joined by `::`) to a new value.
|
||||
|
||||
```bash
|
||||
opai job clone --from <job-name-or-config> -j <new-job-name> <parameter::path::config>=<new-value> [...]
|
||||
```
|
||||
|
||||
It is natural to try submitting multiple jobs with only small changes in the config.
|
||||
|
||||
```python
|
||||
from subprocess import check_call
|
||||
# base job
|
||||
check_call(f'opai job sub -j base_job --env LR=0.001 python train.py $LR'.split())
|
||||
# batch submit
|
||||
for lr in ["0.005", "0.01"]:
|
||||
check_call(f'opai job clone --from base_job -j bj_lr_{lr} jobEnvs::LR={lr}'.split())
|
||||
```
|
|
@ -1,18 +0,0 @@
|
|||
```mermaid
|
||||
sequenceDiagram
|
||||
participant FE as Front End or Plugins
|
||||
participant Launcher as OpenPAI Core
|
||||
participant RT as Runtime (in container)
|
||||
Note left of FE: User
|
||||
FE->>FE: prepare data & codes *
|
||||
FE->>Launcher: submit a job *
|
||||
Launcher->>+RT: pass info through Protocol
|
||||
Note right of RT: parse protocol *
|
||||
Note over RT, Storage: access data (if any) *
|
||||
Note right of RT: execute cmds *
|
||||
Note right of RT: callbacks *
|
||||
RT->>Storage: save annotated files *
|
||||
RT->>-Launcher: exit container
|
||||
FE->>Launcher: query job info *
|
||||
FE->>Storage: fetch job outputs *
|
||||
```
|
До Ширина: | Высота: | Размер: 14 KiB |
|
@ -1,18 +0,0 @@
|
|||
```mermaid
|
||||
sequenceDiagram
|
||||
participant FE as Front End or Plugins
|
||||
participant Launcher as OpenPAI Core
|
||||
participant RT as Runtime (in container)
|
||||
Note left of FE: User
|
||||
FE->>FE: prepare data & codes *
|
||||
FE->>Launcher: submit a job *
|
||||
Launcher->>+RT: pass info through Protocol
|
||||
Note right of RT: parse protocol *
|
||||
Note over RT, Storage: access data (if any) *
|
||||
Note right of RT: execute cmds *
|
||||
Note right of RT: callbacks *
|
||||
RT->>Storage: save annotated files *
|
||||
RT->>-Launcher: exit container
|
||||
FE->>Launcher: query job info *
|
||||
FE->>Storage: fetch job outputs *
|
||||
```
|
|
@ -1,61 +0,0 @@
|
|||
# 1. Python binding
|
||||
|
||||
After installing the SDK, there is a package named `openpaisdk` that can be imported in python code. Here are some classes being frequently used.
|
||||
|
||||
```python
|
||||
from openpaisdk.core import Client # OpenPAI client
|
||||
from openpaisdk.job import Job # job description
|
||||
from openpaisdk.command_line import Engine # command dispatcher
|
||||
```
|
||||
|
||||
## 1.1. Detect your execution environment
|
||||
|
||||
In your code, you may use `openpaisdk.core.in_job_container` to indicate where you are. This let you to do different things according to your environment.
|
||||
|
||||
```python
|
||||
from openpaisdk.core import in_job_container
|
||||
# help(in_job_container) for more details
|
||||
if in_job_container():
|
||||
pass
|
||||
else:
|
||||
pass
|
||||
```
|
||||
|
||||
This function is implemented by checking whether some environmental variable (e.g. `PAI_CONTAINER_ID` is set to a non-empty value).
|
||||
|
||||
## 1.2. Do it in easy way
|
||||
|
||||
To unify the interface and simplifying user's learning cost, user can do whatever CLI provides in their python code in a similar way by calling `Engine`. For example, the following lines query all existing jobs submitted by current user in cluster named `your-alias`.
|
||||
|
||||
```python
|
||||
from openpaisdk.command_line import Engine
|
||||
|
||||
job_name_list = Engine().process(['job', 'list', '--name', '-a', 'your-alias'])
|
||||
```
|
||||
|
||||
The advantages of this way over using `os.system()` or `subprocess.check_call` lies in (a) avoid overhead and (b) get the structued result (no need to parsing the text output). And this way can guarantee the consistency between CLI and python binding.
|
||||
|
||||
## 1.3. Do it in a more pythoic way
|
||||
|
||||
Since someone may not like above solution, of course, user can use the code snippets behind CLI. Here is the code to do the same thing.
|
||||
|
||||
```python
|
||||
from openpaisdk.core import Client
|
||||
from openpaisdk import __cluster_config_file__
|
||||
|
||||
client, _ = Client.from_json(__cluster_config_file__, 'your-alias')
|
||||
job_name_list = client.jobs(name_only=True)
|
||||
```
|
||||
|
||||
## 1.4. Submit your working notebook running in local server
|
||||
|
||||
If you are working in your local `Jupyter` notebook, add below cell and execute it would submit a job.
|
||||
|
||||
```python
|
||||
from openpaisdk.notebook import submit_notebook
|
||||
from openpaisdk.core import in_job_container
|
||||
# help(submit_notebook) for more details
|
||||
if not in_job_container():
|
||||
job_link = submit_notebook()
|
||||
print(job_link)
|
||||
```
|
|
@ -1,61 +0,0 @@
|
|||
# 1. Python binding
|
||||
|
||||
After installing the SDK, there is a package named `openpaisdk` that can be imported in python code. Here are some classes being frequently used.
|
||||
|
||||
```python
|
||||
from openpaisdk.core import Client # OpenPAI client
|
||||
from openpaisdk.job import Job # job description
|
||||
from openpaisdk.command_line import Engine # command dispatcher
|
||||
```
|
||||
|
||||
## 1.1. Detect your execution environment
|
||||
|
||||
In your code, you may use `openpaisdk.core.in_job_container` to indicate where you are. This let you to do different things according to your environment.
|
||||
|
||||
```python
|
||||
from openpaisdk.core import in_job_container
|
||||
# help(in_job_container) for more details
|
||||
if in_job_container():
|
||||
pass
|
||||
else:
|
||||
pass
|
||||
```
|
||||
|
||||
This function is implemented by checking whether some environmental variable (e.g. `PAI_CONTAINER_ID` is set to a non-empty value).
|
||||
|
||||
## 1.2. Do it in easy way
|
||||
|
||||
To unify the interface and simplifying user's learning cost, user can do whatever CLI provides in their python code in a similar way by calling `Engine`. For example, the following lines query all existing jobs submitted by current user in cluster named `your-alias`.
|
||||
|
||||
```python
|
||||
from openpaisdk.command_line import Engine
|
||||
|
||||
job_name_list = Engine().process(['job', 'list', '--name', '-a', 'your-alias'])
|
||||
```
|
||||
|
||||
The advantages of this way over using `os.system()` or `subprocess.check_call` lies in (a) avoid overhead and (b) get the structued result (no need to parsing the text output). And this way can guarantee the consistency between CLI and python binding.
|
||||
|
||||
## 1.3. Do it in a more pythoic way
|
||||
|
||||
Since someone may not like above solution, of course, user can use the code snippets behind CLI. Here is the code to do the same thing.
|
||||
|
||||
```python
|
||||
from openpaisdk.core import Client
|
||||
from openpaisdk import __cluster_config_file__
|
||||
|
||||
client, _ = Client.from_json(__cluster_config_file__, 'your-alias')
|
||||
job_name_list = client.jobs(name_only=True)
|
||||
```
|
||||
|
||||
## 1.4. Submit your working notebook running in local server
|
||||
|
||||
If you are working in your local `Jupyter` notebook, add below cell and execute it would submit a job.
|
||||
|
||||
```python
|
||||
from openpaisdk.notebook import submit_notebook
|
||||
from openpaisdk.core import in_job_container
|
||||
# help(submit_notebook) for more details
|
||||
if not in_job_container():
|
||||
job_link = submit_notebook()
|
||||
print(job_link)
|
||||
```
|
|
@ -1,62 +0,0 @@
|
|||
# 1. _ToDiscuss_ Python SDK as a runtime
|
||||
|
||||
When submitting a job through the SDK (CLI or python binding), the SDK would be isntalled inside the job container automatically by default (turn off by adding `--disable-sdk-install` in `job create`).
|
||||
|
||||
## 1.1. Reconstruct the client in job container
|
||||
|
||||
The SDK has passed necessary information to job container through the `__clusters__` and `__defaults__` items of the `extras` part in job config file, and the `runtime` command will save them to `~/.openpai/clusters.json` and `.opanpai/defaults.json` respectively.
|
||||
|
||||
## 1.2. User can customize callbacks before or after the command executation
|
||||
|
||||
This is similar to the pre- or post- commands in protocol v2.
|
||||
|
||||
## 1.3. User can customize callbacks when exception raised
|
||||
|
||||
This is for debugging.
|
||||
|
||||
## 1.4. Implementation
|
||||
|
||||
An ideal implementation is SDK provides some decorators for registering callbacks. Here is an example.
|
||||
|
||||
```python
|
||||
# original codes
|
||||
...
|
||||
|
||||
def main(args):
|
||||
...
|
||||
|
||||
if __name__ == "__main__":
|
||||
...
|
||||
result = main(args)
|
||||
...
|
||||
```
|
||||
|
||||
After customizing callbacks, it may look like
|
||||
|
||||
```python
|
||||
# for openpai
|
||||
|
||||
from openpai.runtime import Runtime
|
||||
|
||||
app = Runtime.from_env()
|
||||
|
||||
@app.on('start')
|
||||
def pre_commands(...): # if not defined, use that generated from job config
|
||||
...
|
||||
|
||||
@app.on('end')
|
||||
def post_commands(...): # if not defined, use that generated from job config
|
||||
...
|
||||
|
||||
@app.on('main')
|
||||
def main(args):
|
||||
...
|
||||
|
||||
if __name__ == "__main__":
|
||||
...
|
||||
result = app.run(args)
|
||||
...
|
||||
|
||||
```
|
||||
|
||||
_Note: the RunTime may only be triggered when in_job_container() is true, or some user-defined conditions_
|
|
@ -1,62 +0,0 @@
|
|||
# 1. *ToDiscuss* Python SDK as a runtime
|
||||
|
||||
When submitting a job through the SDK (CLI or python binding), the SDK would be isntalled inside the job container automatically by default (turn off by adding `--disable-sdk-install` in `job create`).
|
||||
|
||||
## 1.1. Reconstruct the client in job container
|
||||
|
||||
The SDK has passed necessary information to job container through the `__clusters__` and `__defaults__` items of the `extras` part in job config file, and the `runtime` command will save them to `~/.openpai/clusters.json` and `.opanpai/defaults.json` respectively.
|
||||
|
||||
## 1.2. User can customize callbacks before or after the command executation
|
||||
|
||||
This is similar to the pre- or post- commands in protocol v2.
|
||||
|
||||
## 1.3. User can customize callbacks when exception raised
|
||||
|
||||
This is for debugging.
|
||||
|
||||
## 1.4. Implementation
|
||||
|
||||
An ideal implementation is SDK provides some decorators for registering callbacks. Here is an example.
|
||||
|
||||
```python
|
||||
# original codes
|
||||
...
|
||||
|
||||
def main(args):
|
||||
...
|
||||
|
||||
if __name__ == "__main__":
|
||||
...
|
||||
result = main(args)
|
||||
...
|
||||
```
|
||||
|
||||
After customizing callbacks, it may look like
|
||||
|
||||
```python
|
||||
# for openpai
|
||||
|
||||
from openpai.runtime import Runtime
|
||||
|
||||
app = Runtime.from_env()
|
||||
|
||||
@app.on('start')
|
||||
def pre_commands(...): # if not defined, use that generated from job config
|
||||
...
|
||||
|
||||
@app.on('end')
|
||||
def post_commands(...): # if not defined, use that generated from job config
|
||||
...
|
||||
|
||||
@app.on('main')
|
||||
def main(args):
|
||||
...
|
||||
|
||||
if __name__ == "__main__":
|
||||
...
|
||||
result = app.run(args)
|
||||
...
|
||||
|
||||
```
|
||||
|
||||
*Note: the RunTime may only be triggered when in_job_container() is true, or some user-defined conditions*
|
|
@ -1,66 +0,0 @@
|
|||
# 1. Benefits and scenarios
|
||||
|
||||
## 1.1. Easily accessible `OpenPAI` interface
|
||||
|
||||
- **User can easily access `OpenPAI` resources in scripts (`Python` or `Shell`) and `Jupyter` notebooks**
|
||||
|
||||
The SDK provides classes to describe the clusters (`openpaisdk.core.Cluster`) and jobs (`openpaisdk.job.Job`). The Cluster class wraps necessary REST apis for convenient operations. The Job class is an implementation of the [protocol](https://github.com/microsoft/openpai-protocol/blob/master/schemas/v2/schema.yaml), with which user can easily organize (add or edit) the content of job `yaml` and `json` configuration.
|
||||
|
||||
Besides the wrapping of APIs, the SDK also provides functions to facilitate user to utilize `OpenPAI`. Such functions includes *cluster management*, *storage accessing*, *execution environment detection (local or in a job container)*.
|
||||
|
||||
_Refer to [this doc]() for more details of Python binding_
|
||||
|
||||
- **User can submit and list jobs by simple commands**
|
||||
|
||||
This SDK provides a command line interface with prefix (`opai`). User can complete basic and advanced operations in simple commands, e.g.
|
||||
|
||||
```bash
|
||||
# query jobs
|
||||
opai job list
|
||||
# submit an existing job config file
|
||||
opai job submit --config your/job/config/file
|
||||
# submit a job in one line
|
||||
opai job sub --image your/docker/image --gpu 1 some/commands
|
||||
# storage access
|
||||
opai storage upload/download/list ...
|
||||
```
|
||||
|
||||
_Refer to [command-line-references.md](command-line-references.md) or execute `opai -h` for more details about the command line interface_
|
||||
|
||||
- **User can easily accomplish complicated operations with `OpenPAI`**
|
||||
|
||||
For some advanced users or tools running on `OpenPAI` (e.g. [NNI]()), it is quite convenient to provide a way to let user can complete operations. For example, user can submit tens of jobs to optimize a parameter in a simple `for-loop`, however, it is not so convenient if users have to do it manually.
|
||||
|
||||
- **User can easily reuse local codes**
|
||||
|
||||
`OpenPAI` is quite efficient in utilizing powerful computing resources to run deep learning jobs. However, user have to make their codes and environment ready first. One of the common way is to start a long-running interactive job and write (debug) codes in it before really execution. There are two disadvantages, one is the inconvenience of remoting debugging, the other is the wasting of computing resources.
|
||||
|
||||
The SDK aims to solve the problem, by letting user codes locally and executes on `OpenPAI`. For example, user can code and debug in a local running notebook first, and use `openpaisdk.notebook.submit_notebook` to turn it to a jobs with only a few lines.
|
||||
|
||||
## 1.2. Powerful runtime support
|
||||
|
||||
By installing this package in the docker container, the SDK can run as part of the runtime
|
||||
|
||||
- **It can provide more powerful built-in functions than `pre-commands` and `post-commands`**
|
||||
|
||||
The current `OpenPAI` leverages pre-commands and post-commands to do necessary operations before or after user commands. However, it is limited by the representation capability of shell commands. It would be quite hard to specify complicated behaviors. For examples, some operations (e.g. storage mounting) requires conditional operations according to OS versions. It is hard to implement in pre-commands, however, easy to do by a function in SDK.
|
||||
|
||||
- **It provide basic job management based on workspace and job folder structure**
|
||||
|
||||
For jobs submitted by the SDK (or CLI), a storage structure will be constructed for it. The SDK will create `code` and `output` (or others if required) directory in `<workspace>/jobs/<job-name>`. The SDK or CLI also provides interfaces to access them.
|
||||
|
||||
- **It can let user annotate output files to be saved before exiting the container**
|
||||
|
||||
User can annotate some files (or folders) to be uploaded during submitting the job.
|
||||
|
||||
- **It can provide a mechanism to execute certain callbacks at specified scenarios**
|
||||
|
||||
We provide pre- and post- commands in current implementation, however, the SDK would try to let user specify behaviors at other cases. For example, user can specify what to do if user commands have a non-zero exit return code.
|
||||
|
||||
## 1.3. Unified workflow
|
||||
|
||||
In the new implementation, the [job protocol]() would bridge user specification and the real execution of the job. The SDK is one of the implementations of the protocol, which includes functions to organize, edit, parse and execute the protocol as user's expectation.
|
||||
|
||||
![program model](C:/Users/yuqyang.FAREAST/Workings/pai/contrib/python-sdk/docs/medias/programming_model.svg)
|
||||
|
||||
_*: the functions provided by the SDK or CLI_
|
|
@ -1,66 +0,0 @@
|
|||
# 1. Benefits and scenarios
|
||||
|
||||
## 1.1. Easily accessible `OpenPAI` interface
|
||||
|
||||
- **User can easily access `OpenPAI` resources in scripts (`Python` or `Shell`) and `Jupyter` notebooks**
|
||||
|
||||
The SDK provides classes to describe the clusters (`openpaisdk.core.Cluster`) and jobs (`openpaisdk.job.Job`). The Cluster class wraps necessary REST apis for convenient operations. The Job class is an implementation of the [protocol](https://github.com/microsoft/openpai-protocol/blob/master/schemas/v2/schema.yaml), with which user can easily organize (add or edit) the content of job `yaml` and `json` configuration.
|
||||
|
||||
Besides the wrapping of APIs, the SDK also provides functions to facilitate user to utilize `OpenPAI`. Such functions includes *cluster management*, *storage accessing*, *execution environment detection (local or in a job container)*.
|
||||
|
||||
*Refer to [this doc]() for more details of Python binding*
|
||||
|
||||
- **User can submit and list jobs by simple commands**
|
||||
|
||||
This SDK provides a command line interface with prefix (`opai`). User can complete basic and advanced operations in simple commands, e.g.
|
||||
|
||||
```bash
|
||||
# query jobs
|
||||
opai job list
|
||||
# submit an existing job config file
|
||||
opai job submit --config your/job/config/file
|
||||
# submit a job in one line
|
||||
opai job sub --image your/docker/image --gpu 1 some/commands
|
||||
# storage access
|
||||
opai storage upload/download/list ...
|
||||
```
|
||||
|
||||
*Refer to <command-line-references.md> or execute `opai -h` for more details about the command line interface*
|
||||
|
||||
- **User can easily accomplish complicated operations with `OpenPAI`**
|
||||
|
||||
For some advanced users or tools running on `OpenPAI` (e.g. [NNI]()), it is quite convenient to provide a way to let user can complete operations. For example, user can submit tens of jobs to optimize a parameter in a simple `for-loop`, however, it is not so convenient if users have to do it manually.
|
||||
|
||||
- **User can easily reuse local codes**
|
||||
|
||||
`OpenPAI` is quite efficient in utilizing powerful computing resources to run deep learning jobs. However, user have to make their codes and environment ready first. One of the common way is to start a long-running interactive job and write (debug) codes in it before really execution. There are two disadvantages, one is the inconvenience of remoting debugging, the other is the wasting of computing resources.
|
||||
|
||||
The SDK aims to solve the problem, by letting user codes locally and executes on `OpenPAI`. For example, user can code and debug in a local running notebook first, and use `openpaisdk.notebook.submit_notebook` to turn it to a jobs with only a few lines.
|
||||
|
||||
## 1.2. Powerful runtime support
|
||||
|
||||
By installing this package in the docker container, the SDK can run as part of the runtime
|
||||
|
||||
- **It can provide more powerful built-in functions than `pre-commands` and `post-commands`**
|
||||
|
||||
The current `OpenPAI` leverages pre-commands and post-commands to do necessary operations before or after user commands. However, it is limited by the representation capability of shell commands. It would be quite hard to specify complicated behaviors. For examples, some operations (e.g. storage mounting) requires conditional operations according to OS versions. It is hard to implement in pre-commands, however, easy to do by a function in SDK.
|
||||
|
||||
- **It provide basic job management based on workspace and job folder structure**
|
||||
|
||||
For jobs submitted by the SDK (or CLI), a storage structure will be constructed for it. The SDK will create `code` and `output` (or others if required) directory in `<workspace>/jobs/<job-name>`. The SDK or CLI also provides interfaces to access them.
|
||||
|
||||
- **It can let user annotate output files to be saved before exiting the container**
|
||||
|
||||
User can annotate some files (or folders) to be uploaded during submitting the job.
|
||||
|
||||
- **It can provide a mechanism to execute certain callbacks at specified scenarios**
|
||||
|
||||
We provide pre- and post- commands in current implementation, however, the SDK would try to let user specify behaviors at other cases. For example, user can specify what to do if user commands have a non-zero exit return code.
|
||||
|
||||
## 1.3. Unified workflow
|
||||
|
||||
In the new implementation, the [job protocol]() would bridge user specification and the real execution of the job. The SDK is one of the implementations of the protocol, which includes functions to organize, edit, parse and execute the protocol as user's expectation.
|
||||
|
||||
![program model](C:/Users/yuqyang.FAREAST/Workings/pai/contrib/python-sdk/docs/medias/programming_model.svg)
|
||||
|
||||
**: the functions provided by the SDK or CLI*
|
|
@ -1,113 +0,0 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Install the SDK\n",
|
||||
"Refer to the **Installation** part of [README](https://github.com/microsoft/pai/blob/sdk-release-v0.4.00/contrib/python-sdk/README.md)\n",
|
||||
"\n",
|
||||
"*Note: now the code in a feature developping branch, will merge to master if stable*\n",
|
||||
"\n",
|
||||
"*Note 2: Restarting the kernel may be required to let python load the newly installed package*\n",
|
||||
"\n",
|
||||
"After installation, check it."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import openpaisdk\n",
|
||||
"print(openpaisdk.__version__)\n",
|
||||
"print(openpaisdk.__container_sdk_branch__)\n",
|
||||
"print(openpaisdk.get_install_uri())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"And also check the command line interface (CLI) tool `opai`"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"! opai -h"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Sepcify `OpenPAI` cluster information\n",
|
||||
"Refer to corresponding part of [README](https://github.com/microsoft/pai/blob/sdk-release-v0.4.00/contrib/python-sdk/README.md)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Add a cluster\n",
|
||||
"User may add a new cluster by `opai cluster add` and attach a hdfs storage for it as below."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"! opai cluster add --cluster-alias cluster-for-test --pai-uri http://x.x.x.x --user myuser --password mypassword\n",
|
||||
"! opai cluster attach-hdfs --default --cluster-alias cluster-for-test --storage-alias hdfs --web-hdfs-uri http://x.x.x.x:port"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## List your clusters\n",
|
||||
"User may list all specified clusters by `opai cluster list`"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from openpaisdk.command_line import Engine\n",
|
||||
"\n",
|
||||
"cluster_cfg = Engine().process(['cluster', 'list'])[\"cluster-for-test\"]\n",
|
||||
"cluster_cfg"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.8"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
|
@ -1,192 +0,0 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Prerequisites\n",
|
||||
"Install the `OpenPAI` sdk from `github` and specify your cluster information in `~/.openpai/clusters.yaml`. \n",
|
||||
"\n",
|
||||
"And for simplicity and security, we recommand user to setup necessary information in `.openpai/defaults.json` other than shown in the example notebook. (Refer to for [README](https://github.com/microsoft/pai/blob/sdk-release-v0.4.00/contrib/python-sdk/README.md) more details.)\n",
|
||||
"\n",
|
||||
"_Please make sure you have set default values for ***cluster-alias***. This notebook will not set them explicitly for security and privacy issue_\n",
|
||||
"\n",
|
||||
"If not, use below commands to set them\n",
|
||||
"```bash\n",
|
||||
"opai set cluster-alias=<your/cluster/alias>\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%load_ext autoreload\n",
|
||||
"%autoreload 2\n",
|
||||
"\n",
|
||||
"from openpaisdk.command_line import Engine\n",
|
||||
"from openpaisdk.core import ClusterList, in_job_container\n",
|
||||
"from uuid import uuid4 as randstr\n",
|
||||
"\n",
|
||||
"clusters = Engine().process(['cluster', 'list'])\n",
|
||||
"default_values = Engine().process(['set'])\n",
|
||||
"print(default_values)\n",
|
||||
"\n",
|
||||
"cluster_alias = default_values[\"cluster-alias\"]\n",
|
||||
"assert cluster_alias in clusters, \"please specify cluster-alias and workspace\"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Submit jobs\n",
|
||||
"\n",
|
||||
"Now we submit jobs from \n",
|
||||
"- an existing version 1 job config file\n",
|
||||
"- an existing version 2 job config file\n",
|
||||
"- a hello-world command line"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%%writefile mnist_v1.json\n",
|
||||
"{\n",
|
||||
" \"jobName\": \"keras_tensorflow_backend_mnist\",\n",
|
||||
" \"image\": \"openpai/pai.example.keras.tensorflow:stable\",\n",
|
||||
" \"taskRoles\": [\n",
|
||||
" {\n",
|
||||
" \"name\": \"mnist\",\n",
|
||||
" \"taskNumber\": 1,\n",
|
||||
" \"cpuNumber\": 4,\n",
|
||||
" \"memoryMB\": 8192,\n",
|
||||
" \"gpuNumber\": 1,\n",
|
||||
" \"command\": \"python mnist_cnn.py\"\n",
|
||||
" }\n",
|
||||
" ]\n",
|
||||
"}"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%%writefile mnist_v2.yaml\n",
|
||||
"protocolVersion: 2\n",
|
||||
"name: keras_tensorflow_mnist\n",
|
||||
"type: job\n",
|
||||
"version: 1.0\n",
|
||||
"contributor: OpenPAI\n",
|
||||
"description: |\n",
|
||||
" # Keras Tensorflow Backend MNIST Digit Recognition Examples\n",
|
||||
" Trains a simple convnet on the MNIST dataset.\n",
|
||||
" Gets to 99.25% test accuracy after 12 epochs\n",
|
||||
" (there is still a lot of margin for parameter tuning).\n",
|
||||
" 16 seconds per epoch on a GRID K520 GPU.\n",
|
||||
"\n",
|
||||
" Reference https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py\n",
|
||||
"\n",
|
||||
"prerequisites:\n",
|
||||
" - protocolVersion: 2\n",
|
||||
" name: keras_tensorflow_example\n",
|
||||
" type: dockerimage\n",
|
||||
" version: 1.0\n",
|
||||
" contributor : OpenPAI\n",
|
||||
" description: |\n",
|
||||
" This is an [example Keras with TensorFlow backend Docker image on OpenPAI](https://github.com/Microsoft/pai/tree/master/examples/keras).\n",
|
||||
" uri : openpai/pai.example.keras.tensorflow\n",
|
||||
"\n",
|
||||
"taskRoles:\n",
|
||||
" train:\n",
|
||||
" instances: 1\n",
|
||||
" completion:\n",
|
||||
" minSucceededInstances: 1\n",
|
||||
" dockerImage: keras_tensorflow_example\n",
|
||||
" resourcePerInstance:\n",
|
||||
" cpu: 4\n",
|
||||
" memoryMB: 8192\n",
|
||||
" gpu: 1\n",
|
||||
" commands:\n",
|
||||
" - python mnist_cnn.py"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"tests = [\"submit_v1\", \"submit_v2\", \"sub_oneliner\"]\n",
|
||||
"jobnames = {k: k + '_' + randstr().hex for k in tests}\n",
|
||||
"\n",
|
||||
"options = \"\"\n",
|
||||
"# options += \" --preview\"\n",
|
||||
"\n",
|
||||
"if not in_job_container():\n",
|
||||
" jobs, cmds = [], []\n",
|
||||
" \n",
|
||||
" # submit v1\n",
|
||||
" jobs.append(\"submit_v1_\" + randstr().hex)\n",
|
||||
" cmds.append(f'opai job submit {options} --update jobName={jobs[-1]} mnist_v1.json')\n",
|
||||
"\n",
|
||||
" # submit v2\n",
|
||||
" jobs.append(\"submit_v2_\" + randstr().hex)\n",
|
||||
" cmds.append(f'opai job submit {options} --update name={jobs[-1]} mnist_v2.yaml')\n",
|
||||
" \n",
|
||||
" # sub\n",
|
||||
" jobs.append(\"sub_\" + randstr().hex) \n",
|
||||
" resource = '-i openpai/pai.example.keras.tensorflow --cpu 4 --memoryMB 8192 --gpu 1'\n",
|
||||
" cmds.append(f'opai job sub {options} -j {jobs[-1]} {resource} python mnist_cnn.py')\n",
|
||||
"\n",
|
||||
" # notebook\n",
|
||||
" jobs.append(\"notebook_\" + randstr().hex) \n",
|
||||
" cmds.append(f'opai job notebook {options} -j {jobs[-1]} {resource} --python python3 --pip-installs keras 2-submit-job-from-local-notebook.ipynb')\n",
|
||||
"\n",
|
||||
" for cmd in cmds:\n",
|
||||
" print(cmd, \"\\n\")\n",
|
||||
" ! {cmd}\n",
|
||||
" print(\"\\n\")\n",
|
||||
" \n",
|
||||
" states = ClusterList().load().get_client(cluster_alias).wait(jobs)\n",
|
||||
" failed_jobs = [t for i, t in enumerate(jobs) if states[i] != \"SUCCEEDED\"]\n",
|
||||
" assert not failed_jobs, \"some of jobs fails %s\" % failed_jobs"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.8"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
|
@ -1,115 +0,0 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Keras MNIST CNN example\n",
|
||||
"\n",
|
||||
"https://keras.io/examples/mnist_cnn/\n",
|
||||
"\n",
|
||||
"Trains a simple convnet on the MNIST dataset.\n",
|
||||
"\n",
|
||||
"Gets to 99.25% test accuracy after 12 epochs (there is still a lot of margin for parameter tuning). 16 seconds per epoch on a GRID K520 GPU.\n",
|
||||
"\n",
|
||||
"Submit this notebook to openpai by \n",
|
||||
"\n",
|
||||
"```bash\n",
|
||||
"opai job notebook -i openpai/pai.example.keras.tensorflow --cpu 4 --memoryMB 8192 --gpu 1 --python python3 --pip-installs keras 2-submit-job-from-local-notebook.ipynb\n",
|
||||
" ```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from __future__ import print_function\n",
|
||||
"import keras\n",
|
||||
"from keras.datasets import mnist\n",
|
||||
"from keras.models import Sequential\n",
|
||||
"from keras.layers import Dense, Dropout, Flatten\n",
|
||||
"from keras.layers import Conv2D, MaxPooling2D\n",
|
||||
"from keras import backend as K\n",
|
||||
"\n",
|
||||
"batch_size = 128\n",
|
||||
"num_classes = 10\n",
|
||||
"epochs = 12\n",
|
||||
"\n",
|
||||
"# input image dimensions\n",
|
||||
"img_rows, img_cols = 28, 28\n",
|
||||
"\n",
|
||||
"# the data, split between train and test sets\n",
|
||||
"(x_train, y_train), (x_test, y_test) = mnist.load_data()\n",
|
||||
"\n",
|
||||
"if K.image_data_format() == 'channels_first':\n",
|
||||
" x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)\n",
|
||||
" x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)\n",
|
||||
" input_shape = (1, img_rows, img_cols)\n",
|
||||
"else:\n",
|
||||
" x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)\n",
|
||||
" x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)\n",
|
||||
" input_shape = (img_rows, img_cols, 1)\n",
|
||||
"\n",
|
||||
"x_train = x_train.astype('float32')\n",
|
||||
"x_test = x_test.astype('float32')\n",
|
||||
"x_train /= 255\n",
|
||||
"x_test /= 255\n",
|
||||
"print('x_train shape:', x_train.shape)\n",
|
||||
"print(x_train.shape[0], 'train samples')\n",
|
||||
"print(x_test.shape[0], 'test samples')\n",
|
||||
"\n",
|
||||
"# convert class vectors to binary class matrices\n",
|
||||
"y_train = keras.utils.to_categorical(y_train, num_classes)\n",
|
||||
"y_test = keras.utils.to_categorical(y_test, num_classes)\n",
|
||||
"\n",
|
||||
"model = Sequential()\n",
|
||||
"model.add(Conv2D(32, kernel_size=(3, 3),\n",
|
||||
" activation='relu',\n",
|
||||
" input_shape=input_shape))\n",
|
||||
"model.add(Conv2D(64, (3, 3), activation='relu'))\n",
|
||||
"model.add(MaxPooling2D(pool_size=(2, 2)))\n",
|
||||
"model.add(Dropout(0.25))\n",
|
||||
"model.add(Flatten())\n",
|
||||
"model.add(Dense(128, activation='relu'))\n",
|
||||
"model.add(Dropout(0.5))\n",
|
||||
"model.add(Dense(num_classes, activation='softmax'))\n",
|
||||
"\n",
|
||||
"model.compile(loss=keras.losses.categorical_crossentropy,\n",
|
||||
" optimizer=keras.optimizers.Adadelta(),\n",
|
||||
" metrics=['accuracy'])\n",
|
||||
"\n",
|
||||
"model.fit(x_train, y_train,\n",
|
||||
" batch_size=batch_size,\n",
|
||||
" epochs=epochs,\n",
|
||||
" verbose=1,\n",
|
||||
" validation_data=(x_test, y_test))\n",
|
||||
"score = model.evaluate(x_test, y_test, verbose=0)\n",
|
||||
"print('Test loss:', score[0])\n",
|
||||
"print('Test accuracy:', score[1])"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.8"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
|
@ -1,185 +0,0 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%load_ext autoreload\n",
|
||||
"%autoreload 2\n",
|
||||
"\n",
|
||||
"from hello import say_hello\n",
|
||||
"say_hello()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from openpaisdk.notebook import parse_notebook_path, get_notebook_path\n",
|
||||
"from openpaisdk.core import get_defaults, randstr\n",
|
||||
"from openpaisdk.io_utils import to_screen\n",
|
||||
"\n",
|
||||
"cluster = {\n",
|
||||
" \"cluster_alias\": get_defaults()[\"cluster-alias\"],\n",
|
||||
" \"virtual_cluster\": None,\n",
|
||||
" \"workspace\": get_defaults()[\"workspace\"],\n",
|
||||
"}\n",
|
||||
"\n",
|
||||
"job_name = parse_notebook_path()[0] + '_' + randstr().hex\n",
|
||||
"\n",
|
||||
"to_screen(cluster)\n",
|
||||
"to_screen(job_name)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from openpaisdk.core import Job\n",
|
||||
"help(Job.from_notebook)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"__nb_ext_custom_cfg__ = {\n",
|
||||
" \"token\": \"abcdef\", # not to set a int string like 1234\n",
|
||||
" \"image\": 'ufoym/deepo:pytorch-py36-cu90',\n",
|
||||
" \"resources\": {\n",
|
||||
" \"cpu\": 4, \"memoryMB\": 8192, \"gpu\": 0,\n",
|
||||
" },\n",
|
||||
" \"sources\": [\"hello.py\"], \n",
|
||||
" \"pip_installs\": [],\n",
|
||||
"}\n",
|
||||
"\n",
|
||||
"job = Job(job_name).from_notebook(nb_file=get_notebook_path(), cluster=cluster, **__nb_ext_custom_cfg__)\n",
|
||||
"# to_screen(job.get_config())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"help(Job.submit)\n",
|
||||
"job.submit(cluster[\"cluster_alias\"], cluster[\"virtual_cluster\"])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# restore the job from a name and cluster\n",
|
||||
"job2 = Job(job_name).load(cluster_alias=cluster[\"cluster_alias\"])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# one time check, return {state:..., notebook:...}\n",
|
||||
"job2.connect_jupyter()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# wait until notebook url is ready\n",
|
||||
"help(Job.wait)\n",
|
||||
"job2.wait(timeout=100)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# to_screen(job2.logs()[\"stderr\"])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# job2.stop()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.8"
|
||||
},
|
||||
"varInspector": {
|
||||
"cols": {
|
||||
"lenName": 16,
|
||||
"lenType": 16,
|
||||
"lenVar": 40
|
||||
},
|
||||
"kernels_config": {
|
||||
"python": {
|
||||
"delete_cmd_postfix": "",
|
||||
"delete_cmd_prefix": "del ",
|
||||
"library": "var_list.py",
|
||||
"varRefreshCmd": "print(var_dic_list())"
|
||||
},
|
||||
"r": {
|
||||
"delete_cmd_postfix": ") ",
|
||||
"delete_cmd_prefix": "rm(",
|
||||
"library": "var_list.r",
|
||||
"varRefreshCmd": "cat(var_dic_list()) "
|
||||
}
|
||||
},
|
||||
"types_to_exclude": [
|
||||
"module",
|
||||
"function",
|
||||
"builtin_function_or_method",
|
||||
"instance",
|
||||
"_Feature"
|
||||
],
|
||||
"window_display": false
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
|
@ -1,2 +0,0 @@
|
|||
def say_hello():
|
||||
print("Hello, OpenPAI")
|
|
@ -1,31 +0,0 @@
|
|||
import os
|
||||
import sys
|
||||
import shutil
|
||||
from openpaisdk.utils import run_command
|
||||
from openpaisdk.io_utils import browser_open
|
||||
|
||||
|
||||
try:
|
||||
import nbmerge
|
||||
except:
|
||||
run_command([sys.executable, '-m pip install nbmerge'])
|
||||
|
||||
test_notebooks = [
|
||||
'0-install-sdk-specify-openpai-cluster.ipynb',
|
||||
'1-submit-and-query-via-command-line.ipynb',
|
||||
# '2-submit-job-from-local-notebook.ipynb',
|
||||
]
|
||||
|
||||
merged_file = "integrated_tests.ipynb"
|
||||
html_file = os.path.splitext(merged_file)[0] + '.html'
|
||||
shutil.rmtree(merged_file, ignore_errors=True)
|
||||
shutil.rmtree(html_file, ignore_errors=True)
|
||||
|
||||
# clear output for committing
|
||||
for f in test_notebooks:
|
||||
os.system("jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace %s" % f)
|
||||
|
||||
os.system('nbmerge %s -o %s' % (' '.join(test_notebooks), merged_file))
|
||||
os.system('jupyter nbconvert --ExecutePreprocessor.timeout=-1 --ExecutePreprocessor.allow_errors=True --to html --execute %s' % merged_file)
|
||||
|
||||
browser_open(html_file)
|
|
@ -1,23 +0,0 @@
|
|||
from openpaisdk.flags import __flags__
|
||||
from openpaisdk.io_utils import to_screen
|
||||
from openpaisdk.defaults import get_defaults, update_default, LayeredSettings
|
||||
from openpaisdk.cluster import ClusterList, Cluster
|
||||
from openpaisdk.job import Job, JobStatusParser
|
||||
|
||||
|
||||
__version__ = '0.4.00'
|
||||
|
||||
|
||||
def in_job_container(varname: str = 'PAI_CONTAINER_ID'):
|
||||
"""in_job_container check whether it is inside a job container (by checking environmental variables)
|
||||
|
||||
|
||||
Keyword Arguments:
|
||||
varname {str} -- the variable to test (default: {'PAI_CONTAINER_ID'})
|
||||
|
||||
Returns:
|
||||
[bool] -- return True is os.environ[varname] is set
|
||||
"""
|
||||
if not os.environ.get(varname, ''):
|
||||
return False
|
||||
return True
|
|
@ -1,93 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
"""This file provides a mechanism to couple a Namespace (argparse) and pai protocol
|
||||
"""
|
||||
import argparse
|
||||
from openpaisdk.defaults import LayeredSettings
|
||||
|
||||
|
||||
class ArgumentFactory:
|
||||
|
||||
def __init__(self):
|
||||
self.factory = dict()
|
||||
|
||||
# deal with predefined defaults
|
||||
for name, params in LayeredSettings.definitions.items():
|
||||
args = ['--' + name]
|
||||
abbr = params.get('abbreviation', None)
|
||||
if abbr: # args = ['--{name}', '-{abbr}' or '--{abbr}']
|
||||
args += [('-' if len(abbr) == 1 else '--') + abbr]
|
||||
kwargs = {k: v for k, v in params.items() if k not in ["name", "abbreviation"]}
|
||||
kwargs["default"] = LayeredSettings.get(name)
|
||||
self.add_argument(*args, **kwargs)
|
||||
|
||||
# cluster
|
||||
self.add_argument('cluster_alias', help='cluster alias to select')
|
||||
|
||||
self.add_argument('--pai-uri', help="uri of openpai cluster, in format of http://x.x.x.x")
|
||||
self.add_argument('--user', help='username')
|
||||
self.add_argument('--password', help="password")
|
||||
self.add_argument('--authen-token', '--token', dest='token', help="authentication token")
|
||||
|
||||
self.add_argument('--editor', default="code", help="path to your editor used to open files")
|
||||
|
||||
# job spec
|
||||
self.add_argument('--job-name', '-j', help='job name')
|
||||
|
||||
self.add_argument('--is-global', '-g', action="store_true",
|
||||
help="set globally (not limited to current working folder)", default=False)
|
||||
self.add_argument('--update', '-u', action='append',
|
||||
help='replace current key-value pairs with new key=value (key1:key2:...=value for nested objects)')
|
||||
self.add_argument('--preview', action='store_true', help='preview result before doing action')
|
||||
self.add_argument('--no-browser', action='store_true', help='does not open the job link in web browser')
|
||||
self.add_argument('--interactive', action='store_true', help='enter the interactive mode after job starts')
|
||||
self.add_argument('--notebook-token', '--token', dest='token', default="abcd",
|
||||
help='jupyter notebook authentication token')
|
||||
self.add_argument("--python", default="python",
|
||||
help="command or path of python, default is {python}, may be {python3}")
|
||||
|
||||
self.add_argument('--cmd-sep', default="\s*&&\s*", help="command separator, default is (&&)")
|
||||
self.add_argument('commands', nargs=argparse.REMAINDER, help='shell commands to execute')
|
||||
|
||||
# runtime
|
||||
self.add_argument('config', nargs='?', help='job config file')
|
||||
self.add_argument('notebook', nargs='?', help='Jupyter notebook file')
|
||||
|
||||
# storage
|
||||
self.add_argument('--recursive', action='store_true', default=False, help="recursive target operation")
|
||||
self.add_argument('--overwrite', action='store_true', default=False, help="enable overwrite if exists")
|
||||
self.add_argument('local_path', help="local path")
|
||||
self.add_argument('remote_path', help="remote path")
|
||||
|
||||
def add_argument(self, *args, **kwargs):
|
||||
self.factory[args[0]] = dict(args=args, kwargs=kwargs)
|
||||
|
||||
def get(self, key):
|
||||
value = self.factory[key]
|
||||
return value['args'], value['kwargs']
|
||||
|
||||
|
||||
__arguments_factory__ = ArgumentFactory()
|
||||
|
||||
|
||||
def cli_add_arguments(parser: argparse.ArgumentParser, args: list):
|
||||
for a in args:
|
||||
args, kwargs = __arguments_factory__.get(a)
|
||||
# assert parser.conflict_handler == 'resolve', "set conflict_handler to avoid duplicated"
|
||||
parser.add_argument(*args, **kwargs)
|
|
@ -1,133 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
import argparse
|
||||
from openpaisdk.io_utils import to_screen
|
||||
from openpaisdk.job import Job
|
||||
from openpaisdk.cluster import ClusterList
|
||||
|
||||
|
||||
class ArgumentError(Exception):
|
||||
|
||||
pass
|
||||
|
||||
|
||||
class Action:
|
||||
|
||||
def __init__(self, action: str, help_s: str):
|
||||
self.action, self.help_s = action, help_s
|
||||
|
||||
def define_arguments(self, parser: argparse.ArgumentParser):
|
||||
pass
|
||||
|
||||
def check_arguments(self, args):
|
||||
pass
|
||||
|
||||
def restore(self, args):
|
||||
pass
|
||||
|
||||
def store(self, args):
|
||||
pass
|
||||
|
||||
def do_action(self, args):
|
||||
raise NotImplementedError
|
||||
|
||||
|
||||
class ActionFactory(Action):
|
||||
|
||||
def __init__(self, action: str, allowed_actions: dict):
|
||||
assert action in allowed_actions, ("unsupported action of job", action)
|
||||
super().__init__(action, allowed_actions[action])
|
||||
suffix = action.replace('-', '_')
|
||||
for attr in ["define_arguments", "check_arguments", "do_action"]:
|
||||
if hasattr(self, f"{attr}_{suffix}"):
|
||||
setattr(self, attr, getattr(self, f"{attr}_{suffix}"))
|
||||
else:
|
||||
assert attr != "do_action", f"must specify a method named {attr}_{suffix} in {self.__class__.__name__}"
|
||||
|
||||
self.__job__ = Job()
|
||||
self.__clusters__ = ClusterList()
|
||||
self.enable_svaing = dict(job=False, clusters=False)
|
||||
|
||||
def restore(self, args):
|
||||
if getattr(args, 'job_name', None):
|
||||
self.__job__.load(job_name=args.job_name)
|
||||
self.__clusters__.load()
|
||||
return self
|
||||
|
||||
def store(self, args):
|
||||
if self.enable_svaing["job"]:
|
||||
self.__job__.save()
|
||||
if self.enable_svaing["clusters"]:
|
||||
self.__clusters__.save()
|
||||
return self
|
||||
|
||||
|
||||
class Scene:
|
||||
|
||||
def __init__(self, scene: str, help_s: str, parser: argparse.ArgumentParser,
|
||||
action_list # type: list[Action]
|
||||
):
|
||||
self.scene, self.help_s = scene, help_s
|
||||
self.single_action = len(action_list) == 1 and scene == action_list[0].action
|
||||
if self.single_action:
|
||||
self.actor = action_list[0]
|
||||
self.actor.define_arguments(parser)
|
||||
else:
|
||||
self.actions, subparsers = dict(), parser.add_subparsers(dest='action', help=help_s)
|
||||
for a in action_list:
|
||||
p = subparsers.add_parser(a.action, help=a.help_s)
|
||||
a.define_arguments(p)
|
||||
self.actions[a.action] = a
|
||||
|
||||
def process(self, args):
|
||||
actor = self.actor if self.single_action else self.actions[args.action]
|
||||
actor.check_arguments(args)
|
||||
actor.restore(args)
|
||||
result = actor.do_action(args)
|
||||
actor.store(args)
|
||||
return result
|
||||
|
||||
|
||||
class EngineFactory:
|
||||
|
||||
def __init__(self, cli_structure):
|
||||
self.parser = argparse.ArgumentParser(
|
||||
description='command line interface for OpenPAI',
|
||||
formatter_class=argparse.ArgumentDefaultsHelpFormatter
|
||||
)
|
||||
subparsers = self.parser.add_subparsers(
|
||||
dest='scene',
|
||||
help='openpai cli working scenarios',
|
||||
)
|
||||
self.scenes = dict()
|
||||
for k, v in cli_structure.items():
|
||||
p = subparsers.add_parser(k, help=v[0])
|
||||
self.scenes[k] = Scene(k, v[0], p, v[1])
|
||||
|
||||
def process(self, a: list):
|
||||
to_screen(f'Received arguments {a}', _type="debug")
|
||||
args = self.parser.parse_args(a)
|
||||
return self.process_args(args)
|
||||
|
||||
def process_args(self, args):
|
||||
to_screen(f'Parsed arguments {args}', _type="debug")
|
||||
if not args.scene:
|
||||
self.parser.print_help()
|
||||
return
|
||||
return self.scenes[args.scene].process(args)
|
|
@ -1,307 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
from openpaisdk.io_utils import from_file, to_file, to_screen
|
||||
from openpaisdk.storage import Storage
|
||||
from openpaisdk.utils import OrganizedList
|
||||
from openpaisdk.utils import get_response, na, exception_free, RestSrvError, concurrent_map
|
||||
|
||||
|
||||
def get_cluster(alias: str, fname: str = None, get_client: bool = True):
|
||||
"""the generalized function call to load cluster
|
||||
return cluster client if assert get_client else return config"""
|
||||
if get_client:
|
||||
return ClusterList().load(fname).get_client(alias)
|
||||
else:
|
||||
return ClusterList().load(fname).select(alias)
|
||||
|
||||
|
||||
class ClusterList:
|
||||
"""Data structure corresponding to the contents of ~/.openpai/clusters.yaml
|
||||
We use an OrganizedList to handle the operations to this class
|
||||
"""
|
||||
|
||||
def __init__(self, clusters: list = None):
|
||||
self.clusters = OrganizedList(clusters, _key="cluster_alias") if clusters else []
|
||||
|
||||
def load(self, fname: str = None):
|
||||
fname = na(fname, self.default_config_file)
|
||||
self.clusters = OrganizedList(from_file(fname, default=[]), _key="cluster_alias")
|
||||
return self
|
||||
|
||||
def save(self):
|
||||
to_file(self.clusters.as_list, self.default_config_file)
|
||||
|
||||
@property
|
||||
def default_config_file(self):
|
||||
from openpaisdk.flags import __flags__
|
||||
from openpaisdk.defaults import get_defaults
|
||||
return __flags__.get_cluster_cfg_file(get_defaults()["clusters-in-local"])
|
||||
|
||||
def tell(self):
|
||||
return {
|
||||
a: {
|
||||
v: dict(GPUs='-', memory='-', vCores='-', uri=cfg["pai_uri"], user=cfg["user"]) for v in cfg["virtual_clusters"]
|
||||
} for a, cfg in self.clusters.as_dict.items()
|
||||
}
|
||||
|
||||
def add(self, cluster: dict):
|
||||
cfg = Cluster().load(**cluster).check().config
|
||||
self.clusters.add(cfg, replace=True)
|
||||
return self
|
||||
|
||||
def update_all(self):
|
||||
for a in self.aliases:
|
||||
self.add(self.clusters.first(a))
|
||||
|
||||
def delete(self, alias: str):
|
||||
return self.clusters.remove(alias)
|
||||
|
||||
def select(self, alias: str):
|
||||
return self.clusters.first(alias)
|
||||
|
||||
def get_client(self, alias: str):
|
||||
return Cluster().load(**self.select(alias))
|
||||
|
||||
def available_resources(self):
|
||||
"""concurrent version to get available resources"""
|
||||
aliases = self.aliases
|
||||
ret = concurrent_map(Cluster.available_resources, (self.get_client(a) for a in aliases))
|
||||
return {a: r for a, r in zip(aliases, ret) if r is not None}
|
||||
|
||||
@property
|
||||
def aliases(self):
|
||||
return [c["cluster_alias"] for c in self.clusters if "cluster_alias" in c]
|
||||
|
||||
@property
|
||||
def alias(self):
|
||||
return self.config["cluster_alias"]
|
||||
|
||||
|
||||
class Cluster:
|
||||
"""A wrapper of cluster to access the REST APIs"""
|
||||
|
||||
def __init__(self, toke_expiration: int = 3600):
|
||||
# ! currently sdk will not handle toke refreshing
|
||||
self.config = {}
|
||||
self.__token_expire = toke_expiration
|
||||
self.__token = None
|
||||
|
||||
def load(self, cluster_alias: str = None, pai_uri: str = None, user: str = None, password: str = None, token: str = None, **kwargs):
|
||||
import re
|
||||
self.config.update(
|
||||
cluster_alias=cluster_alias,
|
||||
pai_uri=pai_uri.strip("/"),
|
||||
user=user,
|
||||
password=password,
|
||||
token=token,
|
||||
)
|
||||
self.config.update(
|
||||
{k: v for k, v in kwargs.items() if k in ["info", "storages", "virtual_clusters"]}
|
||||
)
|
||||
# validate
|
||||
assert self.alias, "cluster must have an alias"
|
||||
assert self.user, "must specify a user name"
|
||||
assert re.match("^(http|https)://(.*[^/])$",
|
||||
self.pai_uri), "pai_uri should be a uri in the format of http(s)://x.x.x.x"
|
||||
return self
|
||||
|
||||
def check(self):
|
||||
to_screen("try to connect cluster {}".format(self.alias))
|
||||
storages = self.rest_api_storages()
|
||||
for i, s in enumerate(storages):
|
||||
s.setdefault("storage_alias", s["protocol"] + f'-{i}')
|
||||
cluster_info = na(self.rest_api_cluster_info(), {})
|
||||
if cluster_info.get("authnMethod", "basic") == "OIDC":
|
||||
assert self.config["token"], "must use authentication token (instead of password) in OIDC mode"
|
||||
self.config.update(
|
||||
info=cluster_info,
|
||||
storages=storages,
|
||||
virtual_clusters=self.virtual_clusters(),
|
||||
)
|
||||
# ! will check authentication types according to AAD enabled or not
|
||||
return self
|
||||
|
||||
@property
|
||||
def alias(self):
|
||||
return self.config["cluster_alias"]
|
||||
|
||||
@property
|
||||
def pai_uri(self):
|
||||
return self.config["pai_uri"].strip("/")
|
||||
|
||||
@property
|
||||
def user(self):
|
||||
return self.config["user"]
|
||||
|
||||
@property
|
||||
def password(self):
|
||||
return str(self.config["password"])
|
||||
|
||||
@property
|
||||
def token(self):
|
||||
if self.config["token"]:
|
||||
return str(self.config["token"])
|
||||
if not self.__token:
|
||||
self.__token = self.rest_api_token(self.__token_expire)
|
||||
return self.__token
|
||||
|
||||
def get_storage(self, alias: str = None):
|
||||
# ! every cluster should have a builtin storage
|
||||
for sto in self.config.get("storages", []):
|
||||
if alias is None or sto["storage_alias"] == alias:
|
||||
if sto["protocol"] == 'hdfs':
|
||||
return Storage(protocol='webHDFS', url=sto["webhdfs"], user=sto.get('user', self.user))
|
||||
|
||||
def get_job_link(self, job_name: str):
|
||||
return '{}/job-detail.html?username={}&jobName={}'.format(self.pai_uri, self.user, job_name)
|
||||
|
||||
@property
|
||||
def rest_srv(self):
|
||||
return '{}/rest-server/api'.format(self.pai_uri)
|
||||
|
||||
# ! for some older version that does not support this API
|
||||
@exception_free(Exception, None, "Cluster info API is not supported")
|
||||
def rest_api_cluster_info(self):
|
||||
"refer to https://github.com/microsoft/pai/pull/3281/"
|
||||
return get_response('GET', [self.rest_srv, 'v1'], allowed_status=[200]).json()
|
||||
|
||||
def rest_api_storages(self):
|
||||
# ! currently this is a fake
|
||||
return [
|
||||
{
|
||||
"protocol": "hdfs",
|
||||
"webhdfs": f"{self.pai_uri}/webhdfs"
|
||||
},
|
||||
]
|
||||
|
||||
@exception_free(RestSrvError, None)
|
||||
def rest_api_job_list(self, user: str = None):
|
||||
return get_response(
|
||||
'GET', [self.rest_srv, 'v1', ('user', user), 'jobs']
|
||||
).json()
|
||||
|
||||
@exception_free(RestSrvError, None)
|
||||
def rest_api_job_info(self, job_name: str = None, info: str = None, user: str = None):
|
||||
import json
|
||||
import yaml
|
||||
user = self.user if user is None else user
|
||||
assert info in [None, 'config', 'ssh'], ('unsupported query information', info)
|
||||
response = get_response(
|
||||
'GET', [self.rest_srv, 'v1', 'user', user, 'jobs', job_name, info]
|
||||
)
|
||||
try:
|
||||
return response.json()
|
||||
except json.decoder.JSONDecodeError:
|
||||
return yaml.load(response.text, Loader=yaml.FullLoader)
|
||||
else:
|
||||
raise RestSrvError
|
||||
|
||||
@exception_free(Exception, None)
|
||||
def rest_api_token(self, expiration=3600):
|
||||
return get_response(
|
||||
'POST', [self.rest_srv, 'v1', 'token'],
|
||||
body={
|
||||
'username': self.user, 'password': self.password, 'expiration': expiration
|
||||
}
|
||||
).json()['token']
|
||||
|
||||
def rest_api_submit(self, job: dict):
|
||||
use_v2 = str(job.get("protocolVersion", 1)) == "2"
|
||||
if use_v2:
|
||||
import yaml
|
||||
return get_response(
|
||||
'POST', [self.rest_srv, 'v2', 'jobs'],
|
||||
headers={
|
||||
'Authorization': 'Bearer {}'.format(self.token),
|
||||
'Content-Type': 'text/yaml',
|
||||
},
|
||||
body=yaml.dump(job),
|
||||
allowed_status=[202, 201]
|
||||
)
|
||||
else:
|
||||
return get_response(
|
||||
'POST', [self.rest_srv, 'v1', 'user', self.user, 'jobs'],
|
||||
headers={
|
||||
'Authorization': 'Bearer {}'.format(self.token),
|
||||
'Content-Type': 'application/json',
|
||||
},
|
||||
body=job,
|
||||
allowed_status=[202, 201]
|
||||
)
|
||||
|
||||
@exception_free(RestSrvError, None)
|
||||
def rest_api_execute_job(self, job_name: str, e_type: str = "STOP"):
|
||||
assert e_type in ["START", "STOP"], "unsupported execute type {}".format(e_type)
|
||||
return get_response(
|
||||
'PUT', [self.rest_srv, 'v1', 'user', self.user, 'jobs', job_name, 'executionType'],
|
||||
headers={
|
||||
'Authorization': 'Bearer {}'.format(self.token),
|
||||
},
|
||||
body={
|
||||
"value": e_type
|
||||
},
|
||||
allowed_status=[200, 202],
|
||||
).json()
|
||||
|
||||
@exception_free(RestSrvError, None)
|
||||
def rest_api_virtual_clusters(self):
|
||||
return get_response(
|
||||
'GET', [self.rest_srv, 'v1', 'virtual-clusters'],
|
||||
headers={
|
||||
'Authorization': 'Bearer {}'.format(self.token),
|
||||
'Content-Type': 'application/json',
|
||||
},
|
||||
allowed_status=[200]
|
||||
).json()
|
||||
|
||||
@exception_free(RestSrvError, None)
|
||||
def rest_api_user(self, user: str = None):
|
||||
return get_response(
|
||||
'GET', [self.rest_srv, 'v1', 'user', user if user else self.user],
|
||||
headers={
|
||||
'Authorization': 'Bearer {}'.format(self.token),
|
||||
},
|
||||
).json()
|
||||
|
||||
def virtual_clusters(self, user_info: dict = None):
|
||||
user_info = na(user_info, self.rest_api_user())
|
||||
assert user_info, f'failed to get user information from {self.alias}'
|
||||
my_virtual_clusters = user_info["virtualCluster"]
|
||||
if isinstance(my_virtual_clusters, str):
|
||||
my_virtual_clusters = my_virtual_clusters.split(",")
|
||||
return my_virtual_clusters
|
||||
|
||||
def virtual_cluster_available_resources(self):
|
||||
vc_info = self.rest_api_virtual_clusters()
|
||||
dic = dict()
|
||||
for key, vc in vc_info.items():
|
||||
if "resourcesTotal" in vc:
|
||||
used, total = vc["resourcesUsed"], vc["resourcesTotal"]
|
||||
dic[key] = {
|
||||
k: max(0, int(total[k] - used[k])) for k in total
|
||||
}
|
||||
else:
|
||||
# return -1 if the REST api not supported
|
||||
dic[key] = dict(GPUs=-1, memory=-1, vCores=-1)
|
||||
return dic
|
||||
|
||||
@exception_free(Exception, None)
|
||||
def available_resources(self):
|
||||
resources = self.virtual_cluster_available_resources()
|
||||
return {k: v for k, v in resources.items() if k in self.config["virtual_clusters"]}
|
|
@ -1,428 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
import argparse
|
||||
import os
|
||||
import sys
|
||||
from openpaisdk.cli_arguments import cli_add_arguments
|
||||
from openpaisdk.cli_factory import ActionFactory, EngineFactory
|
||||
from openpaisdk.defaults import get_defaults, update_default
|
||||
from openpaisdk.io_utils import browser_open, to_screen
|
||||
from openpaisdk.utils import Nested, run_command, na, randstr
|
||||
from openpaisdk.defaults import __flags__
|
||||
|
||||
|
||||
def extract_args(args: argparse.Namespace, get_list: list = None, ignore_list: list = ["scene", "action"]):
|
||||
if get_list:
|
||||
return {k: getattr(args, k) for k in get_list}
|
||||
return {k: v for k, v in vars(args).items() if k not in ignore_list}
|
||||
|
||||
|
||||
class ActionFactoryForDefault(ActionFactory):
|
||||
|
||||
def define_arguments(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(parser, ['--is-global'])
|
||||
parser.add_argument('contents', nargs='*', help='(variable=value) pair to be set as default')
|
||||
|
||||
def do_action_set(self, args):
|
||||
import re
|
||||
if not args.contents:
|
||||
return get_defaults(False, True, False) if args.is_global else get_defaults(True, True, False)
|
||||
kv_pairs = []
|
||||
for content in args.contents:
|
||||
m = re.match("^([^=]+?)([\+|\-]*=)([^=]*)$", content)
|
||||
if m:
|
||||
kv_pairs.append(m.groups())
|
||||
else:
|
||||
kv_pairs.append((content, '', ''))
|
||||
for kv_pair in kv_pairs:
|
||||
assert kv_pair[0] and kv_pair[1] in ["=", "+=", "-="] and kv_pair[2], \
|
||||
f"must specify a key=value pair ({kv_pair[0]}, {kv_pair[2]})"
|
||||
update_default(kv_pair[0], kv_pair[2], is_global=args.is_global)
|
||||
|
||||
def do_action_unset(self, args):
|
||||
for kv_pair in args.contents:
|
||||
update_default(kv_pair[0], kv_pair[2], is_global=args.is_global, to_delete=True)
|
||||
|
||||
|
||||
class ActionFactoryForCluster(ActionFactory):
|
||||
|
||||
def define_arguments_edit(self, parser):
|
||||
cli_add_arguments(parser, ["--editor"])
|
||||
|
||||
def check_arguments_edit(self, args):
|
||||
assert args.editor, "cannot edit the file without an editor"
|
||||
|
||||
def do_action_edit(self, args):
|
||||
run_command([args.editor, cluster_cfg_file])
|
||||
|
||||
def define_arguments_update(self, parser):
|
||||
pass
|
||||
|
||||
def do_action_update(self, args):
|
||||
self.enable_svaing["clusters"] = True
|
||||
return self.__clusters__.update_all()
|
||||
|
||||
def define_arguments_list(self, parser):
|
||||
cli_add_arguments(parser, [])
|
||||
|
||||
@staticmethod
|
||||
def tabulate_resources(dic: dict):
|
||||
to_screen([
|
||||
[c, i.get("uri", None), i.get("user", None), v, i["GPUs"], i["vCores"], i["memory"]] for c in dic.keys() for v, i in dic[c].items()
|
||||
], _type="table", headers=["cluster", "uri", "user", "virtual-cluster", "GPUs", "vCores", "memory"])
|
||||
return dic
|
||||
|
||||
def do_action_list(self, args):
|
||||
info = self.__clusters__.tell()
|
||||
ActionFactoryForCluster.tabulate_resources(info)
|
||||
|
||||
def define_arguments_resources(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(parser, [])
|
||||
|
||||
def do_action_resources(self, args):
|
||||
r = self.__clusters__.available_resources()
|
||||
ActionFactoryForCluster.tabulate_resources(r)
|
||||
|
||||
def define_arguments_add(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(
|
||||
parser, ['--cluster-alias', '--pai-uri', '--user', '--password', '--authen-token'])
|
||||
|
||||
def check_arguments_add(self, args):
|
||||
assert args.cluster_alias or args.pai_uri or args.user, "must specify cluster-alias, pai-uri, user"
|
||||
assert args.password or args.token, "please add an authentication credential, password or token"
|
||||
|
||||
def do_action_add(self, args):
|
||||
self.enable_svaing["clusters"] = True
|
||||
self.__clusters__.add(extract_args(args))
|
||||
|
||||
def define_arguments_delete(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(parser, ['cluster_alias'])
|
||||
|
||||
def do_action_delete(self, args):
|
||||
if self.__clusters__.delete(args.cluster_alias):
|
||||
to_screen("cluster %s deleted" % args.cluster_alias)
|
||||
return None
|
||||
|
||||
def define_arguments_select(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(parser, ['--is-global', 'cluster_alias'])
|
||||
|
||||
def check_arguments_select(self, args):
|
||||
assert args.cluster_alias, "must specify a valid cluster-alias"
|
||||
|
||||
def do_action_select(self, args):
|
||||
update_default('cluster-alias', args.cluster_alias,
|
||||
is_global=args.is_global)
|
||||
|
||||
|
||||
class ActionFactoryForJob(ActionFactory):
|
||||
|
||||
# basic commands
|
||||
def define_arguments_list(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(parser, ['--cluster-alias', '--user'])
|
||||
|
||||
def do_action_list(self, args):
|
||||
client = self.__clusters__.get_client(args.cluster_alias)
|
||||
if not args.user:
|
||||
args.user = client.user
|
||||
to_screen("if not set, only your job will be listed, user `--user __all__` to list jobs of all users")
|
||||
if args.user == '__all__':
|
||||
args.user = None
|
||||
jobs = client.rest_api_job_list(user=args.user)
|
||||
return ["%s [%s]" % (j["name"], j.get("state", "UNKNOWN")) for j in jobs]
|
||||
|
||||
def define_arguments_status(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(parser, ['--cluster-alias', '--user'])
|
||||
parser.add_argument('job_name', help='job name')
|
||||
parser.add_argument('query', nargs='?', choices=['config', 'ssh'])
|
||||
|
||||
def check_arguments_status(self, args):
|
||||
assert args.job_name, "must specify a job name"
|
||||
|
||||
def do_action_status(self, args):
|
||||
client = self.__clusters__.get_client(args.cluster_alias)
|
||||
if not args.user:
|
||||
args.user = client.user
|
||||
return client.rest_api_job_info(args.job_name, args.query, user=args.user)
|
||||
|
||||
def define_arguments_stop(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(parser, ['--cluster-alias'])
|
||||
parser.add_argument('job_names', nargs='+', help='job name')
|
||||
|
||||
def check_arguments_stop(self, args):
|
||||
assert args.job_names, "must specify a job name"
|
||||
|
||||
def do_action_stop(self, args):
|
||||
client = self.__clusters__.get_client(args.cluster_alias)
|
||||
for job_name in args.job_names:
|
||||
to_screen(client.rest_api_execute_job(job_name, "STOP"))
|
||||
|
||||
def define_arguments_submit(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(
|
||||
parser, ['--cluster-alias', '--virtual-cluster', '--preview', '--update', 'config'])
|
||||
|
||||
def check_arguments_submit(self, args):
|
||||
assert args.config, "please specify a job config file (json or yaml format)"
|
||||
assert os.path.isfile(args.config), "%s cannot be read" % args.config
|
||||
|
||||
def submit_it(self, args):
|
||||
if args.preview:
|
||||
return self.__job__.validate().get_config()
|
||||
result = self.__job__.submit(args.cluster_alias, args.virtual_cluster)
|
||||
if "job_link" in result and not getattr(args, 'no_browser', False):
|
||||
browser_open(result["job_link"])
|
||||
return result
|
||||
|
||||
def do_action_submit(self, args):
|
||||
# key-value pair in --update option would support nested key, e.g. defaults->virtualCluster=<your-virtual-cluster>
|
||||
self.__job__.load(fname=args.config)
|
||||
if args.update:
|
||||
for s in args.update:
|
||||
key, value = s.split("=")
|
||||
Nested(self.__job__.protocol).set(key, value)
|
||||
return self.submit_it(args)
|
||||
|
||||
def define_essentials(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(parser, [
|
||||
'--job-name',
|
||||
'--cluster-alias', '--virtual-cluster', '--workspace', # for cluster
|
||||
'--sources', '--pip-installs', # for sdk_template
|
||||
'--image', '--cpu', '--gpu', '--mem', "--memoryMB",
|
||||
'--preview', '--no-browser',
|
||||
'--python',
|
||||
])
|
||||
|
||||
def check_essentials(self, args):
|
||||
assert args.cluster_alias, "must specify a cluster"
|
||||
args.sources = [] if not args.sources else args.sources
|
||||
args.pip_installs = [] if not args.pip_installs else args.pip_installs
|
||||
if args.sources:
|
||||
assert args.workspace, "must specify --workspace if --sources used"
|
||||
for s in args.sources:
|
||||
assert os.path.isfile(s), "file %s not found" % s
|
||||
assert args.image, "must specify a docker image"
|
||||
if args.job_name:
|
||||
args.job_name = args.job_name.replace("$", randstr(10))
|
||||
|
||||
def define_arguments_sub(self, parser: argparse.ArgumentParser):
|
||||
self.define_essentials(parser)
|
||||
cli_add_arguments(parser, [
|
||||
'commands'
|
||||
])
|
||||
|
||||
def check_arguments_sub(self, args):
|
||||
self.check_essentials(args)
|
||||
|
||||
def do_action_sub(self, args):
|
||||
self.__job__.new(args.job_name).one_liner(
|
||||
commands=" ".join(args.commands),
|
||||
image=args.image,
|
||||
resources=extract_args(args, ["gpu", "cpu", "memoryMB", "mem"]),
|
||||
cluster=extract_args(
|
||||
args, ["cluster_alias", "virtual_cluster", "workspace"]),
|
||||
sources=args.sources, pip_installs=args.pip_installs,
|
||||
)
|
||||
self.__job__.protocol["parameters"]["python_path"] = args.python
|
||||
return self.submit_it(args)
|
||||
|
||||
def define_arguments_notebook(self, parser: argparse.ArgumentParser):
|
||||
self.define_essentials(parser)
|
||||
cli_add_arguments(parser, [
|
||||
'--interactive',
|
||||
'--notebook-token',
|
||||
'notebook'
|
||||
])
|
||||
|
||||
def check_arguments_notebook(self, args):
|
||||
self.check_essentials(args)
|
||||
assert args.notebook or args.interactive, "must specify a notebook name unless in interactive mode"
|
||||
if not args.job_name:
|
||||
assert args.notebook or args.interactive, "must specify a notebook if no job name defined"
|
||||
args.job_name = os.path.splitext(os.path.basename(args.notebook))[
|
||||
0] + "_" + randstr().hex if args.notebook else "jupyter_server_{}".format(randstr().hex)
|
||||
if args.interactive and not args.token:
|
||||
to_screen("no authentication token is set", _type="warn")
|
||||
|
||||
def connect_notebook(self):
|
||||
result = self.__job__.wait()
|
||||
if result.get("notebook", None) is not None:
|
||||
browser_open(result["notebook"])
|
||||
return result
|
||||
|
||||
def do_action_notebook(self, args):
|
||||
self.__job__.new(args.job_name).from_notebook(
|
||||
nb_file=args.notebook, mode="interactive" if args.interactive else "silent", token=args.token,
|
||||
image=args.image,
|
||||
cluster=extract_args(
|
||||
args, ["cluster_alias", "virtual_cluster", "workspace"]),
|
||||
resources=extract_args(args, ["gpu", "cpu", "memoryMB", "mem"]),
|
||||
sources=args.sources, pip_installs=args.pip_installs,
|
||||
)
|
||||
self.__job__.protocol["parameters"]["python_path"] = args.python
|
||||
result = self.submit_it(args)
|
||||
if not args.preview:
|
||||
result.update(na(self.connect_notebook(), {}))
|
||||
return result
|
||||
|
||||
def define_arguments_connect(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(parser, ['--cluster-alias'])
|
||||
parser.add_argument('job_name', help="job name to connect")
|
||||
|
||||
def check_arguments_connect(self, args):
|
||||
assert args.cluster_alias, "must specify a cluster"
|
||||
assert args.job_name, "must specify a job name"
|
||||
|
||||
def do_action_connect(self, args):
|
||||
to_screen("retrieving job config from cluster")
|
||||
self.__job__.load(job_name=args.job_name, cluster_alias=args.cluster_alias)
|
||||
return self.connect_notebook()
|
||||
|
||||
|
||||
class ActionFactoryForStorage(ActionFactory):
|
||||
|
||||
def define_arguments_list_storage(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(parser, ['--cluster-alias'])
|
||||
|
||||
def do_action_list_storage(self, args):
|
||||
return self.__clusters__.select(args.cluster_alias)['storages']
|
||||
|
||||
def define_arguments_list(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(
|
||||
parser, ['--cluster-alias', '--storage-alias', 'remote_path'])
|
||||
|
||||
def do_action_list(self, args):
|
||||
return self.__clusters__.get_client(args.cluster_alias).get_storage(args.storage_alias).list(args.remote_path)
|
||||
|
||||
def define_arguments_status(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(
|
||||
parser, ['--cluster-alias', '--storage-alias', 'remote_path'])
|
||||
|
||||
def do_action_status(self, args):
|
||||
return self.__clusters__.get_client(args.cluster_alias).get_storage(args.storage_alias).status(args.remote_path)
|
||||
|
||||
def define_arguments_delete(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(
|
||||
parser, ['--cluster-alias', '--storage-alias', '--recursive', 'remote_path'])
|
||||
|
||||
def do_action_delete(self, args):
|
||||
return self.__clusters__.get_client(args.cluster_alias).get_storage(args.storage_alias).delete(args.remote_path, recursive=args.recursive)
|
||||
|
||||
def define_arguments_download(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(
|
||||
parser, ['--cluster-alias', '--storage-alias', 'remote_path', 'local_path'])
|
||||
|
||||
def do_action_download(self, args):
|
||||
return self.__clusters__.get_client(args.cluster_alias).get_storage(args.storage_alias).download(remote_path=args.remote_path, local_path=args.local_path)
|
||||
|
||||
def define_arguments_upload(self, parser: argparse.ArgumentParser):
|
||||
cli_add_arguments(parser, [
|
||||
'--cluster-alias', '--storage-alias', '--overwrite', 'local_path', 'remote_path'])
|
||||
|
||||
def do_action_upload(self, args):
|
||||
return self.__clusters__.get_client(args.cluster_alias).get_storage(args.storage_alias).upload(remote_path=args.remote_path, local_path=args.local_path, overwrite=getattr(args, "overwrite", False))
|
||||
|
||||
|
||||
cluster_cfg_file = __flags__.get_cluster_cfg_file(get_defaults()["clusters-in-local"])
|
||||
|
||||
|
||||
def generate_cli_structure(is_beta: bool):
|
||||
cli_s = {
|
||||
"cluster": {
|
||||
"help": "cluster management",
|
||||
"factory": ActionFactoryForCluster,
|
||||
"actions": {
|
||||
"list": "list clusters in config file %s" % cluster_cfg_file,
|
||||
"resources": "report the (available, used, total) resources of the cluster",
|
||||
"update": "check the healthness of clusters and update the information",
|
||||
"edit": "edit the config file in your editor %s" % cluster_cfg_file,
|
||||
"add": "add a cluster to config file %s" % cluster_cfg_file,
|
||||
"delete": "delete a cluster from config file %s" % cluster_cfg_file,
|
||||
"select": "select a cluster as default",
|
||||
}
|
||||
},
|
||||
"job": {
|
||||
"help": "job operations",
|
||||
"factory": ActionFactoryForJob,
|
||||
"actions": {
|
||||
"list": "list existing jobs",
|
||||
"status": "query the status of a job",
|
||||
"stop": "stop the job",
|
||||
"submit": "submit the job from a config file",
|
||||
"sub": "generate a config file from commands, and then `submit` it",
|
||||
"notebook": "run a jupyter notebook remotely",
|
||||
"connect": "connect to an existing job",
|
||||
}
|
||||
},
|
||||
"storage": {
|
||||
"help": "storage operations",
|
||||
"factory": ActionFactoryForStorage,
|
||||
"actions": {
|
||||
"list-storage": "list storage attached to the cluster",
|
||||
"list": "list items about the remote path",
|
||||
"status": "get detailed information about remote path",
|
||||
"upload": "upload",
|
||||
"download": "download",
|
||||
"delete": "delete",
|
||||
}
|
||||
},
|
||||
}
|
||||
dic = {
|
||||
key: [
|
||||
value["help"],
|
||||
[value["factory"](x, value["actions"])
|
||||
for x in value["actions"].keys()]
|
||||
] for key, value in cli_s.items()
|
||||
}
|
||||
dic.update({
|
||||
"set": [
|
||||
"set a (default) variable for cluster and job", [
|
||||
ActionFactoryForDefault("set", {"set": ["set"]})]
|
||||
],
|
||||
"unset": [
|
||||
"un-set a (default) variable for cluster and job", [
|
||||
ActionFactoryForDefault("unset", {"unset": ["unset"]})]
|
||||
],
|
||||
})
|
||||
return dic
|
||||
|
||||
|
||||
class Engine(EngineFactory):
|
||||
|
||||
def __init__(self):
|
||||
super().__init__(generate_cli_structure(is_beta=False))
|
||||
|
||||
|
||||
def main():
|
||||
try:
|
||||
eng = Engine()
|
||||
result = eng.process(sys.argv[1:])
|
||||
if result:
|
||||
to_screen(result)
|
||||
return 0
|
||||
except AssertionError as identifier:
|
||||
to_screen(f"Value error: {repr(identifier)}", _type="error")
|
||||
return 1
|
||||
except Exception as identifier:
|
||||
to_screen(f"Error: {repr(identifier)}", _type="error")
|
||||
return 2
|
||||
else:
|
||||
return -1
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
|
@ -1,166 +0,0 @@
|
|||
|
||||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
""" this module is to set a way to control the predefined configurations
|
||||
"""
|
||||
from openpaisdk.flags import __flags__
|
||||
from openpaisdk.utils import na, OrganizedList
|
||||
from openpaisdk.io_utils import from_file, to_file, to_screen
|
||||
|
||||
|
||||
class CfgLayer:
|
||||
|
||||
def __init__(self, name: str, include: list = None, exclude: list = None, file: str = None, values: dict = None, allow_unknown: bool = True):
|
||||
self.name = name
|
||||
self.file = file
|
||||
self.values = from_file(file, {}, silent=True) if file else na(values, {})
|
||||
self.definitions = OrganizedList(
|
||||
__flags__.default_var_definitions(),
|
||||
_key="name"
|
||||
).filter(None, include, exclude) # type: OrganizedList
|
||||
|
||||
def update(self, key: str, value=None, delete: bool = False):
|
||||
if not self.allow(key):
|
||||
to_screen(f"{key} is not a recognized default variable, ignored")
|
||||
return
|
||||
dic = self.values
|
||||
if delete:
|
||||
if key not in dic:
|
||||
to_screen(f"key {key} not found in {self.name}, ignored")
|
||||
elif not self.act_append(key) or not value: # delete the key when not append action
|
||||
del dic[key]
|
||||
to_screen(f"key {key} removed completely from {self.name} successfully")
|
||||
else:
|
||||
dic[key].remove(value)
|
||||
to_screen(f"{value} removed in {key} under {self.name} successfully")
|
||||
else:
|
||||
if self.act_append(key):
|
||||
def _append(dic, key, value):
|
||||
dic.setdefault(key, [])
|
||||
if value not in dic[key]:
|
||||
dic[key].append(value)
|
||||
_append(dic, key, value)
|
||||
to_screen(f"{value} added to {key} under {self.name} successfully")
|
||||
else:
|
||||
dic[key] = value
|
||||
to_screen(f"{key} set to {value} under {self.name} successfully")
|
||||
if self.file:
|
||||
to_file(self.values, self.file)
|
||||
|
||||
def allow(self, key: str):
|
||||
return self.definitions.first_index(key) is not None
|
||||
|
||||
def act_append(self, key: str):
|
||||
if self.allow(key):
|
||||
return self.definitions.first(key).get("action", None) == "append"
|
||||
return False
|
||||
|
||||
|
||||
class LayeredSettings:
|
||||
"""key-value querying from a list of dicts, priority depends on list index
|
||||
refer to [TestDefaults](../tests/test_utils.py) for more usage examples
|
||||
"""
|
||||
|
||||
layers = None
|
||||
definitions = None
|
||||
|
||||
@classmethod
|
||||
def init(cls):
|
||||
if cls.layers is None:
|
||||
cls.reset()
|
||||
|
||||
@classmethod
|
||||
def reset(cls):
|
||||
cls.definitions = OrganizedList(__flags__.default_var_definitions(), _key="name").as_dict
|
||||
cls.layers = OrganizedList([
|
||||
CfgLayer(
|
||||
name="user_advaced",
|
||||
exclude=["clusters-in-local", "image-list", "resource-specs"]
|
||||
),
|
||||
CfgLayer(
|
||||
name="user_basic",
|
||||
exclude=["clusters-in-local", "image-list", "resource-specs"]
|
||||
),
|
||||
CfgLayer(
|
||||
name="local_default",
|
||||
exclude=[], file=__flags__.get_default_file(is_global=False)
|
||||
),
|
||||
CfgLayer(
|
||||
name="global_default",
|
||||
exclude=[], file=__flags__.get_default_file(is_global=True)
|
||||
)
|
||||
], _key="name", _getter=getattr)
|
||||
|
||||
@classmethod
|
||||
def keys(cls):
|
||||
dic = set()
|
||||
for layer in cls.layers:
|
||||
for key in layer.values.keys():
|
||||
dic.add(key)
|
||||
dic = dic.union(cls.definitions.keys())
|
||||
return list(dic)
|
||||
|
||||
@classmethod
|
||||
def act_append(cls, key):
|
||||
return cls.definitions.get(key, {}).get("action", None) == "append"
|
||||
|
||||
@classmethod
|
||||
def get(cls, key):
|
||||
__not_found__ = "==Not-Found=="
|
||||
lst = [layer.values.get(key, __not_found__) for layer in cls.layers]
|
||||
lst.append(cls.definitions.get(key, {}).get("default", None))
|
||||
lst = [x for x in lst if x != __not_found__]
|
||||
|
||||
if cls.act_append(key):
|
||||
from openpaisdk.utils import flatten
|
||||
return list(flatten(lst))
|
||||
else:
|
||||
return lst[0] if lst else None
|
||||
|
||||
@classmethod
|
||||
def update(cls, layer: str, key: str, value=None, delete: bool = False):
|
||||
cls.layers.first(layer).update(key, value, delete)
|
||||
|
||||
@classmethod
|
||||
def as_dict(cls):
|
||||
return {key: cls.get(key) for key in cls.keys()}
|
||||
|
||||
@classmethod
|
||||
def print_supported_items(cls):
|
||||
headers = ['name', 'default', 'help']
|
||||
return to_screen([
|
||||
[x.get(k, None) for k in headers] for x in __flags__.default_var_definitions()
|
||||
], _type="table", headers=headers)
|
||||
|
||||
|
||||
LayeredSettings.init()
|
||||
|
||||
|
||||
def get_defaults(en_local=True, en_global=True, en_predefined=True):
|
||||
return LayeredSettings.as_dict()
|
||||
|
||||
|
||||
def update_default(key: str, value: str = None, is_global: bool = False, to_delete: bool = False):
|
||||
layer = "global_default" if is_global else "local_default"
|
||||
LayeredSettings.update(layer, key, value, to_delete)
|
||||
|
||||
|
||||
def get_install_uri(ver: str = None):
|
||||
ver = get_defaults()["container-sdk-branch"] if not ver else ver
|
||||
return '-e "git+https://github.com/Microsoft/pai@{}#egg=openpaisdk&subdirectory=contrib/python-sdk"'.format(ver)
|
|
@ -1,149 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
import os
|
||||
|
||||
|
||||
class __flags__(object):
|
||||
"store the flags and constants"
|
||||
disable_to_screen = False # A flag to disable to_screen output
|
||||
debug_mode = os.path.isfile('debug_enable')
|
||||
|
||||
# ! below attributes should not be changed
|
||||
cache = '.openpai'
|
||||
cluster_cfg_file = 'clusters.yaml'
|
||||
defaults_file = 'defaults.yaml'
|
||||
container_sdk_branch = 'master'
|
||||
resources_requirements = dict(cpu=2, gpu=0, memoryMB=4096, ports={})
|
||||
storage_root = '/openpai-sdk'
|
||||
custom_predefined = []
|
||||
|
||||
@staticmethod
|
||||
def default_var_definitions():
|
||||
return [
|
||||
{
|
||||
"name": "clusters-in-local",
|
||||
"default": "no",
|
||||
"help": f"[yes / no], if yes, clusters configuration stored in {__flags__.get_cluster_cfg_file('yes')} other than ~/{__flags__.get_cluster_cfg_file('yes')}",
|
||||
},
|
||||
{
|
||||
"name": "cluster-alias",
|
||||
"abbreviation": "a",
|
||||
"help": "cluster alias",
|
||||
},
|
||||
{
|
||||
"name": "virtual-cluster",
|
||||
"abbreviation": "vc",
|
||||
"help": "virtual cluster name"
|
||||
},
|
||||
{
|
||||
"name": "storage-alias",
|
||||
"abbreviation": "s",
|
||||
"help": "alias of storage to use"
|
||||
},
|
||||
{
|
||||
"name": "workspace",
|
||||
"default": None,
|
||||
"abbreviation": "w",
|
||||
"help": f"storage root for a job to store its codes / data / outputs ... (default is {__flags__.storage_root}/$user)"
|
||||
},
|
||||
{
|
||||
"name": "container-sdk-branch",
|
||||
"default": __flags__.container_sdk_branch,
|
||||
"help": "code branch to install sdk from (in a job container)"
|
||||
},
|
||||
{
|
||||
"name": "image",
|
||||
"abbreviation": "i",
|
||||
"help": "docker image"
|
||||
},
|
||||
{
|
||||
"name": "cpu",
|
||||
"help": f"cpu number per instance (default is {__flags__.resources_requirements['cpu']})"
|
||||
},
|
||||
{
|
||||
"name": "gpu",
|
||||
"help": f"gpu number per instance (default is {__flags__.resources_requirements['gpu']})"
|
||||
},
|
||||
{
|
||||
"name": "memoryMB",
|
||||
"help": f"memory (MB) per instance (default is {__flags__.resources_requirements['memoryMB']}) (will be overridden by --mem)"
|
||||
},
|
||||
{
|
||||
"name": "mem",
|
||||
"help": "memory (MB / GB) per instance (default is %.0fGB)" % (__flags__.resources_requirements["memoryMB"] / 1024.0)
|
||||
},
|
||||
{
|
||||
"name": "sources",
|
||||
"default": [],
|
||||
"abbreviation": "src",
|
||||
"action": "append",
|
||||
"help": "source files to upload (into container)"
|
||||
},
|
||||
{
|
||||
"name": "pip-installs",
|
||||
"default": [],
|
||||
"abbreviation": "pip",
|
||||
"action": "append",
|
||||
"help": "packages to install via pip"
|
||||
},
|
||||
{
|
||||
"name": "image-list",
|
||||
"default": [],
|
||||
"action": "append",
|
||||
"help": "list of images that are frequently used"
|
||||
},
|
||||
{
|
||||
"name": "resource-list",
|
||||
"default": [],
|
||||
"action": "append",
|
||||
"help": "list of resource specs that are frequently used"
|
||||
},
|
||||
{
|
||||
"name": "web-default-form",
|
||||
"help": "web-default-form (in Submitter)"
|
||||
},
|
||||
{
|
||||
"name": "web-default-image",
|
||||
"help": "web-default-image (in Submitter)"
|
||||
},
|
||||
{
|
||||
"name": "web-default-resource",
|
||||
"help": "web-default-resource (in Submitter), format: '<gpu>,<cpu>,<memoryMB>'"
|
||||
},
|
||||
] + __flags__.custom_predefined
|
||||
|
||||
@staticmethod
|
||||
def get_cluster_cfg_file(clusters_in_local: str = 'no') -> str:
|
||||
assert clusters_in_local in ['no', 'yes'], f"only allow yes / no, but {clusters_in_local} received"
|
||||
pth = [__flags__.cache, __flags__.cluster_cfg_file]
|
||||
if clusters_in_local == 'no':
|
||||
pth = [os.path.expanduser('~')] + pth
|
||||
return os.path.join(*pth)
|
||||
|
||||
@staticmethod
|
||||
def get_default_file(is_global: bool) -> str:
|
||||
pth = [__flags__.cache, __flags__.defaults_file]
|
||||
pth = [os.path.expanduser('~')] + pth if is_global else pth
|
||||
return os.path.join(*pth)
|
||||
|
||||
@staticmethod
|
||||
def print_predefined(exclude: list = None, include: list = None):
|
||||
from tabulate import tabulate
|
||||
citems = __flags__.predefined_defaults(exclude, include)
|
||||
print(tabulate(citems, headers=citems[0]._asdict().keys()), flush=True)
|
|
@ -1,204 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
import os
|
||||
import errno
|
||||
import shutil
|
||||
from webbrowser import open_new_tab
|
||||
from contextlib import contextmanager
|
||||
from functools import partial
|
||||
import json
|
||||
import yaml
|
||||
import logging
|
||||
from urllib.request import urlopen
|
||||
from urllib.parse import urlsplit
|
||||
from urllib.request import urlretrieve
|
||||
import cgi
|
||||
from openpaisdk.flags import __flags__
|
||||
|
||||
logging.basicConfig(format='%(name)s - %(levelname)s - %(message)s')
|
||||
__logger__ = logging.getLogger(name="openpai")
|
||||
__logger__.setLevel(level=logging.DEBUG if __flags__.debug_mode else logging.INFO)
|
||||
|
||||
|
||||
def to_screen(msg, _type: str = "normal", **kwargs):
|
||||
"""a general wrapping function to deal with interactive IO and logging
|
||||
"""
|
||||
def print_out(msg, **kwargs):
|
||||
out = yaml.dump(msg, default_flow_style=False, **kwargs) if not isinstance(msg, str) else msg
|
||||
if not __flags__.disable_to_screen:
|
||||
print(out, flush=True)
|
||||
return out
|
||||
|
||||
def print_table(msg, **kwargs):
|
||||
from tabulate import tabulate
|
||||
out = tabulate(msg, **kwargs)
|
||||
if not __flags__.disable_to_screen:
|
||||
print(out, flush=True)
|
||||
return out
|
||||
|
||||
func_dict = {
|
||||
"normal": print_out,
|
||||
"table": print_table,
|
||||
"warn": partial(__logger__.warn, exc_info=__flags__.debug_mode),
|
||||
"debug": __logger__.debug,
|
||||
"error": partial(__logger__.error, exc_info=True),
|
||||
}
|
||||
assert _type in func_dict, f"unsupported output type {_type}, only {list(func_dict.keys(()))} are valid"
|
||||
ret = func_dict[_type](msg, **kwargs)
|
||||
return ret if _type == "table" else msg
|
||||
|
||||
|
||||
def listdir(path):
|
||||
assert os.path.isdir(path), "{} is not a valid path of directory".format(path)
|
||||
root, dirs, files = next(os.walk(path))
|
||||
return {
|
||||
"root": root,
|
||||
"dirs": dirs,
|
||||
"files": files
|
||||
}
|
||||
|
||||
|
||||
def browser_open(url: str):
|
||||
__logger__.info("open in browser: %s", url)
|
||||
try:
|
||||
open_new_tab(url)
|
||||
except Exception as e:
|
||||
to_screen(f"fail to open {url} due to {repx(e)}", _type="warn")
|
||||
|
||||
|
||||
def from_file(fname: str, default=None, silent: bool = False, **kwargs):
|
||||
"""read yaml or json file; return default if (only when default is not None)
|
||||
- file non existing
|
||||
- empty file or contents in file is not valid
|
||||
- loaded content is not expected type (type(default))
|
||||
"""
|
||||
import yaml
|
||||
assert os.path.splitext(fname)[1] in __json_exts__ + __yaml_exts__, f"unrecognized {fname}"
|
||||
try:
|
||||
with open(fname) as fp:
|
||||
dic = dict(kwargs)
|
||||
dic.setdefault('Loader', yaml.FullLoader)
|
||||
ret = yaml.load(fp, **dic)
|
||||
assert ret, f"read empty object ({ret}) from {fname}, return {default}"
|
||||
assert default is None or isinstance(
|
||||
ret, type(default)), f"read wrong type ({type(ret)}, expected {type(default)}) from {fname}, return {default}"
|
||||
return ret
|
||||
except Exception as identifier:
|
||||
if default is None:
|
||||
to_screen(f"{repr(identifier)} when reading {fname}", _type="error")
|
||||
raise identifier
|
||||
if not silent:
|
||||
to_screen(f"{repr(identifier)} when reading {fname}", _type="warn")
|
||||
return default
|
||||
|
||||
|
||||
def get_url_filename_from_server(url):
|
||||
try:
|
||||
blah = urlopen(url).info()['Content-Disposition']
|
||||
_, params = cgi.parse_header(blah)
|
||||
return params["filename"]
|
||||
except Exception as e:
|
||||
to_screen(f'Failed to get filename from server: {repr(e)}', _type="warn")
|
||||
return None
|
||||
|
||||
|
||||
def web_download_to_folder(url: str, folder: str, filename: str = None):
|
||||
if not filename:
|
||||
split = urlsplit(url)
|
||||
filename = split.path.split("/")[-1]
|
||||
filename = os.path.join(folder, filename)
|
||||
os.makedirs(folder, exist_ok=True)
|
||||
try:
|
||||
urlretrieve(url, filename)
|
||||
__logger__.info('download from %s to %s', url, filename)
|
||||
return filename
|
||||
except Exception:
|
||||
__logger__.error("failed to download", exc_info=True)
|
||||
|
||||
|
||||
def mkdir_for(pth: str):
|
||||
d = os.path.dirname(pth)
|
||||
if d:
|
||||
os.makedirs(d, exist_ok=True)
|
||||
return d
|
||||
|
||||
|
||||
def file_func(kwargs: dict, func=shutil.copy2, tester: str = 'dst'):
|
||||
try:
|
||||
return func(**kwargs)
|
||||
except IOError as identifier:
|
||||
# ENOENT(2): file does not exist, raised also on missing dest parent dir
|
||||
if identifier.errno != errno.ENOENT:
|
||||
print(identifier.__dict__)
|
||||
assert tester in kwargs.keys(), 'wrong parameter {}'.format(tester)
|
||||
os.makedirs(os.path.dirname(kwargs[tester]), exist_ok=True)
|
||||
return func(**kwargs)
|
||||
except Exception as identifier:
|
||||
print(identifier)
|
||||
return None
|
||||
|
||||
|
||||
@contextmanager
|
||||
def safe_open(filename: str, mode: str = 'r', func=open, **kwargs):
|
||||
"if directory of filename does not exist, create it first"
|
||||
mkdir_for(filename)
|
||||
fn = func(filename, mode=mode, **kwargs)
|
||||
yield fn
|
||||
fn.close()
|
||||
|
||||
|
||||
@contextmanager
|
||||
def safe_chdir(pth: str):
|
||||
"safely change directory to pth, and then go back"
|
||||
currdir = os.getcwd()
|
||||
try:
|
||||
if not pth:
|
||||
pth = currdir
|
||||
os.chdir(pth)
|
||||
__logger__.info("changing directory to %s", pth)
|
||||
yield pth
|
||||
finally:
|
||||
os.chdir(currdir)
|
||||
__logger__.info("changing directory back to %s", currdir)
|
||||
|
||||
|
||||
def safe_copy(src: str, dst: str):
|
||||
"if directory of filename doesnot exist, create it first"
|
||||
return file_func({'src': src, 'dst': dst})
|
||||
|
||||
|
||||
__yaml_exts__, __json_exts__ = ['.yaml', '.yml'], ['.json', '.jsn']
|
||||
|
||||
|
||||
def to_file(obj, fname: str, fmt=None, **kwargs):
|
||||
if not fmt:
|
||||
_, ext = os.path.splitext(fname)
|
||||
if ext in __json_exts__:
|
||||
fmt, dic = json, dict(indent=4)
|
||||
elif ext in __yaml_exts__:
|
||||
import yaml
|
||||
fmt, dic = yaml, dict(default_flow_style=False)
|
||||
else:
|
||||
raise NotImplementedError
|
||||
dic.update(kwargs)
|
||||
else:
|
||||
dic = kwargs
|
||||
with safe_open(fname, 'w') as fp:
|
||||
fmt.dump(obj, fp, **dic)
|
||||
__logger__.debug("serialize object to file %s", fname)
|
|
@ -1,659 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import pathlib
|
||||
from typing import Union, List
|
||||
from copy import deepcopy
|
||||
from html2text import html2text
|
||||
|
||||
from openpaisdk.flags import __flags__
|
||||
from openpaisdk.defaults import get_install_uri, LayeredSettings
|
||||
from openpaisdk.io_utils import from_file, safe_open, to_file, to_screen
|
||||
from openpaisdk.utils import Retry, concurrent_map, exception_free, find, get_response, na, na_lazy
|
||||
from openpaisdk.cluster import get_cluster
|
||||
|
||||
__protocol_filename__ = "job_protocol.yaml"
|
||||
__config_filename__ = "job_config.json"
|
||||
__protocol_unit_types__ = ["job", "data", "script", "dockerimage", "output"]
|
||||
|
||||
|
||||
class ProtocolUnit:
|
||||
|
||||
@staticmethod
|
||||
def validate(u: dict):
|
||||
# assert u["protocolVersion"] in ["1", "2", 1, 2], "invalid protocolVersion (%s)" % u["protocolVersion"]
|
||||
assert u["type"] in __protocol_unit_types__, "invalid type (%s)" % u["type"]
|
||||
assert u["name"], "invalid name"
|
||||
# uri: String or list, required # Only when the type is data can the uri be a list.
|
||||
assert isinstance(u["uri"], str) or u["type"] == "data" and isinstance(u["uri"], list), "uri: String or list, required # Only when the type is data can the uri be a list. (Error: %s)" % u
|
||||
|
||||
|
||||
class TaskRole:
|
||||
|
||||
@staticmethod
|
||||
def validate(t: dict):
|
||||
assert t["dockerImage"], "unknown dockerImage"
|
||||
assert t["resourcePerInstance"]["cpu"] > 0, "invalid cpu number (%d)" % t["resourcePerInstance"]["cpu"]
|
||||
assert t["resourcePerInstance"]["gpu"] >= 0, "invalid gpu number (%d)" % t["resourcePerInstance"]["gpu"]
|
||||
assert t["resourcePerInstance"]["memoryMB"] > 0, "invalid memoryMB number (%d)" % t["resourcePerInstance"]["memoryMB"]
|
||||
for label, port in t["resourcePerInstance"].get("ports", {}).items():
|
||||
assert port >= 0, "invalid port (%s : %d)" % (label, port)
|
||||
assert isinstance(t["commands"], list) and t["commands"], "empty commands"
|
||||
|
||||
|
||||
class Deployment:
|
||||
|
||||
@staticmethod
|
||||
def validate(d: dict, task_role_names: list):
|
||||
assert d["name"], "deployment should have a name"
|
||||
for t, c in d["taskRoles"].items():
|
||||
assert t in task_role_names, "invalid taskrole name (%s)" % (t)
|
||||
assert isinstance(["preCommands"], list), "preCommands should be a list"
|
||||
assert isinstance(["postCommands"], list), "postCommands should be a list"
|
||||
|
||||
|
||||
class JobResource:
|
||||
|
||||
def __init__(self, r: dict = None):
|
||||
from copy import deepcopy
|
||||
|
||||
def gb2mb(m):
|
||||
if not isinstance(m, str) or m.isnumeric():
|
||||
return int(m)
|
||||
if m.lower().endswith('g'):
|
||||
return int(m[:-1]) * 1024
|
||||
if m.lower().endswith('gb'):
|
||||
return int(m[:-2]) * 1024
|
||||
raise ValueError(m)
|
||||
|
||||
r = {} if not r else r
|
||||
dic = deepcopy(__flags__.resources_requirements)
|
||||
for key in ["cpu", "gpu", "memoryMB", "ports"]:
|
||||
if r.get(key, None) is not None:
|
||||
dic[key] = int(r[key]) if not key == "ports" else r[key]
|
||||
if r.get("mem", None) is not None:
|
||||
dic["memoryMB"] = gb2mb(r["mem"])
|
||||
self.req = dic
|
||||
|
||||
def add_port(self, name: str, num: int = 1):
|
||||
self.req.setdefault("ports", {})[name] = num
|
||||
return self
|
||||
|
||||
@property
|
||||
def as_dict(self):
|
||||
return self.req
|
||||
|
||||
@staticmethod
|
||||
def parse_list(lst: List[str]):
|
||||
r = []
|
||||
for spec in lst:
|
||||
s = spec.replace(" ", '').split(",")
|
||||
r.append(JobResource({
|
||||
"gpu": s[0], "cpu": s[1], "mem": s[2],
|
||||
}).as_dict)
|
||||
return r
|
||||
|
||||
|
||||
class Job:
|
||||
"""
|
||||
the data structure and methods to describe a job compatible with https://github.com/microsoft/openpai-protocol/blob/master/schemas/v2/schema.yaml
|
||||
external methods:
|
||||
- I/O
|
||||
- save(...) / load(...): store and restore to the disk
|
||||
- Job protocol wizard
|
||||
- sdk_job_template(...): generate a job template with the sdk (embedding cluster / storage information)
|
||||
- one_liner(...): generate a single-taskrole job protocol from commands and other essential information
|
||||
- from_notebook(...): generate a job protocol from a jupyter notebook
|
||||
- Interaction with clusters
|
||||
- submit(...): submit to a cluster, including archiving and uploading local source files
|
||||
- wait(...): wait a job until completed
|
||||
- log(...):
|
||||
- Parse logs
|
||||
- connect_jupyter(...): wait job running and connected to jupyter server
|
||||
"""
|
||||
|
||||
def __init__(self, name: str=None, **kwargs):
|
||||
self.protocol = dict() # follow the schema of https://github.com/microsoft/openpai-protocol/blob/master/schemas/v2/schema.yaml
|
||||
self._client = None # cluster client
|
||||
self.new(name, **kwargs)
|
||||
|
||||
def new(self, name: str, **kwargs):
|
||||
self.protocol = {
|
||||
"name": name,
|
||||
"protocolVersion": 2,
|
||||
"type": "job",
|
||||
"prerequisites": [],
|
||||
"parameters": dict(),
|
||||
"secrets": dict(),
|
||||
"taskRoles": dict(),
|
||||
"deployments": [],
|
||||
"defaults": dict(),
|
||||
"extras": dict(),
|
||||
}
|
||||
self.protocol.update(kwargs)
|
||||
return self
|
||||
|
||||
def load(self, fname: str = None, job_name: str = None, cluster_alias: str = None):
|
||||
if cluster_alias: # load job config from cluster by REST api
|
||||
job_name = na(job_name, self.name)
|
||||
self.protocol = get_cluster(cluster_alias).rest_api_job_info(job_name, 'config')
|
||||
else: # load from local file
|
||||
if not fname:
|
||||
fname = Job(job_name).protocol_file
|
||||
if os.path.isfile(fname):
|
||||
self.protocol = from_file(fname, default="==FATAL==")
|
||||
self.protocol.setdefault('protocolVersion', '1') # v1 protocol (json) has no protocolVersion
|
||||
return self
|
||||
|
||||
def save(self):
|
||||
if self.name:
|
||||
to_file(self.protocol, self.protocol_file)
|
||||
return self
|
||||
|
||||
def validate(self):
|
||||
assert self.protocolVersion in ["1", "2"], "unknown protocolVersion (%s)" % self.protocol["protocolVersion"]
|
||||
assert self.name is not None, "job name is null %s" % self.protocol
|
||||
if self.protocolVersion == "2":
|
||||
assert self.protocol["type"] == "job", "type must be job (%s)" % self.protocol["type"]
|
||||
for t in self.protocol.get("taskRoles", {}).values():
|
||||
TaskRole.validate(t)
|
||||
for d in self.protocol.get("deployments", []):
|
||||
Deployment.validate(d, list(self.protocol["taskRoles"].keys()))
|
||||
for u in self.protocol.get("prerequisites", []):
|
||||
ProtocolUnit.validate(u)
|
||||
return self
|
||||
|
||||
@property
|
||||
def protocolVersion(self):
|
||||
return str(self.protocol.get("protocolVersion", "1"))
|
||||
|
||||
@property
|
||||
def name(self):
|
||||
return self.protocol.get("name" if self.protocolVersion == "2" else "jobName", None)
|
||||
|
||||
@property
|
||||
def cache_dir(self):
|
||||
assert self.name, "cannot get cache directory for an empty job name"
|
||||
return os.path.join(__flags__.cache, self.name)
|
||||
|
||||
def cache_file(self, fname):
|
||||
return os.path.join(self.cache_dir, fname)
|
||||
|
||||
@property
|
||||
def protocol_file(self):
|
||||
return self.cache_file(__protocol_filename__)
|
||||
|
||||
@property
|
||||
def temp_archive(self):
|
||||
return self.cache_file(self.name + ".tar.gz")
|
||||
|
||||
@staticmethod
|
||||
def get_config_file(job_name: str, v2: bool=True):
|
||||
return Job(job_name).cache_file(__protocol_filename__ if v2 else __config_filename__)
|
||||
|
||||
def param(self, key, default=None, field: str="parameters"):
|
||||
return self.protocol.get(field, {}).get(key, default)
|
||||
|
||||
def set_param(self, key, value, field: str="parameters"):
|
||||
self.protocol.setdefault(field, {})[key] = value
|
||||
|
||||
def secret(self, key, default=None):
|
||||
return self.param(key, default, "secrets")
|
||||
|
||||
def set_secret(self, key, value):
|
||||
self.set_param(key, value, "secrets")
|
||||
|
||||
def extra(self, key, default=None):
|
||||
return self.param(key, default, "extras")
|
||||
|
||||
def set_extra(self, key, value):
|
||||
self.set_param(key, value, "extras")
|
||||
|
||||
def tags(self):
|
||||
return self.param("tags", [], "extras")
|
||||
|
||||
def add_tag(self, tag: str):
|
||||
lst = self.tags()
|
||||
if tag not in lst:
|
||||
lst.append(tag)
|
||||
self.set_param("tags", lst, "extras")
|
||||
return self
|
||||
|
||||
def has_tag(self, tag: str):
|
||||
return tag in self.tags()
|
||||
|
||||
def get_config(self):
|
||||
if self.protocolVersion == "2":
|
||||
self.interpret_sdk_plugin()
|
||||
for d in self.protocol.get("deployments", []):
|
||||
r = d["taskRoles"]
|
||||
t_lst = list(r.keys())
|
||||
for t in t_lst:
|
||||
for k in ["preCommands", "postCommands"]: # pre- / post-
|
||||
if k not in r[t]:
|
||||
continue
|
||||
if len(r[t][k]) == 0:
|
||||
del r[t][k]
|
||||
if len(r[t]) == 0:
|
||||
del r[t]
|
||||
for key in ["deployments", "parameters"]:
|
||||
if key in self.protocol and len(self.protocol[key]) == 0:
|
||||
del self.protocol[key]
|
||||
for t in self.protocol["taskRoles"].values():
|
||||
if "ports" in t["resourcePerInstance"] and len(t["resourcePerInstance"]["ports"]) == 0:
|
||||
del t["resourcePerInstance"]["ports"]
|
||||
return self.protocol
|
||||
else:
|
||||
dic = deepcopy(self.protocol)
|
||||
del dic["protocolVersion"]
|
||||
return dic
|
||||
|
||||
def sdk_job_template(self, cluster_alias_lst: str=[], workspace: str=None, sources: list=None, pip_installs: list=None):
|
||||
"generate the job template for a sdk-submitted job"
|
||||
# secrets
|
||||
clusters = [get_cluster(alias, get_client=False) for alias in cluster_alias_lst]
|
||||
workspace = na(workspace, LayeredSettings.get("workspace"))
|
||||
workspace = na(workspace, f"{__flags__.storage_root}/{clusters[0]['user']}")
|
||||
self.set_secret("clusters", json.dumps(clusters))
|
||||
self.set_param("cluster_alias", cluster_alias_lst[0] if cluster_alias_lst else None)
|
||||
self.set_param("work_directory", '{}/jobs/{}'.format(workspace, self.name) if workspace else None)
|
||||
|
||||
# parameters
|
||||
self.set_param("python_path", "python")
|
||||
|
||||
# signature
|
||||
self.add_tag(__internal_tags__["sdk"])
|
||||
|
||||
# sdk.plugins
|
||||
sdk_install_uri = "-U {}".format(get_install_uri())
|
||||
c_dir = '~/{}'.format(__flags__.cache)
|
||||
c_file = '%s/%s' % (c_dir, __flags__.cluster_cfg_file)
|
||||
|
||||
plugins = []
|
||||
if sources:
|
||||
plugins.append({
|
||||
"plugin": "local.uploadFiles",
|
||||
"parameters": {
|
||||
"files": list(set([os.path.relpath(s) for s in sources])),
|
||||
},
|
||||
})
|
||||
|
||||
plugins.extend([
|
||||
{
|
||||
"plugin": "container.preCommands", # commands to install essential pip packages
|
||||
"parameters": {
|
||||
"commands": [
|
||||
"<% $parameters.python_path %> -m pip install {}".format(p) for p in [sdk_install_uri] + na(pip_installs, [])
|
||||
]
|
||||
}
|
||||
},
|
||||
{
|
||||
"plugin": "container.preCommands", # copy cluster information
|
||||
"parameters": {
|
||||
"commands": [
|
||||
"mkdir %s" % c_dir,
|
||||
"echo \"write config to {}\"".format(c_file),
|
||||
"echo <% $secrets.clusters %> > {}".format(c_file),
|
||||
"opai cluster select <% $parameters.cluster_alias %>",
|
||||
]
|
||||
}
|
||||
}
|
||||
])
|
||||
|
||||
if sources:
|
||||
a_file = os.path.basename(self.temp_archive)
|
||||
plugins.append({
|
||||
"plugin": "container.preCommands",
|
||||
"parameters": {
|
||||
"commands": [
|
||||
"opai storage download <% $parameters.work_directory %>/source/{} {}".format(a_file, a_file),
|
||||
"tar xvfz {}".format(a_file)
|
||||
]
|
||||
}
|
||||
})
|
||||
self.set_extra("sdk.plugins", plugins)
|
||||
return self
|
||||
|
||||
def one_liner(self,
|
||||
commands: Union[list, str], image: str, cluster: dict, resources: dict=None,
|
||||
sources: list = None, pip_installs: list = None
|
||||
):
|
||||
"""generate the single-task-role job protocol from essentials such as commands, docker image...
|
||||
:param cluster (dict): a dictionary includes {cluster_alias, virtual_cluster, workspace}
|
||||
"""
|
||||
self.sdk_job_template([cluster["cluster_alias"]], cluster.get("workspace", None), sources, pip_installs)
|
||||
self.protocol["prerequisites"].append({
|
||||
"name": "docker_image",
|
||||
"type": "dockerimage",
|
||||
"protocolVersion": "2",
|
||||
"uri": image,
|
||||
})
|
||||
self.protocol.setdefault("taskRoles", {})["main"] = {
|
||||
"dockerImage": "docker_image",
|
||||
"resourcePerInstance": JobResource(resources).as_dict,
|
||||
"commands": commands if isinstance(commands, list) else [commands]
|
||||
}
|
||||
self.add_tag(__internal_tags__["one_liner"])
|
||||
return self
|
||||
|
||||
def from_notebook(self,
|
||||
nb_file: str, mode: str="interactive", token: str="abcd",
|
||||
image: str=None, cluster: dict=None, resources: dict=None,
|
||||
sources: list = None, pip_installs: list = None
|
||||
):
|
||||
"""
|
||||
mode: interactive / silent / script
|
||||
"""
|
||||
assert mode in ["interactive", "silent", "script"], "unsupported mode %s" % mode
|
||||
if not nb_file:
|
||||
mode, nb_file = "interactive", ""
|
||||
else:
|
||||
assert os.path.isfile(nb_file), "cannot read the ipython notebook {}".format(nb_file)
|
||||
sources = na(sources, [])
|
||||
sources.append(nb_file)
|
||||
self.set_param("notebook_file", os.path.splitext(os.path.basename(nb_file))[0] if nb_file else "")
|
||||
resources = JobResource(resources)
|
||||
if mode == "interactive":
|
||||
resources.add_port("jupyter")
|
||||
self.set_secret("token", token)
|
||||
cmds = [
|
||||
" ".join([
|
||||
"jupyter notebook",
|
||||
"--no-browser", "--ip 0.0.0.0", "--port $PAI_CONTAINER_HOST_jupyter_PORT_LIST",
|
||||
"--NotebookApp.token=<% $secrets.token %>",
|
||||
"--allow-root --NotebookApp.file_to_run=<% $parameters.notebook_file %>.ipynb",
|
||||
]),
|
||||
]
|
||||
elif mode == "silent":
|
||||
cmds = [
|
||||
" ".join([
|
||||
"jupyter nbconvert --ExecutePreprocessor.timeout=-1 --ExecutePreprocessor.allow_errors=True",
|
||||
"--to html --execute <% $parameters.notebook_file %>.ipynb",
|
||||
]),
|
||||
"opai storage upload <% $parameters.notebook_file %>.html <% $parameters.work_directory %>/output/<% $parameters.notebook_file %>.html",
|
||||
]
|
||||
else:
|
||||
cmds = [
|
||||
"jupyter nbconvert --to script <% $parameters.notebook_file %>.ipynb --output openpai_submitter_entry",
|
||||
"echo ======================== Python Script Starts ========================",
|
||||
# execute notebook by iPython. To remove color information, we use "--no-term-title" and sed below
|
||||
"""ipython --no-term-title openpai_submitter_entry.py | sed -r "s/\\x1B\\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" | tr -dc '[[:print:]]\\n'""",
|
||||
]
|
||||
self.one_liner(cmds, image, cluster, resources.as_dict, sources, na(pip_installs, []) + ["jupyter"])
|
||||
mode_to_tag = {"interactive": "interactive_nb", "silent": "batch_nb", "script": "script_nb"}
|
||||
self.add_tag(__internal_tags__[mode_to_tag[mode]])
|
||||
return self
|
||||
|
||||
def interpret_sdk_plugin(self):
|
||||
plugins = self.extra("sdk.plugins", [])
|
||||
# concatenate commands
|
||||
if len(self.protocol.setdefault("deployments", [])) == 0: # will move to plugin fields when it is ready
|
||||
# we could use a new deployments for every pre- / post- commands plugin
|
||||
deployment_name, task_role_names = "sdk_deployment", list(self.protocol["taskRoles"])
|
||||
deployment = {key: dict(preCommands=[], postCommands=[]) for key in task_role_names}
|
||||
plugins_to_remove = []
|
||||
for i, plugin in enumerate(plugins):
|
||||
target = find("container.(\w+)", plugin["plugin"])
|
||||
if target not in ["preCommands", "postCommands"]:
|
||||
continue
|
||||
for t in plugin.get("taskRoles", task_role_names):
|
||||
deployment[t][target].extend(plugin["parameters"]["commands"])
|
||||
plugins_to_remove.append(i)
|
||||
if plugins_to_remove:
|
||||
self.protocol["deployments"].append({
|
||||
"name": deployment_name,
|
||||
"taskRoles": deployment,
|
||||
})
|
||||
self.protocol.setdefault("defaults", {})["deployment"] = deployment_name
|
||||
for i in reversed(plugins_to_remove):
|
||||
del plugins[i]
|
||||
return self
|
||||
|
||||
@property
|
||||
def client(self):
|
||||
if self._client is None:
|
||||
alias = self.param("cluster_alias")
|
||||
if alias:
|
||||
self._client = get_cluster(alias)
|
||||
return self._client
|
||||
|
||||
def select_cluster(self, cluster_alias: str=None, virtual_cluster: str=None):
|
||||
self._client = get_cluster(cluster_alias)
|
||||
if virtual_cluster:
|
||||
if self.protocolVersion == "1":
|
||||
self.protocol["virtualCluster"] = virtual_cluster
|
||||
else:
|
||||
self.set_param("virtualCluster", virtual_cluster, field="defaults")
|
||||
return self
|
||||
|
||||
# methods only for SDK-enabled jobs
|
||||
def submit(self, cluster_alias: str = None, virtual_cluster: str = None):
|
||||
cluster_alias = na(cluster_alias, self.param("cluster_alias", None))
|
||||
self.select_cluster(cluster_alias, virtual_cluster)
|
||||
self.validate().local_process()
|
||||
to_screen("submit job %s to cluster %s" % (self.name, cluster_alias))
|
||||
try:
|
||||
self.client.rest_api_submit(self.get_config())
|
||||
job_link = self.client.get_job_link(self.name)
|
||||
return {"job_link": job_link, "job_name": self.name}
|
||||
except Exception as identifier:
|
||||
to_screen(f"submit failed due to {repr(identifier)}", _type="error")
|
||||
to_screen(self.get_config())
|
||||
raise identifier
|
||||
|
||||
def stop(self):
|
||||
return self.client.rest_api_execute_job(self.name)
|
||||
|
||||
def get_status(self):
|
||||
return self.client.rest_api_job_info(self.name)
|
||||
|
||||
def wait(self, t_sleep: float = 10, timeout: float = 3600, silent: bool = False):
|
||||
"""for jupyter job, wait until ready to connect
|
||||
for normal job, wait until completed"""
|
||||
exit_states = __job_states__["completed"]
|
||||
repeater = Retry(timeout=timeout, t_sleep=t_sleep, silent=silent)
|
||||
interactive_nb = self.has_tag(__internal_tags__["interactive_nb"])
|
||||
batch_nb = self.has_tag(__internal_tags__["batch_nb"])
|
||||
if interactive_nb or batch_nb:
|
||||
if interactive_nb:
|
||||
to_screen("{} is recognized to be an interactive jupyter notebook job".format(self.name))
|
||||
to_screen("notebook job needs to be RUNNING state and the kernel started")
|
||||
if batch_nb:
|
||||
to_screen("{} is recognized to be a silent jupyter notebook job".format(self.name))
|
||||
to_screen("notebook job needs to be SUCCEEDED state and the output is ready")
|
||||
return repeater.retry(
|
||||
lambda x: x.get('state', None) in exit_states or x.get("notebook", None) is not None,
|
||||
self.connect_jupyter
|
||||
)
|
||||
to_screen("wait until job to be completed ({})".format(exit_states))
|
||||
return repeater.retry(
|
||||
lambda x: JobStatusParser.state(x) in exit_states, # x: job status
|
||||
self.get_status
|
||||
)
|
||||
|
||||
def plugin_uploadFiles(self, plugin: dict):
|
||||
import tarfile
|
||||
to_screen("archiving and uploading ...")
|
||||
work_directory = self.param("work_directory")
|
||||
assert work_directory, "must specify a storage to upload"
|
||||
with safe_open(self.temp_archive, "w:gz", func=tarfile.open) as fn:
|
||||
for src in plugin["parameters"]["files"]:
|
||||
src = os.path.relpath(src)
|
||||
if os.path.dirname(src) != "":
|
||||
to_screen("files not in current folder may cause wrong location when unarchived in the container, please check it {}".format(src), _type="warn")
|
||||
fn.add(src)
|
||||
to_screen("{} archived and wait to be uploaded".format(src))
|
||||
self.client.get_storage().upload(
|
||||
local_path=self.temp_archive,
|
||||
remote_path="{}/source/{}".format(work_directory, os.path.basename(self.temp_archive)),
|
||||
overwrite=True
|
||||
)
|
||||
|
||||
def local_process(self):
|
||||
"pre-process the job protocol locally, including uploading files, deal with pre-/post- commands"
|
||||
self.validate()
|
||||
plugins = self.protocol.get("extras", {}).get("sdk.plugins", [])
|
||||
for plugin in plugins:
|
||||
s = find("local.(\w+)", plugin["plugin"])
|
||||
if not s:
|
||||
continue
|
||||
getattr(self, "plugin_" + s)(plugin)
|
||||
return self
|
||||
|
||||
def connect_jupyter(self):
|
||||
if self.has_tag(__internal_tags__["script_nb"]):
|
||||
return self.connect_jupyter_script()
|
||||
if self.has_tag(__internal_tags__["batch_nb"]):
|
||||
return self.connect_jupyter_batch()
|
||||
if self.has_tag(__internal_tags__["interactive_nb"]):
|
||||
return self.connect_jupyter_interactive()
|
||||
|
||||
def connect_jupyter_batch(self):
|
||||
"fetch the html result if ready"
|
||||
status = self.get_status()
|
||||
state = JobStatusParser.state(status)
|
||||
url = None
|
||||
if state in __job_states__["successful"]:
|
||||
html_file = self.param("notebook_file") + ".html"
|
||||
local_path = html_file
|
||||
remote_path = '{}/output/{}'.format(self.param("work_directory"), html_file)
|
||||
self.client.get_storage().download(remote_path=remote_path, local_path=local_path)
|
||||
url = pathlib.Path(os.path.abspath(html_file)).as_uri()
|
||||
return dict(state=state, notebook=url)
|
||||
|
||||
def connect_jupyter_interactive(self):
|
||||
"get the url of notebook if ready"
|
||||
status = self.get_status()
|
||||
nb_file = self.param("notebook_file") + ".ipynb" if self.param("notebook_file") else None
|
||||
return JobStatusParser.interactive_jupyter_url(status, nb_file)
|
||||
|
||||
def connect_jupyter_script(self):
|
||||
status = self.get_status()
|
||||
state = self.state(status)
|
||||
return dict(state=state, notebook=None)
|
||||
|
||||
|
||||
__internal_tags__ = {
|
||||
"sdk": "py-sdk",
|
||||
"one_liner": 'py-sdk-one-liner',
|
||||
"interactive_nb": 'py-sdk-notebook-interactive',
|
||||
"batch_nb": 'py-sdk-notebook-batch',
|
||||
"script_nb": 'py-sdk-notebook-script',
|
||||
}
|
||||
|
||||
|
||||
__job_states__ = {
|
||||
"successful": ["SUCCEEDED"],
|
||||
"failed": ["FAILED", "STOPPED"],
|
||||
"ongoing": ["WAITING", "RUNNING", "COMPLETING"],
|
||||
}
|
||||
__job_states__["completed"] = __job_states__["successful"] + __job_states__["failed"]
|
||||
__job_states__["ready"] = __job_states__["completed"] + ["RUNNING"]
|
||||
__job_states__["valid"] = [s for sub in __job_states__.values() for s in sub]
|
||||
|
||||
|
||||
class JobStatusParser:
|
||||
|
||||
@staticmethod
|
||||
@exception_free(KeyError, None)
|
||||
def state(status: dict):
|
||||
return status["jobStatus"]["state"]
|
||||
|
||||
@staticmethod
|
||||
@exception_free(KeyError, None)
|
||||
def single_task_logs(status: dict, task_role: str = 'main', index: int = 0, log_type: dict=None, return_urls: bool=False):
|
||||
"""change to use containerLog"""
|
||||
log_type = na(log_type, {
|
||||
"stdout": "user.pai.stdout/?start=0",
|
||||
"stderr": "user.pai.stderr/?start=0"
|
||||
})
|
||||
containers = status.get("taskRoles", {}).get(task_role, {}).get("taskStatuses", [])
|
||||
if len(containers) < index + 1:
|
||||
return None
|
||||
containerLog = containers[index].get("containerLog", None)
|
||||
if not containerLog:
|
||||
return None
|
||||
urls = {
|
||||
k: "{}{}".format(containerLog, v)
|
||||
for k, v in log_type.items()
|
||||
}
|
||||
if return_urls:
|
||||
return urls
|
||||
else:
|
||||
html_contents = {k: get_response('GET', v).text for k, v in urls.items()}
|
||||
try:
|
||||
from html2text import html2text
|
||||
return {k: html2text(v) for k, v in html_contents.items()}
|
||||
except ImportError:
|
||||
return html_contents
|
||||
|
||||
@staticmethod
|
||||
@exception_free(Exception, None)
|
||||
def all_tasks_logs(status: dict):
|
||||
"""retrieve logs of all tasks"""
|
||||
logs = {
|
||||
'stdout': {}, 'stderr': {}
|
||||
}
|
||||
for tr_name, tf_info in status['taskRoles'].items():
|
||||
for task_status in tf_info['taskStatuses']:
|
||||
task_id = '{}[{}]'.format(tr_name, task_status['taskIndex'])
|
||||
task_logs = JobStatusParser.single_task_logs(status, tr_name, task_status['taskIndex'])
|
||||
for k, v in task_logs.items():
|
||||
logs.setdefault(k, {})[task_id] = v
|
||||
return logs
|
||||
|
||||
@staticmethod
|
||||
@exception_free(Exception, dict(state=None, notebook=None))
|
||||
def interactive_jupyter_url(status: dict, nb_file: str=None, task_role: str='main', index: int= 0):
|
||||
"get the url of notebook if ready"
|
||||
state = JobStatusParser.state(status)
|
||||
url = None
|
||||
if state == "RUNNING":
|
||||
job_log = JobStatusParser.single_task_logs(
|
||||
status, task_role, index
|
||||
)["stderr"].split('\n')
|
||||
for line in job_log:
|
||||
if re.search("The Jupyter Notebook is running at:", line):
|
||||
from openpaisdk.utils import path_join
|
||||
container = status["taskRoles"][task_role]["taskStatuses"][index]
|
||||
ip, port = container["containerIp"], container["containerPorts"]["jupyter"]
|
||||
url = path_join([f"http://{ip}:{port}", "notebooks", nb_file])
|
||||
break
|
||||
return dict(state=state, notebook=url)
|
||||
|
||||
|
||||
def job_spider(cluster, jobs: list = None):
|
||||
jobs = na_lazy(jobs, cluster.rest_api_job_list)
|
||||
to_screen("{} jobs to be captured in the cluster {}".format(len(jobs), cluster.alias))
|
||||
job_statuses = concurrent_map(
|
||||
lambda j: cluster.rest_api_job_info(j['name'], info=None, user=j['username']),
|
||||
jobs
|
||||
)
|
||||
job_configs = concurrent_map(
|
||||
lambda j: cluster.rest_api_job_info(j['name'], info='config', user=j['username']),
|
||||
jobs
|
||||
)
|
||||
job_logs = concurrent_map(JobStatusParser.all_tasks_logs, job_statuses)
|
||||
for job, sta, cfg, logs in zip(jobs, job_statuses, job_configs, job_logs):
|
||||
job['status'] = sta
|
||||
job['config'] = cfg
|
||||
job['logs'] = logs
|
||||
return jobs
|
|
@ -1,83 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
import json
|
||||
import os.path
|
||||
import re
|
||||
from openpaisdk.defaults import LayeredSettings, __flags__
|
||||
|
||||
|
||||
def get_notebook_path():
|
||||
"""
|
||||
Return the full path of the jupyter notebook.
|
||||
Reference: https://github.com/jupyter/notebook/issues/1000#issuecomment-359875246
|
||||
"""
|
||||
import requests
|
||||
from requests.compat import urljoin
|
||||
from notebook.notebookapp import list_running_servers
|
||||
import ipykernel
|
||||
|
||||
kernel_id = re.search('kernel-(.*).json',
|
||||
ipykernel.connect.get_connection_file()).group(1)
|
||||
servers = list_running_servers()
|
||||
for ss in servers:
|
||||
response = requests.get(urljoin(ss['url'], 'api/sessions'),
|
||||
params={'token': ss.get('token', '')})
|
||||
info = json.loads(response.text)
|
||||
if isinstance(info, dict) and info['message'] == 'Forbidden':
|
||||
continue
|
||||
for nn in info:
|
||||
if nn['kernel']['id'] == kernel_id:
|
||||
relative_path = nn['notebook']['path']
|
||||
return os.path.join(ss['notebook_dir'], relative_path)
|
||||
|
||||
|
||||
def parse_notebook_path():
|
||||
"parse the running notebook path to name, folder, extension"
|
||||
nb_file = get_notebook_path()
|
||||
folder, fname = os.path.split(nb_file)
|
||||
name, ext = os.path.splitext(fname)
|
||||
return name, folder, ext
|
||||
|
||||
|
||||
class NotebookConfiguration:
|
||||
"wrapper of LayeredSettings"
|
||||
|
||||
@staticmethod
|
||||
def reset():
|
||||
LayeredSettings.reset()
|
||||
|
||||
@staticmethod
|
||||
def print_supported_items():
|
||||
ret = LayeredSettings.print_supported_items()
|
||||
if __flags__.disable_to_screen:
|
||||
print(ret)
|
||||
|
||||
@staticmethod
|
||||
def set(key, value):
|
||||
LayeredSettings.update("user_advaced", key, value)
|
||||
|
||||
@staticmethod
|
||||
def get(*args):
|
||||
dic = LayeredSettings.as_dict()
|
||||
if not args:
|
||||
return dic
|
||||
elif len(args) == 1:
|
||||
return dic[args[0]]
|
||||
else:
|
||||
return [dic[a] for a in args]
|
|
@ -1,52 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
"""
|
||||
[summary]
|
||||
"""
|
||||
from openpaisdk.io_utils import mkdir_for, to_screen
|
||||
|
||||
|
||||
class Storage:
|
||||
|
||||
def __init__(self, protocol: str = 'webHDFS', *args, **kwargs):
|
||||
self.protocol, self.client = protocol.lower(), None
|
||||
if protocol.lower() == 'webHDFS'.lower():
|
||||
from hdfs import InsecureClient
|
||||
self.client = InsecureClient(*args, **kwargs)
|
||||
for f in 'upload download list status delete'.split():
|
||||
setattr(self, f, getattr(self, '%s_%s' %
|
||||
(f, protocol.lower())))
|
||||
|
||||
def upload_webhdfs(self, local_path: str, remote_path: str, **kwargs):
|
||||
to_screen("upload %s -> %s" % (local_path, remote_path))
|
||||
return self.client.upload(local_path=local_path, hdfs_path=remote_path, **kwargs)
|
||||
|
||||
def download_webhdfs(self, remote_path: str, local_path: str, **kwargs):
|
||||
mkdir_for(local_path)
|
||||
to_screen("download %s -> %s" % (remote_path, local_path))
|
||||
return self.client.download(local_path=local_path, hdfs_path=remote_path, overwrite=True, **kwargs)
|
||||
|
||||
def list_webhdfs(self, remote_path: str, **kwargs):
|
||||
return self.client.list(hdfs_path=remote_path, **kwargs)
|
||||
|
||||
def status_webhdfs(self, remote_path: str, **kwargs):
|
||||
return self.client.status(hdfs_path=remote_path, **kwargs)
|
||||
|
||||
def delete_webhdfs(self, remote_path: str, **kwargs):
|
||||
return self.client.delete(hdfs_path=remote_path, **kwargs)
|
|
@ -1,298 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
"""
|
||||
common functions to
|
||||
"""
|
||||
from openpaisdk.io_utils import safe_chdir, to_screen, __logger__
|
||||
import subprocess
|
||||
import importlib
|
||||
import os
|
||||
import time
|
||||
import requests
|
||||
from typing import Union
|
||||
from functools import wraps
|
||||
from collections import Iterable
|
||||
from requests_toolbelt.utils import dump
|
||||
from urllib3.exceptions import InsecureRequestWarning
|
||||
|
||||
# Suppress only the single warning from urllib3 needed.
|
||||
requests.packages.urllib3.disable_warnings(category=InsecureRequestWarning)
|
||||
|
||||
|
||||
def exception_free(err_type, default, err_msg: str = None):
|
||||
"return the default value if the exception is caught"
|
||||
def inner_func(fn):
|
||||
@wraps(fn)
|
||||
def wrapper(*args, **kwargs):
|
||||
try:
|
||||
return fn(*args, **kwargs)
|
||||
except err_type as e:
|
||||
if not err_msg:
|
||||
to_screen(repr(e), _type="warn")
|
||||
else:
|
||||
to_screen(err_msg, _type="warn")
|
||||
return default
|
||||
except Exception as e:
|
||||
raise e
|
||||
return wrapper
|
||||
return inner_func
|
||||
|
||||
|
||||
def concurrent_map(fn, it, max_workers=None):
|
||||
"a wrapper of concurrent.futures.ThreadPoolExecutor.map, retrieve the results"
|
||||
from concurrent.futures import ThreadPoolExecutor
|
||||
ret = []
|
||||
with ThreadPoolExecutor(max_workers=max_workers) as executor:
|
||||
futures = executor.map(fn, it)
|
||||
for f in futures:
|
||||
ret.append(f)
|
||||
return ret
|
||||
|
||||
|
||||
class OrganizedList(list):
|
||||
|
||||
def __init__(self, lst: list, _key: str = None, _getter=dict.get):
|
||||
super().__init__(lst)
|
||||
self._getter = _getter
|
||||
self._key = _key
|
||||
|
||||
@property
|
||||
def _fn_get(self):
|
||||
return lambda elem: self._getter(elem, self._key)
|
||||
|
||||
def first_index(self, target):
|
||||
for i, elem in enumerate(self):
|
||||
if self._fn_get(elem) == target:
|
||||
return i
|
||||
return None
|
||||
|
||||
def first(self, target):
|
||||
i = self.first_index(target)
|
||||
return self[i] if i is not None else None
|
||||
|
||||
def filter_index(self, target=None, include: list = None, exclude: list = None):
|
||||
if include is not None:
|
||||
return [i for i, elem in enumerate(self) if self._fn_get(elem) in include]
|
||||
if exclude is not None:
|
||||
return [i for i, elem in enumerate(self) if self._fn_get(elem) not in exclude]
|
||||
return [i for i, elem in enumerate(self) if self._fn_get(elem) == target]
|
||||
|
||||
def filter(self, target=None, include=None, exclude=None):
|
||||
return OrganizedList([self[i] for i in self.filter_index(target, include, exclude)], self._key, self._getter)
|
||||
|
||||
@property
|
||||
def as_dict(self):
|
||||
return {self._fn_get(elem): elem for elem in self}
|
||||
|
||||
@property
|
||||
def as_list(self):
|
||||
return [x for x in self]
|
||||
|
||||
def add(self, elem: dict, getter=dict.get, silent: bool = False, replace: bool = False):
|
||||
for i in self.filter_index(self._fn_get(elem)):
|
||||
if replace:
|
||||
self[i] = elem
|
||||
if not silent:
|
||||
to_screen(f"OrganizedList: {self._key} = {self._fn_get(elem)} already exists, replace it")
|
||||
else:
|
||||
self[i].update(elem)
|
||||
if not silent:
|
||||
to_screen(f"OrderedDict: {self._key} = {self._fn_get(elem)} already exists, update it")
|
||||
return self # ~ return
|
||||
self.append(elem)
|
||||
if not silent:
|
||||
to_screen(f"OrganizedList: {self._key} = {self._fn_get(elem)} added")
|
||||
return self
|
||||
|
||||
def remove(self, target):
|
||||
indexes = self.filter_index(target)
|
||||
if not indexes:
|
||||
to_screen(f"OrganizedList: {self._key} = {target} cannot be deleted due to non-existence")
|
||||
return self
|
||||
for index in sorted(indexes, reverse=True):
|
||||
del self[index]
|
||||
to_screen(f"OrganizedList: {self._key} = {target} removed")
|
||||
return self
|
||||
|
||||
|
||||
class Nested:
|
||||
|
||||
def __init__(self, t, sep: str = ":"):
|
||||
self.__sep__ = sep
|
||||
self.content = t
|
||||
|
||||
def get(self, keys: str):
|
||||
return Nested.s_get(self.content, keys.split(self.__sep__))
|
||||
|
||||
def set(self, keys: str, value):
|
||||
return Nested.s_set(self.content, keys.split(self.__sep__), value)
|
||||
|
||||
@staticmethod
|
||||
def _validate(context: Union[list, dict], idx: Union[str, int]):
|
||||
return int(idx) if isinstance(context, list) else idx
|
||||
|
||||
@staticmethod
|
||||
def s_get(target, keys: list):
|
||||
k = Nested._validate(target, keys[0])
|
||||
if len(keys) == 1:
|
||||
return target[k]
|
||||
return Nested.s_get(target[k], keys[1:])
|
||||
|
||||
@staticmethod
|
||||
def s_set(target, keys: list, value):
|
||||
# ! not allow to create a list
|
||||
k = Nested._validate(target, keys[0])
|
||||
if len(keys) == 1:
|
||||
target[k] = value
|
||||
return
|
||||
if isinstance(target, dict) and k not in target:
|
||||
target[k] = dict()
|
||||
return Nested.s_set(target[k], keys[1:], value)
|
||||
|
||||
|
||||
def getobj(name: str):
|
||||
mod_name, func_name = name.rsplit('.', 1)
|
||||
mod = importlib.import_module(mod_name)
|
||||
return getattr(mod, func_name)
|
||||
|
||||
|
||||
class RestSrvError(Exception):
|
||||
pass
|
||||
|
||||
|
||||
class NotReadyError(Exception):
|
||||
pass
|
||||
|
||||
|
||||
class Retry:
|
||||
|
||||
def __init__(self, max_try: int = 10, t_sleep: float = 10, timeout: float = 600, silent: bool = True):
|
||||
self.max_try = max_try
|
||||
self.t_sleep = t_sleep
|
||||
self.timeout = timeout
|
||||
if self.timeout:
|
||||
assert self.t_sleep, "must specify a period to sleep if timeout is set"
|
||||
self.silent = silent
|
||||
|
||||
def retry(self, f_exit, func, *args, **kwargs):
|
||||
t, i = 0, 0
|
||||
while True:
|
||||
try:
|
||||
x = func(*args, **kwargs)
|
||||
if f_exit(x):
|
||||
if not self.silent:
|
||||
to_screen("ready: {}".format(x))
|
||||
return x
|
||||
except NotReadyError as identifier:
|
||||
__logger__.debug("condition not satisfied", identifier)
|
||||
if not self.silent:
|
||||
to_screen("not ready yet: {}".format(x))
|
||||
i, t = i + 1, t + self.t_sleep
|
||||
if self.max_try and i >= self.max_try or self.timeout and t >= self.timeout:
|
||||
return None
|
||||
if self.t_sleep:
|
||||
time.sleep(self.t_sleep)
|
||||
|
||||
|
||||
def path_join(path: Union[list, str], sep: str = '/'):
|
||||
""" join path from list or str
|
||||
- ['aaa', 'bbb', 'ccc'] -> 'aaa/bbb/ccc'
|
||||
- ['aaa', 'bbb', ('xxx', None), 'ddd'] -> 'aaa/bbb/ccc'
|
||||
- ['aaa', 'bbb', ('xxx', 'x-val'), 'ddd'] -> 'aaa/bbb/xxx/x-val/ccc'
|
||||
"""
|
||||
def is_single_element(x):
|
||||
return isinstance(x, str) or not isinstance(x, Iterable)
|
||||
if is_single_element(path):
|
||||
return str(path)
|
||||
p_lst = []
|
||||
for p in path:
|
||||
if not p:
|
||||
continue
|
||||
if is_single_element(p):
|
||||
p_lst.append(str(p))
|
||||
elif all(p):
|
||||
p_lst.extend([str(x) for x in p])
|
||||
return '/'.join(p_lst)
|
||||
|
||||
|
||||
def get_response(method: str, path: Union[list, str], headers: dict = None, body: dict = None, allowed_status: list = [200], **kwargs):
|
||||
"""an easy wrapper of request, including:
|
||||
- path accept a list of strings and more complicated input
|
||||
- will checked the response status_code, raise RestSrvError if not in the allowed_status
|
||||
"""
|
||||
path = path_join(path)
|
||||
headers = na(headers, {})
|
||||
body = na(body, {})
|
||||
application_json = 'Content-Type' not in headers or headers['Content-Type'] == 'application/json'
|
||||
response = requests.request(method, path, headers=headers, ** kwargs, **{
|
||||
"json" if application_json else "data": body,
|
||||
"verify": False, # support https
|
||||
})
|
||||
__logger__.debug('----------Response-------------\n%s', dump.dump_all(response).decode('utf-8'))
|
||||
if allowed_status and response.status_code not in allowed_status:
|
||||
__logger__.warn(response.status_code, response.json())
|
||||
raise RestSrvError(response.status_code, response.json())
|
||||
return response
|
||||
|
||||
|
||||
def run_command(commands, # type: Union[list, str]
|
||||
cwd=None, # type: str
|
||||
):
|
||||
command = commands if isinstance(commands, str) else " ".join(commands)
|
||||
with safe_chdir(cwd):
|
||||
rtn_code = os.system(command)
|
||||
if rtn_code:
|
||||
raise subprocess.CalledProcessError(rtn_code, commands)
|
||||
|
||||
|
||||
def sys_call(args, dec_mode: str = 'utf-8'):
|
||||
p = subprocess.Popen(args, shell=True, stderr=subprocess.PIPE, stdout=subprocess.PIPE)
|
||||
out, err = p.communicate()
|
||||
if dec_mode:
|
||||
out, err = out.decode(dec_mode), err.decode(dec_mode)
|
||||
if p.returncode:
|
||||
raise subprocess.CalledProcessError(f"ErrCode: {p.returncode}, {err}")
|
||||
return out, err
|
||||
|
||||
|
||||
def find(fmt: str, s: str, g: int = 1, func=None):
|
||||
import re
|
||||
func = na(func, re.match)
|
||||
m = func(fmt, s)
|
||||
return m.group(g) if m else None
|
||||
|
||||
|
||||
def na(a, default):
|
||||
return a if a is not None else default
|
||||
|
||||
|
||||
def na_lazy(a, fn, *args, **kwargs):
|
||||
return a if a is not None else fn(*args, **kwargs)
|
||||
|
||||
|
||||
def flatten(lst: list):
|
||||
return sum(lst, [])
|
||||
|
||||
|
||||
def randstr(num: int = 10, letters=None):
|
||||
"get a random string with given length"
|
||||
import string
|
||||
import random
|
||||
letters = na(letters, string.ascii_letters)
|
||||
return ''.join(random.choice(letters) for i in range(num))
|
|
@ -1,15 +0,0 @@
|
|||
from setuptools import setup
|
||||
|
||||
setup(name='openpaisdk',
|
||||
version='0.4.00',
|
||||
description='A simple SDK for OpenPAI',
|
||||
url='https://github.com/microsoft/pai/contrib/python-sdk',
|
||||
packages=['openpaisdk'],
|
||||
install_requires=[
|
||||
'requests', 'hdfs', 'PyYAML', 'requests-toolbelt', 'html2text', 'tabulate'
|
||||
],
|
||||
entry_points={
|
||||
'console_scripts': ['opai=openpaisdk.command_line:main'],
|
||||
},
|
||||
zip_safe=False
|
||||
)
|
|
@ -1,62 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
import os
|
||||
import unittest
|
||||
from typing import Union
|
||||
from openpaisdk.io_utils import to_screen, safe_chdir
|
||||
|
||||
|
||||
def separated(method):
|
||||
"run the each test in a separated directory"
|
||||
def func(*args, **kwargs):
|
||||
dir_name = 'utdir_' + method.__name__
|
||||
os.makedirs(dir_name, exist_ok=True)
|
||||
try:
|
||||
with safe_chdir(dir_name):
|
||||
method(*args, **kwargs)
|
||||
except Exception as identifier:
|
||||
raise identifier
|
||||
finally:
|
||||
to_screen(f"trying to remove {dir_name}")
|
||||
# ! rmtree not work on windows
|
||||
os.system(f'rm -rf {dir_name}')
|
||||
return func
|
||||
|
||||
|
||||
class OrderedUnitTestCase(unittest.TestCase):
|
||||
|
||||
def get_steps(self):
|
||||
for name in dir(self): # dir() result is implicitly sorted
|
||||
if name.lower().startswith("step"):
|
||||
yield name, getattr(self, name)
|
||||
|
||||
def run_steps(self):
|
||||
for name, func in self.get_steps():
|
||||
try:
|
||||
to_screen(f"\n==== begin to test {name} ====")
|
||||
func()
|
||||
except Exception as identifier:
|
||||
self.fail("test {} failed ({}: {})".format(name, type(identifier), repr(identifier)))
|
||||
|
||||
def cmd_exec(self, cmds: Union[list, str]):
|
||||
if isinstance(cmds, list):
|
||||
cmds = ' '.join(cmds)
|
||||
print(cmds)
|
||||
exit_code = os.system(cmds)
|
||||
self.assertEqual(exit_code, 0, f"fail to run {cmds}")
|
|
@ -1,111 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
import os
|
||||
from openpaisdk import get_defaults, ClusterList, JobStatusParser
|
||||
from openpaisdk.utils import run_command, randstr
|
||||
from openpaisdk.io_utils import to_screen
|
||||
from typing import Union
|
||||
from basic_test import OrderedUnitTestCase, separated
|
||||
|
||||
|
||||
def get_cmd(cmd: Union[str, list], flags: dict, args: Union[list, str] = None):
|
||||
lst = []
|
||||
lst.extend(cmd if isinstance(cmd, list) else cmd.split())
|
||||
for flag, value in flags.items():
|
||||
lst.extend(["--" + flag, value.__str__()])
|
||||
if args:
|
||||
lst.extend(args if isinstance(args, list) else args.split())
|
||||
return lst
|
||||
|
||||
|
||||
def run_commands(*cmds, sep: str = '&&'):
|
||||
lst = []
|
||||
for i, c in enumerate(cmds):
|
||||
lst.extend(c)
|
||||
if i != len(cmds) - 1:
|
||||
lst.append(sep)
|
||||
run_command(lst)
|
||||
|
||||
|
||||
def run_test_command(cmd: Union[str, list], flags: dict, args: Union[list, str] = None):
|
||||
run_command(get_cmd(cmd, flags, args))
|
||||
|
||||
|
||||
def gen_expected(dic: dict, **kwargs):
|
||||
dic2 = {k.replace("-", "_"): v if k != "password" else "******" for k, v in dic.items()}
|
||||
dic2.update(kwargs)
|
||||
return dic2
|
||||
|
||||
|
||||
class TestCommandLineInterface(OrderedUnitTestCase):
|
||||
|
||||
ut_init_shell = os.path.join('..', 'ut_init.sh')
|
||||
|
||||
def step1_init_clusters(self):
|
||||
to_screen("""\
|
||||
testing REST APIs related to retrieving cluster info, including
|
||||
- rest_api_cluster_info
|
||||
- rest_api_user
|
||||
- rest_api_token
|
||||
- rest_api_virtual_clusters
|
||||
""")
|
||||
with open(self.ut_init_shell) as fn:
|
||||
for line in fn:
|
||||
if line.startswith('#'):
|
||||
continue
|
||||
self.cmd_exec(line)
|
||||
alias = get_defaults()["cluster-alias"]
|
||||
self.assertTrue(alias, "not specify a cluster")
|
||||
self.cmd_exec('opai cluster resources')
|
||||
|
||||
def step2_submit_job(self):
|
||||
import time
|
||||
to_screen("""\
|
||||
testing REST APIs related to submitting a job, including
|
||||
- rest_api_submit
|
||||
""")
|
||||
self.job_name = 'ut_test_' + randstr(10)
|
||||
self.cmd_exec(['opai', 'job', 'sub', '-i', 'python:3', '-j', self.job_name, 'opai cluster resources'])
|
||||
time.sleep(10)
|
||||
|
||||
def step3_job_monitoring(self):
|
||||
to_screen("""\
|
||||
testing REST APIs related to querying a job, including
|
||||
- rest_api_job_list
|
||||
- rest_api_job_info
|
||||
""")
|
||||
client = ClusterList().load().get_client(get_defaults()["cluster-alias"])
|
||||
self.cmd_exec(['opai', 'job', 'list'])
|
||||
job_list = client.rest_api_job_list(client.user) # ! only jobs from current user to reduce time
|
||||
job_list = [job['name'] for job in job_list]
|
||||
assert self.job_name in job_list, job_list
|
||||
to_screen(f"testing job monitoring with {self.job_name}")
|
||||
status = client.rest_api_job_info(self.job_name)
|
||||
to_screen(f"retrieving job status and get its state {JobStatusParser.state(status)}")
|
||||
client.rest_api_job_info(self.job_name, 'config')
|
||||
to_screen("retrieving job config")
|
||||
logs = JobStatusParser.all_tasks_logs(status)
|
||||
assert logs, f"failed to read logs from status \n{status}"
|
||||
for k, v in logs.items():
|
||||
for t, content in v.items():
|
||||
to_screen(f"reading logs {k} for {t} and get {len(content)} Bytes")
|
||||
|
||||
@separated
|
||||
def test_commands_sequence(self):
|
||||
self.run_steps()
|
|
@ -1,62 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
import os
|
||||
import sys
|
||||
import unittest
|
||||
|
||||
|
||||
in_place_chaning = False
|
||||
|
||||
|
||||
class TestFormat(unittest.TestCase):
|
||||
|
||||
folders = [os.path.join('..', 'openpaisdk'), '.']
|
||||
|
||||
def test_format(self):
|
||||
for folder in self.folders:
|
||||
root, dirs, files = next(os.walk(folder))
|
||||
for src in [fn for fn in files if fn.endswith(".py")]:
|
||||
os.system(' '.join([
|
||||
sys.executable, '-m', 'autoflake',
|
||||
'--remove-unused-variables',
|
||||
'--remove-all-unused-imports',
|
||||
'--remove-duplicate-keys',
|
||||
'--ignore-init-module-imports',
|
||||
'-i' if in_place_chaning else '',
|
||||
os.path.join(folder, src)
|
||||
]))
|
||||
|
||||
def clear_notebook_output(self):
|
||||
folders = [
|
||||
os.path.join('..', 'examples'),
|
||||
os.path.join('..', '..', 'notebook-extension', 'examples'),
|
||||
]
|
||||
for folder in folders:
|
||||
root, dirs, files = next(os.walk(folder))
|
||||
for file in [fn for fn in files if fn.endswith('.ipynb')]:
|
||||
src = os.path.join(folder, file)
|
||||
print(src)
|
||||
os.system(f"jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace {src}")
|
||||
os.system(f"dos2unix {src}")
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
in_place_chaning = True
|
||||
TestFormat().test_format()
|
||||
TestFormat().clear_notebook_output()
|
|
@ -1,51 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
from basic_test import OrderedUnitTestCase, separated
|
||||
from openpaisdk import to_screen
|
||||
|
||||
|
||||
class TestJobResource(OrderedUnitTestCase):
|
||||
|
||||
def test_job_resource_parser(self):
|
||||
from openpaisdk.job import JobResource
|
||||
from openpaisdk import __flags__
|
||||
self.assertDictEqual(__flags__.resources_requirements, JobResource(None).as_dict)
|
||||
self.assertDictEqual(__flags__.resources_requirements, JobResource().as_dict)
|
||||
self.assertDictEqual(__flags__.resources_requirements, JobResource({}).as_dict)
|
||||
dic = dict(cpu=-1, gpu=-2, memoryMB=-1024)
|
||||
for key, value in dic.items():
|
||||
self.assertEqual(value, JobResource(dic).as_dict[key])
|
||||
dic['mem'] = '-2gb'
|
||||
self.assertEqual(-2048, JobResource(dic).as_dict["memoryMB"])
|
||||
dic['mem'] = '-3g'
|
||||
self.assertEqual(-3072, JobResource(dic).as_dict["memoryMB"])
|
||||
dic['mem'] = 10240
|
||||
self.assertEqual(10240, JobResource(dic).as_dict["memoryMB"])
|
||||
self.assertEqual({"a": 1}, JobResource(dic).add_port("a").as_dict["ports"])
|
||||
|
||||
def test_job_resource_list(self):
|
||||
from openpaisdk.job import JobResource
|
||||
samples = {
|
||||
"3,3,3g": dict(gpu=3, cpu=3, memoryMB=3072, ports={}),
|
||||
"3,1, 2g": dict(gpu=3, cpu=1, memoryMB=2048, ports={}),
|
||||
}
|
||||
keys = list(samples.keys())
|
||||
rets = JobResource.parse_list(keys)
|
||||
for k, r in zip(keys, rets):
|
||||
self.assertDictEqual(r, samples[k])
|
|
@ -1,46 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
from basic_test import OrderedUnitTestCase, separated
|
||||
from openpaisdk import to_screen
|
||||
|
||||
|
||||
class TestNbExtCfg(OrderedUnitTestCase):
|
||||
|
||||
settings = dict(cpu=100, gpu=-2, mem='90g')
|
||||
|
||||
def step1_init(self):
|
||||
from openpaisdk.notebook import NotebookConfiguration
|
||||
NotebookConfiguration.print_supported_items()
|
||||
|
||||
def step2_setup(self):
|
||||
from openpaisdk.notebook import NotebookConfiguration
|
||||
from openpaisdk import LayeredSettings
|
||||
NotebookConfiguration.set(**self.settings)
|
||||
for key in self.settings.keys():
|
||||
LayeredSettings.update('user_basic', key, -1)
|
||||
|
||||
def step3_check(self):
|
||||
from openpaisdk.notebook import NotebookConfiguration
|
||||
to_screen(NotebookConfiguration.get())
|
||||
dic = {k: NotebookConfiguration.get(k) for k in self.settings}
|
||||
self.assertDictEqual(dic, self.settings)
|
||||
|
||||
@separated
|
||||
def test_nbext_configuration(self):
|
||||
self.run_steps()
|
|
@ -1,226 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
|
||||
import os
|
||||
import unittest
|
||||
from copy import deepcopy
|
||||
from openpaisdk.utils import OrganizedList as ol
|
||||
from openpaisdk.utils import Nested
|
||||
from openpaisdk.utils import randstr
|
||||
from openpaisdk.io_utils import __flags__, from_file, to_screen
|
||||
from openpaisdk import get_defaults, update_default, LayeredSettings
|
||||
from basic_test import separated
|
||||
|
||||
|
||||
class TestIOUtils(unittest.TestCase):
|
||||
|
||||
@separated
|
||||
def test_reading_failures(self):
|
||||
with self.assertRaises(Exception): # non existing file
|
||||
from_file(randstr(8) + '.yaml')
|
||||
with self.assertRaises(AssertionError): # unsupported file extension
|
||||
from_file(randstr(10))
|
||||
with self.assertRaises(Exception):
|
||||
fname = randstr(10) + '.json'
|
||||
os.system(f"touch {fname}")
|
||||
from_file(fname)
|
||||
|
||||
@separated
|
||||
def test_returning_default(self):
|
||||
for dval in [[], ['a', 'b'], {}, {'a': 'b'}]:
|
||||
ass_fn = self.assertListEqual if isinstance(dval, list) else self.assertDictEqual
|
||||
with self.assertRaises(AssertionError): # unsupported file extension
|
||||
from_file(randstr(10))
|
||||
fname = randstr(8) + '.yaml'
|
||||
ass_fn(from_file(fname, dval), dval) # non existing
|
||||
os.system(f"echo '' > {fname}")
|
||||
ass_fn(from_file(fname, dval), dval)
|
||||
os.system(f"echo 'abcd' > {fname}")
|
||||
ass_fn(from_file(fname, dval), dval)
|
||||
|
||||
|
||||
class TestDefaults(unittest.TestCase):
|
||||
|
||||
global_default_file = __flags__.get_default_file(is_global=True)
|
||||
local_default_file = __flags__.get_default_file(is_global=False)
|
||||
|
||||
def get_random_var_name(self):
|
||||
import random
|
||||
from openpaisdk import LayeredSettings
|
||||
lst = [x for x in LayeredSettings.keys() if not LayeredSettings.act_append(x)]
|
||||
ret = lst[random.randint(0, len(lst) - 1)]
|
||||
to_screen(f"random select {ret} in {lst}")
|
||||
return ret
|
||||
|
||||
@separated
|
||||
def test_update_defaults(self):
|
||||
# ! not test global defaults updating, test it in integration tests
|
||||
test_key, test_value = self.get_random_var_name(), randstr(10)
|
||||
# add a default key
|
||||
update_default(test_key, test_value, is_global=False, to_delete=False)
|
||||
self.assertEqual(get_defaults()[test_key], test_value,
|
||||
msg=f"failed to check {test_key} in {LayeredSettings.as_dict()}")
|
||||
# should appear in local
|
||||
self.assertEqual(from_file(self.local_default_file)[test_key], test_value)
|
||||
# delete
|
||||
update_default(test_key, test_value, is_global=False, to_delete=True)
|
||||
with self.assertRaises(KeyError):
|
||||
os.system(f"cat {self.local_default_file}")
|
||||
from_file(self.local_default_file, {})[test_key]
|
||||
# add not allowed
|
||||
test_key = randstr(10)
|
||||
update_default(test_key, test_value, is_global=False, to_delete=False)
|
||||
with self.assertRaises(KeyError):
|
||||
from_file(self.local_default_file, {})[test_key]
|
||||
|
||||
@separated
|
||||
def test_layered_settings(self):
|
||||
from openpaisdk import LayeredSettings, __flags__
|
||||
__flags__.custom_predefined = [
|
||||
{
|
||||
'name': 'test-key-1',
|
||||
},
|
||||
{
|
||||
'name': 'test-key-2',
|
||||
'action': 'append',
|
||||
'default': []
|
||||
}
|
||||
]
|
||||
LayeredSettings.reset()
|
||||
# ? add / update append key
|
||||
for test_key in ['test-key-1', 'test-key-2']:
|
||||
for i, layer in enumerate(LayeredSettings.layers):
|
||||
LayeredSettings.update(layer.name, test_key, i)
|
||||
if layer.act_append(test_key):
|
||||
self.assertTrue(isinstance(layer.values[test_key], list), msg=f"{layer.values}")
|
||||
self.assertEqual(0, LayeredSettings.get('test-key-1'))
|
||||
self.assertListEqual([0, 1, 2, 3], LayeredSettings.get('test-key-2'))
|
||||
# ? delete
|
||||
for test_key in ['test-key-1', 'test-key-2']:
|
||||
for i, layer in enumerate(LayeredSettings.layers):
|
||||
LayeredSettings.update(layer.name, test_key, None, delete=True)
|
||||
# ? reset the predefined
|
||||
__flags__.custom_predefined = []
|
||||
LayeredSettings.reset()
|
||||
|
||||
@separated
|
||||
def test_unknown_variable_defined(self):
|
||||
from openpaisdk import LayeredSettings, __flags__
|
||||
test_key, test_value = 'test-key-long-existing', randstr(10)
|
||||
__flags__.custom_predefined = [
|
||||
{
|
||||
'name': test_key,
|
||||
},
|
||||
]
|
||||
LayeredSettings.reset()
|
||||
# ? add / update append key
|
||||
LayeredSettings.update('local_default', test_key, test_value)
|
||||
# ? reset the predefined
|
||||
__flags__.custom_predefined = []
|
||||
LayeredSettings.reset()
|
||||
self.assertEqual(test_value, LayeredSettings.get(test_key))
|
||||
# cannot delete or change the unknown variable
|
||||
LayeredSettings.update('local_default', test_key, randstr(10))
|
||||
LayeredSettings.reset()
|
||||
self.assertEqual(test_value, LayeredSettings.get(test_key))
|
||||
LayeredSettings.update('local_default', test_key, delete=True)
|
||||
LayeredSettings.reset()
|
||||
self.assertEqual(test_value, LayeredSettings.get(test_key))
|
||||
|
||||
|
||||
class TestOrganizedList(unittest.TestCase):
|
||||
|
||||
class foo:
|
||||
|
||||
def __init__(self, a=None, b=None, c=None, d=None):
|
||||
self.a, self.b, self.c, self.d = a, b, c, d
|
||||
|
||||
@property
|
||||
def as_dict(self):
|
||||
return {k: v for k, v in vars(self).items() if v is not None}
|
||||
|
||||
def update(self, other):
|
||||
for key, value in other.as_dict.items():
|
||||
setattr(self, key, value)
|
||||
|
||||
lst_objs = [foo("x", 0), foo("x", 1), foo("y", 2), foo("y", c=1), foo("z", 4)]
|
||||
lst = [obj.as_dict for obj in lst_objs]
|
||||
|
||||
def ol_test_run(self, lst, getter):
|
||||
def to_dict(obj):
|
||||
return obj if isinstance(obj, dict) else obj.as_dict
|
||||
dut = ol(lst[:3], "a", getter)
|
||||
# find
|
||||
self.assertEqual(2, dut.first_index("y"))
|
||||
self.assertDictEqual(to_dict(lst[2]), to_dict(dut.first("y")))
|
||||
# filter
|
||||
self.assertListEqual([0, 1], dut.filter_index("x"))
|
||||
self.assertListEqual(lst[:2], dut.filter("x").as_list)
|
||||
# as_dict
|
||||
self.assertDictEqual(dict(x=lst[1], y=lst[2]), dut.as_dict)
|
||||
# add (update)
|
||||
elem = lst[-2]
|
||||
dut.add(elem)
|
||||
self.assertEqual(2, getter(lst[2], "b"))
|
||||
self.assertEqual(1, getter(lst[2], "c"))
|
||||
# add (replace)
|
||||
elem = lst[-2]
|
||||
dut.add(elem, replace=True)
|
||||
self.assertEqual(None, getter(dut[2], "b"))
|
||||
# add (append)
|
||||
elem = lst[-1]
|
||||
dut.add(elem)
|
||||
self.assertEqual(4, getter(dut[-1], "b"))
|
||||
# delete
|
||||
dut.remove("z")
|
||||
self.assertEqual(3, len(dut))
|
||||
dut.remove("z")
|
||||
self.assertEqual(3, len(dut))
|
||||
|
||||
def test_dict(self):
|
||||
self.ol_test_run(deepcopy(self.lst), dict.get)
|
||||
|
||||
def test_obj(self):
|
||||
self.ol_test_run(deepcopy(self.lst_objs), getattr)
|
||||
|
||||
|
||||
class TestNested(unittest.TestCase):
|
||||
|
||||
def test_set(self):
|
||||
nested_obj = {
|
||||
"a": [
|
||||
{
|
||||
"aa0": {
|
||||
"aaa": "val_aaa"
|
||||
},
|
||||
},
|
||||
{
|
||||
"aa1": {
|
||||
"aaa1": "val_aaa1"
|
||||
}
|
||||
}
|
||||
|
||||
],
|
||||
"b": "haha"
|
||||
}
|
||||
n = Nested(nested_obj, sep="->")
|
||||
self.assertEqual(n.get("a->0->aa0->aaa"), "val_aaa")
|
||||
with self.assertRaises(KeyError):
|
||||
nested_obj["a"][1]["aa2"]["aaa"]
|
||||
n.set("a->1->aa2->aaa", "val_aaa2")
|
||||
self.assertEqual(nested_obj["a"][1]["aa2"]["aaa"], "val_aaa2")
|
|
@ -1,74 +0,0 @@
|
|||
# Samba server with AAD integration
|
||||
|
||||
A samba server integrated with AAD. Has a shared path and private paths for AD users, and create a shared account.
|
||||
Also, it offers api to query user groups by user name.
|
||||
This is an example of samba server with AAD integration, please change to your own configuration before use.
|
||||
|
||||
## Index
|
||||
- [Components](#Components)
|
||||
- [How to Use](#How_to_Use)
|
||||
|
||||
### Components <a name="Components"></a>
|
||||
- Samba server
|
||||
Data Structure:
|
||||
```
|
||||
root
|
||||
-- data
|
||||
-- users
|
||||
-- user1
|
||||
-- user2
|
||||
-- user3
|
||||
```
|
||||
data: Shared folder.
|
||||
user: User private folder, user folder will be created when user first use samba.
|
||||
|
||||
- Nginx service
|
||||
A service that can query user groups through domain user name.
|
||||
|
||||
|
||||
### How to Use <a name="How_to_Use"></a>
|
||||
- Replace with your own configs
|
||||
krb5.conf: Replace realms.
|
||||
smb.conf: Replace realm and id map.
|
||||
domaininfo.py: Replace corp domains.
|
||||
|
||||
- Build docker image
|
||||
```
|
||||
./build.sh
|
||||
```
|
||||
|
||||
- Start service
|
||||
```
|
||||
./start.sh <DOMAIN> <DOMAINUSER> <DOMAINPWD> <PAISMBUSER> <PAISMBPWD>
|
||||
```
|
||||
Variable|Spec
|
||||
--|:--:
|
||||
DOMAIN|Domain to join, e.g. FAREAST
|
||||
DOMAINUSER|Existing domain user name. Will join domain using this account
|
||||
DOMAINPWD|Password for domain user
|
||||
PAISMBUSER|Create new local samba account for PAI to use
|
||||
PAISMBPWD|Password for new samba account
|
||||
|
||||
- Access samba with domain-joined windows system.
|
||||
In windows file explorer, input:
|
||||
```
|
||||
\\<server address>
|
||||
```
|
||||
This will show two folders: data and home.
|
||||
Data folder is a shared folder for all users.
|
||||
Home folder is private folder for current AD user.
|
||||
|
||||
- Mount samba using personal account
|
||||
```
|
||||
mount -t cifs //<server address>/<folder> <mount point> -o username=<domain user name>,password=<domain user password>,domain=<domain>
|
||||
```
|
||||
|
||||
- Mount samba using PAI account
|
||||
```
|
||||
mount -t cifs //<server address>/<folder> <mount point> -o username=<pai smb user>,password=<pai smb password>,domain=WORKGROUP
|
||||
```
|
||||
|
||||
- Query user groups
|
||||
```
|
||||
http://<server address>:<server port>/GetUserId?userName=<domain user name>
|
||||
```
|
|
@ -1,74 +0,0 @@
|
|||
# Samba server with AAD integration
|
||||
|
||||
A samba server integrated with AAD. Has a shared path and private paths for AD users, and create a shared account.
|
||||
Also, it offers api to query user groups by user name.
|
||||
This is an example of samba server with AAD integration, please change to your own configuration before use.
|
||||
|
||||
## Index
|
||||
|
||||
- [Components](#Components)
|
||||
- [How to Use](#How_to_Use)
|
||||
|
||||
### Components <a name="Components"></a>
|
||||
|
||||
- Samba server Data Structure:
|
||||
|
||||
root
|
||||
-- data
|
||||
-- users
|
||||
-- user1
|
||||
-- user2
|
||||
-- user3
|
||||
|
||||
|
||||
data: Shared folder.
|
||||
user: User private folder, user folder will be created when user first use samba.
|
||||
|
||||
- Nginx service A service that can query user groups through domain user name.
|
||||
|
||||
### How to Use <a name="How_to_Use"></a>
|
||||
|
||||
- Replace with your own configs krb5.conf: Replace realms. smb.conf: Replace realm and id map. domaininfo.py: Replace corp domains.
|
||||
|
||||
- Build docker image
|
||||
|
||||
./build.sh
|
||||
|
||||
|
||||
- Start service
|
||||
|
||||
./start.sh <DOMAIN> <DOMAINUSER> <DOMAINPWD> <PAISMBUSER> <PAISMBPWD>
|
||||
|
||||
|
||||
| Variable | Spec |
|
||||
| ---------- |:--------------------------------------------------------------:|
|
||||
| DOMAIN | Domain to join, e.g. FAREAST |
|
||||
| DOMAINUSER | Existing domain user name. Will join domain using this account |
|
||||
| DOMAINPWD | Password for domain user |
|
||||
| PAISMBUSER | Create new local samba account for PAI to use |
|
||||
| PAISMBPWD | Password for new samba account |
|
||||
|
||||
|
||||
- Access samba with domain-joined windows system.
|
||||
In windows file explorer, input:
|
||||
|
||||
\\<server address>
|
||||
|
||||
|
||||
This will show two folders: data and home.
|
||||
Data folder is a shared folder for all users.
|
||||
Home folder is private folder for current AD user.
|
||||
|
||||
- Mount samba using personal account
|
||||
|
||||
mount -t cifs //<server address>/<folder> <mount point> -o username=<domain user name>,password=<domain user password>,domain=<domain>
|
||||
|
||||
|
||||
- Mount samba using PAI account
|
||||
|
||||
mount -t cifs //<server address>/<folder> <mount point> -o username=<pai smb user>,password=<pai smb password>,domain=WORKGROUP
|
||||
|
||||
|
||||
- Query user groups
|
||||
|
||||
http://<server address>:<server port>/GetUserId?userName=<domain user name>
|
|
@ -1 +0,0 @@
|
|||
docker build -t paismb:stable build/
|
|
@ -1,39 +0,0 @@
|
|||
FROM ubuntu:16.04
|
||||
|
||||
COPY krb5.conf /etc/krb5.conf
|
||||
COPY nsswitch.conf /etc/nsswitch.conf
|
||||
|
||||
RUN apt-get update && \
|
||||
apt-get install -y \
|
||||
samba \
|
||||
attr \
|
||||
winbind \
|
||||
libpam-winbind \
|
||||
libnss-winbind \
|
||||
libpam-krb5 \
|
||||
krb5-config \
|
||||
krb5-user \
|
||||
cifs-utils \
|
||||
nginx \
|
||||
python-dev \
|
||||
python-pip
|
||||
|
||||
RUN pip install flask \
|
||||
flask_restful \
|
||||
uwsgi
|
||||
|
||||
COPY smb.conf /etc/samba/smb.conf
|
||||
COPY default /etc/nginx/sites-available/default
|
||||
|
||||
ENV SHARE_ROOT=/share/pai
|
||||
|
||||
ADD infosrv /infosrv
|
||||
RUN mkdir -p /infosrv/uwsgi
|
||||
COPY run.sh /run.sh
|
||||
RUN chmod +x /run.sh
|
||||
COPY sambauserhomecreate /usr/bin/
|
||||
RUN chmod +x /usr/bin/sambauserhomecreate
|
||||
COPY sambadatacreate /usr/bin/
|
||||
RUN chmod +x /usr/bin/sambadatacreate
|
||||
|
||||
CMD /run.sh
|
|
@ -1,28 +0,0 @@
|
|||
##
|
||||
# You should look at the following URL's in order to grasp a solid understanding
|
||||
# of Nginx configuration files in order to fully unleash the power of Nginx.
|
||||
# http://wiki.nginx.org/Pitfalls
|
||||
# http://wiki.nginx.org/QuickStart
|
||||
# http://wiki.nginx.org/Configuration
|
||||
#
|
||||
# Generally, you will want to move this file somewhere, and start with a clean
|
||||
# file but keep this around for reference. Or just disable in sites-enabled.
|
||||
#
|
||||
# Please see /usr/share/doc/nginx-doc/examples/ for more detailed examples.
|
||||
##
|
||||
|
||||
# Default server configuration
|
||||
#
|
||||
server {
|
||||
listen 80 default_server;
|
||||
listen [::]:80 default_server;
|
||||
|
||||
server_name _domaininfo_;
|
||||
|
||||
location / {
|
||||
include uwsgi_params;
|
||||
#uwsgi_pass unix:/infosrv/uwsgi/uwsgi.sock;
|
||||
uwsgi_pass 127.0.0.1:8988;
|
||||
}
|
||||
|
||||
}
|
|
@ -1,62 +0,0 @@
|
|||
import sys
|
||||
import json
|
||||
import os
|
||||
|
||||
from flask import Flask
|
||||
from flask_restful import reqparse, abort, Api, Resource
|
||||
from flask import request, jsonify
|
||||
import base64
|
||||
import subprocess
|
||||
|
||||
|
||||
app = Flask(__name__)
|
||||
api = Api(app)
|
||||
|
||||
|
||||
|
||||
parser = reqparse.RequestParser()
|
||||
|
||||
def cmd_exec(cmdStr):
|
||||
try:
|
||||
output = subprocess.check_output(["bash","-c", cmdStr]).strip()
|
||||
except Exception as e:
|
||||
print(e)
|
||||
output = ""
|
||||
return output
|
||||
|
||||
class GetUserId(Resource):
|
||||
def get(self):
|
||||
parser.add_argument('userName')
|
||||
args = parser.parse_args()
|
||||
ret = {}
|
||||
|
||||
if args["userName"] is not None and len(args["userName"].strip()) > 0:
|
||||
# Replace with your corp domains
|
||||
corpDomains = ['ATHENA']
|
||||
ret["uid"] = ""
|
||||
|
||||
for corpDomain in corpDomains:
|
||||
if len(ret["uid"].strip())==0:
|
||||
userName = str(args["userName"]).strip().split("@")[0]
|
||||
uid = cmd_exec("id -u %s\\\\%s" % (corpDomain,userName))
|
||||
gid = cmd_exec("id -g %s\\\\%s" % (corpDomain,userName))
|
||||
groups = cmd_exec("id -Gnz %s\\\\%s" % (corpDomain,userName)).split("\0")
|
||||
|
||||
ret["uid"] = uid
|
||||
ret["gid"] = gid
|
||||
ret["groups"] = groups
|
||||
|
||||
|
||||
resp = jsonify(ret)
|
||||
resp.headers["Access-Control-Allow-Origin"] = "*"
|
||||
resp.headers["dataType"] = "json"
|
||||
|
||||
return resp
|
||||
|
||||
##
|
||||
## Actually setup the Api resource routing here
|
||||
##
|
||||
api.add_resource(GetUserId, '/GetUserId')
|
||||
|
||||
if __name__ == '__main__':
|
||||
app.run(debug=False,host="0.0.0.0",threaded=True)
|
|
@ -1,17 +0,0 @@
|
|||
[uwsgi]
|
||||
chdir=/infosrv
|
||||
module=domaininfo
|
||||
callable=app
|
||||
master=true
|
||||
processes=4
|
||||
chmod-socket=666
|
||||
logfile-chmod=644
|
||||
procname-prefix-spaced=DomainInfo
|
||||
py-autoreload=1
|
||||
socket=127.0.0.1:8988
|
||||
|
||||
vacuum=true
|
||||
socket=%(chdir)/uwsgi/uwsgi.sock
|
||||
stats=%(chdir)/uwsgi/uwsgi.status
|
||||
pidfile=%(chdir)/uwsgi/uwsgi.pid
|
||||
daemonize=%(chdir)/uwsgi/uwsgi.log
|
|
@ -1,39 +0,0 @@
|
|||
# This is a template configure file. Please change to your own settings before use.
|
||||
[libdefaults]
|
||||
ticket_lifetime = 24h
|
||||
# Replace with your own default realm
|
||||
default_realm = ATHENA.MIT.EDU
|
||||
forwardable = true
|
||||
|
||||
# Replace with your own realms
|
||||
[realms]
|
||||
ATHENA.MIT.EDU = {
|
||||
kdc = kerberos.mit.edu
|
||||
kdc = kerberos-1.mit.edu
|
||||
kdc = kerberos-2.mit.edu:88
|
||||
admin_server = kerberos.mit.edu
|
||||
default_domain = mit.edu
|
||||
}
|
||||
|
||||
# Replace with your own domain realms
|
||||
[domain_realm]
|
||||
.mit.edu = ATHENA.MIT.EDU
|
||||
mit.edu = ATHENA.MIT.EDU
|
||||
|
||||
#[kdc]
|
||||
# profile = /etc/krb5kdc/kdc.conf
|
||||
|
||||
[appdefaults]
|
||||
pam = {
|
||||
debug = false
|
||||
ticket_lifetime = 36000
|
||||
renew_lifetime = 36000
|
||||
forwardable = true
|
||||
krb4_convert = false
|
||||
}
|
||||
|
||||
[logging]
|
||||
kdc = SYSLOG:INFO:DAEMON
|
||||
kdc = FILE:/var/log/krb5kdc.log
|
||||
admin_server = FILE:/var/log/kadmin.log
|
||||
default = FILE:/var/log/krb5lib.log
|
|
@ -1,20 +0,0 @@
|
|||
# /etc/nsswitch.conf
|
||||
#
|
||||
# Example configuration of GNU Name Service Switch functionality.
|
||||
# If you have the `glibc-doc-reference' and `info' packages installed, try:
|
||||
# `info libc "Name Service Switch"' for information about this file.
|
||||
|
||||
passwd: compat winbind
|
||||
group: compat winbind
|
||||
shadow: compat
|
||||
gshadow: files
|
||||
|
||||
hosts: files dns
|
||||
networks: files
|
||||
|
||||
protocols: db files
|
||||
services: db files
|
||||
ethers: db files
|
||||
rpc: db files
|
||||
|
||||
netgroup: compat winbind
|
|
@ -1,14 +0,0 @@
|
|||
#!/bin/bash
|
||||
sed -i 's/%$(PAISMBUSER)/'$PAISMBUSER'/' /etc/samba/smb.conf
|
||||
sed -i 's/%$(DOMAIN)/'$DOMAIN'/' /etc/samba/smb.conf
|
||||
|
||||
net ads join -U "$DOMAINUSER"%"$DOMAINPWD"
|
||||
service winbind restart
|
||||
service smbd restart
|
||||
|
||||
useradd "$PAISMBUSER"
|
||||
(echo "$PAISMBPWD" && echo "$PAISMBPWD") | ./usr/bin/smbpasswd -a "$PAISMBUSER"
|
||||
|
||||
uwsgi --ini /infosrv/uwsgi.ini
|
||||
service nginx stop
|
||||
nginx -g 'daemon off;'
|
|
@ -1,9 +0,0 @@
|
|||
#!/bin/bash
|
||||
paiuser=$1
|
||||
|
||||
datapath=/share/pai/data
|
||||
umask 000
|
||||
if [ ! -d "$datapath" ];then
|
||||
mkdir -p "$datapath"
|
||||
chown "$paiuser":"$paiuser" "$datapath"
|
||||
fi
|
|
@ -1,25 +0,0 @@
|
|||
#!/bin/bash
|
||||
user=$1
|
||||
domain=$2
|
||||
uname=$3
|
||||
paiuser=$4
|
||||
|
||||
userspath=/share/pai/users
|
||||
umask 000
|
||||
if [ ! -d "$userspath" ];then
|
||||
mkdir -p "$userspath"
|
||||
chown "$paiuser":"$paiuser" "$userspath"
|
||||
fi
|
||||
|
||||
umask 007
|
||||
userpath="$userspath"/"$user"
|
||||
if [ ! -d "$userpath" ];then
|
||||
mkdir -p "$userpath"
|
||||
if [ $user != $uname ]
|
||||
then
|
||||
chown "$domain\\$user":"$paiuser" $userpath
|
||||
else
|
||||
chown "$user":"$paiuser" $userpath
|
||||
fi
|
||||
setfacl -m u:"$paiuser":rwx $userpath
|
||||
fi
|
|
@ -1,118 +0,0 @@
|
|||
# This is a template configure file. Please change to your own settings before use.
|
||||
# Further doco is here
|
||||
# https://www.samba.org/samba/docs/man/manpages/smb.conf.5.html
|
||||
[global]
|
||||
# No .tld
|
||||
workgroup = %$(DOMAIN)
|
||||
# Active Directory System
|
||||
security = ADS
|
||||
|
||||
# Replace with your own realm defined in krb5.conf
|
||||
realm = ATHENA.MIT.EDU
|
||||
|
||||
# map to guest = bad user
|
||||
# guest account = guest
|
||||
# Just a member server
|
||||
domain master = No
|
||||
local master = No
|
||||
preferred master = No
|
||||
# Works both in samba 3.2 and 3.6 and 4.1
|
||||
|
||||
# Replace with your own idmap config
|
||||
idmap config * : backend = rid
|
||||
idmap config * : range = 900000000-999999999
|
||||
idmap config ATHENA : backend = rid
|
||||
idmap config ATHENA : range = 100000000-199999999
|
||||
|
||||
|
||||
# One week is the default
|
||||
idmap cache time = 604800
|
||||
# If you set this to 0 winbind will get thrown into a loop and
|
||||
# be stuck at 99% mem and cpu.
|
||||
# 5m is the default
|
||||
winbind cache time = 300
|
||||
winbind enum users = No
|
||||
winbind enum groups = No
|
||||
# This way users log in with username instead of username@example.org
|
||||
winbind use default domain = No
|
||||
# Do not recursively descend into groups, it kills performance
|
||||
winbind nested groups = No
|
||||
# This is what slows down logins, if we didn't care about resolving groups
|
||||
# we could set this to 0
|
||||
winbind expand groups = 0
|
||||
winbind refresh tickets = Yes
|
||||
# Using offline login = Yes forces max domain connections to 1
|
||||
winbind offline logon = No
|
||||
winbind max clients = 1500
|
||||
winbind max domain connections = 50
|
||||
|
||||
# winbind separator = @
|
||||
winbind:ignore domains = 001D 064D 343I ADVENTUREWORKS9 AMALGA AMALGATEST BIGPARK BINGLAB CAE CCSSELFHOST CDV CERDP CETI CFDEV CLOUDLAB CONNECTED CONTOSO-01 CPEXEC CPMT CPMTPPE CRMDFIFDDOM CSLAB CTDEV DCLAB E14 E15 ERIDANUS EXCHANGE EXTRANET EXTRANETTEST FORNAX FULTONDOMAIN GME GMR HADEV HAVANATWO HEALTH HOSPITALA HVAADCS HYDRI HYPER-V IDCNETTEST ISLAND IT ITNEXTGENLAB LAB1BOISE LHWKSTA MASSIVEINCORPOR MEXEXCHANGEDC MGDNOK MMS MPSD-WI MR MSGENG MS-GMR MSLPA MSSTORE MSTT MTETCS MUTEST MYOPWV NEBCPS1 NEBCPS2 NEBCPS3 NEBCPS4 NEBCPS5 NLCPS1 NEBCPST NEBCPST NOE NOKIAEA NORTHWINDTEST NTDEV OBPPERF OCTANS OEXTRANET OFFICEDOG OFORNAX OSSCPUB OUALAB PARTNERS PARTTEST PCTS PDSTEAM PEOPLETEST PHX PIN PORTAL PROSUPPORT PRVFAB PYXIDIS RESOURCE REVOLUTION2 SAW SDITESTT SEDEV SEGROUP SENET SENTILLIONINC SLCLAB SPEECH SPWLAB SPXMAILDOMAIN STBTEST STODC01 SYS-SQLSVR SYS-WINGROUP TANGODOM1 TELECOMLAB TEQUILA Threshold TNT UKMCS UPGROUP VE VMLIBDOM VOMJUMPSTART WGIA WINDEPLOY WINSE WINSE-CTDEV WINSRVLAB WMD WPDEV XCORP XCORP XGROUP XGROUP XGROUPPPE XPORTAL XRED ZIPLINE
|
||||
# Disable printer support
|
||||
load printers = No
|
||||
printing = bsd
|
||||
printcap name = /dev/null
|
||||
disable spoolss = yes
|
||||
# Becomes /home/example/username
|
||||
template homedir = /storage/users/%U
|
||||
# shell access
|
||||
template shell = /bin/bash
|
||||
client use spnego = Yes
|
||||
client ntlmv2 auth = Yes
|
||||
encrypt passwords = Yes
|
||||
restrict anonymous = 2
|
||||
log level = 2
|
||||
log file = /var/log/samba/samba.log
|
||||
smb2 max read = 8388608
|
||||
smb2 max write = 8388608
|
||||
smb2 max trans = 8388608
|
||||
# This is fairly custom to Ubuntu
|
||||
# See www.samba.org/samba/docs/man/manpages-3/smb.conf.5.html#ADDMACHINESCRIPT
|
||||
# and https://www.samba.org/samba/docs/man/Samba-HOWTO-Collection/domain-member.html
|
||||
add machine script = /usr/sbin/adduser --system --gecos %u --home /var/lib/nobody --shell /bin/false --uid 300 --no-create-home %u
|
||||
|
||||
|
||||
[root]
|
||||
comment = Samba share root
|
||||
path = /share/pai
|
||||
valid users = %$(PAISMBUSER)
|
||||
writable = yes
|
||||
browseable = no
|
||||
#root preexec = /usr/bin/sambarootcreate %$(PAISMBUSER)
|
||||
create mask = 0777
|
||||
directory mask = 0777
|
||||
|
||||
[users]
|
||||
comment = Samba share users
|
||||
path = /share/pai/users
|
||||
valid users = %$(PAISMBUSER)
|
||||
writable = yes
|
||||
browseable = no
|
||||
root preexec = /usr/bin/sambauserhomecreate %U %D %u %$(PAISMBUSER)
|
||||
create mask = 0777
|
||||
directory mask = 0777
|
||||
|
||||
[home]
|
||||
comment = Samba share user home
|
||||
path = /share/pai/users/%U
|
||||
writeable = yes
|
||||
browseable = yes
|
||||
valid users = %$(PAISMBUSER) %D\%U
|
||||
root preexec = /usr/bin/sambauserhomecreate %U %D %u %$(PAISMBUSER)
|
||||
create mask = 0777
|
||||
|
||||
[data]
|
||||
comment = Samba share data
|
||||
path = /share/pai/data
|
||||
valid users = %$(PAISMBUSER) %D\%U
|
||||
writable = yes
|
||||
browseable = yes
|
||||
root preexec = /usr/bin/sambadatacreate %$(PAISMBUSER)
|
||||
directory mask = 0777
|
||||
force directory mode = 0777
|
||||
directory security mask = 0777
|
||||
force directory security mode = 0777
|
||||
create mask = 0777
|
||||
force create mode = 0777
|
||||
security mask = 0777
|
||||
force security mode = 0777
|
|
@ -1,15 +0,0 @@
|
|||
#!/bin/bash
|
||||
if [ -z "$5" ]; then
|
||||
echo "usage: ./start.sh <DOMAIN> <DOMAINUSER> <DOMAINPWD> <PAISMBUSER> <PAISMBPWD>"
|
||||
else
|
||||
DOMAIN=$1
|
||||
DOMAINUSER=$2
|
||||
DOMAINPWD=$3
|
||||
PAISMBUSER=$4
|
||||
PAISMBPWD=$5
|
||||
|
||||
mkdir -p /share/pai
|
||||
docker run -dit --privileged --restart=always -p 8079:80 -p 445:445 --mount type=bind,source=/share/pai,target=/share/pai \
|
||||
--name paismb -e DOMAIN="$DOMAIN" -e DOMAINUSER="$DOMAINUSER" -e DOMAINPWD="$DOMAINPWD" \
|
||||
-e PAISMBUSER="$PAISMBUSER" -e PAISMBPWD="$PAISMBPWD" paismb:stable
|
||||
fi
|
|
@ -1,241 +0,0 @@
|
|||
# Team wise storage
|
||||
|
||||
*NOTICE: This tool has been deprecated, please refer to [Setup Kubernetes Persistent Volumes as Storage on PAI](../../docs/setup-persistent-volumes-on-pai.md).*
|
||||
|
||||
|
||||
A tool to manage external storage in PAI.
|
||||
|
||||
## Index
|
||||
- [ What is team wise storage](#Team_storage)
|
||||
- [ Team wise storage usages ](#Usages)
|
||||
- [ Setup server ](#Usages_setup_server)
|
||||
- [ Create storage server in PAI ](#Usages_server)
|
||||
- [ Create storage config in PAI ](#Usages_config)
|
||||
- [ Set storage config access for group ](#Usages_groupsc)
|
||||
- [ Use Storage in PAI ](#Usages_job)
|
||||
- [ Example ](#Usages_example)
|
||||
- [ Storage data structure ](#Data_structure)
|
||||
- [ Server data structure ](#Server_data)
|
||||
- [ Nfs Server data structure ](#Nfs_data)
|
||||
- [ Samba Server data structure ](#Samba_data)
|
||||
- [ Azurefile Server data structure ](#Azurefile_data)
|
||||
- [ Azureblob Server data structure ](#Azureblob_data)
|
||||
- [ Hdfs Server data structure ](#Hdfs_data)
|
||||
- [ Config data structure ](#Config_data)
|
||||
- [ Config in group data ](#Config_in_group_data)
|
||||
|
||||
## What is team wise storage <a name="Team_storage"></a>
|
||||
Team wise storage is a solution that helps admin to manage NAS(network attached storage) by team/group. After admin configured team wise storage settings, users can easily use NAS in their jobs.<br/>
|
||||
Team wise storage solution offers:
|
||||
- Multiple NAS support, including NFS, Samba, Azurefile, Azureblob and HDFS
|
||||
- Configurable mount structure settings
|
||||
- Mixed usage for different NAS
|
||||
- Configuration for Team/Group scope
|
||||
|
||||
## Team wise storage usages <a name="Usages"></a>
|
||||
|
||||
### Setup server <a name="Usages_setup_server"></a>
|
||||
- NFS
|
||||
|
||||
Edit /etc/exports, export /root/path/to/share
|
||||
```
|
||||
/root/path/to/share (rw, sync, no_root_squash)
|
||||
```
|
||||
no_root_squash is needed for storage plugin to creae folders.
|
||||
|
||||
- Samba
|
||||
|
||||
After create samba server, create user for PAI to use samba.
|
||||
```
|
||||
useradd paismb
|
||||
smbpasswd -a paismb
|
||||
#Input password for paismb
|
||||
```
|
||||
|
||||
- Azurefile
|
||||
|
||||
Create Azurefile share through azure web portal.
|
||||
|
||||
- Azureblob
|
||||
|
||||
Create Azureblob share through azure web portal.
|
||||
|
||||
|
||||
### Create storage server in PAI <a name="Usages_server"></a>
|
||||
In PAI dev-box, swith to folder pai/contrib/storage-plugin
|
||||
|
||||
Create server config using command:
|
||||
- NFS:
|
||||
```
|
||||
python storagectl.py server set NAME nfs ADDRESS ROOTPATH
|
||||
```
|
||||
|
||||
- Samba:
|
||||
```
|
||||
python storagectl.py server set NAME samba ADDRESS ROOTPATH USERNAME PASSWORD DOMAIN
|
||||
```
|
||||
|
||||
- Azurefile:
|
||||
```
|
||||
python storagectl.py server set NAME azurefile DATASTORE FILESHARE ACCOUNTNAME KEY
|
||||
```
|
||||
|
||||
- Azureblob:
|
||||
```
|
||||
python storagectl.py server set NAME azureblob DATASTORE CONTAINERNAME ACCOUNTNAME KEY
|
||||
```
|
||||
|
||||
- HDFS:
|
||||
```
|
||||
python storagectl.py server set NAME hdfs NAMENODE PORT
|
||||
```
|
||||
|
||||
### Create storage config in PAI <a name="Usages_config"></a>
|
||||
In PAI dev-box, swith to folder pai/contrib/storage-plugin
|
||||
|
||||
Create config using command:
|
||||
```
|
||||
python storagectl.py config set CONFIG_NAME GROUP_NAME [-s SERVER_NAME_1 SERVER_NAME_2 ...] [-m MOUNT_POINT SERVER PATH]... [-d]
|
||||
```
|
||||
|
||||
### Set storage config access for group <a name="Usages_groupsc"></a>
|
||||
In PAI dev-box, swith to folder pai/contrib/storage-plugin
|
||||
|
||||
Set storage config access for group using command:
|
||||
```
|
||||
python storagectl.py groupsc add GROUP_NAME CONFIG_NAME
|
||||
```
|
||||
|
||||
### Use Storage info in job container <a name="Usages_job"></a>
|
||||
User can use team wise storage through job submit page. Please refer to related page for details.
|
||||
|
||||
### Example <a name="Usages_example"></a>
|
||||
Suppose admin has set up a new samba server "smbserver" on "10.0.0.0", created PAI account "paismb" with password "paipwd".
|
||||
The structure of samba server is as follows:
|
||||
```
|
||||
-- root
|
||||
-- data
|
||||
-- users
|
||||
-- user1
|
||||
-- user2
|
||||
...
|
||||
```
|
||||
Now we want all members of "paigroup" mount server's data folder to /data, and user's data (e.g user1) to /user by default. The admin should setup storage config in PAI using:
|
||||
```bash
|
||||
python storagectl.py server set smbserver samba 10.0.0.1 root paismb paipwd local
|
||||
python storagectl.py config set configsmb -s smbserver -m /data smbserver data -m /user smbserver 'users/${PAI_USER_NAME}' -d
|
||||
python storagectl.py groupsc add paigroup configsmb
|
||||
```
|
||||
Then when "paiuser" from "paigroup" uses job submit page, the configsmb will be shown and user can choose whether to use it <br/>
|
||||
|
||||
|
||||
## Team wise storage data structures <a name="Data_structure"></a>
|
||||
|
||||
### Server data structure <a name="Server_data"></a>
|
||||
```json
|
||||
{
|
||||
"spn": "servername",
|
||||
"type": "nfs|samba|azurefile|azureblob"
|
||||
}
|
||||
```
|
||||
#### Nfs Server data structure <a name="Nfs_data"></a>
|
||||
```json
|
||||
{
|
||||
"spn": "servername",
|
||||
"type": "nfs",
|
||||
"address": "server/address",
|
||||
"rootPath": "server/root/path"
|
||||
}
|
||||
```
|
||||
|
||||
#### Samba Server data structure <a name="Samba_data"></a>
|
||||
```json
|
||||
{
|
||||
"spn": "servername",
|
||||
"type": "samba",
|
||||
"address": "server/address",
|
||||
"rootPath": "server/root/path",
|
||||
"userName": "username",
|
||||
"password": "password",
|
||||
"domain": "userdomain"
|
||||
}
|
||||
```
|
||||
|
||||
#### Azurefile Server data structure <a name="Azurefile_data"></a>
|
||||
```json
|
||||
{
|
||||
"spn": "servername",
|
||||
"type": "azurefile",
|
||||
"dataStore": "datastore",
|
||||
"fileShare": "fileshare",
|
||||
"accountName": "accountname",
|
||||
"key": "key"
|
||||
}
|
||||
```
|
||||
|
||||
#### Azureblob Server data structure <a name="Azureblob_data"></a>
|
||||
```json
|
||||
{
|
||||
"spn": "servername",
|
||||
"type": "azureblob",
|
||||
"dataStore": "datastore",
|
||||
"containerName": "containername",
|
||||
"accountName": "accountname",
|
||||
"key": "key"
|
||||
}
|
||||
```
|
||||
|
||||
#### Hdfs Server data structure <a name="Hdfs_data"></a>
|
||||
```json
|
||||
{
|
||||
"spn": "servername",
|
||||
"type": "hdfs",
|
||||
"namenode": "namenode",
|
||||
"port": "port",
|
||||
}
|
||||
```
|
||||
|
||||
### Config data structure <a name="Config_data"></a>
|
||||
```json
|
||||
{
|
||||
"name": "configname",
|
||||
"gpn": "groupname",
|
||||
"default": false,
|
||||
"servers": [
|
||||
"servername",
|
||||
],
|
||||
"mountInfos": [
|
||||
{
|
||||
"mountpoint": "local/mount/point",
|
||||
"server": "servername",
|
||||
"path": "server/sub/path"
|
||||
},
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
- MountInfo: How user mount server path to local.
|
||||
```json
|
||||
{
|
||||
"mountpoint": "local/mount/point",
|
||||
"server": "servername",
|
||||
"path": "server/sub/path"
|
||||
}
|
||||
```
|
||||
|
||||
### Config in group data<a name="Config_in_group_data"></a>
|
||||
- Which storage configs that a group can access is stored in group data's extension field. For example, a group that can access STORAGE_CONFIG is like following:
|
||||
```json
|
||||
{
|
||||
"groupname": "groupname",
|
||||
"externalName": "externalName",
|
||||
"description": "description",
|
||||
"extension": {
|
||||
"acls": {
|
||||
"admin": false,
|
||||
"virtualClusters": [],
|
||||
"storageConfigs": ["STORAGE_CONFIG"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
|
@ -1,16 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
|
@ -1,8 +0,0 @@
|
|||
{
|
||||
"type": "nfs",
|
||||
"title": "nfs_example",
|
||||
"address": "10.0.0.1",
|
||||
"rootPath": "/share/nfs",
|
||||
"sharedFolders": ["data"],
|
||||
"privateFolders": ["users"]
|
||||
}
|
|
@ -1,6 +0,0 @@
|
|||
{
|
||||
"defaultStorage": "nfs_example.json",
|
||||
"externalStorages": [
|
||||
"nfs_example.json"
|
||||
]
|
||||
}
|
|
@ -1,37 +0,0 @@
|
|||
{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"type": {
|
||||
"type": "string",
|
||||
"description": "The type of external storage"
|
||||
},
|
||||
"title": {
|
||||
"type": "string",
|
||||
"description": "Shown name of external storage"
|
||||
},
|
||||
"address": {
|
||||
"type": "string",
|
||||
"description": "The ip address of external storage"
|
||||
},
|
||||
"rootPath": {
|
||||
"type": "string",
|
||||
"description": "The root path of external storage"
|
||||
},
|
||||
"sharedFolders": {
|
||||
"type": "array",
|
||||
"description": "Shared folder under root path",
|
||||
"items": { "type": "string" }
|
||||
},
|
||||
"privateFolders": {
|
||||
"type": "array",
|
||||
"description": "The base of user private folder under root path, represent rootPath/$base/$username",
|
||||
"items": { "type": "string" }
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"type",
|
||||
"title",
|
||||
"address",
|
||||
"rootPath"
|
||||
]
|
||||
}
|
|
@ -1,18 +0,0 @@
|
|||
{
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"defaultStorage": {
|
||||
"type": "string",
|
||||
"description": "User default external storage"
|
||||
},
|
||||
"externalStorages": {
|
||||
"type": "array",
|
||||
"description": "All external storages that the user has permission to access",
|
||||
"items": { "type": "string" }
|
||||
}
|
||||
},
|
||||
"required": [
|
||||
"defaultStorage",
|
||||
"externalStorages"
|
||||
]
|
||||
}
|
|
@ -1,111 +0,0 @@
|
|||
# storagectl
|
||||
|
||||
A tool to manage your storage config.
|
||||
|
||||
## Index
|
||||
- [ Manage server ](#Server_config)
|
||||
- [ Set server ](#Server_set)
|
||||
- [ Set nfs server ](#Server_set_nfs)
|
||||
- [ Set samba server ](#Server_set_samba)
|
||||
- [ Set azurefile server ](#Server_set_azurefile)
|
||||
- [ Set azureblob server ](#Server_set_azureblob)
|
||||
- [ Set hdfs server ](#Server_set_hdfs)
|
||||
- [ List server ](#Server_list)
|
||||
- [ Delete server ](#Server_delete)
|
||||
|
||||
- [ Manage config ](#Config_config)
|
||||
- [ Set config ](#Config_set)
|
||||
- [ List config ](#Config_list)
|
||||
- [ Delete config ](#Config_delete)
|
||||
|
||||
- [ Manage group storage access ](#Groupsc_config)
|
||||
- [ Add group storage config ](#Groupsc_add)
|
||||
- [ List group storage configs ](#Groupsc_list)
|
||||
- [ Delete group storage config ](#Groupsc_delete)
|
||||
|
||||
|
||||
## Manage Server <a name="Server_config"></a>
|
||||
Manage server in PAI. Server how PAI access a nas server.
|
||||
### Set server <a name="Server_set"></a>
|
||||
|
||||
#### Set nfs server <a name="Server_set_nfs"></a>
|
||||
```
|
||||
python storagectl.py server set NAME nfs ADDRESS ROOTPATH
|
||||
```
|
||||
|
||||
#### Set samba server <a name="Server_set_samba"></a>
|
||||
```
|
||||
python storagectl.py server set NAME samba ADDRESS ROOTPATH USERNAME PASSWORD DOMAIN
|
||||
```
|
||||
|
||||
#### Set azurefile server <a name="Server_set_azurefile"></a>
|
||||
```
|
||||
python storagectl.py server set NAME azurefile DATASTORE FILESHARE ACCOUNTNAME KEY [-p PROXY_ADDRESS PROXY_PASSWORD]
|
||||
```
|
||||
|
||||
#### Set azureblob server <a name="Server_set_azureblob"></a>
|
||||
```
|
||||
python storagectl.py server set NAME azureblob DATASTORE CONTAINERNAME ACCOUNTNAME KEY
|
||||
```
|
||||
|
||||
#### Set hdfs server <a name="Server_set_hdfs"></a>
|
||||
```
|
||||
python storagectl.py server set NAME hdfs NAMENODE PORT
|
||||
```
|
||||
|
||||
### List server <a name="Server_list"></a>
|
||||
```
|
||||
python storagectl.py server list [-n SERVER_NAME_1, SERVER_NAME_2 ...]
|
||||
```
|
||||
- If -n specified, list certain servers. Otherwise list all servers.
|
||||
|
||||
### Delete server <a name="Server_delete"></a>
|
||||
```
|
||||
python storagectl.py user delete SERVER_NAME
|
||||
```
|
||||
|
||||
|
||||
## Manage Config <a name="Config_config"></a>
|
||||
Manage configs for group in PAI. Config defines a set of mount infos. Every config belongs to a group. That is to say, one group may have 0 to n configs.
|
||||
### Set config <a name="Config_set"></a>
|
||||
```
|
||||
python storagectl.py config set CONFIG_NAME [-m MOUNT_POINT SERVER PATH]... [-d]
|
||||
```
|
||||
- If -d is set, means mount config storage by default.
|
||||
- -m means the mount info for config. If -m specified, the PATH on SERVER will be mount to MOUNT_POINT.
|
||||
- [Job Environment Varialbes](https://github.com/microsoft/pai/blob/master/docs/job_tutorial.md#environment-variables) can be referenced In PATH. Please use '' to quote job environment variables to avoid references to local variables in dev-box.
|
||||
|
||||
For example, suppose we have set config using:
|
||||
```
|
||||
python storagectl.py config set SAMPLE_CONFIG -m /mnt/job SAMPLE_SERVER 'users/${PAI_USER_NAME}/jobs/${PAI_JOB_NAME}'
|
||||
```
|
||||
If current user is 'paiuser' and current job is 'job-TEST'. This config will mount SAMPLE_SERVER/users/paiuser/jobs/job-TEST to /mnt/job
|
||||
|
||||
### List config <a name="Config_list"></a>
|
||||
```
|
||||
python storagectl.py config list [-n CONFIG_NAME_1, CONFIG_NAME_2 ...]
|
||||
```
|
||||
- If -n specified, list certain configs. Otherwise list all config.
|
||||
|
||||
### Delete config <a name="Config_delete"></a>
|
||||
```
|
||||
python storagectl.py config delete CONFIG_NAME
|
||||
```
|
||||
|
||||
|
||||
## Manage group storage access <a name="Groupsc_config"></a>
|
||||
Manage PAI group's storage config access.
|
||||
### Add group storage config <a name="Groupsc_set"></a>
|
||||
```
|
||||
python storagectl.py groupsc add GROUP_NAME CONFIG_NAME
|
||||
```
|
||||
|
||||
### List group storage config <a name="Groupsc_list"></a>
|
||||
```
|
||||
python storagectl.py groupsc list GROUP_NAME
|
||||
```
|
||||
|
||||
### Delete group storage config <a name="Groupsc_delete"></a>
|
||||
```
|
||||
python storagectl.py groupsc delete GROUP_NAME CONFIG_NAME
|
||||
```
|
|
@ -1,273 +0,0 @@
|
|||
#!/usr/bin/env python
|
||||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
from __future__ import absolute_import
|
||||
from __future__ import print_function
|
||||
|
||||
import os
|
||||
import sys
|
||||
import argparse
|
||||
import datetime
|
||||
import logging
|
||||
import logging.config
|
||||
import json
|
||||
import base64
|
||||
import subprocess
|
||||
import multiprocessing
|
||||
import random,string
|
||||
|
||||
from kubernetes import client, config, watch
|
||||
from kubernetes.client.rest import ApiException
|
||||
|
||||
from utils.storage_util import *
|
||||
|
||||
import binascii
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Save server config to k8s secret
|
||||
def save_secret(secret_name, name, content_dict):
|
||||
secret_dict = dict()
|
||||
secret_dict[name] = base64.b64encode(json.dumps(content_dict))
|
||||
patch_secret(secret_name, secret_dict, "pai-storage")
|
||||
|
||||
def show_secret(args):
|
||||
secret_data = get_secret(args.secret_name, "pai-storage")
|
||||
if secret_data is None:
|
||||
logger.error("No secret found.")
|
||||
else:
|
||||
for key, value in secret_data.iteritems():
|
||||
if args.name is None or key in args.name:
|
||||
print(key)
|
||||
print(base64.b64decode(value))
|
||||
|
||||
def delete_secret(args):
|
||||
delete_secret_content(args.secret_name, args.name, "pai-storage")
|
||||
|
||||
|
||||
def server_set(args):
|
||||
content_dict = dict()
|
||||
content_dict["spn"] = args.name
|
||||
content_dict["type"] = args.server_type
|
||||
if args.server_type == "nfs":
|
||||
content_dict["address"] = args.address
|
||||
content_dict["rootPath"] = args.root_path
|
||||
elif args.server_type == "samba":
|
||||
content_dict["address"] = args.address
|
||||
content_dict["rootPath"] = args.root_path
|
||||
content_dict["userName"] = args.user_name
|
||||
content_dict["password"] = args.password
|
||||
content_dict["domain"] = args.domain
|
||||
elif args.server_type == "azurefile":
|
||||
content_dict["dataStore"] = args.data_store
|
||||
content_dict["fileShare"] = args.file_share
|
||||
content_dict["accountName"] = args.account_name
|
||||
content_dict["key"] = args.key
|
||||
if args.proxy is not None:
|
||||
content_dict["proxy"] = args.proxy
|
||||
elif args.server_type == "azureblob":
|
||||
content_dict["dataStore"] = args.data_store
|
||||
content_dict["containerName"] = args.container_name
|
||||
content_dict["accountName"] = args.account_name
|
||||
content_dict["key"] = args.key
|
||||
elif args.server_type == "hdfs":
|
||||
content_dict["namenode"] = args.namenode
|
||||
content_dict["port"] = args.port
|
||||
else:
|
||||
logger.error("Unknow storage type")
|
||||
sys.exit(1)
|
||||
save_secret("storage-server", args.name, content_dict)
|
||||
|
||||
|
||||
def config_set(args):
|
||||
try:
|
||||
content_dict = dict()
|
||||
content_dict["name"] = args.name
|
||||
content_dict["servers"] = args.servers
|
||||
content_dict["default"] = args.default
|
||||
if args.mount_info is not None:
|
||||
mount_infos = []
|
||||
for info_data in args.mount_info:
|
||||
# Verify mount point, mountPoint should starts with "/" and path should not
|
||||
if not info_data[0].startswith("/"):
|
||||
raise NameError("MOUNT_POINT should be absolute path and starts with \'/\'")
|
||||
elif info_data[2].startswith("/"):
|
||||
raise NameError("PATH should be relative path and not starts with \'/\'")
|
||||
else:
|
||||
info = {"mountPoint" : info_data[0], "server" : info_data[1], "path" : info_data[2]}
|
||||
mount_infos.append(info)
|
||||
content_dict["mountInfos"] = mount_infos
|
||||
except NameError as e:
|
||||
logger.error(e)
|
||||
else:
|
||||
save_secret("storage-config", args.name, content_dict)
|
||||
|
||||
def get_group_extension(group_name):
|
||||
group_hex = binascii.hexlify(group_name)
|
||||
secret_data = get_secret(group_hex, "pai-group")
|
||||
if secret_data is None:
|
||||
logger.error("No group found.")
|
||||
return None
|
||||
else:
|
||||
extension = json.loads(base64.b64decode(secret_data["extension"]))
|
||||
return extension
|
||||
|
||||
def groupsc_add(args):
|
||||
extension = get_group_extension(args.group_name)
|
||||
if extension is not None:
|
||||
if "storageConfigs" not in extension["acls"]:
|
||||
extension["acls"]["storageConfigs"] = []
|
||||
storageConfigs = extension["acls"]["storageConfigs"]
|
||||
if args.config_name not in storageConfigs:
|
||||
storageConfigs.append(args.config_name)
|
||||
secret_dict = dict()
|
||||
secret_dict["extension"] = base64.b64encode(json.dumps(extension))
|
||||
patch_secret(binascii.hexlify(args.group_name), secret_dict, "pai-group")
|
||||
logger.info("Successfully added storage config to group!")
|
||||
|
||||
def groupsc_delete(args):
|
||||
extension = get_group_extension(args.group_name)
|
||||
if extension is not None:
|
||||
storageConfigs = extension["acls"]["storageConfigs"]
|
||||
if args.config_name in storageConfigs:
|
||||
storageConfigs.remove(args.config_name)
|
||||
secret_dict = dict()
|
||||
secret_dict["extension"] = base64.b64encode(json.dumps(extension))
|
||||
patch_secret(binascii.hexlify(args.group_name), secret_dict, "pai-group")
|
||||
logger.info("Successfully deleted storage config from group!")
|
||||
|
||||
def groupsc_list(args):
|
||||
extension = get_group_extension(args.group_name)
|
||||
if extension is not None:
|
||||
print(extension["acls"]["storageConfigs"])
|
||||
|
||||
def setup_logger_config(logger):
|
||||
"""
|
||||
Setup logging configuration.
|
||||
"""
|
||||
if len(logger.handlers) == 0:
|
||||
logger.propagate = False
|
||||
logger.setLevel(logging.DEBUG)
|
||||
consoleHandler = logging.StreamHandler()
|
||||
consoleHandler.setLevel(logging.DEBUG)
|
||||
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
|
||||
consoleHandler.setFormatter(formatter)
|
||||
logger.addHandler(consoleHandler)
|
||||
|
||||
|
||||
def main():
|
||||
scriptFolder=os.path.dirname(os.path.realpath(__file__))
|
||||
os.chdir(scriptFolder)
|
||||
|
||||
parser = argparse.ArgumentParser(description="pai storage management tool")
|
||||
subparsers = parser.add_subparsers(help='Storage management cli')
|
||||
|
||||
# ./storagectl.py server set|list|delete
|
||||
server_parser = subparsers.add_parser("server", description="Commands to manage servers.", formatter_class=argparse.RawDescriptionHelpFormatter)
|
||||
server_subparsers = server_parser.add_subparsers(help="Add/modify, list or delete server")
|
||||
# ./storgectl.py server set ...
|
||||
server_set_parser = server_subparsers.add_parser("set")
|
||||
server_set_parser.add_argument("name")
|
||||
server_set_subparsers = server_set_parser.add_subparsers(help="Add/modify storage types, currently support nfs, samba, azurefile and azureblob")
|
||||
# ./storagectl.py server set NAME nfs ADDRESS ROOTPATH
|
||||
server_set_nfs_parser = server_set_subparsers.add_parser("nfs")
|
||||
server_set_nfs_parser.add_argument("address", metavar="address", help="Nfs remote address")
|
||||
server_set_nfs_parser.add_argument("root_path", metavar="rootpath", help="Nfs remote root path")
|
||||
server_set_nfs_parser.set_defaults(func=server_set, server_type="nfs")
|
||||
# ./storagectl.py server set NAME samba ADDRESS ROOTPATH USERNAME PASSWORD DOMAIN
|
||||
server_set_samba_parser = server_set_subparsers.add_parser("samba")
|
||||
server_set_samba_parser.add_argument("address", metavar="address", help="Samba remote address")
|
||||
server_set_samba_parser.add_argument("root_path", metavar="rootpath", help="Samba remote root path")
|
||||
server_set_samba_parser.add_argument("user_name", metavar="username", help="Samba PAI username")
|
||||
server_set_samba_parser.add_argument("password", metavar="password", help="Samba PAI password")
|
||||
server_set_samba_parser.add_argument("domain", metavar="domain", help="Samba PAI domain")
|
||||
server_set_samba_parser.set_defaults(func=server_set, server_type="samba")
|
||||
# ./storagectl.py server set NAME azurefile DATASTORE FILESHARE ACCOUNTNAME KEY [-p PROXY_ADDRESS PROXY_PASSWORD]
|
||||
server_set_azurefile_parser = server_set_subparsers.add_parser("azurefile")
|
||||
server_set_azurefile_parser.add_argument("data_store", metavar="datastore", help="Azurefile data store")
|
||||
server_set_azurefile_parser.add_argument("file_share", metavar="fileshare", help="Azurefile file share")
|
||||
server_set_azurefile_parser.add_argument("account_name", metavar="accountname", help="Azurefile account name")
|
||||
server_set_azurefile_parser.add_argument("key", metavar="key", help="Azurefile share key")
|
||||
server_set_azurefile_parser.add_argument("-p", "--proxy", dest="proxy", nargs=2, help="Proxy to mount azure file: PROXY_INFO PROXY_PASSWORD")
|
||||
server_set_azurefile_parser.set_defaults(func=server_set, server_type="azurefile")
|
||||
# ./storagectl.py server set NAME azureblob DATASTORE CONTAINERNAME ACCOUNTNAME KEY
|
||||
server_set_azureblob_parser = server_set_subparsers.add_parser("azureblob")
|
||||
server_set_azureblob_parser.add_argument("data_store", metavar="datastore", help="Azureblob data store")
|
||||
server_set_azureblob_parser.add_argument("container_name", metavar="containername", help="Azureblob container name")
|
||||
server_set_azureblob_parser.add_argument("account_name", metavar="accountname", help="Azureblob account name")
|
||||
server_set_azureblob_parser.add_argument("key", metavar="key", help="Azureblob share key")
|
||||
server_set_azureblob_parser.set_defaults(func=server_set, server_type="azureblob")
|
||||
# ./storagectl.py server set NAME hdfs NAMENODE PORT
|
||||
server_set_hdfs_parser = server_set_subparsers.add_parser("hdfs")
|
||||
server_set_hdfs_parser.add_argument("namenode", metavar="namenode", help="HDFS name node")
|
||||
server_set_hdfs_parser.add_argument("port", metavar="port", help="HDFS name node port")
|
||||
server_set_hdfs_parser.set_defaults(func=server_set, server_type="hdfs")
|
||||
# ./storagectl.py server list [-n SERVER_NAME_1, SERVER_NAME_2 ...]
|
||||
server_list_parser = server_subparsers.add_parser("list")
|
||||
server_list_parser.add_argument("-n", "--name", dest="name", nargs="+", help="filter result by names")
|
||||
server_list_parser.set_defaults(func=show_secret, secret_name="storage-server")
|
||||
# ./storagectl.py user delete SERVER_NAME
|
||||
server_del_parser = server_subparsers.add_parser("delete")
|
||||
server_del_parser.add_argument("name")
|
||||
server_del_parser.set_defaults(func=delete_secret, secret_name="storage-server")
|
||||
|
||||
# ./storagectl.py config ...
|
||||
config_parser = subparsers.add_parser("config", description="Manage config", formatter_class=argparse.RawDescriptionHelpFormatter)
|
||||
config_subparsers = config_parser.add_subparsers(help="Manage config")
|
||||
# ./storagectl.py config set CONFIG_NAME GROUP_NAME [-s SERVER_NAME_1 SERVER_NAME_2 ...] [-m MOUNT_POINT SERVER PATH]... [-d]
|
||||
config_set_parser = config_subparsers.add_parser("set")
|
||||
config_set_parser.add_argument("name", help="Config name")
|
||||
config_set_parser.add_argument("-s", "--server", dest="servers", nargs="+", help="-s SERVER_NAME_1 SERVER_NAME_2 ...")
|
||||
config_set_parser.add_argument("-m", "--mountinfo", dest="mount_info", nargs=3, action="append", help="-m MOUNT_POINT SERVER SUB_PATH")
|
||||
config_set_parser.add_argument("-d", "--default", action="store_true", help="Mount by default")
|
||||
config_set_parser.set_defaults(func=config_set)
|
||||
# ./storagectl.py config list [-n CONFIG_NAME_1, CONFIG_NAME_2 ...] [-g GROUP_NAME_1, GROUP_NAME_2 ...]
|
||||
config_list_parser = config_subparsers.add_parser("list")
|
||||
config_list_parser.add_argument("-n", "--name", dest="name", nargs="+", help="filter result by names")
|
||||
config_list_parser.add_argument("-g", "--group", dest="group", nargs="+", help="filter result by groups")
|
||||
config_list_parser.set_defaults(func=show_secret, secret_name="storage-config")
|
||||
# ./storagectl.py config delete CONFIG_NAME
|
||||
config_del_parser = config_subparsers.add_parser("delete")
|
||||
config_del_parser.add_argument("name")
|
||||
config_del_parser.set_defaults(func=delete_secret, secret_name="storage-config")
|
||||
|
||||
# ./storagectl.py groupsc add|delete|list
|
||||
groupsc_parser = subparsers.add_parser("groupsc", description="Manage group storage config", formatter_class=argparse.RawDescriptionHelpFormatter)
|
||||
groupsc_subparsers = groupsc_parser.add_subparsers(help="Manage group storage config")
|
||||
# ./storagectl.py groupsc add GROUP_NAME STORAGE_CONFIG_NAME
|
||||
groupsc_add_parser = groupsc_subparsers.add_parser("add")
|
||||
groupsc_add_parser.add_argument("group_name")
|
||||
groupsc_add_parser.add_argument("config_name")
|
||||
groupsc_add_parser.set_defaults(func=groupsc_add)
|
||||
# ./storagectl.py groupsc delete GROUP_NAME STORAGE_CONFIG_NAME
|
||||
groupsc_delete_parser = groupsc_subparsers.add_parser("delete")
|
||||
groupsc_delete_parser.add_argument("group_name")
|
||||
groupsc_delete_parser.add_argument("config_name")
|
||||
groupsc_delete_parser.set_defaults(func=groupsc_delete)
|
||||
# ./storagectl.py groupsc list GROUP_NAME
|
||||
groupsc_list_parser = groupsc_subparsers.add_parser("list")
|
||||
groupsc_list_parser.add_argument("group_name")
|
||||
groupsc_list_parser.set_defaults(func=groupsc_list)
|
||||
|
||||
args = parser.parse_args()
|
||||
args.func(args)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
setup_logger_config(logger)
|
||||
main()
|
|
@ -1,16 +0,0 @@
|
|||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
|
@ -1,194 +0,0 @@
|
|||
#!/usr/bin/env python
|
||||
# Copyright (c) Microsoft Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
# MIT License
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
|
||||
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
|
||||
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
|
||||
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
||||
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
|
||||
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
|
||||
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||||
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
import logging
|
||||
import logging.config
|
||||
import base64
|
||||
|
||||
from kubernetes import client, config, watch
|
||||
from kubernetes.client.rest import ApiException
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
def confirm_namespace(namespace):
|
||||
config.load_kube_config()
|
||||
api_instance = client.CoreV1Api()
|
||||
|
||||
try:
|
||||
api_response = api_instance.read_namespace(namespace)
|
||||
|
||||
except ApiException as e:
|
||||
if e.status == 404:
|
||||
logger.info("Couldn't find namespace {0}. Create new namespace".format(namespace))
|
||||
try:
|
||||
meta_data = client.V1ObjectMeta(name=namespace)
|
||||
body = client.V1ConfigMap(metadata=meta_data)
|
||||
api_response = api_instance.create_namespace(body)
|
||||
logger.info("Namesapce {0} is created".format(namespace))
|
||||
except ApiException as ie:
|
||||
logger.error("Exception when calling CoreV1Api->create_namespace: {0}".format(str(ie)))
|
||||
sys.exit(1)
|
||||
else:
|
||||
logger.error("Exception when calling CoreV1Api->read_namespace: {0}".format(str(e)))
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
# List usernames from pai-user secrets
|
||||
def get_pai_users():
|
||||
users = []
|
||||
config.load_kube_config()
|
||||
api_instance = client.CoreV1Api()
|
||||
|
||||
try:
|
||||
api_response = api_instance.list_namespaced_secret("pai-user")
|
||||
for item in api_response.items:
|
||||
users.append(base64.b64decode(item.data["username"]))
|
||||
|
||||
except ApiException as e:
|
||||
if e.status == 404:
|
||||
logger.info("Couldn't find secret in namespace pai-user, exit")
|
||||
sys.exit(1)
|
||||
else:
|
||||
logger.error("Exception when calling CoreV1Api->list_namespaced_secret: {0}".format(str(e)))
|
||||
sys.exit(1)
|
||||
|
||||
return users
|
||||
|
||||
|
||||
def update_configmap(name, data_dict, namespace):
|
||||
confirm_namespace(namespace)
|
||||
|
||||
config.load_kube_config()
|
||||
api_instance = client.CoreV1Api()
|
||||
|
||||
meta_data = client.V1ObjectMeta()
|
||||
meta_data.namespace = namespace
|
||||
meta_data.name = name
|
||||
body = client.V1ConfigMap(
|
||||
metadata = meta_data,
|
||||
data = data_dict)
|
||||
|
||||
try:
|
||||
api_response = api_instance.patch_namespaced_config_map(name, namespace, body)
|
||||
logger.info("configmap named {0} is updated.".format(name))
|
||||
except ApiException as e:
|
||||
if e.status == 404:
|
||||
try:
|
||||
logger.info("Couldn't find configmap named {0}. Create a new configmap".format(name))
|
||||
api_response = api_instance.create_namespaced_config_map(namespace, body)
|
||||
logger.info("Configmap named {0} is created".format(name))
|
||||
except ApiException as ie:
|
||||
logger.error("Exception when calling CoreV1Api->create_namespaced_config_map: {0}".format(str(ie)))
|
||||
sys.exit(1)
|
||||
else:
|
||||
logger.error("Exception when calling CoreV1Api->patch_namespaced_config_map: {0}".format(str(e)))
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
def get_storage_config(storage_config_name, namespace):
|
||||
confirm_namespace(namespace)
|
||||
|
||||
config.load_kube_config()
|
||||
api_instance = client.CoreV1Api()
|
||||
|
||||
try:
|
||||
api_response = api_instance.read_namespaced_config_map(storage_config_name, namespace)
|
||||
|
||||
except ApiException as e:
|
||||
if e.status == 404:
|
||||
logger.info("Couldn't find configmap named {0}.".format(storage_config_name))
|
||||
return None
|
||||
else:
|
||||
logger.error("Exception when calling CoreV1Api->read_namespaced_config_map: {0}".format(str(e)))
|
||||
sys.exit(1)
|
||||
|
||||
return api_response.data
|
||||
|
||||
|
||||
def patch_secret(name, data_dict, namespace):
|
||||
confirm_namespace(namespace)
|
||||
|
||||
config.load_kube_config()
|
||||
api_instance = client.CoreV1Api()
|
||||
|
||||
meta_data = client.V1ObjectMeta()
|
||||
meta_data.namespace = namespace
|
||||
meta_data.name = name
|
||||
body = client.V1Secret(metadata = meta_data, data = data_dict)
|
||||
|
||||
try:
|
||||
api_response = api_instance.patch_namespaced_secret(name, namespace, body)
|
||||
logger.info("Secret named {0} is updated.".format(name))
|
||||
except ApiException as e:
|
||||
logger.info(e)
|
||||
if e.status == 404:
|
||||
try:
|
||||
logger.info("Couldn't find secret named {0}. Create a new secret".format(name))
|
||||
api_response = api_instance.create_namespaced_secret(namespace, body)
|
||||
logger.info("Secret named {0} is created".format(name))
|
||||
except ApiException as ie:
|
||||
logger.error("Exception when calling CoreV1Api->create_namespaced_secret: {0}".format(str(ie)))
|
||||
sys.exit(1)
|
||||
else:
|
||||
logger.error("Exception when calling CoreV1Api->patch_namespaced_secret: {0}".format(str(e)))
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
def get_secret(name, namespace):
|
||||
confirm_namespace(namespace)
|
||||
|
||||
config.load_kube_config()
|
||||
api_instance = client.CoreV1Api()
|
||||
|
||||
try:
|
||||
api_response = api_instance.read_namespaced_secret(name, namespace)
|
||||
except ApiException as e:
|
||||
if e.status == 404:
|
||||
logger.info("Couldn't find secret named {0}.".format(name))
|
||||
return None
|
||||
else:
|
||||
logger.error("Exception when calling CoreV1Api->read_namespaced_config_map: {0}".format(str(e)))
|
||||
sys.exit(1)
|
||||
|
||||
return api_response.data
|
||||
|
||||
|
||||
def delete_secret_content(name, key, namespace):
|
||||
confirm_namespace(namespace)
|
||||
|
||||
config.load_kube_config()
|
||||
api_instance = client.CoreV1Api()
|
||||
try:
|
||||
api_response = api_instance.read_namespaced_secret(name, namespace)
|
||||
if api_response is not None and type(api_response.data) is dict:
|
||||
removed_content = api_response.data.pop(key, None)
|
||||
if removed_content is not None:
|
||||
meta_data = client.V1ObjectMeta()
|
||||
meta_data.namespace = namespace
|
||||
meta_data.name = name
|
||||
body = client.V1Secret(metadata = meta_data, data = api_response.data)
|
||||
api_instance.replace_namespaced_secret(name, namespace, body)
|
||||
except ApiException as e:
|
||||
if e.status == 404:
|
||||
logger.info("Couldn't find secret named {0}.".format(name))
|
||||
else:
|
||||
logger.error("Exception when try to delete {0} from {1}: reason: {2}".format(key, name, str(e)))
|
||||
sys.exit(1)
|
|
@ -1,14 +0,0 @@
|
|||
root = true
|
||||
|
||||
[*]
|
||||
indent_style = space
|
||||
indent_size = 2
|
||||
end_of_line = lf
|
||||
charset = utf-8
|
||||
trim_trailing_whitespace = true
|
||||
insert_final_newline = true
|
||||
max_line_length = 80
|
||||
|
||||
[*.md]
|
||||
indent_size = 4
|
||||
trim_trailing_whitespace = false
|
|
@ -1,89 +0,0 @@
|
|||
|
||||
# Created by https://www.gitignore.io/api/node
|
||||
# Edit at https://www.gitignore.io/?templates=node
|
||||
|
||||
### Node ###
|
||||
# Logs
|
||||
logs
|
||||
*.log
|
||||
npm-debug.log*
|
||||
yarn-debug.log*
|
||||
yarn-error.log*
|
||||
|
||||
# Runtime data
|
||||
pids
|
||||
*.pid
|
||||
*.seed
|
||||
*.pid.lock
|
||||
|
||||
# Directory for instrumented libs generated by jscoverage/JSCover
|
||||
lib-cov
|
||||
|
||||
# Coverage directory used by tools like istanbul
|
||||
coverage
|
||||
|
||||
# nyc test coverage
|
||||
.nyc_output
|
||||
|
||||
# Grunt intermediate storage (https://gruntjs.com/creating-plugins#storing-task-files)
|
||||
.grunt
|
||||
|
||||
# Bower dependency directory (https://bower.io/)
|
||||
bower_components
|
||||
|
||||
# node-waf configuration
|
||||
.lock-wscript
|
||||
|
||||
# Compiled binary addons (https://nodejs.org/api/addons.html)
|
||||
build/Release
|
||||
|
||||
# Dependency directories
|
||||
node_modules/
|
||||
jspm_packages/
|
||||
|
||||
# TypeScript v1 declaration files
|
||||
typings/
|
||||
|
||||
# Optional npm cache directory
|
||||
.npm
|
||||
|
||||
# Optional eslint cache
|
||||
.eslintcache
|
||||
|
||||
# Optional REPL history
|
||||
.node_repl_history
|
||||
|
||||
# Output of 'npm pack'
|
||||
*.tgz
|
||||
|
||||
# Yarn Integrity file
|
||||
.yarn-integrity
|
||||
|
||||
# dotenv environment variables file
|
||||
.env
|
||||
.env.test
|
||||
|
||||
# parcel-bundler cache (https://parceljs.org/)
|
||||
.cache
|
||||
|
||||
# next.js build output
|
||||
.next
|
||||
|
||||
# nuxt.js build output
|
||||
.nuxt
|
||||
|
||||
# vuepress build output
|
||||
.vuepress/dist
|
||||
|
||||
# Serverless directories
|
||||
.serverless/
|
||||
|
||||
# FuseBox cache
|
||||
.fusebox/
|
||||
|
||||
# DynamoDB Local files
|
||||
.dynamodb/
|
||||
|
||||
# End of https://www.gitignore.io/api/node
|
||||
|
||||
dist/
|
|
@ -1,52 +0,0 @@
|
|||
/*!
|
||||
* Copyright (c) Microsoft Corporation
|
||||
* All rights reserved.
|
||||
*
|
||||
* MIT License
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
* of this software and associated documentation files (the "Software"), to deal
|
||||
* in the Software without restriction, including without limitation the rights
|
||||
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
* copies of the Software, and to permit persons to whom the Software is
|
||||
* furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in all
|
||||
* copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
* SOFTWARE.
|
||||
*/
|
||||
import * as React from "react";
|
||||
|
||||
import { IFormControlProps } from ".";
|
||||
|
||||
interface ICheckBoxProps extends IFormControlProps<boolean> {}
|
||||
|
||||
const CheckBox: React.FunctionComponent<ICheckBoxProps> = (props) => {
|
||||
const { children, className, onChange, value } = props;
|
||||
|
||||
const onInputChange: React.ChangeEventHandler<HTMLInputElement> = (event) => {
|
||||
if (onChange !== undefined) {
|
||||
onChange(event.target.checked);
|
||||
}
|
||||
};
|
||||
|
||||
return (
|
||||
<div className={className}>
|
||||
<div className="checkbox">
|
||||
<label>
|
||||
<input type="checkbox" checked={value} onChange={onInputChange}/>
|
||||
{children ? " " + children : null}
|
||||
</label>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
};
|
||||
|
||||
export default CheckBox;
|
|
@ -1,59 +0,0 @@
|
|||
/*!
|
||||
* Copyright (c) Microsoft Corporation
|
||||
* All rights reserved.
|
||||
*
|
||||
* MIT License
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
* of this software and associated documentation files (the "Software"), to deal
|
||||
* in the Software without restriction, including without limitation the rights
|
||||
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
* copies of the Software, and to permit persons to whom the Software is
|
||||
* furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in all
|
||||
* copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
* SOFTWARE.
|
||||
*/
|
||||
import classNames from "classnames";
|
||||
import * as React from "react";
|
||||
|
||||
import { IFormControlProps } from ".";
|
||||
|
||||
interface INumberInputProps extends IFormControlProps<number> {
|
||||
min?: number;
|
||||
max?: number;
|
||||
}
|
||||
|
||||
const NumberInput: React.FunctionComponent<INumberInputProps> = (props) => {
|
||||
const { children, className, max, min, onChange, value } = props;
|
||||
const onInputChange: React.ChangeEventHandler<HTMLInputElement> = (event) => {
|
||||
if (onChange !== undefined) { onChange(event.target.valueAsNumber); }
|
||||
};
|
||||
const UID = "U" + Math.floor(Math.random() * 0xFFFFFF).toString(16);
|
||||
|
||||
return (
|
||||
<div className={classNames("form-group", className)}>
|
||||
<label htmlFor={UID}>{children}</label>
|
||||
<input
|
||||
type="number"
|
||||
className="form-control"
|
||||
id={UID}
|
||||
placeholder={children}
|
||||
min={min}
|
||||
max={max}
|
||||
value={value}
|
||||
onChange={onInputChange}
|
||||
/>
|
||||
</div>
|
||||
);
|
||||
};
|
||||
|
||||
export default NumberInput;
|
|
@ -1,75 +0,0 @@
|
|||
/*!
|
||||
* Copyright (c) Microsoft Corporation
|
||||
* All rights reserved.
|
||||
*
|
||||
* MIT License
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
* of this software and associated documentation files (the "Software"), to deal
|
||||
* in the Software without restriction, including without limitation the rights
|
||||
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
* copies of the Software, and to permit persons to whom the Software is
|
||||
* furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in all
|
||||
* copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
* SOFTWARE.
|
||||
*/
|
||||
import classNames from "classnames";
|
||||
import * as React from "react";
|
||||
|
||||
import { IFormControlProps } from ".";
|
||||
|
||||
interface IOptionProps {
|
||||
label: string;
|
||||
value: string;
|
||||
}
|
||||
|
||||
const Option: React.FunctionComponent<IOptionProps> = ({ value, label }) => {
|
||||
return <option value={value}>{label}</option>;
|
||||
};
|
||||
|
||||
interface ISelectProps extends IFormControlProps<string> {
|
||||
options: Array<IOptionProps | string>;
|
||||
}
|
||||
|
||||
const Select: React.FunctionComponent<ISelectProps> = (props) => {
|
||||
const { children, className, options, value, onChange } = props;
|
||||
const onSelectChange: React.ChangeEventHandler<HTMLSelectElement> = (event) => {
|
||||
if (onChange !== undefined) {
|
||||
onChange(event.target.value);
|
||||
}
|
||||
};
|
||||
const UID = "U" + Math.floor(Math.random() * 0xFFFFFF).toString(16);
|
||||
return (
|
||||
<div className={classNames("form-group", className)}>
|
||||
<label htmlFor={UID}>{children}</label>
|
||||
<select
|
||||
className="form-control"
|
||||
id={UID}
|
||||
placeholder={children}
|
||||
value={value}
|
||||
onChange={onSelectChange}
|
||||
>
|
||||
{
|
||||
options.map((option) => {
|
||||
if (typeof option === "string") {
|
||||
return <Option key={option} label={option} value={option}/>;
|
||||
} else {
|
||||
return <Option key={option.value} {...option}/>;
|
||||
}
|
||||
})
|
||||
}
|
||||
</select>
|
||||
</div>
|
||||
);
|
||||
};
|
||||
|
||||
export default Select;
|
|
@ -1,59 +0,0 @@
|
|||
/*!
|
||||
* Copyright (c) Microsoft Corporation
|
||||
* All rights reserved.
|
||||
*
|
||||
* MIT License
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
* of this software and associated documentation files (the "Software"), to deal
|
||||
* in the Software without restriction, including without limitation the rights
|
||||
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
* copies of the Software, and to permit persons to whom the Software is
|
||||
* furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in all
|
||||
* copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
* SOFTWARE.
|
||||
*/
|
||||
import classNames from "classnames";
|
||||
import * as React from "react";
|
||||
|
||||
import { IFormControlProps } from ".";
|
||||
|
||||
interface ITextAreaProps extends IFormControlProps<string> {
|
||||
cols?: number;
|
||||
rows?: number;
|
||||
}
|
||||
|
||||
const TextArea: React.FunctionComponent<ITextAreaProps> = (props) => {
|
||||
const { children, className, rows, cols, value, onChange } = props;
|
||||
const onTextAreaChange: React.ChangeEventHandler<HTMLTextAreaElement> = (event) => {
|
||||
if (onChange !== undefined) {
|
||||
onChange(event.target.value);
|
||||
}
|
||||
};
|
||||
const UID = "U" + Math.floor(Math.random() * 0xFFFFFF).toString(16);
|
||||
return (
|
||||
<div className={classNames("form-group", className)}>
|
||||
<label htmlFor={UID}>{children}</label>
|
||||
<textarea
|
||||
className="form-control"
|
||||
id={UID}
|
||||
placeholder={children}
|
||||
rows={rows}
|
||||
cols={cols}
|
||||
value={value}
|
||||
onChange={onTextAreaChange}
|
||||
/>
|
||||
</div>
|
||||
);
|
||||
};
|
||||
|
||||
export default TextArea;
|
|
@ -1,56 +0,0 @@
|
|||
/*!
|
||||
* Copyright (c) Microsoft Corporation
|
||||
* All rights reserved.
|
||||
*
|
||||
* MIT License
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
* of this software and associated documentation files (the "Software"), to deal
|
||||
* in the Software without restriction, including without limitation the rights
|
||||
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
* copies of the Software, and to permit persons to whom the Software is
|
||||
* furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in all
|
||||
* copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
* SOFTWARE.
|
||||
*/
|
||||
import classNames from "classnames";
|
||||
import * as React from "react";
|
||||
|
||||
import { IFormControlProps } from ".";
|
||||
|
||||
interface ITextInputProps extends IFormControlProps<string> {
|
||||
type?: string;
|
||||
}
|
||||
|
||||
const TextInput: React.FunctionComponent<ITextInputProps> = (props) => {
|
||||
const { children, className, onChange, type = "text", value } = props;
|
||||
const onInputChange: React.ChangeEventHandler<HTMLInputElement> = (event) => {
|
||||
if (onChange !== undefined) { onChange(event.target.value); }
|
||||
};
|
||||
const UID = "U" + Math.floor(Math.random() * 0xFFFFFF).toString(16);
|
||||
|
||||
return (
|
||||
<div className={classNames("form-group", className)}>
|
||||
<label htmlFor={UID}>{children}</label>
|
||||
<input
|
||||
type={type}
|
||||
className="form-control"
|
||||
id={UID}
|
||||
placeholder={children}
|
||||
value={value}
|
||||
onChange={onInputChange}
|
||||
/>
|
||||
</div>
|
||||
);
|
||||
};
|
||||
|
||||
export default TextInput;
|
|
@ -1,30 +0,0 @@
|
|||
/*!
|
||||
* Copyright (c) Microsoft Corporation
|
||||
* All rights reserved.
|
||||
*
|
||||
* MIT License
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
* of this software and associated documentation files (the "Software"), to deal
|
||||
* in the Software without restriction, including without limitation the rights
|
||||
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
* copies of the Software, and to permit persons to whom the Software is
|
||||
* furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in all
|
||||
* copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
* SOFTWARE.
|
||||
*/
|
||||
export interface IFormControlProps<V> {
|
||||
children?: string;
|
||||
className?: string;
|
||||
value?: V;
|
||||
onChange?(value: V): void;
|
||||
}
|
|
@ -1,72 +0,0 @@
|
|||
/*!
|
||||
* Copyright (c) Microsoft Corporation
|
||||
* All rights reserved.
|
||||
*
|
||||
* MIT License
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
* of this software and associated documentation files (the "Software"), to deal
|
||||
* in the Software without restriction, including without limitation the rights
|
||||
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
* copies of the Software, and to permit persons to whom the Software is
|
||||
* furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in all
|
||||
* copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
* SOFTWARE.
|
||||
*/
|
||||
import classNames from "classnames";
|
||||
import * as React from "react";
|
||||
|
||||
interface IPanelProps {
|
||||
className?: string;
|
||||
title: string;
|
||||
}
|
||||
|
||||
interface IPanelState {
|
||||
collapse: boolean;
|
||||
}
|
||||
|
||||
const headingStyle: React.CSSProperties = {
|
||||
cursor: "pointer",
|
||||
display: "block",
|
||||
};
|
||||
|
||||
export default class Panel extends React.Component<IPanelProps, IPanelState> {
|
||||
constructor(props: IPanelProps) {
|
||||
super(props);
|
||||
this.state = { collapse: true };
|
||||
}
|
||||
|
||||
public render() {
|
||||
const { children, className, title } = this.props;
|
||||
const { collapse } = this.state;
|
||||
const iconClassName = collapse ? "glyphicon-triangle-bottom" : "glyphicon-triangle-top";
|
||||
return (
|
||||
<div className={classNames("panel", "panel-default", className)}>
|
||||
<a className="panel-heading" style={headingStyle}>
|
||||
<p className="panel-title" onClick={this.onClickTitle}>
|
||||
<span className={classNames("glyphicon", iconClassName)}/>
|
||||
{" "}{title}
|
||||
</p>
|
||||
</a>
|
||||
<div className={classNames("panel-collapse", "collapse", { in: !collapse })}>
|
||||
<div className="panel-body rows">
|
||||
{children}
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
private onClickTitle = () => {
|
||||
this.setState(({ collapse }) => ({ collapse: !collapse }));
|
||||
}
|
||||
}
|
|
@ -1,79 +0,0 @@
|
|||
/*!
|
||||
* Copyright (c) Microsoft Corporation
|
||||
* All rights reserved.
|
||||
*
|
||||
* MIT License
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
* of this software and associated documentation files (the "Software"), to deal
|
||||
* in the Software without restriction, including without limitation the rights
|
||||
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
* copies of the Software, and to permit persons to whom the Software is
|
||||
* furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in all
|
||||
* copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
* SOFTWARE.
|
||||
*/
|
||||
import * as React from "react";
|
||||
|
||||
import SimpleJob from "..";
|
||||
import SimpleJobContext from "../Context";
|
||||
|
||||
const DatabaseOperation: React.FunctionComponent = () => (
|
||||
<SimpleJobContext.Consumer>
|
||||
{ ({ value: simpleJob, apply }) => {
|
||||
const download = () => {
|
||||
const json = SimpleJob.toLegacyJSON(simpleJob);
|
||||
const blob = new Blob([json], { type: "application/octet-stream" });
|
||||
const filename = `${simpleJob.name}.json`;
|
||||
if (navigator.msSaveBlob) {
|
||||
navigator.msSaveBlob(blob, filename);
|
||||
} else {
|
||||
const anchor = document.createElement("a");
|
||||
anchor.href = URL.createObjectURL(blob);
|
||||
anchor.download = filename;
|
||||
document.body.appendChild(anchor);
|
||||
setTimeout(() => {
|
||||
anchor.click();
|
||||
setTimeout(() => {
|
||||
document.body.removeChild(anchor);
|
||||
}, 0);
|
||||
}, 0);
|
||||
}
|
||||
};
|
||||
const upload = (event: React.ChangeEvent<HTMLInputElement>) => {
|
||||
if (event.currentTarget.files == null) { return; }
|
||||
const file = event.currentTarget.files[0];
|
||||
if (file == null) { return; }
|
||||
|
||||
const fileReader = new FileReader();
|
||||
fileReader.addEventListener("load", () => {
|
||||
apply(fileReader.result as string);
|
||||
});
|
||||
fileReader.readAsText(file);
|
||||
};
|
||||
return (
|
||||
<div className="col-md-12">
|
||||
<button type="button" className="btn btn-success" onClick={download}>
|
||||
Download JSON
|
||||
</button>
|
||||
{" "}
|
||||
<label>
|
||||
<a type="button" className="btn btn-success">Upload JSON</a>
|
||||
<input type="file" className="sr-only" accept="application/json,.json" onChange={upload}/>
|
||||
</label>
|
||||
</div>
|
||||
);
|
||||
}}
|
||||
</SimpleJobContext.Consumer>
|
||||
);
|
||||
|
||||
export default DatabaseOperation;
|
|
@ -1,94 +0,0 @@
|
|||
/*!
|
||||
* Copyright (c) Microsoft Corporation
|
||||
* All rights reserved.
|
||||
*
|
||||
* MIT License
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
* of this software and associated documentation files (the "Software"), to deal
|
||||
* in the Software without restriction, including without limitation the rights
|
||||
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
* copies of the Software, and to permit persons to whom the Software is
|
||||
* furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in all
|
||||
* copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
* SOFTWARE.
|
||||
*/
|
||||
import * as React from "react";
|
||||
|
||||
import TextInput from "../../Components/FormControls/TextInput";
|
||||
|
||||
import SimpleJobContext from "../Context";
|
||||
|
||||
const EnvironmentVariables: React.FunctionComponent = () => (
|
||||
<SimpleJobContext.Consumer>
|
||||
{ ({ value: simpleJob, set: setSimpleJob }) => {
|
||||
const variables = simpleJob.environmentVariables;
|
||||
const setVariables = setSimpleJob("environmentVariables");
|
||||
|
||||
const setVariableName = (index: number) => (name: string) => {
|
||||
setVariables([
|
||||
...variables.slice(0, index),
|
||||
{ name, value: variables[index].value },
|
||||
...variables.slice(index + 1),
|
||||
]);
|
||||
};
|
||||
const setVariableValue = (index: number) => (value: string) => {
|
||||
setVariables([
|
||||
...variables.slice(0, index),
|
||||
{ name: variables[index].name, value },
|
||||
...variables.slice(index + 1),
|
||||
]);
|
||||
};
|
||||
|
||||
const addVariable = () => {
|
||||
setVariables(variables.concat({ name: "", value: "" }));
|
||||
};
|
||||
|
||||
return (
|
||||
<React.Fragment>
|
||||
<div className="rows">
|
||||
{
|
||||
simpleJob.environmentVariables.map((variable, index) => (
|
||||
<React.Fragment key={index}>
|
||||
<TextInput
|
||||
type="text"
|
||||
className="col-sm-6"
|
||||
value={variable.name}
|
||||
onChange={setVariableName(index)}
|
||||
>
|
||||
Name of the Environment Variable
|
||||
</TextInput>
|
||||
<TextInput
|
||||
type="text"
|
||||
className="col-sm-6"
|
||||
value={variable.value}
|
||||
onChange={setVariableValue(index)}
|
||||
>
|
||||
Value of the Environment Variable
|
||||
</TextInput>
|
||||
</React.Fragment>
|
||||
) )
|
||||
}
|
||||
<div className="col-sm-12">
|
||||
<button type="button" className="btn btn-info" onClick={addVariable}>
|
||||
<span className="glyphicon glyphicon-plus"/>
|
||||
{" Add Environment Variable"}
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</React.Fragment>
|
||||
);
|
||||
}}
|
||||
</SimpleJobContext.Consumer>
|
||||
);
|
||||
|
||||
export default EnvironmentVariables;
|