3d77782f53
Bumps [lodash](https://github.com/lodash/lodash) from 4.17.11 to 4.17.15. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](https://github.com/lodash/lodash/compare/4.17.11...4.17.15) Signed-off-by: dependabot[bot] <support@github.com> |
||
---|---|---|
.. | ||
app | ||
tests | ||
.dockerignore | ||
.eslintignore | ||
.eslintrc.json | ||
README.md | ||
dockerfile | ||
package-lock.json | ||
package.json |
README.md
Training Service
The training service is responsible to communicate with a third party service to train a new ML model. In our case, we are using Azure Databricks. Using other third party will require an implementation change in this service, however the other parts of the system should remain the same.
Request Flows
New Training Request
New requests for training are going through the following steps:
-
A new request (POST) for training includes the following body:
{ "modelType": "MODEL", "parameters": { // Parameters key-values } }
-
Retrieve the notebook path in Databricks according to the
modelType
('MODEL' in this case).Note:
modelType
-> 'Databricks notebook path' mappings are described in the json value of theDATABRICKS_TYPE_MAPPING
environment variable (more details below). -
Start the Databricks cluster if it is in 'TERMINATED' state.
-
Send a request to Databricks to run the notebook with the specified parameters.
-
The response from Databricks will include
runId
, and will be returned in the response json in the following structure:{ "runId": 123 }
Get Run Status Request
-
A new request for run status is received with
runId
in the request path. Example:GET /123
-
The service will check the run on Databricks and will return a response that includes
state
andmessage
in the following structure:{ "state": "<Run Status>", "message": "<Run Message>" }
- 'Run Status' could be one of the following values:
pending
,running
orcompleted
- 'Run Message' is generally empty, but will include error message if the run finishes unsuccessfully.
- 'Run Status' could be one of the following values:
Environment Variables
The service expects several environment variables to be set in order to run:
Var | Required | Description |
---|---|---|
PORT | yes | Service port. default=80 |
DATABRICKS_WORKSPACE_URL | yes | Databricks Workspace URL |
DATABRICKS_AUTH_TOKEN | yes | Authentication Token for Databricks |
DATABRICKS_CLUSTER_ID | yes | Databricks cluster ID. More information can be found here |
DATABRICKS_RUN_TIMEOUT | yes | Run timeout for notebook runs |
DATABRICKS_TYPE_MAPPING | yes | Json including MODEL:NOTEBOOK_PATH mapping |
NODE_ENV | no | test for unit testing |
APP_INSIGHTS_INSTRUMENTATION_KEY | no | Application Insights instrumentation key |
SERVICE_NAME | no | service name for Application Insights logging |
Sample environment variables
PORT=3000
DATABRICKS_WORKSPACE_URL=https://westeurope.azuredatabricks.net
DATABRICKS_AUTH_TOKEN=abcdefghi123456a123a1234a123456abc12
DATABRICKS_CLUSTER_ID=1234-123456-hurts123
DATABRICKS_RUN_TIMEOUT=3600
DATABRICKS_TYPE_MAPPING={"wine":"/shared/wine_notebook","diabetes":"/shared/diabetes_notebook"}
APP_INSIGHTS_INSTRUMENTATION_KEY=01e9c546-1234-1234-cf56-7d6b49fc053a
SERVICE_NAME=training
Build and Run with Docker
docker build . -t training-service
docker run --env-file=.env-file -p 127.0.0.1:3000:3000 training-service