Added Experiment Tracking doc and corresponding code (#79)
* Initial commit of Kubeflow Pipeline MLFlow Experiment Run Dashboard * Modified Power BI report and Kube flow installation doc. Added Experiment Tracking doc and corresponding images * Added link to Experiment Tracking doc * Added more friendlier naming and converted millisecond timestamp to datetime * Added links for Kubeflow pipelines and MLflow experiments, changed Experiment tracking doc structure, and fixed error in MLOps Github doc * Added a few screenshots from Kubeflow and MLflow dashboards
|
@ -38,10 +38,10 @@ This documentation helps you get started with the sample from infrastructure set
|
|||
- [Securing Kubeflow on AKS](./docs/Kubeflow-secure.md)
|
||||
- [MLOps with GitHub](./docs/mlops-github.md)
|
||||
- [MLOps with Azure DevOps](./docs/mlops-azdo.md)
|
||||
- [Experiment Tracking](./docs/experiment_tracking.md)
|
||||
|
||||
Code for the following can be found in the code directory, but currently there is no documentation:
|
||||
|
||||
- Experiment Tracking
|
||||
- Running Kubeflow component in parallel
|
||||
- Running Jupyter Server within Kubeflow
|
||||
- Running MLFlow Project from Kubeflow
|
||||
|
|
|
@ -2,8 +2,8 @@
|
|||
|
||||
## Connect to AKS
|
||||
|
||||
* Login to Azure: az login
|
||||
* Create user credentials: az aks get-credentials -n <AKS_NAME> -g <RESOURCE_GROUP_NAME>
|
||||
* Login to Azure: `az login`
|
||||
* Create user credentials: `az aks get-credentials -n <AKS_NAME> -g <RESOURCE_GROUP_NAME>`
|
||||
|
||||
## Install Istio (if not already installed on the cluster)
|
||||
|
||||
|
|
Двоичные данные
docs/diagrams/kubeflow-dashboard.png
До Ширина: | Высота: | Размер: 61 KiB После Ширина: | Высота: | Размер: 75 KiB |
После Ширина: | Высота: | Размер: 80 KiB |
После Ширина: | Высота: | Размер: 61 KiB |
После Ширина: | Высота: | Размер: 46 KiB |
После Ширина: | Высота: | Размер: 38 KiB |
После Ширина: | Высота: | Размер: 96 KiB |
После Ширина: | Высота: | Размер: 86 KiB |
После Ширина: | Высота: | Размер: 107 KiB |
После Ширина: | Высота: | Размер: 110 KiB |
|
@ -0,0 +1,64 @@
|
|||
# Experiment Tracking
|
||||
|
||||
## Tracking pipelines with Kubeflow
|
||||
|
||||
### Access the Kubeflow dashboard
|
||||
|
||||
Go to http://{KUBEFLOW_HOST}/_/pipelines/
|
||||
|
||||
![Kubeflow Dashboard](./diagrams/kubeflow-dashboard.png)
|
||||
|
||||
### Tracking pipeline artifacts and experiments
|
||||
|
||||
To get pipeline details, go to http://{KUBEFLOW_HOST}/_/pipelines/details/{PIPELINE_ID}.
|
||||
|
||||
![Kubeflow Pipeline Details](./diagrams/kubeflow-pipeline-details.png)
|
||||
|
||||
To get experiment details, go to http://{KUBEFLOW_HOST}/_/experiments/details/{PIPELINE_ID}.
|
||||
|
||||
![Kubeflow Experiment Details](./diagrams/kubeflow-experiment-details.png)
|
||||
|
||||
## Tracking experiments with MLflow
|
||||
|
||||
### Access the MLflow dashboard
|
||||
|
||||
Go to http://{MLFLOW_HOST}/mlflow/#
|
||||
|
||||
![MLflow Dashboard](./diagrams/mlflow-dashboard.png)
|
||||
|
||||
### Tracking experiment artifacts and experiments
|
||||
|
||||
Go to http://{MLFLOW_HOST}/mlflow/#/experiments/{EXPERIMENT_ID}/runs/{RUN_ID}
|
||||
|
||||
![MLflow Experiment Artifacts](./diagrams/mlflow-experiment-artifacts.png)
|
||||
|
||||
## Unified Dashboard for Kubeflow pipelines and MLflow experiments
|
||||
|
||||
### Forward Kubeflow and MLFlow portals to localhost
|
||||
|
||||
- Login to Azure: `az login`
|
||||
- Port forward: `kubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80`
|
||||
|
||||
If running the `kubectl` statement return an error, make sure your subscription is set to the subscription containing your AKS cluster and user credentials have been created for your AKS cluster (see [Kubeflow Installation on AKS](Kubeflow-install.md)).
|
||||
|
||||
### Power BI Report
|
||||
|
||||
A sample Power BI report was created to visualize the Kubeflow pipeline and MLFlow experiment run data with filtering which enables analysis over time with multiple dimensions. The corresponding PBIX file can be found in the `code\powerbi` directory. The report contains 3 visuals:
|
||||
|
||||
- A bar graph of Kubeflow pipelines based on status information
|
||||
- A question and answer visual displaying the average duration of Kubeflow pipelines
|
||||
- A table showing both Kubeflow pipeline and it's corresponding MLFlow experiment run data
|
||||
|
||||
![GitHub CI Actions](./diagrams/powerbi_full.png)
|
||||
|
||||
To filter the full page by Kubeflow pipelines with the status of `Succeeded`, click on the `Succeeded` bar in the bar graph.
|
||||
|
||||
![GitHub CI Actions](./diagrams/powerbi_succeeded.png)
|
||||
|
||||
To filter the full page by Kubeflow pipelines with the status of `Failure`, click on the `Failure` bar in the bar graph.
|
||||
|
||||
![GitHub CI Actions](./diagrams/powerbi_failed.png)
|
||||
|
||||
To filter the full page by date, select the dates in `kfp.created_at` or `mlflow.start_time` under `Filters` -> `Filters on all pages` on the right side of the report. If one selects all the dates in `kfp.created_at` containing `2020-07-18`, then the Power BI report will look as follows:
|
||||
|
||||
![GitHub CI Actions](./diagrams/powerbi_2020-07-18.png)
|
|
@ -32,14 +32,14 @@ Clicking on `Actions` tab at the top, `CI` on the left side, your pull request,
|
|||
|
||||
### Access KFP via Kubeflow Central Dashboard
|
||||
|
||||
1. Go to http://{KUBEFLOW_HOST}/_/pipeline/
|
||||
1. Go to http://{KUBEFLOW_HOST}/_/pipelines/
|
||||
2. Clicking on the pipeline with the matching RUN_ID, you will get details on failed and finished steps.
|
||||
and the job, you will get details on failed, finished, and skipped steps.
|
||||
![Kubeflow Dashboard Pipeline Steps](./diagrams/kubeflow-dashboard.png)
|
||||
![Kubeflow Dashboard Pipeline Steps](./diagrams/kubeflow-pipeline-details.png)
|
||||
|
||||
### Access registered model in MLFlow
|
||||
|
||||
1. Go to http://{KUBEFLOW_HOST}/_/pipeline/#/runs
|
||||
1. Go to http://{KUBEFLOW_HOST}/_/pipelines/#/runs
|
||||
2. Clicking the Run name with the Pipeline Version that contains the matching RUN_ID, `register-to-aml`, and `Logs`, you will get execution logs for pipeline steps, which shows the run_id.
|
||||
3. Go to http://{MLFLOW_HOST}/mlflow/#/models/tacosandburritos
|
||||
4. Clicking on the model version registered at the closest time to the execution of the KFP, you will get details about the Source Run.![MLFlow Model](./diagrams/mlflow-model.png)
|
||||
|
|