{
"cells": [
{
"cell_type": "markdown",
"id": "0",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# AutoML with FLAML Library\n",
"\n",
"\n",
"| | | | |\n",
"|-----|--------|--------|--------|\n",
"| \n",
"\n",
"\n",
"\n",
"### Goal\n",
"In this notebook, we demonstrate how to use AutoML with FLAML to find the best model for our dataset.\n",
"\n",
"\n",
"## 1. Introduction\n",
"\n",
"FLAML is a Python library (https://github.com/microsoft/FLAML) designed to automatically produce accurate machine learning models \n",
"with low computational cost. It is fast and economical. The simple and lightweight design makes it easy to use and extend, such as adding new learners. FLAML can \n",
"- serve as an economical AutoML engine,\n",
"- be used as a fast hyperparameter tuning tool, or \n",
"- be embedded in self-tuning software that requires low latency & resource in repetitive\n",
" tuning tasks.\n",
"\n",
"In this notebook, we use one real data example (binary classification) to showcase how to use FLAML library.\n",
"\n",
"FLAML requires `Python>=3.8`. To run this notebook example, please install the following packages."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1",
"metadata": {
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:45:27.1218348Z",
"execution_start_time": "2024-09-03T04:45:00.4763917Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "edb1122f-690f-4826-ae22-f45c675bab46",
"queued_time": "2024-09-03T04:44:51.3993564Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": "2024-09-03T04:44:51.4277029Z",
"spark_pool": null,
"state": "finished",
"statement_id": 7,
"statement_ids": [
3,
4,
5,
6,
7
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 7, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collecting openml==0.14.2\n",
" Downloading openml-0.14.2.tar.gz (144 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m144.5/144.5 kB\u001b[0m \u001b[31m8.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25h Installing build dependencies ... \u001b[?25l-\b \b\\\b \b|\b \bdone\n",
"\u001b[?25h Getting requirements to build wheel ... \u001b[?25l-\b \bdone\n",
"\u001b[?25h Installing backend dependencies ... \u001b[?25l-\b \b\\\b \bdone\n",
"\u001b[?25h Preparing metadata (pyproject.toml) ... \u001b[?25l-\b \bdone\n",
"\u001b[?25hRequirement already satisfied: scikit-learn>=1.3.0 in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (1.3.0)\n",
"Collecting rgf-python==3.12.0\n",
" Downloading rgf_python-3.12.0-py3-none-manylinux1_x86_64.whl (757 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m757.8/757.8 kB\u001b[0m \u001b[31m45.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hRequirement already satisfied: liac-arff>=2.4.0 in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from openml==0.14.2) (2.5.0)\n",
"Requirement already satisfied: xmltodict in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from openml==0.14.2) (0.13.0)\n",
"Requirement already satisfied: requests in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from openml==0.14.2) (2.31.0)\n",
"Requirement already satisfied: python-dateutil in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from openml==0.14.2) (2.8.2)\n",
"Requirement already satisfied: pandas>=1.0.0 in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from openml==0.14.2) (2.0.3)\n",
"Requirement already satisfied: scipy>=0.13.3 in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from openml==0.14.2) (1.10.1)\n",
"Requirement already satisfied: numpy>=1.6.2 in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from openml==0.14.2) (1.24.3)\n",
"Collecting minio (from openml==0.14.2)\n",
" Downloading minio-7.2.8-py3-none-any.whl (93 kB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m93.5/93.5 kB\u001b[0m \u001b[31m42.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hRequirement already satisfied: pyarrow in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from openml==0.14.2) (12.0.1)\n",
"Requirement already satisfied: joblib in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from rgf-python==3.12.0) (1.3.2)\n",
"Requirement already satisfied: threadpoolctl>=2.0.0 in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from scikit-learn>=1.3.0) (3.2.0)\n",
"Requirement already satisfied: pytz>=2020.1 in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from pandas>=1.0.0->openml==0.14.2) (2023.3.post1)\n",
"Requirement already satisfied: tzdata>=2022.1 in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from pandas>=1.0.0->openml==0.14.2) (2023.3)\n",
"Requirement already satisfied: six>=1.5 in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from python-dateutil->openml==0.14.2) (1.16.0)\n",
"Requirement already satisfied: certifi in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from minio->openml==0.14.2) (2023.7.22)\n",
"Requirement already satisfied: urllib3 in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from minio->openml==0.14.2) (1.26.17)\n",
"Requirement already satisfied: argon2-cffi in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from minio->openml==0.14.2) (23.1.0)\n",
"Collecting pycryptodome (from minio->openml==0.14.2)\n",
" Downloading pycryptodome-3.20.0-cp35-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.1 MB)\n",
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.1/2.1 MB\u001b[0m \u001b[31m146.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
"\u001b[?25hRequirement already satisfied: typing-extensions in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from minio->openml==0.14.2) (4.5.0)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from requests->openml==0.14.2) (3.3.1)\n",
"Requirement already satisfied: idna<4,>=2.5 in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from requests->openml==0.14.2) (3.4)\n",
"Requirement already satisfied: argon2-cffi-bindings in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from argon2-cffi->minio->openml==0.14.2) (21.2.0)\n",
"Requirement already satisfied: cffi>=1.0.1 in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from argon2-cffi-bindings->argon2-cffi->minio->openml==0.14.2) (1.16.0)\n",
"Requirement already satisfied: pycparser in /home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages (from cffi>=1.0.1->argon2-cffi-bindings->argon2-cffi->minio->openml==0.14.2) (2.21)\n",
"Building wheels for collected packages: openml\n",
" Building wheel for openml (pyproject.toml) ... \u001b[?25l-\b \b\\\b \bdone\n",
"\u001b[?25h Created wheel for openml: filename=openml-0.14.2-py3-none-any.whl size=158697 sha256=284673299d0458cde5751ea69ec6de4e90e7410adec2b980af071fc934877200\n",
" Stored in directory: /home/trusted-service-user/.cache/pip/wheels/2e/4e/af/5e721761d86375dbca82e63cc2470019e97815bc39f11451ea\n",
"Successfully built openml\n",
"Installing collected packages: pycryptodome, rgf-python, minio, openml\n",
"Successfully installed minio-7.2.8 openml-0.14.2 pycryptodome-3.20.0 rgf-python-3.12.0\n",
"\n",
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.1.2\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.2\u001b[0m\n",
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpython -m pip install --upgrade pip\u001b[0m\n",
"Note: you may need to restart the kernel to use updated packages.\n",
"Warning: PySpark kernel has been restarted to use updated packages.\n",
"\n"
]
}
],
"source": [
"%pip install \"openml==0.14.2\" \"scikit-learn>=1.3.0\" \"rgf-python==3.12.0\""
]
},
{
"cell_type": "markdown",
"id": "2",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"### Set the logging level\n",
"\n",
"You can configure the logging level to suppress unnecessary outputs to keep the logs cleaner."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:45:32.370685Z",
"execution_start_time": "2024-09-03T04:45:31.9932557Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "ab2d1ef3-9329-48e2-9429-71cd00e8af55",
"queued_time": "2024-09-03T04:44:51.168749Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 9,
"statement_ids": [
9
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 9, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import logging\n",
"import warnings\n",
" \n",
"logging.getLogger('synapse.ml').setLevel(logging.CRITICAL)\n",
"logging.getLogger('mlflow.utils').setLevel(logging.CRITICAL)\n",
"warnings.simplefilter('ignore', category=FutureWarning)\n",
"warnings.simplefilter('ignore', category=UserWarning)"
]
},
{
"cell_type": "markdown",
"id": "4",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"### Set up MLflow experiment tracking\n",
"\n",
"MLflow is an open source platform that is deeply integrated into the Data Science experience in Fabric and allows to easily track and compare the performance of different models and experiments without the need for manual tracking. For more information, see [Autologging in Microsoft Fabric](https://aka.ms/fabric-autologging)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:45:34.6803604Z",
"execution_start_time": "2024-09-03T04:45:32.7740919Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "18d52168-8765-4f77-b8b8-a4539b51c1aa",
"queued_time": "2024-09-03T04:44:51.1694279Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 10,
"statement_ids": [
10
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 10, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
""
]
},
"execution_count": null,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import mlflow\n",
"\n",
"# Set the MLflow experiment to \"automl-tutorial\" and enable automatic logging\n",
"mlflow.set_experiment(\"automl-tutorial\")"
]
},
{
"cell_type": "markdown",
"id": "6",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 2. Classification Example\n",
"### Load data and preprocess\n",
"\n",
"Download [Airlines dataset](https://www.openml.org/d/1169) from OpenML. The task is to predict whether a given flight will be delayed, given the information of the scheduled departure."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7",
"metadata": {
"jupyter": {
"outputs_hidden": false
},
"slideshow": {
"slide_type": "subslide"
},
"tags": []
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:45:55.4940947Z",
"execution_start_time": "2024-09-03T04:45:35.0725962Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "ca8d70f5-47e5-4674-a31c-738f39ac3175",
"queued_time": "2024-09-03T04:44:51.1702979Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 11,
"statement_ids": [
11
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 11, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"No permission to create OpenML directory at /home/trusted-service-user/.config/openml! This can result in OpenML-Python not working properly.\n",
"download dataset from openml\n",
"Dataset name: airlines\n",
"X_train.shape: (404537, 7), y_train.shape: (404537,);\n",
"X_test.shape: (134846, 7), y_test.shape: (134846,)\n"
]
}
],
"source": [
"from flaml.automl.data import load_openml_dataset\n",
"X_train, X_test, y_train, y_test = load_openml_dataset(dataset_id=1169, data_dir='./')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8",
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:45:56.9587345Z",
"execution_start_time": "2024-09-03T04:45:55.8839894Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "f5ca1611-8219-4ad8-aab8-478e625b31a6",
"queued_time": "2024-09-03T04:44:51.1714393Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 12,
"statement_ids": [
12
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 12, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.synapse.widget-view+json": {
"widget_id": "e951537a-8f45-4ae7-8cd7-5835f56b7d40",
"widget_type": "Synapse.DataFrame"
},
"text/plain": [
"SynapseWidget(Synapse.DataFrame, e951537a-8f45-4ae7-8cd7-5835f56b7d40)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"display(X_train.join(y_train))"
]
},
{
"cell_type": "markdown",
"id": "9",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Run FLAML\n",
"In the FLAML automl run configuration, users can specify the task type, time budget, error metric, learner list, whether to subsample, resampling strategy type, and so on. All these arguments have default values which will be used if users do not provide them. For example, the default classifiers are `['lgbm', 'xgboost', 'xgb_limitdepth', 'catboost', 'rf', 'extra_tree', 'lrl1']`. "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "10",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:45:57.7243859Z",
"execution_start_time": "2024-09-03T04:45:57.3434117Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "698ef231-f161-4732-b95c-2b7793946034",
"queued_time": "2024-09-03T04:44:51.1728277Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 13,
"statement_ids": [
13
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 13, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"''' import AutoML class from flaml package '''\n",
"from flaml import AutoML\n",
"automl = AutoML()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "11",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:45:58.4767064Z",
"execution_start_time": "2024-09-03T04:45:58.1184273Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "fa55a057-1d3e-4f40-800e-0b225e675b14",
"queued_time": "2024-09-03T04:44:51.1751637Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 14,
"statement_ids": [
14
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 14, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"settings = {\n",
" \"time_budget\": 120, # total running time in seconds\n",
" \"metric\": 'accuracy', # check the documentation for options of metrics (https://microsoft.github.io/FLAML/docs/Use-Cases/Task-Oriented-AutoML#optimization-metric)\n",
" \"task\": 'classification', # task type\n",
" \"seed\": 42, # random seed\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "12",
"metadata": {
"slideshow": {
"slide_type": "slide"
},
"tags": [
"outputPrepend"
]
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:48:28.7534447Z",
"execution_start_time": "2024-09-03T04:45:58.882723Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "f261610f-4a20-4c09-8e00-645e1b2bb533",
"queued_time": "2024-09-03T04:44:51.1773522Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 15,
"statement_ids": [
15
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 15, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:46:00] {1787} INFO - task = classification\n",
"[flaml.automl.logger: 09-03 04:46:00] {1798} INFO - Evaluation method: holdout\n",
"[flaml.automl.logger: 09-03 04:46:00] {1901} INFO - Minimizing error metric: 1-accuracy\n",
"[flaml.automl.logger: 09-03 04:46:01] {2019} INFO - List of ML learners in AutoML Run: ['lgbm', 'rf', 'xgboost', 'extra_tree', 'xgb_limitdepth', 'sgd', 'catboost', 'lrl1']\n",
"[flaml.automl.logger: 09-03 04:46:01] {2329} INFO - iteration 0, current learner lgbm\n",
"[flaml.automl.logger: 09-03 04:46:02] {2464} INFO - Estimated sufficient time budget=158419s. Estimated necessary time budget=3905s.\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"accuracy": 0.6223304330630809,
"best_validation_loss": 0.3776695669369191,
"iter_counter": 0,
"trial_time": 0.4332137107849121,
"validation_loss": 0.3776695669369191,
"wall_clock_time": 2.640165090560913
},
"params": {
"FLAML_sample_size": "10000",
"colsample_bytree": "1.0",
"learner": "lgbm",
"learning_rate": "0.09999999999999995",
"log_max_bin": "8",
"min_child_samples": "20",
"n_estimators": "4",
"num_leaves": "4",
"reg_alpha": "0.0009765625",
"reg_lambda": "1.0",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "ca6ad732-f549-4374-9126-b3796d2cb671",
"mlflow.rootRunId": "ca6ad732-f549-4374-9126-b3796d2cb671",
"mlflow.runName": "flight_delays_baseline_child_0",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "LGBMEstimator",
"synapseml.flaml.estimator_name": "lgbm",
"synapseml.flaml.iteration_number": "0",
"synapseml.flaml.learner": "lgbm",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "accuracy",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/14aad3dc-eed4-4466-ad8e-8d5c138b6aac/artifacts",
"end_time": 1725338787,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "14aad3dc-eed4-4466-ad8e-8d5c138b6aac",
"run_name": "",
"run_uuid": "14aad3dc-eed4-4466-ad8e-8d5c138b6aac",
"start_time": 1725338762,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:46:27] {2513} INFO - at 2.6s,\testimator lgbm's best error=0.3777,\tbest estimator lgbm's best error=0.3777\n",
"[flaml.automl.logger: 09-03 04:46:27] {2329} INFO - iteration 1, current learner lgbm\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"accuracy": 0.6223304330630809,
"best_validation_loss": 0.3776695669369191,
"iter_counter": 1,
"trial_time": 0.032134056091308594,
"validation_loss": 0.3776695669369191,
"wall_clock_time": 27.87287712097168
},
"params": {
"FLAML_sample_size": "10000",
"colsample_bytree": "0.9546978871157568",
"learner": "lgbm",
"learning_rate": "0.06631176582925748",
"log_max_bin": "9",
"min_child_samples": "17",
"n_estimators": "5",
"num_leaves": "4",
"reg_alpha": "0.0020990359248076857",
"reg_lambda": "21.92381099932143",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "ca6ad732-f549-4374-9126-b3796d2cb671",
"mlflow.rootRunId": "ca6ad732-f549-4374-9126-b3796d2cb671",
"mlflow.runName": "flight_delays_baseline_child_1",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "LGBMEstimator",
"synapseml.flaml.estimator_name": "lgbm",
"synapseml.flaml.iteration_number": "1",
"synapseml.flaml.learner": "lgbm",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "accuracy",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/4b351a28-0a7b-4b83-8a4e-0529c24ccba9/artifacts",
"end_time": 1725338806,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "4b351a28-0a7b-4b83-8a4e-0529c24ccba9",
"run_name": "",
"run_uuid": "4b351a28-0a7b-4b83-8a4e-0529c24ccba9",
"start_time": 1725338788,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:46:46] {2513} INFO - at 27.9s,\testimator lgbm's best error=0.3777,\tbest estimator lgbm's best error=0.3777\n",
"[flaml.automl.logger: 09-03 04:46:46] {2329} INFO - iteration 2, current learner lgbm\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"accuracy": 0.6236652165315404,
"best_validation_loss": 0.3763347834684596,
"iter_counter": 2,
"trial_time": 0.04708147048950195,
"validation_loss": 0.3763347834684596,
"wall_clock_time": 47.18543338775635
},
"params": {
"FLAML_sample_size": "10000",
"colsample_bytree": "1.0",
"learner": "lgbm",
"learning_rate": "0.15080280060326612",
"log_max_bin": "7",
"min_child_samples": "24",
"n_estimators": "4",
"num_leaves": "10",
"reg_alpha": "0.0009765625",
"reg_lambda": "0.04561250779031766",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "ca6ad732-f549-4374-9126-b3796d2cb671",
"mlflow.rootRunId": "ca6ad732-f549-4374-9126-b3796d2cb671",
"mlflow.runName": "flight_delays_baseline_child_2",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "LGBMEstimator",
"synapseml.flaml.estimator_name": "lgbm",
"synapseml.flaml.iteration_number": "2",
"synapseml.flaml.learner": "lgbm",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "accuracy",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/81eac128-06f7-495e-afc5-d7792f17ab92/artifacts",
"end_time": 1725338826,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "81eac128-06f7-495e-afc5-d7792f17ab92",
"run_name": "",
"run_uuid": "81eac128-06f7-495e-afc5-d7792f17ab92",
"start_time": 1725338807,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:47:07] {2513} INFO - at 47.2s,\testimator lgbm's best error=0.3763,\tbest estimator lgbm's best error=0.3763\n",
"[flaml.automl.logger: 09-03 04:47:07] {2329} INFO - iteration 3, current learner lgbm\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"accuracy": 0.6365186869685584,
"best_validation_loss": 0.3634813130314416,
"iter_counter": 3,
"trial_time": 0.04140424728393555,
"validation_loss": 0.3634813130314416,
"wall_clock_time": 67.61185884475708
},
"params": {
"FLAML_sample_size": "10000",
"colsample_bytree": "0.9006280463830675",
"learner": "lgbm",
"learning_rate": "0.2177704579511968",
"log_max_bin": "6",
"min_child_samples": "20",
"n_estimators": "15",
"num_leaves": "6",
"reg_alpha": "0.0021638671012090007",
"reg_lambda": "0.03731988103956423",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "ca6ad732-f549-4374-9126-b3796d2cb671",
"mlflow.rootRunId": "ca6ad732-f549-4374-9126-b3796d2cb671",
"mlflow.runName": "flight_delays_baseline_child_3",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "LGBMEstimator",
"synapseml.flaml.estimator_name": "lgbm",
"synapseml.flaml.iteration_number": "3",
"synapseml.flaml.learner": "lgbm",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "accuracy",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/25a86fd4-0939-4fbe-825b-548af7352935/artifacts",
"end_time": 1725338847,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "25a86fd4-0939-4fbe-825b-548af7352935",
"run_name": "",
"run_uuid": "25a86fd4-0939-4fbe-825b-548af7352935",
"start_time": 1725338827,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:47:27] {2513} INFO - at 67.6s,\testimator lgbm's best error=0.3635,\tbest estimator lgbm's best error=0.3635\n",
"[flaml.automl.logger: 09-03 04:47:27] {2329} INFO - iteration 4, current learner lgbm\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"accuracy": 0.6236404983191616,
"best_validation_loss": 0.3634813130314416,
"iter_counter": 4,
"trial_time": 0.032787322998046875,
"validation_loss": 0.3763595016808384,
"wall_clock_time": 88.09981751441956
},
"params": {
"FLAML_sample_size": "10000",
"colsample_bytree": "1.0",
"learner": "lgbm",
"learning_rate": "0.15080280060326612",
"log_max_bin": "8",
"min_child_samples": "24",
"n_estimators": "4",
"num_leaves": "10",
"reg_alpha": "0.0009765625",
"reg_lambda": "0.04561250779031766",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "ca6ad732-f549-4374-9126-b3796d2cb671",
"mlflow.rootRunId": "ca6ad732-f549-4374-9126-b3796d2cb671",
"mlflow.runName": "flight_delays_baseline_child_4",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "LGBMEstimator",
"synapseml.flaml.estimator_name": "lgbm",
"synapseml.flaml.iteration_number": "4",
"synapseml.flaml.learner": "lgbm",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "accuracy",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/20063586-3f6e-4b7e-8edc-af22ed666853/artifacts",
"end_time": 1725338865,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "20063586-3f6e-4b7e-8edc-af22ed666853",
"run_name": "",
"run_uuid": "20063586-3f6e-4b7e-8edc-af22ed666853",
"start_time": 1725338848,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:47:45] {2513} INFO - at 88.1s,\testimator lgbm's best error=0.3635,\tbest estimator lgbm's best error=0.3635\n",
"[flaml.automl.logger: 09-03 04:47:45] {2329} INFO - iteration 5, current learner lgbm\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"accuracy": 0.6389163535693099,
"best_validation_loss": 0.36108364643069013,
"iter_counter": 5,
"trial_time": 0.07941198348999023,
"validation_loss": 0.36108364643069013,
"wall_clock_time": 106.20823979377747
},
"params": {
"FLAML_sample_size": "10000",
"colsample_bytree": "0.8871559629536413",
"learner": "lgbm",
"learning_rate": "0.1292426830415275",
"log_max_bin": "6",
"min_child_samples": "14",
"n_estimators": "66",
"num_leaves": "4",
"reg_alpha": "0.02960826033957992",
"reg_lambda": "0.023368135622249268",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "ca6ad732-f549-4374-9126-b3796d2cb671",
"mlflow.rootRunId": "ca6ad732-f549-4374-9126-b3796d2cb671",
"mlflow.runName": "flight_delays_baseline_child_5",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "LGBMEstimator",
"synapseml.flaml.estimator_name": "lgbm",
"synapseml.flaml.iteration_number": "5",
"synapseml.flaml.learner": "lgbm",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "accuracy",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/4d4f70aa-e916-4ed9-8209-b99191c4d5e1/artifacts",
"end_time": 1725338884,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "4d4f70aa-e916-4ed9-8209-b99191c4d5e1",
"run_name": "",
"run_uuid": "4d4f70aa-e916-4ed9-8209-b99191c4d5e1",
"start_time": 1725338866,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:48:05] {2513} INFO - at 106.2s,\testimator lgbm's best error=0.3611,\tbest estimator lgbm's best error=0.3611\n",
"[flaml.automl.logger: 09-03 04:48:06] {569} INFO - logging best model lgbm\n",
"[flaml.automl.logger: 09-03 04:48:09] {2756} INFO - retrain lgbm for 0.6s\n",
"[flaml.automl.logger: 09-03 04:48:09] {2759} INFO - retrained model: LGBMClassifier(colsample_bytree=0.8871559629536413,\n",
" learning_rate=0.1292426830415275, max_bin=63,\n",
" min_child_samples=14, n_estimators=1, n_jobs=-1, num_leaves=4,\n",
" reg_alpha=0.02960826033957992, reg_lambda=0.023368135622249268,\n",
" verbose=-1)\n",
"[flaml.automl.logger: 09-03 04:48:09] {2760} INFO - Auto Feature Engineering pipeline: None\n",
"[flaml.automl.logger: 09-03 04:48:09] {2762} INFO - Best MLflow run name: \n",
"[flaml.automl.logger: 09-03 04:48:09] {2763} INFO - Best MLflow run id: 4d4f70aa-e916-4ed9-8209-b99191c4d5e1\n",
"[flaml.automl.logger: 09-03 04:48:27] {2055} INFO - fit succeeded\n",
"[flaml.automl.logger: 09-03 04:48:27] {2056} INFO - Time taken to find the best model: 106.20823979377747\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"accuracy": 0.6389163535693099,
"best_validation_loss": 0.36108364643069013,
"iter_counter": 5,
"trial_time": 0.07941198348999023,
"validation_loss": 0.36108364643069013,
"wall_clock_time": 106.20823979377747
},
"params": {
"FLAML_sample_size": "10000",
"best_learner": "lgbm",
"colsample_bytree": "0.8871559629536413",
"learner": "lgbm",
"learning_rate": "0.1292426830415275",
"log_max_bin": "6",
"min_child_samples": "14",
"n_estimators": "66",
"num_leaves": "4",
"reg_alpha": "0.02960826033957992",
"reg_lambda": "0.023368135622249268",
"sample_size": "10000"
},
"tags": {
"mlflow.rootRunId": "ca6ad732-f549-4374-9126-b3796d2cb671",
"mlflow.runName": "flight_delays_baseline",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "True",
"synapseml.flaml.estimator_class": "LGBMEstimator",
"synapseml.flaml.estimator_name": "lgbm",
"synapseml.flaml.iteration_number": "5",
"synapseml.flaml.learner": "lgbm",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "accuracy",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/ca6ad732-f549-4374-9126-b3796d2cb671/artifacts",
"end_time": 1725338907,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "ca6ad732-f549-4374-9126-b3796d2cb671",
"run_name": "",
"run_uuid": "ca6ad732-f549-4374-9126-b3796d2cb671",
"start_time": 1725338759,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"'''The main flaml automl API'''\n",
"with mlflow.start_run(run_name=\"flight_delays_baseline\"):\n",
" automl.fit(X_train=X_train, y_train=y_train, **settings)"
]
},
{
"cell_type": "markdown",
"id": "13",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Best model and metric"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "14",
"metadata": {
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:48:29.5268906Z",
"execution_start_time": "2024-09-03T04:48:29.1513034Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "25022243-df4f-4391-91b6-07701d41262f",
"queued_time": "2024-09-03T04:44:51.1795881Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 16,
"statement_ids": [
16
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 16, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Best ML leaner: lgbm\n",
"Best hyperparmeter config: {'n_estimators': 66, 'num_leaves': 4, 'min_child_samples': 14, 'learning_rate': 0.1292426830415275, 'log_max_bin': 6, 'colsample_bytree': 0.8871559629536413, 'reg_alpha': 0.02960826033957992, 'reg_lambda': 0.023368135622249268}\n",
"Best accuracy on validation data: 0.6389\n",
"Training duration of best run: 0.5744 s\n"
]
}
],
"source": [
"'''retrieve best config and best learner'''\n",
"print('Best ML leaner:', automl.best_estimator)\n",
"print('Best hyperparmeter config:', automl.best_config)\n",
"print('Best accuracy on validation data: {0:.4g}'.format(1-automl.best_loss))\n",
"print('Training duration of best run: {0:.4g} s'.format(automl.best_config_train_time))"
]
},
{
"cell_type": "markdown",
"id": "15",
"metadata": {},
"source": [
"## 3. Model saving and prediction\n",
"\n",
"### Save model\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "16",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:48:35.5551143Z",
"execution_start_time": "2024-09-03T04:48:29.9813151Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "1a35f263-201c-43ae-8558-2903b57b1973",
"queued_time": "2024-09-03T04:44:51.1817601Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 17,
"statement_ids": [
17
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 17, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Registered model 'flight_delays_baseline' already exists. Creating a new version of this model...\n",
"2024/09/03 04:48:33 INFO mlflow.tracking._model_registry.client: Waiting up to 300 seconds for model version to finish creation. Model name: flight_delays_baseline, version 4\n",
"Created version '4' of model 'flight_delays_baseline'.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model 'flight_delays_baseline' version 4 registered successfully.\n"
]
}
],
"source": [
"model_path = f\"runs:/{automl.best_run_id}/model\"\n",
"\n",
"# Register the model to the MLflow registry\n",
"registered_model = mlflow.register_model(model_uri=model_path, name=\"flight_delays_baseline\")\n",
"\n",
"# Print the registered model's name and version\n",
"print(f\"Model '{registered_model.name}' version {registered_model.version} registered successfully.\")"
]
},
{
"cell_type": "markdown",
"id": "17",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"### Predict with saved model"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "18",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:48:38.1921149Z",
"execution_start_time": "2024-09-03T04:48:35.9728626Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "65b0cb5e-e7fb-4936-b5c2-c15bec521833",
"queued_time": "2024-09-03T04:44:51.1840028Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 18,
"statement_ids": [
18
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 18, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "c145060f2a5943719ee2f8d11642df52",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Downloading artifacts: 0%| | 0/5 [00:00, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Predicted labels [1 0 1 ... 1 0 0]\n",
"True labels 118331 0\n",
"328182 0\n",
"335454 0\n",
"520591 1\n",
"344651 0\n",
" ..\n",
"367080 0\n",
"203510 1\n",
"254894 0\n",
"296512 1\n",
"362444 0\n",
"Name: Delay, Length: 134846, dtype: category\n",
"Categories (2, object): ['0' < '1']\n"
]
},
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:54:47.5877957Z",
"execution_start_time": "2024-09-03T04:54:47.2059723Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "b85756fc-4249-4378-8bf4-238dbb0ac984",
"queued_time": "2024-09-03T04:48:49.1748862Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 38,
"statement_ids": [
38
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 38, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"loaded_model = mlflow.sklearn.load_model(f\"models:/{registered_model.name}/{registered_model.version}\")\n",
"\n",
"y_pred = loaded_model.predict(X_test)\n",
"print('Predicted labels', y_pred)\n",
"print('True labels', y_test)\n",
"y_pred_proba = automl.predict_proba(X_test)[:,1]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "19",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:48:38.936582Z",
"execution_start_time": "2024-09-03T04:48:38.5760909Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "cd3d0b23-95b6-43fc-b72d-7343692a7964",
"queued_time": "2024-09-03T04:44:51.185972Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 19,
"statement_ids": [
19
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 19, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"accuracy = 0.6425997063316672\n",
"roc_auc = 0.6863937336290802\n",
"log_loss = 0.6294673392946836\n"
]
}
],
"source": [
"''' compute different metric values on testing dataset'''\n",
"from flaml.ml import sklearn_metric_loss_score\n",
"print('accuracy', '=', 1 - sklearn_metric_loss_score('accuracy', y_pred, y_test.astype(float)))\n",
"print('roc_auc', '=', 1 - sklearn_metric_loss_score('roc_auc', y_pred_proba, y_test.astype(float)))\n",
"print('log_loss', '=', sklearn_metric_loss_score('log_loss', y_pred_proba, y_test.astype(float)))"
]
},
{
"cell_type": "markdown",
"id": "20",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## 4. Customized Learner"
]
},
{
"cell_type": "markdown",
"id": "21",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Some experienced automl users may have a preferred model to tune or may already have a reasonably by-hand-tuned model before launching the automl experiment. They need to select optimal configurations for the customized model mixed with standard built-in learners. \n",
"\n",
"FLAML can easily incorporate customized/new learners (preferably with sklearn API) provided by users in a real-time manner, as demonstrated below."
]
},
{
"cell_type": "markdown",
"id": "22",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Example of Regularized Greedy Forest\n",
"\n",
"[Regularized Greedy Forest](https://arxiv.org/abs/1109.0887) (RGF) is a machine learning method currently not included in FLAML. The RGF has many tuning parameters, the most critical of which are: `[max_leaf, n_iter, n_tree_search, opt_interval, min_samples_leaf]`. To run a customized/new learner, the user needs to provide the following information:\n",
"* an implementation of the customized/new learner\n",
"* a list of hyperparameter names and types\n",
"* rough ranges of hyperparameters (i.e., upper/lower bounds)\n",
"* choose initial value corresponding to low cost for cost-related hyperparameters (e.g., initial value for max_leaf and n_iter should be small)\n",
"\n",
"In this example, the above information for RGF is wrapped in a python class called *MyRegularizedGreedyForest* that exposes the hyperparameters."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "23",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:48:39.6956108Z",
"execution_start_time": "2024-09-03T04:48:39.3399804Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "e78fb67f-58c5-4aff-afd1-0ce0a6a0f2a7",
"queued_time": "2024-09-03T04:44:51.1880435Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 20,
"statement_ids": [
20
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 20, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"''' SKLearnEstimator is the super class for a sklearn learner '''\n",
"from flaml.automl.model import SKLearnEstimator\n",
"from flaml import tune\n",
"from flaml.automl.task.task import CLASSIFICATION\n",
"\n",
"\n",
"class MyRegularizedGreedyForest(SKLearnEstimator):\n",
" def __init__(self, task='binary', **config):\n",
" '''Constructor\n",
" \n",
" Args:\n",
" task: A string of the task type, one of\n",
" 'binary', 'multiclass', 'regression'\n",
" config: A dictionary containing the hyperparameter names\n",
" and 'n_jobs' as keys. n_jobs is the number of parallel threads.\n",
" '''\n",
"\n",
" super().__init__(task, **config)\n",
"\n",
" '''task=binary or multi for classification task'''\n",
" if task in CLASSIFICATION:\n",
" from rgf.sklearn import RGFClassifier\n",
"\n",
" self.estimator_class = RGFClassifier\n",
" else:\n",
" from rgf.sklearn import RGFRegressor\n",
" \n",
" self.estimator_class = RGFRegressor\n",
"\n",
" @classmethod\n",
" def search_space(cls, data_size, task):\n",
" '''[required method] search space\n",
"\n",
" Returns:\n",
" A dictionary of the search space. \n",
" Each key is the name of a hyperparameter, and value is a dict with\n",
" its domain (required) and low_cost_init_value, init_value,\n",
" cat_hp_cost (if applicable).\n",
" e.g.,\n",
" {'domain': tune.randint(lower=1, upper=10), 'init_value': 1}.\n",
" '''\n",
" space = { \n",
" 'max_leaf': {'domain': tune.lograndint(lower=4, upper=data_size[0]), 'init_value': 4, 'low_cost_init_value': 4},\n",
" 'n_iter': {'domain': tune.lograndint(lower=1, upper=data_size[0]), 'init_value': 1, 'low_cost_init_value': 1},\n",
" 'n_tree_search': {'domain': tune.lograndint(lower=1, upper=32768), 'init_value': 1, 'low_cost_init_value': 1},\n",
" 'opt_interval': {'domain': tune.lograndint(lower=1, upper=10000), 'init_value': 100},\n",
" 'learning_rate': {'domain': tune.loguniform(lower=0.01, upper=20.0)},\n",
" 'min_samples_leaf': {'domain': tune.lograndint(lower=1, upper=20), 'init_value': 20},\n",
" }\n",
" return space\n",
"\n",
" @classmethod\n",
" def size(cls, config):\n",
" '''[optional method] memory size of the estimator in bytes\n",
" \n",
" Args:\n",
" config - the dict of the hyperparameter config\n",
"\n",
" Returns:\n",
" A float of the memory size required by the estimator to train the\n",
" given config\n",
" '''\n",
" max_leaves = int(round(config['max_leaf']))\n",
" n_estimators = int(round(config['n_iter']))\n",
" return (max_leaves * 3 + (max_leaves - 1) * 4 + 1.0) * n_estimators * 8\n",
"\n",
" @classmethod\n",
" def cost_relative2lgbm(cls):\n",
" '''[optional method] relative cost compared to lightgbm\n",
" '''\n",
" return 1.0\n"
]
},
{
"cell_type": "markdown",
"id": "24",
"metadata": {},
"source": [
"## 5. Customized Metric\n",
"\n",
"It's also easy to customize the optimization metric. As an example, we demonstrate with a custom metric function which combines training loss and validation loss as the final loss to minimize."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "25",
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:48:40.4231961Z",
"execution_start_time": "2024-09-03T04:48:40.0771372Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "266b8a24-e1b0-4ace-a1c8-daa39254d35c",
"queued_time": "2024-09-03T04:44:51.1898314Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 21,
"statement_ids": [
21
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 21, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"def custom_metric(X_val, y_val, estimator, labels, X_train, y_train,\n",
" weight_val=None, weight_train=None, config=None,\n",
" groups_val=None, groups_train=None):\n",
" from sklearn.metrics import log_loss\n",
" import time\n",
" start = time.time()\n",
" y_pred = estimator.predict_proba(X_val)\n",
" pred_time = (time.time() - start) / len(X_val)\n",
" val_loss = log_loss(y_val, y_pred, labels=labels,\n",
" sample_weight=weight_val)\n",
" y_pred = estimator.predict_proba(X_train)\n",
" train_loss = log_loss(y_train, y_pred, labels=labels,\n",
" sample_weight=weight_train)\n",
" alpha = 0.5\n",
" return val_loss * (1 + alpha) - alpha * train_loss, {\n",
" \"val_loss\": val_loss, \"train_loss\": train_loss, \"pred_time\": pred_time\n",
" }\n",
" # two elements are returned:\n",
" # the first element is the metric to minimize as a float number,\n",
" # the second element is a dictionary of the metrics to log"
]
},
{
"cell_type": "markdown",
"id": "26",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Add Customized Learner and and Metric\n",
"\n",
"After adding RGF into the list of learners, we run automl by tuning hyperpameters of RGF as well as the default learners. "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "27",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:48:41.1826867Z",
"execution_start_time": "2024-09-03T04:48:40.8242072Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "7b5820ad-343c-4f0c-9d32-4c392dae6775",
"queued_time": "2024-09-03T04:44:51.1916814Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 22,
"statement_ids": [
22
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 22, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"automl = AutoML()\n",
"automl.add_learner(learner_name='RGF', learner_class=MyRegularizedGreedyForest)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "28",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:48:41.9465962Z",
"execution_start_time": "2024-09-03T04:48:41.5871773Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "8126bb8c-4bae-467d-9fd3-f74e245f0e6c",
"queued_time": "2024-09-03T04:44:51.1931417Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 23,
"statement_ids": [
23
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 23, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"settings = {\n",
" \"time_budget\": 120, # total running time in seconds\n",
" \"metric\": custom_metric, # pass the custom metric funtion here\n",
" \"estimator_list\": ['RGF', 'lgbm', 'rf', 'xgboost'], # list of ML learners\n",
" \"task\": 'classification', # task type\n",
" \"seed\": 42, # random seed\n",
"}"
]
},
{
"cell_type": "markdown",
"id": "29",
"metadata": {},
"source": [
"We can then pass this custom learner and metric function to automl's `fit` method."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "30",
"metadata": {
"slideshow": {
"slide_type": "slide"
},
"tags": []
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:51:39.497045Z",
"execution_start_time": "2024-09-03T04:48:42.3996139Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "535eef5a-4837-412e-9498-93fb8350fe28",
"queued_time": "2024-09-03T04:44:51.1949269Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 24,
"statement_ids": [
24
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 24, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:48:43] {1787} INFO - task = classification\n",
"[flaml.automl.logger: 09-03 04:48:43] {1798} INFO - Evaluation method: holdout\n",
"[flaml.automl.logger: 09-03 04:48:43] {1901} INFO - Minimizing error metric: customized metric\n",
"[flaml.automl.logger: 09-03 04:48:43] {2019} INFO - List of ML learners in AutoML Run: ['RGF', 'lgbm', 'rf', 'xgboost']\n",
"[flaml.automl.logger: 09-03 04:48:43] {2329} INFO - iteration 0, current learner RGF\n",
"[flaml.automl.logger: 09-03 04:48:44] {2464} INFO - Estimated sufficient time budget=329015s. Estimated necessary time budget=329s.\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"best_validation_loss": 0.6624032549382345,
"customized metric": 0.6624032549382345,
"iter_counter": 0,
"trial_time": 0.8938319683074951,
"validation_loss": 0.6624032549382345,
"wall_clock_time": 1.7741670608520508
},
"params": {
"FLAML_sample_size": "10000",
"learner": "RGF",
"learning_rate": "5.203358777802588",
"max_leaf": "4",
"min_samples_leaf": "20",
"n_iter": "1",
"n_tree_search": "1",
"opt_interval": "100",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"mlflow.rootRunId": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"mlflow.runName": "flight_delays_rgf_metric_child_0",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "MyRegularizedGreedyForest",
"synapseml.flaml.estimator_name": "RGF",
"synapseml.flaml.iteration_number": "0",
"synapseml.flaml.learner": "RGF",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "customized metric",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/a9847e9c-7653-442b-b680-71640de74923/artifacts",
"end_time": 1725338941,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "a9847e9c-7653-442b-b680-71640de74923",
"run_name": "",
"run_uuid": "a9847e9c-7653-442b-b680-71640de74923",
"start_time": 1725338924,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:49:02] {2513} INFO - at 1.8s,\testimator RGF's best error=0.6624,\tbest estimator RGF's best error=0.6624\n",
"[flaml.automl.logger: 09-03 04:49:02] {2329} INFO - iteration 1, current learner RGF\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"best_validation_loss": 0.6624032549382345,
"customized metric": 0.7225073543734466,
"iter_counter": 1,
"trial_time": 0.2959563732147217,
"validation_loss": 0.7225073543734466,
"wall_clock_time": 19.77244234085083
},
"params": {
"FLAML_sample_size": "10000",
"learner": "RGF",
"learning_rate": "15.174906417562942",
"max_leaf": "6",
"min_samples_leaf": "16",
"n_iter": "1",
"n_tree_search": "1",
"opt_interval": "45",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"mlflow.rootRunId": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"mlflow.runName": "flight_delays_rgf_metric_child_1",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "MyRegularizedGreedyForest",
"synapseml.flaml.estimator_name": "RGF",
"synapseml.flaml.iteration_number": "1",
"synapseml.flaml.learner": "RGF",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "customized metric",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/b5a50f25-c1ec-4c7b-98e6-5c5595b4fe0a/artifacts",
"end_time": 1725338960,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "b5a50f25-c1ec-4c7b-98e6-5c5595b4fe0a",
"run_name": "",
"run_uuid": "b5a50f25-c1ec-4c7b-98e6-5c5595b4fe0a",
"start_time": 1725338942,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:49:20] {2513} INFO - at 19.8s,\testimator RGF's best error=0.6624,\tbest estimator RGF's best error=0.6624\n",
"[flaml.automl.logger: 09-03 04:49:20] {2329} INFO - iteration 2, current learner RGF\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"best_validation_loss": 0.6579708674713179,
"customized metric": 0.6579708674713179,
"iter_counter": 2,
"trial_time": 0.2928922176361084,
"validation_loss": 0.6579708674713179,
"wall_clock_time": 38.359304904937744
},
"params": {
"FLAML_sample_size": "10000",
"learner": "RGF",
"learning_rate": "1.7841917324247605",
"max_leaf": "4",
"min_samples_leaf": "19",
"n_iter": "7",
"n_tree_search": "2",
"opt_interval": "224",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"mlflow.rootRunId": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"mlflow.runName": "flight_delays_rgf_metric_child_2",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "MyRegularizedGreedyForest",
"synapseml.flaml.estimator_name": "RGF",
"synapseml.flaml.iteration_number": "2",
"synapseml.flaml.learner": "RGF",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "customized metric",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/b884dd57-97cd-4974-a5ef-f333a8d1782e/artifacts",
"end_time": 1725338978,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "b884dd57-97cd-4974-a5ef-f333a8d1782e",
"run_name": "",
"run_uuid": "b884dd57-97cd-4974-a5ef-f333a8d1782e",
"start_time": 1725338961,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:49:39] {2513} INFO - at 38.4s,\testimator RGF's best error=0.6580,\tbest estimator RGF's best error=0.6580\n",
"[flaml.automl.logger: 09-03 04:49:39] {2329} INFO - iteration 3, current learner RGF\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"best_validation_loss": 0.6564226864118912,
"customized metric": 0.6564226864118912,
"iter_counter": 3,
"trial_time": 0.34310221672058105,
"validation_loss": 0.6564226864118912,
"wall_clock_time": 57.05687379837036
},
"params": {
"FLAML_sample_size": "10000",
"learner": "RGF",
"learning_rate": "1.382765995393735",
"max_leaf": "7",
"min_samples_leaf": "19",
"n_iter": "85",
"n_tree_search": "7",
"opt_interval": "151",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"mlflow.rootRunId": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"mlflow.runName": "flight_delays_rgf_metric_child_3",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "MyRegularizedGreedyForest",
"synapseml.flaml.estimator_name": "RGF",
"synapseml.flaml.iteration_number": "3",
"synapseml.flaml.learner": "RGF",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "customized metric",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/d41bd941-0a23-480c-8c05-f868906289e1/artifacts",
"end_time": 1725338997,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "d41bd941-0a23-480c-8c05-f868906289e1",
"run_name": "",
"run_uuid": "d41bd941-0a23-480c-8c05-f868906289e1",
"start_time": 1725338980,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:49:57] {2513} INFO - at 57.1s,\testimator RGF's best error=0.6564,\tbest estimator RGF's best error=0.6564\n",
"[flaml.automl.logger: 09-03 04:49:57] {2329} INFO - iteration 4, current learner RGF\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"best_validation_loss": 0.6564226864118912,
"customized metric": 0.6579708674713179,
"iter_counter": 4,
"trial_time": 0.2889101505279541,
"validation_loss": 0.6579708674713179,
"wall_clock_time": 75.13033628463745
},
"params": {
"FLAML_sample_size": "10000",
"learner": "RGF",
"learning_rate": "1.784191732424759",
"max_leaf": "4",
"min_samples_leaf": "17",
"n_iter": "7",
"n_tree_search": "2",
"opt_interval": "223",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"mlflow.rootRunId": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"mlflow.runName": "flight_delays_rgf_metric_child_4",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "MyRegularizedGreedyForest",
"synapseml.flaml.estimator_name": "RGF",
"synapseml.flaml.iteration_number": "4",
"synapseml.flaml.learner": "RGF",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "customized metric",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/4af65c7e-451e-4644-ab03-0797af0c540a/artifacts",
"end_time": 1725339014,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "4af65c7e-451e-4644-ab03-0797af0c540a",
"run_name": "",
"run_uuid": "4af65c7e-451e-4644-ab03-0797af0c540a",
"start_time": 1725338998,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:50:15] {2513} INFO - at 75.1s,\testimator RGF's best error=0.6564,\tbest estimator RGF's best error=0.6564\n",
"[flaml.automl.logger: 09-03 04:50:15] {2329} INFO - iteration 5, current learner RGF\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"best_validation_loss": 0.6564226864118912,
"customized metric": 0.7285741410304108,
"iter_counter": 5,
"trial_time": 0.3273744583129883,
"validation_loss": 0.7285741410304108,
"wall_clock_time": 92.89774966239929
},
"params": {
"FLAML_sample_size": "10000",
"learner": "RGF",
"learning_rate": "4.977190932165611",
"max_leaf": "4",
"min_samples_leaf": "16",
"n_iter": "33",
"n_tree_search": "11",
"opt_interval": "137",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"mlflow.rootRunId": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"mlflow.runName": "flight_delays_rgf_metric_child_5",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "MyRegularizedGreedyForest",
"synapseml.flaml.estimator_name": "RGF",
"synapseml.flaml.iteration_number": "5",
"synapseml.flaml.learner": "RGF",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "customized metric",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/76410fa1-87e6-4f66-8a5a-6e1de4cb6310/artifacts",
"end_time": 1725339032,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "76410fa1-87e6-4f66-8a5a-6e1de4cb6310",
"run_name": "",
"run_uuid": "76410fa1-87e6-4f66-8a5a-6e1de4cb6310",
"start_time": 1725339016,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:50:32] {2513} INFO - at 92.9s,\testimator RGF's best error=0.6564,\tbest estimator RGF's best error=0.6564\n",
"[flaml.automl.logger: 09-03 04:50:32] {2329} INFO - iteration 6, current learner RGF\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"best_validation_loss": 0.6399577405588128,
"customized metric": 0.6399577405588128,
"iter_counter": 6,
"trial_time": 0.6532900333404541,
"validation_loss": 0.6399577405588128,
"wall_clock_time": 110.61953806877136
},
"params": {
"FLAML_sample_size": "10000",
"learner": "RGF",
"learning_rate": "0.38416082968818005",
"max_leaf": "37",
"min_samples_leaf": "19",
"n_iter": "222",
"n_tree_search": "4",
"opt_interval": "167",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"mlflow.rootRunId": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"mlflow.runName": "flight_delays_rgf_metric_child_6",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "MyRegularizedGreedyForest",
"synapseml.flaml.estimator_name": "RGF",
"synapseml.flaml.iteration_number": "6",
"synapseml.flaml.learner": "RGF",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "customized metric",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/4e7d64c8-227b-4596-a717-a160161778d3/artifacts",
"end_time": 1725339049,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "4e7d64c8-227b-4596-a717-a160161778d3",
"run_name": "",
"run_uuid": "4e7d64c8-227b-4596-a717-a160161778d3",
"start_time": 1725339033,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:50:49] {2513} INFO - at 110.6s,\testimator RGF's best error=0.6400,\tbest estimator RGF's best error=0.6400\n",
"[flaml.automl.logger: 09-03 04:50:50] {569} INFO - logging best model RGF\n",
"[flaml.automl.logger: 09-03 04:51:20] {2756} INFO - retrain RGF for 27.4s\n",
"[flaml.automl.logger: 09-03 04:51:20] {2759} INFO - retrained model: RGFClassifier(learning_rate=0.38416082968818005, max_leaf=37,\n",
" min_samples_leaf=19, n_iter=222, n_tree_search=4,\n",
" opt_interval=167)\n",
"[flaml.automl.logger: 09-03 04:51:20] {2760} INFO - Auto Feature Engineering pipeline: None\n",
"[flaml.automl.logger: 09-03 04:51:20] {2762} INFO - Best MLflow run name: \n",
"[flaml.automl.logger: 09-03 04:51:20] {2763} INFO - Best MLflow run id: 4e7d64c8-227b-4596-a717-a160161778d3\n",
"[flaml.automl.logger: 09-03 04:51:35] {2055} INFO - fit succeeded\n",
"[flaml.automl.logger: 09-03 04:51:35] {2056} INFO - Time taken to find the best model: 110.61953806877136\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"best_validation_loss": 0.6399577405588128,
"customized metric": 0.6399577405588128,
"iter_counter": 6,
"trial_time": 0.6532900333404541,
"validation_loss": 0.6399577405588128,
"wall_clock_time": 110.61953806877136
},
"params": {
"FLAML_sample_size": "10000",
"best_learner": "RGF",
"learner": "RGF",
"learning_rate": "0.38416082968818005",
"max_leaf": "37",
"min_samples_leaf": "19",
"n_iter": "222",
"n_tree_search": "4",
"opt_interval": "167",
"sample_size": "10000"
},
"tags": {
"mlflow.rootRunId": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"mlflow.runName": "flight_delays_rgf_metric",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "True",
"synapseml.flaml.estimator_class": "MyRegularizedGreedyForest",
"synapseml.flaml.estimator_name": "RGF",
"synapseml.flaml.iteration_number": "6",
"synapseml.flaml.learner": "RGF",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "customized metric",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/56f60b2e-eb73-41b4-a32d-ec188c73cd00/artifacts",
"end_time": 1725339096,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"run_name": "",
"run_uuid": "56f60b2e-eb73-41b4-a32d-ec188c73cd00",
"start_time": 1725338922,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"with mlflow.start_run(run_name=\"flight_delays_rgf_metric\"):\n",
" automl.fit(X_train=X_train, y_train=y_train, **settings)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "31",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:51:40.2690103Z",
"execution_start_time": "2024-09-03T04:51:39.917342Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "47087381-f1b4-406d-b54b-fca76b79d4ce",
"queued_time": "2024-09-03T04:44:51.1970927Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 25,
"statement_ids": [
25
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 25, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Best ML leaner: RGF\n",
"Best hyperparmeter config: {'max_leaf': 37, 'n_iter': 222, 'n_tree_search': 4, 'opt_interval': 167, 'min_samples_leaf': 19, 'learning_rate': 0.38416082968818005}\n",
"Best accuracy on validation data: 0.36\n",
"Training duration of best run: 27.35 s\n"
]
}
],
"source": [
"'''retrieve best config and best learner'''\n",
"print('Best ML leaner:', automl.best_estimator)\n",
"print('Best hyperparmeter config:', automl.best_config)\n",
"print('Best accuracy on validation data: {0:.4g}'.format(1-automl.best_loss))\n",
"print('Training duration of best run: {0:.4g} s'.format(automl.best_config_train_time))"
]
},
{
"cell_type": "markdown",
"id": "32",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"## 6. Auto Featurization"
]
},
{
"cell_type": "markdown",
"id": "33",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"Next, we introduce the latest `featurization` module, which could automatically search for a feature engineering pipeline along with AutoML process.\n",
"\n",
"This module leverages HPO algorithms to intelligently select, transform, and construct features from raw data, enhancing the model's predictive power. \n",
"\n",
"The module's integration with AutoML allows for a seamless, automated process where both feature engineering and model selection are jointly optimized.\n",
"\n",
"Just set the `featurization` parameter to `auto` could let you experience this module. Set to `force` to let FLAML choose a method for each stage. Set to `off` to disable the module. On Fabric, it's set to `auto` by default.\n",
"\n",
"\n",
"\n",
"Currently avaliable feature engineering methods:\n",
"1. Stage `categorical`: Methods to encode categorical features. Available:\n",
" - `ordinal`: Ordinal encoding for categorical features.\n",
" \n",
"\n",
"2. Stage `numerical`: Methods to transform numerical features. Available:\n",
" - `null`: No transformation applied to numerical features.\n",
" - `scaler_standard`: Standard scaling for numerical features, normalizing them to have zero mean and unit variance.\n",
" - `scaler_minmax`: Min-Max scaling, transforming features by scaling each feature to a given range, typically [0, 1].\n",
" - `scaler_maxabs`: MaxAbs scaling, scales each feature by its maximum absolute value. This is meant for data that is already centered at zero or sparse data.\n",
" - `scaler_robust`: Robust scaling using statistics that are robust to outliers, particularly useful when dealing with features that contain many outliers.\n",
" - `normalizer_sparse`: Normalization applied to sparse input, making each feature vector have unit norm.\n",
" \n",
" \n",
"3. Stage `selection`: Methods for feature selection. Available:\n",
" - `null`: No feature selection is applied.\n",
" - `cardinality`: Selecting features based on their cardinality.\n",
" - `variance`: Selecting features based on variance threshold.\n",
" \n",
"\n",
"4. Stage `extraction`: Feature extraction methods, applicable based on task type. Available:\n",
" - `null`: No feature extraction is applied.\n",
" - `PCA`: Principal Component Analysis.\n",
" - `LDA`: Linear Discriminant Analysis(For classification tasks only)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "34",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:51:41.7103725Z",
"execution_start_time": "2024-09-03T04:51:40.6545872Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "83c9a530-9a27-4628-b4c3-89d2dea935ca",
"queued_time": "2024-09-03T04:44:51.1989052Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 26,
"statement_ids": [
26
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 26, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"automl = AutoML()\n",
"automl.add_learner(learner_name='RGF', learner_class=MyRegularizedGreedyForest)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "35",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:51:42.534623Z",
"execution_start_time": "2024-09-03T04:51:42.1730398Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "5c454e4d-2e58-48e7-adf7-4dd173e08eb9",
"queued_time": "2024-09-03T04:44:51.1999902Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 27,
"statement_ids": [
27
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 27, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"settings = {\n",
" \"time_budget\": 120, # total running time in seconds\n",
" \"metric\": custom_metric, # pass the custom metric funtion here\n",
" \"estimator_list\": ['RGF', 'lgbm', 'rf', 'xgboost'], # list of ML learners\n",
" \"task\": 'classification', # task type\n",
" \"seed\": 42, # random seed\n",
" \"featurization\": \"auto\",\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "36",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:54:28.7835981Z",
"execution_start_time": "2024-09-03T04:51:42.9699791Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "b27c4285-4823-45a1-a6e2-201c5e6ee824",
"queued_time": "2024-09-03T04:44:51.2008184Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 28,
"statement_ids": [
28
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 28, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:51:43] {1787} INFO - task = classification\n",
"[flaml.automl.logger: 09-03 04:51:43] {1798} INFO - Evaluation method: holdout\n",
"[flaml.automl.logger: 09-03 04:51:44] {1901} INFO - Minimizing error metric: customized metric\n",
"Auto featurization is not supported for spark data. Featurization is turned off.\n",
"[flaml.automl.logger: 09-03 04:51:44] {2019} INFO - List of ML learners in AutoML Run: ['RGF', 'lgbm', 'rf', 'xgboost']\n",
"[flaml.automl.logger: 09-03 04:51:44] {2329} INFO - iteration 0, current learner RGF\n",
"[flaml.automl.logger: 09-03 04:51:44] {2464} INFO - Estimated sufficient time budget=108490s. Estimated necessary time budget=108s.\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"best_validation_loss": 0.6624032549382345,
"customized metric": 0.6624032549382345,
"iter_counter": 0,
"trial_time": 0.29631805419921875,
"validation_loss": 0.6624032549382345,
"wall_clock_time": 1.1034765243530273
},
"params": {
"FLAML_sample_size": "10000",
"learner": "RGF",
"learning_rate": "5.203358777802588",
"max_leaf": "4",
"min_samples_leaf": "20",
"n_iter": "1",
"n_tree_search": "1",
"opt_interval": "100",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"mlflow.rootRunId": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"mlflow.runName": "flight_delays_autofe_child_0",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "MyRegularizedGreedyForest",
"synapseml.flaml.estimator_name": "RGF",
"synapseml.flaml.iteration_number": "0",
"synapseml.flaml.learner": "RGF",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "customized metric",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/fee4346a-61b9-44a5-86a9-18bdab0fc95c/artifacts",
"end_time": 1725339121,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "fee4346a-61b9-44a5-86a9-18bdab0fc95c",
"run_name": "",
"run_uuid": "fee4346a-61b9-44a5-86a9-18bdab0fc95c",
"start_time": 1725339104,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:52:01] {2513} INFO - at 1.1s,\testimator RGF's best error=0.6624,\tbest estimator RGF's best error=0.6624\n",
"[flaml.automl.logger: 09-03 04:52:01] {2329} INFO - iteration 1, current learner RGF\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"best_validation_loss": 0.6624032549382345,
"customized metric": 0.7225073543734466,
"iter_counter": 1,
"trial_time": 0.29607105255126953,
"validation_loss": 0.7225073543734466,
"wall_clock_time": 18.643904447555542
},
"params": {
"FLAML_sample_size": "10000",
"learner": "RGF",
"learning_rate": "15.174906417562942",
"max_leaf": "6",
"min_samples_leaf": "16",
"n_iter": "1",
"n_tree_search": "1",
"opt_interval": "45",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"mlflow.rootRunId": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"mlflow.runName": "flight_delays_autofe_child_1",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "MyRegularizedGreedyForest",
"synapseml.flaml.estimator_name": "RGF",
"synapseml.flaml.iteration_number": "1",
"synapseml.flaml.learner": "RGF",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "customized metric",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/25568343-c9e2-4ea1-b2dc-87c2de178cb5/artifacts",
"end_time": 1725339140,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "25568343-c9e2-4ea1-b2dc-87c2de178cb5",
"run_name": "",
"run_uuid": "25568343-c9e2-4ea1-b2dc-87c2de178cb5",
"start_time": 1725339122,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:52:20] {2513} INFO - at 18.6s,\testimator RGF's best error=0.6624,\tbest estimator RGF's best error=0.6624\n",
"[flaml.automl.logger: 09-03 04:52:20] {2329} INFO - iteration 2, current learner RGF\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"best_validation_loss": 0.6579708674713179,
"customized metric": 0.6579708674713179,
"iter_counter": 2,
"trial_time": 0.2899649143218994,
"validation_loss": 0.6579708674713179,
"wall_clock_time": 37.70584297180176
},
"params": {
"FLAML_sample_size": "10000",
"learner": "RGF",
"learning_rate": "1.7841917324247605",
"max_leaf": "4",
"min_samples_leaf": "19",
"n_iter": "7",
"n_tree_search": "2",
"opt_interval": "224",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"mlflow.rootRunId": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"mlflow.runName": "flight_delays_autofe_child_2",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "MyRegularizedGreedyForest",
"synapseml.flaml.estimator_name": "RGF",
"synapseml.flaml.iteration_number": "2",
"synapseml.flaml.learner": "RGF",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "customized metric",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/3e40efd0-6742-433f-ac4f-0dd3e491a964/artifacts",
"end_time": 1725339158,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "3e40efd0-6742-433f-ac4f-0dd3e491a964",
"run_name": "",
"run_uuid": "3e40efd0-6742-433f-ac4f-0dd3e491a964",
"start_time": 1725339141,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:52:39] {2513} INFO - at 37.7s,\testimator RGF's best error=0.6580,\tbest estimator RGF's best error=0.6580\n",
"[flaml.automl.logger: 09-03 04:52:39] {2329} INFO - iteration 3, current learner lgbm\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages/flaml/fabric/autofe.py:288: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" X[categorical_features] = X[categorical_features].astype(str).astype(\"category\")\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"best_validation_loss": 0.6771484988075377,
"customized metric": 0.6771484988075377,
"iter_counter": 3,
"trial_time": 0.6715874671936035,
"validation_loss": 0.6771484988075377,
"wall_clock_time": 56.74079608917236
},
"params": {
"FLAML_sample_size": "10000",
"colsample_bytree": "1.0",
"fe.categorical": "ordinal",
"fe.extraction": "LDA",
"fe.selection": "null",
"learner": "lgbm",
"learning_rate": "0.09999999999999995",
"log_max_bin": "8",
"min_child_samples": "20",
"n_estimators": "4",
"num_leaves": "4",
"reg_alpha": "0.0009765625",
"reg_lambda": "1.0",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"mlflow.rootRunId": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"mlflow.runName": "flight_delays_autofe_child_3",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "LGBMEstimator",
"synapseml.flaml.estimator_name": "lgbm",
"synapseml.flaml.iteration_number": "3",
"synapseml.flaml.learner": "lgbm",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "customized metric",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/b439be54-2ef7-4820-b2f4-eab53bfb0dd7/artifacts",
"end_time": 1725339179,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "b439be54-2ef7-4820-b2f4-eab53bfb0dd7",
"run_name": "",
"run_uuid": "b439be54-2ef7-4820-b2f4-eab53bfb0dd7",
"start_time": 1725339160,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:52:59] {2513} INFO - at 56.7s,\testimator lgbm's best error=0.6771,\tbest estimator RGF's best error=0.6580\n",
"[flaml.automl.logger: 09-03 04:52:59] {2329} INFO - iteration 4, current learner xgboost\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages/flaml/fabric/autofe.py:288: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" X[categorical_features] = X[categorical_features].astype(str).astype(\"category\")\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"best_validation_loss": 0.6798173003929756,
"customized metric": 0.6798173003929756,
"iter_counter": 4,
"trial_time": 0.31815600395202637,
"validation_loss": 0.6798173003929756,
"wall_clock_time": 76.9973726272583
},
"params": {
"FLAML_sample_size": "10000",
"colsample_bylevel": "1.0",
"colsample_bytree": "1.0",
"fe.categorical": "ordinal",
"fe.extraction": "LDA",
"fe.selection": "null",
"learner": "xgboost",
"learning_rate": "0.09999999999999995",
"max_leaves": "4",
"min_child_weight": "0.9999999999999993",
"n_estimators": "4",
"reg_alpha": "0.0009765625",
"reg_lambda": "1.0",
"sample_size": "10000",
"subsample": "1.0"
},
"tags": {
"mlflow.parentRunId": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"mlflow.rootRunId": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"mlflow.runName": "flight_delays_autofe_child_4",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "XGBoostSklearnEstimator",
"synapseml.flaml.estimator_name": "xgboost",
"synapseml.flaml.iteration_number": "4",
"synapseml.flaml.learner": "xgboost",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "customized metric",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/289b3da0-381b-47d0-9b70-c4fa92077764/artifacts",
"end_time": 1725339199,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "289b3da0-381b-47d0-9b70-c4fa92077764",
"run_name": "",
"run_uuid": "289b3da0-381b-47d0-9b70-c4fa92077764",
"start_time": 1725339180,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:53:19] {2513} INFO - at 77.0s,\testimator xgboost's best error=0.6798,\tbest estimator RGF's best error=0.6580\n",
"[flaml.automl.logger: 09-03 04:53:19] {2329} INFO - iteration 5, current learner RGF\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"best_validation_loss": 0.6579708674713179,
"customized metric": 0.7024846118198472,
"iter_counter": 5,
"trial_time": 0.3247342109680176,
"validation_loss": 0.7024846118198472,
"wall_clock_time": 96.81968307495117
},
"params": {
"FLAML_sample_size": "10000",
"learner": "RGF",
"learning_rate": "1.382765995393735",
"max_leaf": "7",
"min_samples_leaf": "19",
"n_iter": "85",
"n_tree_search": "7",
"opt_interval": "151",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"mlflow.rootRunId": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"mlflow.runName": "flight_delays_autofe_child_5",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "MyRegularizedGreedyForest",
"synapseml.flaml.estimator_name": "RGF",
"synapseml.flaml.iteration_number": "5",
"synapseml.flaml.learner": "RGF",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "customized metric",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/09b06cd7-685c-42a2-9713-4ee67b028791/artifacts",
"end_time": 1725339220,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "09b06cd7-685c-42a2-9713-4ee67b028791",
"run_name": "",
"run_uuid": "09b06cd7-685c-42a2-9713-4ee67b028791",
"start_time": 1725339200,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:53:41] {2513} INFO - at 96.8s,\testimator RGF's best error=0.6580,\tbest estimator RGF's best error=0.6580\n",
"[flaml.automl.logger: 09-03 04:53:41] {2329} INFO - iteration 6, current learner lgbm\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/trusted-service-user/cluster-env/trident_env/lib/python3.10/site-packages/flaml/fabric/autofe.py:288: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" X[categorical_features] = X[categorical_features].astype(str).astype(\"category\")\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"best_validation_loss": 0.6771484988075377,
"customized metric": 0.6788861940642658,
"iter_counter": 6,
"trial_time": 0.21529173851013184,
"validation_loss": 0.6788861940642658,
"wall_clock_time": 118.05959796905518
},
"params": {
"FLAML_sample_size": "10000",
"colsample_bytree": "0.9532787078931052",
"fe.categorical": "ordinal",
"fe.extraction": "LDA",
"fe.selection": "null",
"learner": "lgbm",
"learning_rate": "0.06546385281889436",
"log_max_bin": "9",
"min_child_samples": "17",
"n_estimators": "5",
"num_leaves": "4",
"reg_alpha": "0.0021499603681893195",
"reg_lambda": "24.150321938204367",
"sample_size": "10000"
},
"tags": {
"mlflow.parentRunId": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"mlflow.rootRunId": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"mlflow.runName": "flight_delays_autofe_child_6",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "False",
"synapseml.flaml.estimator_class": "LGBMEstimator",
"synapseml.flaml.estimator_name": "lgbm",
"synapseml.flaml.iteration_number": "6",
"synapseml.flaml.learner": "lgbm",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "customized metric",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/7d1bee9c-9e91-4774-ab71-e3634d3b37cb/artifacts",
"end_time": 1725339239,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "7d1bee9c-9e91-4774-ab71-e3634d3b37cb",
"run_name": "",
"run_uuid": "7d1bee9c-9e91-4774-ab71-e3634d3b37cb",
"start_time": 1725339221,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"[flaml.automl.logger: 09-03 04:54:00] {2513} INFO - at 118.1s,\testimator lgbm's best error=0.6771,\tbest estimator RGF's best error=0.6580\n",
"[flaml.automl.logger: 09-03 04:54:01] {569} INFO - logging best model RGF\n",
"[flaml.automl.logger: 09-03 04:54:06] {2756} INFO - retrain RGF for 2.6s\n",
"[flaml.automl.logger: 09-03 04:54:06] {2759} INFO - retrained model: RGFClassifier(learning_rate=1.7841917324247605, max_leaf=4, min_samples_leaf=19,\n",
" n_iter=7, n_tree_search=2, opt_interval=224)\n",
"[flaml.automl.logger: 09-03 04:54:06] {2760} INFO - Auto Feature Engineering pipeline: None\n",
"[flaml.automl.logger: 09-03 04:54:06] {2762} INFO - Best MLflow run name: \n",
"[flaml.automl.logger: 09-03 04:54:06] {2763} INFO - Best MLflow run id: 3e40efd0-6742-433f-ac4f-0dd3e491a964\n",
"[flaml.automl.logger: 09-03 04:54:23] {2055} INFO - fit succeeded\n",
"[flaml.automl.logger: 09-03 04:54:23] {2056} INFO - Time taken to find the best model: 37.70584297180176\n"
]
},
{
"data": {
"application/vnd.mlflow.run-widget+json": {
"data": {
"metrics": {
"best_validation_loss": 0.6579708674713179,
"customized metric": 0.6579708674713179,
"iter_counter": 2,
"trial_time": 0.2899649143218994,
"validation_loss": 0.6579708674713179,
"wall_clock_time": 37.70584297180176
},
"params": {
"FLAML_sample_size": "10000",
"best_learner": "RGF",
"learner": "RGF",
"learning_rate": "1.7841917324247605",
"max_leaf": "4",
"min_samples_leaf": "19",
"n_iter": "7",
"n_tree_search": "2",
"opt_interval": "224",
"sample_size": "10000"
},
"tags": {
"mlflow.rootRunId": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"mlflow.runName": "flight_delays_autofe",
"mlflow.user": "1b884fa3-ac7e-44f0-a171-1f215a13ecd4",
"synapseml.experiment.artifactId": "d85e32e4-32da-4c71-986b-a0d0723750aa",
"synapseml.experimentName": "automl-tutorial",
"synapseml.flaml.automl_display_configurations": "{\"task\": null, \"automl_mode\": null, \"metric\": null, \"time_budget\": null, \"early_stop\": null, \"featurization\": null}",
"synapseml.flaml.automl_user_configurations": "{}",
"synapseml.flaml.best_run": "True",
"synapseml.flaml.estimator_class": "MyRegularizedGreedyForest",
"synapseml.flaml.estimator_name": "RGF",
"synapseml.flaml.iteration_number": "2",
"synapseml.flaml.learner": "RGF",
"synapseml.flaml.log_type": "r_autolog",
"synapseml.flaml.meric": "customized metric",
"synapseml.flaml.run_source": "flaml-automl",
"synapseml.flaml.sample_size": "10000",
"synapseml.flaml.version": "2.2.0.post1",
"synapseml.livy.id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"synapseml.notebook.artifactId": "4305ee52-6e3d-4e54-8bc5-cd02304d23fc",
"synapseml.user.id": "8abb9091-0a62-4ecd-bf6a-e49dbbf94431",
"synapseml.user.name": "Li Jiang"
}
},
"info": {
"artifact_uri": "sds://onelakemsit.pbidedicated.windows.net/c5eb3571-5221-4fd2-957a-456e893a6545/d85e32e4-32da-4c71-986b-a0d0723750aa/40117031-4c02-4f09-8fd2-3a3453661e9b/artifacts",
"end_time": 1725339264,
"experiment_id": "0499e3cc-c690-4b30-86b9-55838e7df486",
"lifecycle_stage": "active",
"run_id": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"run_name": "",
"run_uuid": "40117031-4c02-4f09-8fd2-3a3453661e9b",
"start_time": 1725339103,
"status": "FINISHED",
"user_id": "e83b0ff5-f802-4776-bae4-11fe73ba932a"
},
"inputs": {
"dataset_inputs": []
}
}
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"with mlflow.start_run(run_name=\"flight_delays_autofe\"):\n",
" automl.fit(X_train=X_train, y_train=y_train, **settings)"
]
},
{
"cell_type": "markdown",
"id": "37",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"### Standalone Feturization Pipeline \n",
"Once the AutoML process completes, the featurization pipeline can be accessed independently, and be utilized separately from the AutoML process.\n",
"\n",
"You can retrieve the feature engineering pipeline specifically through `automl.model.autofe`. \n",
"Alternatively, for a comprehensive view of the preprocessing steps, including those from FLAML's existing preprocessors, use `automl.feature_transformer`.\n",
"\n",
"- To view the configuration details of the entire pipeline, use `autofe.show_transformations()`.\n",
"- For a more interactive experience, the pipeline structure can be visualized by executing `display(autofe)` or simply `autofe`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "38",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:54:30.1784335Z",
"execution_start_time": "2024-09-03T04:54:29.1617339Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "c4ac362a-a6bb-4b02-8c33-77323b7f2b31",
"queued_time": "2024-09-03T04:44:51.2018554Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 29,
"statement_ids": [
29
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 29, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"autofe = automl.model.autofe\n",
"display(autofe)\n",
"if autofe:\n",
" # autofe could be None\n",
" display(autofe.transform(X_test))"
]
},
{
"cell_type": "markdown",
"id": "39",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"Full data preprocessor set, including FLAML's existing preprocess and Featurization:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "40",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:54:31.7039608Z",
"execution_start_time": "2024-09-03T04:54:30.5957553Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "d61327b6-6d7d-471f-83c7-501e6e653430",
"queued_time": "2024-09-03T04:44:51.2031479Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 30,
"statement_ids": [
30
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 30, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.synapse.widget-view+json": {
"widget_id": "01e178a6-4c9f-4b90-825f-a05558a8ee9e",
"widget_type": "Synapse.DataFrame"
},
"text/plain": [
"SynapseWidget(Synapse.DataFrame, 01e178a6-4c9f-4b90-825f-a05558a8ee9e)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"transformer = automl.feature_transformer\n",
"display(transformer)\n",
"display(transformer.transform(X_test))"
]
},
{
"cell_type": "markdown",
"id": "41",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"## 7. Visualization\n",
"The `flaml.visualization` module provides utility functions for plotting the optimization process using [plotly](https://plotly.com/python/). Leveraging `plotly`, users can interactively explore experiment results. To use these plotting functions, simply provide your Hyperparameter Tuning & AutoML experiment results as input. Optional parameters can be added using keyword arguments."
]
},
{
"cell_type": "markdown",
"id": "42",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"## Avaliable Plots\n",
"- plot_contour: Plot the parameter relationship as contour plot in the experiment.\n",
"- plot_edf: Plot the objective value EDF (empirical distribution function) of the experiment.\n",
"- plot_feature_importance: Plot importance for each feature in the dataset.\n",
"- plot_optimization_history: Plot optimization history of all trials in the experiment.\n",
"- plot_parallel_coordinate: Plot the high-dimensional parameter relationships in the experiment.\n",
"- plot_slice: Plot the parameter relationship as slice plot in a study.\n",
"- plot_timeline: Plot the timeline of the experiment."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "43",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [],
"source": [
"import flaml.visualization as fviz\n",
"fig = fviz.plot_slice(automl) # , params=['num_leaves', 'fe.extraction']\n",
"fig.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "44",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [],
"source": [
"fig = fviz.plot_contour(automl, learner=\"RGF\", params=[\"max_leaf\", \"learning_rate\"])\n",
"fig.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "45",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:54:42.5267328Z",
"execution_start_time": "2024-09-03T04:54:38.3407954Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "1d372090-9c00-419f-a7b4-a6e367dd0e31",
"queued_time": "2024-09-03T04:44:51.2069779Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 33,
"statement_ids": [
33
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 33, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"hovertemplate": "learner=RGF
score=%{x}
probability=%{y}",
"legendgroup": "RGF",
"line": {
"dash": "solid",
"shape": "hv"
},
"marker": {
"color": "#636efa",
"symbol": "circle"
},
"mode": "lines",
"name": "RGF",
"orientation": "v",
"showlegend": true,
"type": "scatter",
"x": [
0.2774926456265534,
0.2975153881801528,
0.3375967450617655,
0.34202913252868206
],
"xaxis": "x",
"y": [
0.25,
0.5,
0.75,
1
],
"yaxis": "y"
},
{
"hovertemplate": "learner=lgbm
score=%{x}
probability=%{y}",
"legendgroup": "lgbm",
"line": {
"dash": "solid",
"shape": "hv"
},
"marker": {
"color": "#EF553B",
"symbol": "circle"
},
"mode": "lines",
"name": "lgbm",
"orientation": "v",
"showlegend": true,
"type": "scatter",
"x": [
0.32111380593573424,
0.3228515011924623
],
"xaxis": "x",
"y": [
0.5,
1
],
"yaxis": "y"
},
{
"hovertemplate": "learner=xgboost
score=%{x}
probability=%{y}",
"legendgroup": "xgboost",
"line": {
"dash": "solid",
"shape": "hv"
},
"marker": {
"color": "#00cc96",
"symbol": "circle"
},
"mode": "lines",
"name": "xgboost",
"orientation": "v",
"showlegend": true,
"type": "scatter",
"x": [
0.3201826996070244
],
"xaxis": "x",
"y": [
1
],
"yaxis": "y"
}
],
"layout": {
"legend": {
"title": {
"text": "learner"
},
"tracegroupgap": 0
},
"margin": {
"t": 60
},
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "#E5ECF6",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "#E5ECF6",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "white",
"linecolor": "white",
"minorgridcolor": "white",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "white",
"linecolor": "white",
"minorgridcolor": "white",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"fillpattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"autotypenumbers": "strict",
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "#E5ECF6",
"showlakes": true,
"showland": true,
"subunitcolor": "white"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "#E5ECF6",
"polar": {
"angularaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"bgcolor": "#E5ECF6",
"radialaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
},
"yaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
},
"zaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"baxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"bgcolor": "#E5ECF6",
"caxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "white",
"linecolor": "white",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "white",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "white",
"linecolor": "white",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "white",
"zerolinewidth": 2
}
}
},
"title": {
"text": "Empirical Distribution Function"
},
"xaxis": {
"anchor": "y",
"domain": [
0,
1
],
"title": {
"text": "Score"
}
},
"yaxis": {
"anchor": "x",
"domain": [
0,
1
],
"rangemode": "tozero",
"title": {
"text": "Probability"
}
}
}
},
"text/html": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"fig = fviz.plot_edf(automl)\n",
"fig.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "46",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [],
"source": [
"fig = fviz.plot_optimization_history(automl)\n",
"fig.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "47",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [],
"source": [
"fig = fviz.plot_timeline(automl)\n",
"fig.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "48",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [],
"source": [
"fig = fviz.plot_feature_importance(automl)\n",
"fig.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "49",
"metadata": {
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"data": {
"application/vnd.livy.statement-meta+json": {
"execution_finish_time": "2024-09-03T04:54:46.8278294Z",
"execution_start_time": "2024-09-03T04:54:45.8054335Z",
"livy_statement_state": "available",
"normalized_state": "finished",
"parent_msg_id": "3277a4ef-8679-424c-a2d1-7805471c6793",
"queued_time": "2024-09-03T04:44:51.2103857Z",
"session_id": "44ec0d80-7d47-447e-8932-68bbcc94ad2b",
"session_start_time": null,
"spark_pool": null,
"state": "finished",
"statement_id": 37,
"statement_ids": [
37
]
},
"text/plain": [
"StatementMeta(, 44ec0d80-7d47-447e-8932-68bbcc94ad2b, 37, Finished, Available, Finished)"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.plotly.v1+json": {
"config": {
"plotlyServerURL": "https://plot.ly"
},
"data": [
{
"dimensions": [
{
"label": "sample_size",
"values": [
10000,
10000,
10000,
null,
10000
]
},
{
"label": "max_leaf",
"values": [
4,
6,
4,
null,
7
]
},
{
"label": "n_iter",
"values": [
1,
1,
7,
null,
85
]
},
{
"label": "n_tree_search",
"values": [
1,
1,
2,
null,
7
]
},
{
"label": "opt_interval",
"values": [
100,
45,
224,
null,
151
]
},
{
"label": "min_samples_leaf",
"values": [
20,
16,
19,
null,
19
]
},
{
"label": "learning_rate",
"values": [
5.203358777802588,
15.174906417562942,
1.7841917324247605,
null,
1.382765995393735
]
},
{
"label": "FLAML_sample_size",
"values": [
10000,
10000,
10000,
null,
10000
]
},
{
"label": "score",
"values": [
0.3375967450617655,
0.2774926456265534,
0.34202913252868206,
0.2975153881801528,
null
]
}
],
"domain": {
"x": [
0,
1
],
"y": [
0,
1
]
},
"line": {
"color": [
0.3375967450617655,
0.2774926456265534,
0.34202913252868206,
0.2975153881801528,
null
],
"coloraxis": "coloraxis"
},
"name": "",
"type": "parcoords"
}
],
"layout": {
"coloraxis": {
"colorbar": {
"title": {
"text": "score"
}
},
"colorscale": [
[
0,
"rgb(7, 64, 80)"
],
[
0.16666666666666666,
"rgb(16, 89, 101)"
],
[
0.3333333333333333,
"rgb(33, 122, 121)"
],
[
0.5,
"rgb(76, 155, 130)"
],
[
0.6666666666666666,
"rgb(108, 192, 139)"
],
[
0.8333333333333334,
"rgb(151, 225, 150)"
],
[
1,
"rgb(211, 242, 163)"
]
]
},
"legend": {
"tracegroupgap": 0
},
"margin": {
"t": 60
},
"template": {
"data": {
"bar": [
{
"error_x": {
"color": "#2a3f5f"
},
"error_y": {
"color": "#2a3f5f"
},
"marker": {
"line": {
"color": "#E5ECF6",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "bar"
}
],
"barpolar": [
{
"marker": {
"line": {
"color": "#E5ECF6",
"width": 0.5
},
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "barpolar"
}
],
"carpet": [
{
"aaxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "white",
"linecolor": "white",
"minorgridcolor": "white",
"startlinecolor": "#2a3f5f"
},
"baxis": {
"endlinecolor": "#2a3f5f",
"gridcolor": "white",
"linecolor": "white",
"minorgridcolor": "white",
"startlinecolor": "#2a3f5f"
},
"type": "carpet"
}
],
"choropleth": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "choropleth"
}
],
"contour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "contour"
}
],
"contourcarpet": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "contourcarpet"
}
],
"heatmap": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmap"
}
],
"heatmapgl": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "heatmapgl"
}
],
"histogram": [
{
"marker": {
"pattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
}
},
"type": "histogram"
}
],
"histogram2d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2d"
}
],
"histogram2dcontour": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "histogram2dcontour"
}
],
"mesh3d": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"type": "mesh3d"
}
],
"parcoords": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "parcoords"
}
],
"pie": [
{
"automargin": true,
"type": "pie"
}
],
"scatter": [
{
"fillpattern": {
"fillmode": "overlay",
"size": 10,
"solidity": 0.2
},
"type": "scatter"
}
],
"scatter3d": [
{
"line": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatter3d"
}
],
"scattercarpet": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattercarpet"
}
],
"scattergeo": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergeo"
}
],
"scattergl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattergl"
}
],
"scattermapbox": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scattermapbox"
}
],
"scatterpolar": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolar"
}
],
"scatterpolargl": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterpolargl"
}
],
"scatterternary": [
{
"marker": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"type": "scatterternary"
}
],
"surface": [
{
"colorbar": {
"outlinewidth": 0,
"ticks": ""
},
"colorscale": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"type": "surface"
}
],
"table": [
{
"cells": {
"fill": {
"color": "#EBF0F8"
},
"line": {
"color": "white"
}
},
"header": {
"fill": {
"color": "#C8D4E3"
},
"line": {
"color": "white"
}
},
"type": "table"
}
]
},
"layout": {
"annotationdefaults": {
"arrowcolor": "#2a3f5f",
"arrowhead": 0,
"arrowwidth": 1
},
"autotypenumbers": "strict",
"coloraxis": {
"colorbar": {
"outlinewidth": 0,
"ticks": ""
}
},
"colorscale": {
"diverging": [
[
0,
"#8e0152"
],
[
0.1,
"#c51b7d"
],
[
0.2,
"#de77ae"
],
[
0.3,
"#f1b6da"
],
[
0.4,
"#fde0ef"
],
[
0.5,
"#f7f7f7"
],
[
0.6,
"#e6f5d0"
],
[
0.7,
"#b8e186"
],
[
0.8,
"#7fbc41"
],
[
0.9,
"#4d9221"
],
[
1,
"#276419"
]
],
"sequential": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
],
"sequentialminus": [
[
0,
"#0d0887"
],
[
0.1111111111111111,
"#46039f"
],
[
0.2222222222222222,
"#7201a8"
],
[
0.3333333333333333,
"#9c179e"
],
[
0.4444444444444444,
"#bd3786"
],
[
0.5555555555555556,
"#d8576b"
],
[
0.6666666666666666,
"#ed7953"
],
[
0.7777777777777778,
"#fb9f3a"
],
[
0.8888888888888888,
"#fdca26"
],
[
1,
"#f0f921"
]
]
},
"colorway": [
"#636efa",
"#EF553B",
"#00cc96",
"#ab63fa",
"#FFA15A",
"#19d3f3",
"#FF6692",
"#B6E880",
"#FF97FF",
"#FECB52"
],
"font": {
"color": "#2a3f5f"
},
"geo": {
"bgcolor": "white",
"lakecolor": "white",
"landcolor": "#E5ECF6",
"showlakes": true,
"showland": true,
"subunitcolor": "white"
},
"hoverlabel": {
"align": "left"
},
"hovermode": "closest",
"mapbox": {
"style": "light"
},
"paper_bgcolor": "white",
"plot_bgcolor": "#E5ECF6",
"polar": {
"angularaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"bgcolor": "#E5ECF6",
"radialaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
}
},
"scene": {
"xaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
},
"yaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
},
"zaxis": {
"backgroundcolor": "#E5ECF6",
"gridcolor": "white",
"gridwidth": 2,
"linecolor": "white",
"showbackground": true,
"ticks": "",
"zerolinecolor": "white"
}
},
"shapedefaults": {
"line": {
"color": "#2a3f5f"
}
},
"ternary": {
"aaxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"baxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
},
"bgcolor": "#E5ECF6",
"caxis": {
"gridcolor": "white",
"linecolor": "white",
"ticks": ""
}
},
"title": {
"x": 0.05
},
"xaxis": {
"automargin": true,
"gridcolor": "white",
"linecolor": "white",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "white",
"zerolinewidth": 2
},
"yaxis": {
"automargin": true,
"gridcolor": "white",
"linecolor": "white",
"ticks": "",
"title": {
"standoff": 15
},
"zerolinecolor": "white",
"zerolinewidth": 2
}
}
},
"title": {
"automargin": true,
"text": "Parallel Coordinate Plot for RGF",
"y": 0.1
}
}
},
"text/html": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"fig = fviz.plot_parallel_coordinate(automl)\n",
"fig.show()"
]
}
],
"metadata": {
"a365ComputeOptions": null,
"description": null,
"kernel_info": {
"name": "synapse_pyspark"
},
"kernelspec": {
"display_name": "Synapse PySpark",
"language": "Python",
"name": "synapse_pyspark"
},
"language_info": {
"name": "python"
},
"nteract": {
"version": "nteract-front-end@1.0.0"
},
"save_output": true,
"sessionKeepAliveTimeout": 0,
"spark_compute": {
"compute_id": "/trident/default"
}
},
"nbformat": 4,
"nbformat_minor": 5
}