Show how to perform fast retraining with LightGBM in different business cases

azure benchmark boosted-trees distributed-systems gbdt gbm gbrt gpu kaggle lightgbm machine-learning xgboost

Перейти к файлу

miguelgfierro 788bfc3e69 small modification gitignore		2017-06-26 14:49:36 +00:00
environment	updated documentation	2017-06-20 15:05:55 +01:00
experiments	updated xgboost vs lightgbm notebook	2017-06-22 07:53:40 +00:00
.gitignore	small modification gitignore	2017-06-26 14:49:36 +00:00
INSTALL.md	updated install with bokeh plugin	2017-06-20 15:28:55 +01:00
LICENSE	Some testing	2017-03-28 18:26:53 +00:00
README.md	added results for planet kaggle	2017-06-22 08:39:40 +01:00
requirements.txt	fix requirements	2017-06-26 10:07:44 +01:00

README.md

Fast Retraining

This repo shows how to perform fast retraining with LightGBM in different business cases.

In this repo we show we compare two of the fastest boosted decision tree libraries: XGBoost and LightGBM. We will evaluate them across datasets of several domains and different sizes.

Installation and Setup

The installation instructions can be found here.

Project

In the folder experiments you can find the different experiments of the project. We developed 5 experiments with the CPU versions of the libraries and 2 experiments with the GPU version.

Airline
BCI
Football
Planet Amazon
Fraud Detection
Airline (GPU version)
HIGGS (GPU version)

In the folder experiment/libs there is the common code for the project.

Benchmark

In the following table there are sumarized some results of the benckmarks performed in the experiments:

Dataset	Experiment	Data size	Features	time xgb	time xgb_hist	time lgb	F1 xgb	F1 xgb_hist	F1 lgb
Airline	link	115069017	13	2180	578	366	0.717	0.698	0.694
BCI	link	20497	2048	15.97	52.69	6.38	0.110	0.142	0.137
Football	link	19673	46	2.27	2.47	0.582	0.458	0.460	0.587
Planet Amazon	link	40479	2048	291.93	2238.96	78.43	0.805	0.819	0.891
Fraud Detection	link	284807	30	4.45	1.20	0.73	0.824	0.802	0.813
Airline (GPU)	link	115069017	13	65.04	27.15	21.35	0.726	0.738	0.728
HIGGS (GPU)	link	1000000	28	107.63	51.26	28.90	0.770	0.770	0.766

Contributing

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.