Show how to perform fast retraining with LightGBM in different business cases
Перейти к файлу
miguelgfierro 788bfc3e69 small modification gitignore 2017-06-26 14:49:36 +00:00
environment updated documentation 2017-06-20 15:05:55 +01:00
experiments updated xgboost vs lightgbm notebook 2017-06-22 07:53:40 +00:00
.gitignore small modification gitignore 2017-06-26 14:49:36 +00:00
INSTALL.md updated install with bokeh plugin 2017-06-20 15:28:55 +01:00
LICENSE Some testing 2017-03-28 18:26:53 +00:00
README.md added results for planet kaggle 2017-06-22 08:39:40 +01:00
requirements.txt fix requirements 2017-06-26 10:07:44 +01:00

README.md

Fast Retraining

This repo shows how to perform fast retraining with LightGBM in different business cases.

In this repo we show we compare two of the fastest boosted decision tree libraries: XGBoost and LightGBM. We will evaluate them across datasets of several domains and different sizes.

Installation and Setup

The installation instructions can be found here.

Project

In the folder experiments you can find the different experiments of the project. We developed 5 experiments with the CPU versions of the libraries and 2 experiments with the GPU version.

  • Airline
  • BCI
  • Football
  • Planet Amazon
  • Fraud Detection
  • Airline (GPU version)
  • HIGGS (GPU version)

In the folder experiment/libs there is the common code for the project.

Benchmark

In the following table there are sumarized some results of the benckmarks performed in the experiments:

Dataset Experiment Data size Features time xgb time xgb_hist time lgb F1 xgb F1 xgb_hist F1 lgb
Airline link 115069017 13 2180 578 366 0.717 0.698 0.694
BCI link 20497 2048 15.97 52.69 6.38 0.110 0.142 0.137
Football link 19673 46 2.27 2.47 0.582 0.458 0.460 0.587
Planet Amazon link 40479 2048 291.93 2238.96 78.43 0.805 0.819 0.891
Fraud Detection link 284807 30 4.45 1.20 0.73 0.824 0.802 0.813
Airline (GPU) link 115069017 13 65.04 27.15 21.35 0.726 0.738 0.728
HIGGS (GPU) link 1000000 28 107.63 51.26 28.90 0.770 0.770 0.766

Contributing

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.