A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
Перейти к файлу
Belinda Trotta cc7a1e2739 Predefined bin thresholds (#2325)
* Fix bug where small values of max_bin cause crash.

* Revert "Fix bug where small values of max_bin cause crash."

This reverts commit fe5c8e2547.

* Add functionality to force bin thresholds.

* Fix style issues.

* Use stable sort.

* Minor style and doc fixes.

* Add functionality to force bin thresholds.

* Fix style issues.

* Use stable sort.

* Minor style and doc fixes.

* Change binning behavior to be same as PR #2342.

* Add functionality to force bin thresholds.

* Fix style issues.

* Use stable sort.

* Minor style and doc fixes.

* Add functionality to force bin thresholds.

* Fix style issues.

* Use stable sort.

* Minor style and doc fixes.

* Change binning behavior to be same as PR #2342.

* Add functionality to force bin thresholds.

* Fix style issues.

* Minor style and doc fixes.

* Add functionality to force bin thresholds.

* Fix style issues.

* Minor style and doc fixes.

* Change binning behavior to be same as PR #2342.

* Add functionality to force bin thresholds.

* Fix style issues.

* Use stable sort.

* Minor style and doc fixes.

* Add functionality to force bin thresholds.

* Fix style issues.

* Use stable sort.

* Minor style and doc fixes.

* Change binning behavior to be same as PR #2342.

* Use different bin finding function for predefined bounds.

* Fix style issues.

* Minor refactoring, overload FindBinWithZeroAsOneBin.

* Fix style issues.

* Fix bug and add new test.

* Add warning when using categorical features with forced bins.

* Pass forced_upper_bounds by reference.

* Pass container types by const reference.

* Get categorical features using FeatureBinMapper.

* Fix bug for small max_bin.

* Move GetForcedBins to DatasetLoader.

* Find forced bins in dataset_loader.

* Minor fixes.
2019-09-28 23:31:31 +08:00
.ci [ci] temp fix for missing dependency on Windows (#2440) 2019-09-24 18:20:32 -05:00
.github added JAVA codeowner (#2430) 2019-09-23 23:07:10 +08:00
.nuget [docs] updated Microsoft GitHub URL (#2152) 2019-05-08 13:51:28 +08:00
R-package added new early_stopping param alias (#2431) 2019-09-26 12:07:59 +08:00
compute@36c89134d4 [ci] update CI stuff (#2079) 2019-04-09 10:23:32 +08:00
docker updated docker folder (#2316) 2019-08-13 15:32:39 +03:00
docs Predefined bin thresholds (#2325) 2019-09-28 23:31:31 +08:00
examples Predefined bin thresholds (#2325) 2019-09-28 23:31:31 +08:00
helpers [python] removed unused import and variable (#2213) 2019-06-04 13:43:27 +08:00
include/LightGBM Predefined bin thresholds (#2325) 2019-09-28 23:31:31 +08:00
pmml [docs][python] made OS detection more reliable and little docs improvements (#1414) 2018-06-03 12:46:59 +03:00
python-package [python] avoid data copy where possible (#2383) 2019-09-26 23:37:47 +03:00
src Predefined bin thresholds (#2325) 2019-09-28 23:31:31 +08:00
swig [Java] add missing JNI exception checks to fix warnings (#2405) 2019-09-26 13:01:26 +08:00
tests Predefined bin thresholds (#2325) 2019-09-28 23:31:31 +08:00
windows code refactoring: cost effective gradient boosting (#2407) 2019-09-26 23:35:05 +08:00
.appveyor.yml [ci] temp fix for missing dependency on Windows (#2440) 2019-09-24 18:20:32 -05:00
.editorconfig added editorconfig (#2403) 2019-09-16 14:38:26 +03:00
.gitignore [tests] simplified test and added dumped data to gitignore (#2372) 2019-09-02 18:59:54 +03:00
.gitmodules Initial GPU acceleration support for LightGBM (#368) 2017-04-09 21:53:14 +08:00
.readthedocs.yml [docs][R] added R-package docs generation routines (#2176) 2019-09-01 19:47:04 +03:00
.travis.yml [ci] added cpplint check (#2416) 2019-09-19 16:48:53 +03:00
.vsts-ci.yml [ci][tests] install joblib for test directly (#2374) 2019-09-03 14:08:10 +03:00
CMakeLists.txt fixed minor issues with R package (#2167) 2019-05-12 10:36:27 +02:00
CODE_OF_CONDUCT.md Create CODE_OF_CONDUCT.md (#803) 2017-08-18 19:01:47 +08:00
LICENSE added editorconfig (#2403) 2019-09-16 14:38:26 +03:00
README.md [docs] added reference to leaves Go package (#2379) 2019-09-05 16:22:13 +03:00
VERSION.txt update version number at master branch (#1996) 2019-02-05 19:27:24 +08:00
build_r.R [docs][R] added R-package docs generation routines (#2176) 2019-09-01 19:47:04 +03:00
build_r_site.R [docs][R] added R-package docs generation routines (#2176) 2019-09-01 19:47:04 +03:00

README.md

LightGBM, Light Gradient Boosting Machine

Azure Pipelines Build Status Appveyor Build Status Travis Build Status Documentation Status License Python Versions PyPI Version Join Gitter at https://gitter.im/Microsoft/LightGBM Slack

LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed and efficient with the following advantages:

  • Faster training speed and higher efficiency.
  • Lower memory usage.
  • Better accuracy.
  • Support of parallel and GPU learning.
  • Capable of handling large-scale data.

For further details, please refer to Features.

Benefitting from these advantages, LightGBM is being widely-used in many winning solutions of machine learning competitions.

Comparison experiments on public datasets show that LightGBM can outperform existing boosting frameworks on both efficiency and accuracy, with significantly lower memory consumption. What's more, parallel experiments show that LightGBM can achieve a linear speed-up by using multiple machines for training in specific settings.

Get Started and Documentation

Our primary documentation is at https://lightgbm.readthedocs.io/ and is generated from this repository. If you are new to LightGBM, follow the installation instructions on that site.

Next you may want to read:

Documentation for contributors:

News

08/15/2017 : Optimal split for categorical features.

07/13/2017 : Gitter is available.

06/20/2017 : Python-package is on PyPI now.

06/09/2017 : LightGBM Slack team is available.

05/03/2017 : LightGBM v2 stable release.

04/10/2017 : LightGBM supports GPU-accelerated tree learning now. Please read our GPU Tutorial and Performance Comparison.

02/20/2017 : Update to LightGBM v2.

02/12/2017 : LightGBM v1 stable release.

01/08/2017 : Release R-package beta version, welcome to have a try and provide feedback.

12/05/2016 : Categorical Features as input directly (without one-hot coding).

12/02/2016 : Release Python-package beta version, welcome to have a try and provide feedback.

More detailed update logs : Key Events.

External (Unofficial) Repositories

Julia-package: https://github.com/Allardvm/LightGBM.jl

JPMML (Java PMML converter): https://github.com/jpmml/jpmml-lightgbm

Treelite (model compiler for efficient deployment): https://github.com/dmlc/treelite

leaves (Go model applier): https://github.com/dmitryikh/leaves

ONNXMLTools (ONNX converter): https://github.com/onnx/onnxmltools

SHAP (model output explainer): https://github.com/slundberg/shap

MMLSpark (LightGBM on Spark): https://github.com/Azure/mmlspark

ML.NET (.NET/C#-package): https://github.com/dotnet/machinelearning

LightGBM.NET (.NET/C#-package): https://github.com/rca22/LightGBM.Net

Dask-LightGBM (distributed and parallel Python-package): https://github.com/dask/dask-lightgbm

Ruby gem: https://github.com/ankane/lightgbm

Support

How to Contribute

LightGBM has been developed and used by many active community members. Your help is very valuable to make it better for everyone.

  • Contribute to the tests to make it more reliable.
  • Contribute to the documentation to make it clearer for everyone.
  • Contribute to the examples to share your experience with other users.
  • Look for issues with tag "help wanted" and submit pull requests to address them.
  • Add your stories and experience to Awesome LightGBM. If LightGBM helped you in a machine learning competition or some research application, we want to hear about it!
  • Open an issue to report problems or recommend new features.

Microsoft Open Source Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Reference Papers

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, Tie-Yan Liu. "LightGBM: A Highly Efficient Gradient Boosting Decision Tree". Advances in Neural Information Processing Systems 30 (NIPS 2017), pp. 3149-3157.

Qi Meng, Guolin Ke, Taifeng Wang, Wei Chen, Qiwei Ye, Zhi-Ming Ma, Tie-Yan Liu. "A Communication-Efficient Parallel Algorithm for Decision Tree". Advances in Neural Information Processing Systems 29 (NIPS 2016), pp. 1279-1287.

Huan Zhang, Si Si and Cho-Jui Hsieh. "GPU Acceleration for Large-scale Tree Boosting". SysML Conference, 2018.

Note: If you use LightGBM in your GitHub projects, please add lightgbm in the requirements.txt.

License

This project is licensed under the terms of the MIT license. See LICENSE for additional details.