LightGBM/examples/python-guide
James Lamb e0cda880fc
[python-package] remove uses of deprecated NumPy random number generation APIs, require 'numpy>=1.17.0' (#6468)
2024-06-03 20:17:40 -05:00
..
dask [ci] [python-package] enable ruff-format on tests and examples (#6317) 2024-02-21 12:15:38 -06:00
notebooks [R-package] [python-package] deprecate Dataset arguments to cv() and train() (#6446) 2024-05-10 19:26:39 -05:00
README.md [ci] prevent trailing whitespace, ensure files end with newline (#6373) 2024-03-18 23:24:14 -05:00
advanced_example.py [R-package] [python-package] deprecate Dataset arguments to cv() and train() (#6446) 2024-05-10 19:26:39 -05:00
dataset_from_multi_hdf5.py [ci] [python-package] enable ruff-format on tests and examples (#6317) 2024-02-21 12:15:38 -06:00
logistic_regression.py [python-package] remove uses of deprecated NumPy random number generation APIs, require 'numpy>=1.17.0' (#6468) 2024-06-03 20:17:40 -05:00
plot_example.py [R-package] [python-package] deprecate Dataset arguments to cv() and train() (#6446) 2024-05-10 19:26:39 -05:00
simple_example.py [ci] [python-package] enable ruff-format on tests and examples (#6317) 2024-02-21 12:15:38 -06:00
sklearn_example.py [ci] [python-package] enable ruff-format on tests and examples (#6317) 2024-02-21 12:15:38 -06:00

README.md

Python-package Examples

Here is an example for LightGBM to use Python-package.

You should install LightGBM Python-package first.

You also need scikit-learn, pandas, matplotlib (only for plot example), and scipy (only for logistic regression example) to run the examples, but they are not required for the package itself. You can install them with pip:

pip install scikit-learn pandas matplotlib scipy -U

Now you can run examples in this folder, for example:

python simple_example.py

Examples include:

  • dask/: examples using Dask for distributed training
  • simple_example.py
    • Construct Dataset
    • Basic train and predict
    • Eval during training
    • Early stopping
    • Save model to file
  • sklearn_example.py
    • Create data for learning with sklearn interface
    • Basic train and predict with sklearn interface
    • Feature importances with sklearn interface
    • Self-defined eval metric with sklearn interface
    • Find best parameters for the model with sklearn's GridSearchCV
  • advanced_example.py
    • Construct Dataset
    • Set feature names
    • Directly use categorical features without one-hot encoding
    • Save model to file
    • Dump model to JSON format
    • Get feature names
    • Get feature importances
    • Load model to predict
    • Dump and load model with pickle
    • Load model file to continue training
    • Change learning rates during training
    • Change any parameters during training
    • Self-defined objective function
    • Self-defined eval metric
    • Callback function
  • logistic_regression.py
    • Use objective xentropy or binary
    • Use xentropy with binary labels or probability labels
    • Use binary only with binary labels
    • Compare speed of xentropy versus binary
  • plot_example.py
    • Construct Dataset
    • Train and record eval results for further plotting
    • Plot metrics recorded during training
    • Plot feature importances
    • Plot split value histogram
    • Plot one specified tree
    • Plot one specified tree with Graphviz
  • dataset_from_multi_hdf5.py
    • Construct Dataset from multiple HDF5 files
    • Avoid loading all data into memory