зеркало из https://github.com/microsoft/LightGBM.git
89 строки
3.1 KiB
ReStructuredText
89 строки
3.1 KiB
ReStructuredText
Quick Start
|
|
===========
|
|
|
|
This is a quick start guide for LightGBM CLI version.
|
|
|
|
Follow the `Installation Guide <./Installation-Guide.rst>`__ to install LightGBM first.
|
|
|
|
**List of other helpful links**
|
|
|
|
- `Parameters <./Parameters.rst>`__
|
|
|
|
- `Parameters Tuning <./Parameters-Tuning.rst>`__
|
|
|
|
- `Python-package Quick Start <./Python-Intro.rst>`__
|
|
|
|
- `Python API <./Python-API.rst>`__
|
|
|
|
Training Data Format
|
|
--------------------
|
|
|
|
LightGBM supports input data files with `CSV`_, `TSV`_ and `LibSVM`_ formats.
|
|
|
|
Files could be both with and without `headers <./Parameters.rst#header>`__.
|
|
|
|
`Label column <./Parameters.rst#label_column>`__ could be specified both by index and by name.
|
|
|
|
Some columns could be `ignored <./Parameters.rst#ignore_column>`__.
|
|
|
|
Categorical Feature Support
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
LightGBM can use categorical features directly (without one-hot encoding).
|
|
The experiment on `Expo data`_ shows about 8x speed-up compared with one-hot encoding.
|
|
|
|
For the setting details, please refer to the ``categorical_feature`` `parameter <./Parameters.rst#categorical_feature>`__.
|
|
|
|
Weight and Query/Group Data
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
LightGBM also supports weighted training, it needs an additional `weight data <./Parameters.rst#weight-data>`__.
|
|
And it needs an additional `query data <./Parameters.rst#query-data>`_ for ranking task.
|
|
|
|
Also, `weight <./Parameters.rst#weight_column>`__ and `query <./Parameters.rst#group_column>`__ data could be specified as columns in training data in the same manner as label.
|
|
|
|
Parameters Quick Look
|
|
---------------------
|
|
|
|
The parameters format is ``key1=value1 key2=value2 ...``.
|
|
|
|
Parameters can be set both in config file and command line.
|
|
If one parameter appears in both command line and config file, LightGBM will use the parameter from the command line.
|
|
|
|
The most important parameters which new users should take a look to are located into `Core Parameters <./Parameters.rst#core-parameters>`__
|
|
and the top of `Learning Control Parameters <./Parameters.rst#learning-control-parameters>`__
|
|
sections of the full detailed list of `LightGBM's parameters <./Parameters.rst>`__.
|
|
|
|
Run LightGBM
|
|
------------
|
|
|
|
::
|
|
|
|
"./lightgbm" config=your_config_file other_args ...
|
|
|
|
Parameters can be set both in the config file and command line, and the parameters in command line have higher priority than in the config file.
|
|
For example, the following command line will keep ``num_trees=10`` and ignore the same parameter in the config file.
|
|
|
|
::
|
|
|
|
"./lightgbm" config=train.conf num_trees=10
|
|
|
|
Examples
|
|
--------
|
|
|
|
- `Binary Classification <https://github.com/microsoft/LightGBM/tree/master/examples/binary_classification>`__
|
|
|
|
- `Regression <https://github.com/microsoft/LightGBM/tree/master/examples/regression>`__
|
|
|
|
- `Lambdarank <https://github.com/microsoft/LightGBM/tree/master/examples/lambdarank>`__
|
|
|
|
- `Parallel Learning <https://github.com/microsoft/LightGBM/tree/master/examples/parallel_learning>`__
|
|
|
|
.. _CSV: https://en.wikipedia.org/wiki/Comma-separated_values
|
|
|
|
.. _TSV: https://en.wikipedia.org/wiki/Tab-separated_values
|
|
|
|
.. _LibSVM: https://www.csie.ntu.edu.tw/~cjlin/libsvm/
|
|
|
|
.. _Expo data: http://stat-computing.org/dataexpo/2009/
|