Граф коммитов

12 Коммитов

Автор SHA1 Сообщение Дата
Nikita Titov 54facc4d72
[python] rename `print_evaluation()` into `log_evaluation()` (#4604)
* Update __init__.py

* Update Python-API.rst

* Update engine.py

* Update test_utilities.py

* Update sklearn.py

* Update callback.py

* Update callback.py

* Update callback.py
2021-09-16 01:26:02 +03:00
Chen Yufei c359896e9b
[python-package] Create Dataset from multiple data files (#4089)
* [python-package] create Dataset from sampled data.

* [python-package] create Dataset from List[Sequence].

1. Use random access for data sampling
2. Support read data from multiple input files
3. Read data in batch so no need to hold all data in memory

* [python-package] example: create Dataset from multiple HDF5 file.

* fix: revert is_class implementation for seq

* fix: unwanted memory view reference for seq

* fix: seq is_class accepts sklearn matrices

* fix: requirements for example

* fix: pycode

* feat: print static code linting stage

* fix: linting: avoid shell str regex conversion

* code style: doc style

* code style: isort

* fix ci dependency: h5py on windows

* [py] remove rm files in test seq
https://github.com/microsoft/LightGBM/pull/4089#discussion_r612929623

* docs(python): init_from_sample summary

https://github.com/microsoft/LightGBM/pull/4089#discussion_r612903389

* remove dataset dump sample data debugging code.

* remove typo fix.

Create separate PR for this.

* fix typo in src/c_api.cpp

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* style(linting): py3 type hint for seq

* test(basic): os.path style path handling

* Revert "feat: print static code linting stage"

This reverts commit 10bd79f7f8.

* feat(python): sequence on validation set

* minor(python): comment

* minor(python): test option hint

* style(python): fix code linting

* style(python): add pydoc for ref_dataset

* doc(python): sequence

Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

* revert(python): sequence class abc

* chore(python): remove rm_files

* Remove useless static_assert.

* refactor: test_basic test for sequence.

* fix lint complaint.

* remove dataset._dump_text in sequence test.

* Fix reverting typo fix.

* Apply suggestions from code review

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Fix type hint, code and doc style.

* fix failing test_basic.

* Remove TODO about keep constant in sync with cpp.

* Install h5py only when running python-examples.

* Fix lint complaint.

* Apply suggestions from code review

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Doc fixes, remove unused params_str in __init_from_seqs.

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Remove unnecessary conda install in windows ci script.

* Keep param as example in dataset_from_multi_hdf5.py

* Add _get_sample_count function to remove code duplication.

* Use batch_size parameter in generate_hdf.

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Fix after applying suggestions.

* Fix test, check idx is instance of numbers.Integral.

* Update python-package/lightgbm/basic.py

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Expose Sequence class in Python-API doc.

* Handle Sequence object not having batch_size.

* Fix isort lint complaint.

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update docstring to mention Sequence as data input.

* Remove get_one_line in test_basic.py

* Make Sequence an abstract class.

* Reduce number of tests for test_sequence.

* Add c_api: LGBM_SampleCount, fix potential bug in LGBMSampleIndices.

* empty commit to trigger ci

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Rename to LGBM_GetSampleCount, change LGBM_SampleIndices out_len to int32_t.

Also rename total_nrow to num_total_row in c_api.h for consistency.

* Doc about Sequence in docs/Python-Intro.rst.

* Fix: basic.py change LGBM_SampleIndices out_len to int32.

* Add create_valid test case with Dataset from Sequence.

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Apply suggestions from code review

Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

* Remove no longer used DEFAULT_BIN_CONSTRUCT_SAMPLE_CNT.

* Update python-package/lightgbm/basic.py

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Co-authored-by: Willian Zhang <willian@willian.email>
Co-authored-by: Willian Z <Willian@Willian-Zhang.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-07-02 15:17:17 +03:00
James Lamb 84c4b7504a
[docs][dask] add versionadded note to Dask docs (#3935) 2021-02-10 15:19:20 +03:00
Nikita Titov 36322ceeae
[dask][docs] initial setup for Dask docs (#3822)
* initial Dask docs

* fix MRO

* address review comments
2021-01-24 20:58:52 -06:00
Nikita Titov b7ccdaf066
[python] Allow to register custom logger in Python-package (#3820)
* centralize Python-package logging in one place

* continue

* fix test name

* removed unused import

* enhance test

* fix lint

* hotfix test

* workaround for GPU test

* remove custom logger from Dask-package

* replace one log func with flags by multiple funcs
2021-01-24 23:37:45 +03:00
momijiame 1d59a04566
[python] add return_cvbooster flag to cv func and publish _CVBooster (#283,#2105,#1445) (#3204)
* [python] add return_cvbooster flag to cv function and rename _CVBooster to make public (#283,#2105)

* [python] Reduce expected metric of unit testing

* [docs] add the CVBooster to the documentation

* [python] reflect the review comments

- Add some clarifications to the documentation
- Rename CVBooster.append to make private
- Decrease iteration rounds of testing to save CI time
- Use CVBooster as root member of lgb

* [python] add more checks in testing for cv

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* [python] add docstring for instance attributes of CVBooster

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* [python] fix docstring

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2020-08-02 19:03:32 +03:00
Nikita Titov c633c6c2af
[python] Re-enable scikit-learn 0.22+ support (#2949)
* Revert "specify the last supported version of scikit-learn (#2637)"

This reverts commit d100277649.

* ban scikit-learn 0.22.0 and skip broken test

* fix updated test

* fix lint test

* Revert "fix lint test"

This reverts commit 8b4db0805f.
2020-04-10 12:53:21 +09:00
Nikita Titov d100277649 specify the last supported version of scikit-learn (#2637) 2019-12-19 19:00:29 +08:00
Alexander L. Hayes 207bb3ef32 [docs] 🎨 Sphinx Autosummary for generating Python-API documentation (#2286)
* 🎨 `sphinx.ext.autosummary` for generating Python-API summaries

Add `docs/.gitignore` to not track autosummary stubs
Add `sphinx.ext.autosummary` in `docs/conf.py`
  Add 'members' and 'inherited-members' as default parameters
  Add 'autosummary = True' for setting output with `:toctree:`
Add `.. autosummary::` tags to replace `.. autoclass::`

Previously the `Python-API.rst` dumped all of the Python API onto
a single page.

This replaces the Python-API documentation with an index listing
all modules, and paginates all functions and classes onto
separate pages.

* ✏️ Corrections following feedback

Drop `docs/.gitignore` to use the general `.gitignore`
Add `show-inheritance` to `autodoc_default_flags` in `docs/conf.py`
Fix `both` to `class` in `autoclass_content` in `docs/conf.py`

* ✏️ Replacing deprecated Sphinx parameter

Fix deprecated `autodoc_default_flags` to `autodoc_default_options`

* ✏️ Adding `autodoc_default_flags` in to support early Sphinx versions

Add `autodoc_default_flags` with parameters from
  `autodoc_default_options`
2019-07-27 14:59:51 +03:00
Nikita Titov 611cf5d414 [python] added plot_split_value_histogram function (#2043)
* added plot_split_value_histogram function

* updated init module

* added plot split value histogram example

* added plot_split_value_histogram to notebook

* added test

* fixed pylint

* updated API docs

* fixed grammar

* set y ticks to int value in more sufficient way
2019-05-01 23:05:16 +09:00
Nikita Titov cd6d058386
various improvements around metric param and early_stopping_rounds param description (#1589)
* bring consistency and clearness into early_stopping_rounds desc, metric desc and implementation

* hotfix

* hotfix

* used NDCG as default metric for lambdarank task

* fixed missed methods at ReadTheDocs and changed default eval_metric

* leaved only unique metrics

* fixed comment
2018-08-27 14:46:18 +03:00
Nikita Titov 6d34fb8635 [docs] move wiki to Read the Docs (#945)
* fixed Python-API references

* moved Features section to ReadTheDocs

* fixed index of ReadTheDocs

* moved Experiments section to ReadTheDocs

* fixed capital letter

* fixed citing

* moved Parallel Learning section to ReadTheDocs

* fixed markdown

* fixed Python-API

* fixed link to Quick-Start

* fixed gpu docker README

* moved Installation Guide from wiki to ReadTheDocs

* removed references to wiki

* fixed capital letters in headings

* hotfixes

* fixed non-Unicode symbols and reference to Python API

* fixed citing references

* fixed links in .md files

* fixed links in .rst files

* store images locally in the repo

* fixed missed word

* fixed indent in Experiments.rst

* fixed 'Duplicate implicit target name' message which is successfully
resolved by adding anchors

* less verbose

* prevented maito: ref creation

* fixed indents

* fixed 404

* fixed 403

* fixed 301

* fixed fake anchors

* fixed file extentions

* fixed Sphinx warnings

* added StrikerRUS profile link to FAQ

* added henry0312 profile link to FAQ
2017-10-07 23:43:58 +08:00