LightGBM

Граф коммитов

Автор	SHA1	Сообщение	Дата
shiyu1994	ec4bd1e0a4	set is_linear_ to false when it is absent from the model file (fix #3778 ) (#4056 )	2021-03-13 00:44:18 +03:00
Nikita Titov	e5c3f7e755	[docs] add Yu Shi to repo maintainers (#4060 ) * Update FAQ.rst * Update CODEOWNERS	2021-03-10 20:30:11 -06:00
Nikita Titov	8d0669fb4d	set 'pending' commit status for R Solaris optional workflow (#4061 )	2021-03-10 18:29:00 -06:00
James Lamb	15853a7a02	[dask] add tutorial documentation (fixes #3814 , fixes #3838 ) (#4030 ) * [dask] add tutorial documentation (fixes #3814, fixes #3838) * add notes on saving the model * quick start examples * add examples * fix timeouts in examples * remove notebook * fill out prediction section * table of contents * add line back * linting * isort * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * move examples under python-guide * remove unused pickle import Co-authored-by: Nikita Titov <nekit94-08@mail.ru>	2021-03-10 13:34:43 -06:00
James Lamb	296397df7b	[dask] raise more informative error for duplicates in 'machines' (fixes #4057 ) (#4059 ) * [dask] raise more informative error for duplicates in 'machines' * uncomment * avoid test failure * Revert "avoid test failure" This reverts commit `9442bdf00f`.	2021-03-10 12:02:27 -06:00
marcelonieva7	b75a43a05b	Update index.rst (#4029 ) Add alt text to logo image Co-authored-by: James Lamb <jaylamb20@gmail.com>	2021-03-10 09:32:46 -06:00
jmoralez	1d7b54d30f	[dask] include multiclass-classification task in tests (#4048 ) * include multiclass-classification task and task_to_model_factory dicts * define centers coordinates. flatten init_scores within each partition for multiclass-classification * include issue comment and fix linting error	2021-03-09 21:58:38 -06:00
James Lamb	13680d89a1	[ci] add CMake + R 3.6 test back (fixes #3469 ) (#4053 ) * [ci] add CMake + R 3.6 test back (fixes #3469) * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update .ci/test_r_package_windows.ps1 * -Wait and remove rtools40 * empty commit Co-authored-by: Nikita Titov <nekit94-08@mail.ru>	2021-03-09 18:54:01 -06:00
James Lamb	85bda857c0	[ci] fix R CMD CHECK note about example timings (fixes #4049 ) (#4055 ) * [ci] fix R CMD CHECK note about example timings (fixes #4049) * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * empty commit Co-authored-by: Nikita Titov <nekit94-08@mail.ru>	2021-03-09 16:37:06 -06:00
James Lamb	4e9c976867	[ci] prevent getting incompatible dask and distributed versions (#4054 ) * [ci] prevent getting incompatible dask and distributed versions * Update .ci/test.sh Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * empty commit Co-authored-by: Nikita Titov <nekit94-08@mail.ru>	2021-03-09 11:20:53 -06:00
James Lamb	3a5e3c001f	[ci] ignore untitle Jupyter notebooks in .gitignore (#4047 )	2021-03-05 15:20:55 +03:00
jmoralez	37e987828d	[dask] Include support for init_score (#3950 ) * include support for init_score * use dataframe from init_score and test difference with and without init_score in local model * revert refactoring * initial docs. test between distributed models with and without init_score * remove ranker from tests * test value for root node and change docs * comma * re-include parametrize * fix incorrect merge * use single init_score and the booster_ attribute * use np.float64 instead of float	2021-03-04 11:50:08 -06:00
shiyu1994	19f357726c	[docs] update description of deterministic parameter (#4027 ) * update description of deterministic parameter to require using with force_row_wise or force_col_wise * Update include/LightGBM/config.h Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * update docs Co-authored-by: Nikita Titov <nekit94-08@mail.ru>	2021-03-04 15:33:54 +03:00
James Lamb	87c37bf04f	[ci] [R-package] upgrade to R 4.0.4 in CI (#4042 )	2021-03-03 14:12:27 +08:00
Subham Agrawal	f92aa54fa0	[docs] Add alt text to image in Parameters-Tuning.rst (#4035 ) * [docs] Add alt text to image in Parameters-Tuning.rst Add alt text to Leaf-wise growth image, as part of #4028 * Update docs/Parameters-Tuning.rst Co-authored-by: James Lamb <jaylamb20@gmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>	2021-03-02 13:43:01 -06:00
James Lamb	2a00b6ffbc	[dask] [ci] add support for scikit-learn 0.24+ in tests (fixes #4031 ) (#4032 ) * [dask] [ci] add support for scikit-learn 0.24+ in tests (fixes #4031) * Update tests/python_package_test/test_dask.py Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * try upgrading mixtexsetup * they changed the executable name UGH * more changes for executable name * another path change * changing package mirrors * undo experiments Co-authored-by: Nikita Titov <nekit94-08@mail.ru>	2021-03-02 16:29:08 +03:00
Qingyun Wu	6356e659af	[docs] Add FLAML for efficient hyperparameter optimization (#4013 ) * add FLAML for HPO in DOC * add FLAML for HPO * revise FLAML phasing * Update docs/Parameters-Tuning.rst Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update README.md Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: Nikita Titov <nekit94-08@mail.ru>	2021-02-24 08:25:01 -06:00
Nikita Titov	3ab6bbf9f3	[tests][dask] simplify fit calls in Dask tests (#4018 ) * simplify fit calls in Dask tests * Update .vsts-ci.yml * Update .vsts-ci.yml	2021-02-24 08:17:55 -06:00
jmoralez	5dacd603ba	[dask][python-package] include support for column array as label (#3943 ) * include support for column array as label * remove nested ifs * fix linting errors * include tests for sklearn regressors * include docstring for numpy_1d_array_to_dtype * include . at end of docstring * remove pandas import and test for regression, classification and ranking * check predictions of sklearn models as well * test training only in dask. drop pandas series tests * use PANDAS_INSTALLED and pd_Series * inline imports * use col array in fit for test_dask * include review comments	2021-02-24 14:47:49 +03:00
Nikita Titov	86a085f7ca	[tests][python] Add test for single leaf in linear tree (#4015 ) * Update test_engine.py * Update python_package.yml * Update python_package.yml * Update test_engine.py * hotfix	2021-02-24 18:46:05 +11:00
jmoralez	0e57657585	[dask] use random ports in network setup (#3823 ) * use socket.bind with port 0 and client.run to find random open ports * include test for found ports * find random open ports as default * parametrize local_listen_port. type hint to _find_random_open_port. fid open ports only on workers with data. * make indentation consistent and pass list of workers to client.run * remove socket import * change random port implementation * fix test	2021-02-23 22:14:12 -06:00
Nikita Titov	7777852a19	[dask] Reuse addresses saved in variable (#4016 )	2021-02-24 04:05:41 +03:00
James Lamb	1f73f55938	[dask] allow tight control over ports (#3994 ) * [dask] allow tight control over ports * getting there, getting there * fix params maybe * fixing params * remove unnecessary stuff * fix tests * fixes * some minor changes * fix flaky test * linting * more linting * clarify parameter description * add warning * revert docs change * Update python-package/lightgbm/dask.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * trying to fix stuff * this is working * update tests * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * indent Co-authored-by: Nikita Titov <nekit94-08@mail.ru>	2021-02-23 23:48:53 +03:00
Belinda Trotta	b09c1ff70d	[DOCS] Update docs to note that pred_contrib is not available for linear trees (#4006 ) * Update docs to note that pred_contrib is not available for linear trees * Add warning in code * Change warning to error	2021-02-23 17:52:13 +03:00
James Lamb	7171558444	[doc] Reorganize documentation on distributed learning (fixes #3596 ) (#3951 ) * rework distributed learning page * more references * more changes * more changes * add anchors for olds links * revert changes from #4000 * fix links * more links * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update docs/Parallel-Learning-Guide.rst Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: Nikita Titov <nekit94-08@mail.ru>	2021-02-21 20:43:02 -06:00
mjmckp	605c97b5ee	Fix evalution of linear trees with a single leaf. (#3987 ) * Fix index out-of-range exception generated by BaggingHelper on small datasets. Prior to this change, the line "score_t threshold = tmp_gradients[top_k - 1];" would generate an exception, since tmp_gradients would be empty when the cnt input value to the function is zero. * Update goss.hpp * Update goss.hpp * Add API method LGBM_BoosterPredictForMats which runs prediction on a data set given as of array of pointers to rows (as opposed to existing method LGBM_BoosterPredictForMat which requires data given as contiguous array) * Fix incorrect upstream merge * Add link to LightGBM.NET * Fix indenting to 2 spaces * Dummy edit to trigger CI * Dummy edit to trigger CI * remove duplicate functions from merge * Fix evalution of linear trees with a single leaf. Note that trees without linear models at the leaf always handle num_leaves = 1 as a special case and directly output the leaf value. Linear trees were missing this special case handling, and hence would have the following issues: * Calling Tree::Predict or Tree::PredictByMap would cause an access violation exception attempting to access the first value of the empty split_feature_ array in GetLeaf. * PredictionFunLinear would either cause an access violation or go into an infinite loop when attempting to do the equivalent of GetLeaf. Note also that PredictionFun does not need the same changes as PredictionFunLinear, since both are only called by Tree::AddPredictionToScore, which has a special case for (!is_linear_ && num_leaves_ <= 1) that precludes calling PredictionFun. Co-authored-by: matthew-peacock <matthew.peacock@whiteoakam.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com>	2021-02-21 17:15:16 -06:00
James Lamb	b1d382ee0c	[ci] prefer older binary to new source for R packages on Mac builds (fixes #4008 ) (#4010 ) * [ci] prefer older binary to new source for R packages * back to binary * preserve choice on Linux	2021-02-21 17:11:17 -06:00
James Lamb	646267d265	[dask] use more specific method names on _DaskLGBMModel (#4004 )	2021-02-20 14:39:55 +03:00
mjmckp	7f91dc66f9	Use high precision conversion from double to string in Tree::ToString() for new linear tree members (#3938 ) * Fix index out-of-range exception generated by BaggingHelper on small datasets. Prior to this change, the line "score_t threshold = tmp_gradients[top_k - 1];" would generate an exception, since tmp_gradients would be empty when the cnt input value to the function is zero. * Update goss.hpp * Update goss.hpp * Add API method LGBM_BoosterPredictForMats which runs prediction on a data set given as of array of pointers to rows (as opposed to existing method LGBM_BoosterPredictForMat which requires data given as contiguous array) * Fix incorrect upstream merge * Add link to LightGBM.NET * Fix indenting to 2 spaces * Dummy edit to trigger CI * Dummy edit to trigger CI * remove duplicate functions from merge * In Tree::ToString() method, print double values for linear tree models with high precision, so that the tree may be accurately reproduced elsewhere (LightGBM.Net in particular) * Need to use more precise StringToArray instead of StringToArrayFast when parsing double valued arrays for linear trees, to ensure models round-trip via string or file correctly. Co-authored-by: matthew-peacock <matthew.peacock@whiteoakam.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com>	2021-02-20 07:28:18 +08:00
James Lamb	7880b79fde	[docs] Change some 'parallel learning' references to 'distributed learning' (#4000 ) * [docs] Change some 'parallel learning' references to 'distributed learning' * found a few more * one more reference	2021-02-19 09:47:30 -06:00
James Lamb	0ee4d37fb5	remove commented-out code in cross-entropy metric source (#3999 )	2021-02-18 23:09:17 -06:00
imjwang	eb5f471bc1	[tests][dask] add scikit-learn compatibility tests (fixes #3894 ) (#3947 ) * add test_dask.py * Update tests/python_package_test/test_dask.py Co-authored-by: James Lamb <jaylamb20@gmail.com> * clients * remove ports * safe sklearn checks * safe sklearn checks * fix whitespace * fix whitespace-try 2 * fix whitespace-try 3 * isort * isort * sklearn_checks_to_learn Co-authored-by: James Lamb <jaylamb20@gmail.com>	2021-02-18 05:28:39 +03:00
James Lamb	a3f4831d75	[tests][dask] make find-open-port test more reliable (#3993 ) * [dask] make find-open-port test more reliable * use listen_port fixture * Apply suggestions from code review	2021-02-18 03:59:35 +03:00
mjmckp	5321fef67b	Fix for CreatePredictor function and VS2017 Debug build (#3937 ) * Fix index out-of-range exception generated by BaggingHelper on small datasets. Prior to this change, the line "score_t threshold = tmp_gradients[top_k - 1];" would generate an exception, since tmp_gradients would be empty when the cnt input value to the function is zero. * Update goss.hpp * Update goss.hpp * Add API method LGBM_BoosterPredictForMats which runs prediction on a data set given as of array of pointers to rows (as opposed to existing method LGBM_BoosterPredictForMat which requires data given as contiguous array) * Fix incorrect upstream merge * Add link to LightGBM.NET * Fix indenting to 2 spaces * Dummy edit to trigger CI * Dummy edit to trigger CI * remove duplicate functions from merge * Fix for CreatePredictor function: for VS2017 in Debug build, the previous version would end up giving an uninitialised prediction function that would throw access violation exceptions when invoked. Co-authored-by: matthew-peacock <matthew.peacock@whiteoakam.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com>	2021-02-17 15:17:11 +03:00
Alex Ford	de8c610512	Optimize array-from-ctypes in basic.py (#3927 ) Approximately %80 of runtime when loading "low column count, high row count" DataFrames into Datasets is consumed in `np.fromiter`, called as part of the `Dataset.get_field` method. This is particularly pernicious hotspot, as unlike other ctypes-based methods this is a hot loop over a python iterator loop and causes significant GIL-contention in multi-threaded applications. Replace `np.fromiter` with a direct call to `np.ctypeslib.as_array`, which allows a single-shot `copy` of the underlying array. This reduces the load time of a ~35 million row categorical dataframe with 1 column from ~5 seconds to ~1 second, and allows multi-threaded execution.	2021-02-16 23:23:48 -06:00
Nikita Titov	75b9b0d3c8	[ci][python] hotfix imports order (#3992 )	2021-02-17 01:18:37 +03:00
Nikita Titov	1413c060b0	Run tests and build Python wheels for aarch64 architecture (#3948 ) * Update setup.sh * Update test.sh * Update test_dask.py * Update test_engine.py * Update .vsts-ci.yml	2021-02-16 23:35:37 +03:00
Nikita Titov	d6ebd063ff	[ci][python] run isort in CI linting job (#3990 ) * run isort in CI linting job * workaround conda compatibility issues	2021-02-16 20:09:13 +03:00
Zhuyi Xue	4ae59494ab	[ci][python] apply isort to python-package/lightgbm/compat.py #3958 (#3968 )	2021-02-16 15:05:33 +03:00
Zhuyi Xue	6110bd1585	[ci][python] apply isort to python-package/lightgbm/engine.py #3958 (#3970 )	2021-02-16 15:04:21 +03:00
Zhuyi Xue	1248d55f0d	[ci][python] apply isort to tests/python_package_test/test_engine.py #3958 (#3981 )	2021-02-16 15:02:36 +03:00
Zhuyi Xue	af0c226057	[ci][python] apply isort to python-package/lightgbm/basic.py #3958 (#3967 )	2021-02-16 05:22:12 +03:00
Zhuyi Xue	9b64b9c91b	[ci][python] apply isort to python-package/lightgbm/__init__.py #3958 (#3966 )	2021-02-16 04:03:00 +03:00
Zhuyi Xue	acb677416a	[ci][python] apply isort to python-package/lightgbm/sklearn.py #3958 (#3973 )	2021-02-16 03:37:16 +03:00
Zhuyi Xue	9445b2ca26	[ci][python] apply isort to tests/python_package_test/test_basic.py #3958 (#3977 )	2021-02-16 03:06:09 +03:00
Zhuyi Xue	d64fcbe080	[ci][python] apply isort to tests/python_package_test/test_consistency.py #3958 (#3978 )	2021-02-16 03:04:03 +03:00
Zhuyi Xue	cac97d0c51	[ci][python] apply isort to tests/python_package_test/test_plotting.py #3958 (#3982 )	2021-02-16 01:10:58 +03:00
Zhuyi Xue	0cb94fa59a	[ci][python] apply isort to tests/python_package_test/test_utilities.py #3958 (#3984 )	2021-02-16 01:08:48 +03:00
Zhuyi Xue	332b0db5ef	[ci][python] apply isort to python-package/setup.py #3958 (#3974 )	2021-02-15 23:44:21 +03:00
Zhuyi Xue	e9ea85bd06	[ci][python] apply isort to python-package/lightgbm/plotting.py #3958 (#3972 )	2021-02-15 23:41:21 +03:00

1 2 3 4 5 ...

2388 Коммитов Все ветки Поиск

2388 Коммитов

Все ветки