Граф коммитов

616 Коммитов

Автор SHA1 Сообщение Дата
Thomas J. Fan e4658e1aa8
[docs] Add scikit-learn for intersphinx (fixes #5954) (#5956) 2023-07-03 21:55:25 -05:00
James Lamb 3df9698bf3
[docs] more fixes for broken links (#5941) 2023-06-23 14:00:11 -05:00
James Lamb 22c09f90cf
[docs] fix broken links in docs (#5939) 2023-06-23 10:34:49 -05:00
James Lamb 1c8a7abdea
[c++] support building with Ninja on Linux (#5910) 2023-06-08 21:06:19 -05:00
Jacek Laskowski ad487fee57
[docs] Update Quick Start (#5900) 2023-05-31 20:02:20 -05:00
James Lamb e75352cf2c
[docs] recommend a more efficient source installation of the python package (#5881) 2023-05-10 14:36:04 -05:00
shiyu1994 17ecfab335
Add quantized training (CPU part) (#5800)
* add quantized training (first stage)

* add histogram construction functions for integer gradients

* add stochastic rounding

* update docs

* fix compilation errors by adding template instantiations

* update files for compilation

* fix compilation of gpu version

* initialize gradient discretizer before share states

* add a test case for quantized training

* add quantized training for data distributed training

* Delete origin.pred

* Delete ifelse.pred

* Delete LightGBM_model.txt

* remove useless changes

* fix lint error

* remove debug loggings

* fix mismatch of vector and allocator types

* remove changes in main.cpp

* fix bugs with uninitialized gradient discretizer

* initialize ordered gradients in gradient discretizer

* disable quantized training with gpu and cuda

fix msvc compilation errors and warnings

* fix bug in data parallel tree learner

* make quantized training test deterministic

* make quantized training in test case more accurate

* refactor test_quantized_training

* fix leaf splits initialization with quantized training

* check distributed quantized training result
2023-05-05 16:41:48 +08:00
James Lamb a97c444b4c
[ci] [python-package] replace 'python setup.py' with a shell script (#5837) 2023-05-04 17:06:11 -05:00
Mohamed Ziada 771bad8cca
[docs] adjusted bagging_freq parameter description (#5698) 2023-02-12 18:27:56 -06:00
James Lamb 4f47547c88
[CUDA] consolidate CUDA versions (#5677)
* [ci] speed up if-else, swig, and lint conda setup

* add 'source activate'

* python constraint

* start removing cuda v1

* comment out CI

* remove more references

* revert some unnecessaary changes

* revert a few more mistakes

* revert another change that ignored params

* sigh

* remove CUDATreeLearner

* fix tests, docs

* fix quoting in setup.py

* restore all CI

* Apply suggestions from code review

Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

* Apply suggestions from code review

* completely remove cuda_exp, update docs

---------

Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>
2023-02-01 11:27:52 +08:00
James Lamb 5ffd757119
[docs] remove unnecessary interactivity in 'docker run' instructions (#5692) 2023-01-31 13:11:07 -06:00
James Lamb 3c3f79e771
[docs] encourage use of releases for GUI-only installations (#5649) 2023-01-16 22:08:52 -06:00
James Lamb 02d212b4c0
[ci] [python-package] enforce flake8 checks (fixes #5566) (#5659) 2023-01-12 15:59:52 -06:00
Yifei Liu fffd066cb3
Decouple Boosting Types (fixes #3128) (#4827)
* add parameter data_sample_strategy

* abstract GOSS as a sample strategy(GOSS1), togetherwith origial GOSS (Normal Bagging has not been abstracted, so do NOT use it now)

* abstract Bagging as a subclass (BAGGING), but original Bagging members in GBDT are still kept

* fix some variables

* remove GOSS(as boost) and Bagging logic in GBDT

* rename GOSS1 to GOSS(as sample strategy)

* add warning about use GOSS as boosting_type

* a little ; bug

* remove CHECK when "gradients != nullptr"

* rename DataSampleStrategy to avoid confusion

* remove and add some ccomments, followingconvention

* fix bug about GBDT::ResetConfig (ObjectiveFunction inconsistencty bet…

* add std::ignore to avoid compiler warnings (anpotential fails)

* update Makevars and vcxproj

* handle constant hessian

move resize of gradient vectors out of sample strategy

* mark override for IsHessianChange

* fix lint errors

* rerun parameter_generator.py

* update config_auto.cpp

* delete redundant blank line

* update num_data_ when train_data_ is updated

set gradients and hessians when GOSS

* check bagging_freq is not zero

* reset config_ value

merge ResetBaggingConfig and ResetGOSS

* remove useless check

* add ttests in test_engine.py

* remove whitespace in blank line

* remove arguments verbose_eval and evals_result

* Update tests/python_package_test/test_engine.py

reduce num_boost_round

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

reduce num_boost_round

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update src/boosting/sample_strategy.cpp

modify warning about setting goss as `boosting_type`

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update tests/python_package_test/test_engine.py

replace load_boston() with make_regression()

remove value checks of mean_squared_error in test_sample_strategy_with_boosting()

* Update tests/python_package_test/test_engine.py

add value checks of mean_squared_error in test_sample_strategy_with_boosting()

* Modify warnning about using goss as boosting type

* Update tests/python_package_test/test_engine.py

add random_state=42 for make_regression()

reduce the threshold of mean_square_error

* Update src/boosting/sample_strategy.cpp

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* remove goss from boosting types in documentation

* Update src/boosting/bagging.hpp

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update src/boosting/bagging.hpp

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update src/boosting/goss.hpp

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update src/boosting/goss.hpp

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* rename GOSS with GOSSStrategy

* update doc

* address comments

* fix table in doc

* Update include/LightGBM/config.h

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* update documentation

* update test case

* revert useless change in test_engine.py

* add tests for evaluation results in test_sample_strategy_with_boosting

* include <string>

* change to assert_allclose in test_goss_boosting_and_strategy_equivalent

* more tolerance in result checking, due to minor difference in results of gpu versions

* change == to np.testing.assert_allclose

* fix test case

* set gpu_use_dp to true

* change --report to --report-level for rstcheck

* use gpu_use_dp=true in test_goss_boosting_and_strategy_equivalent

* revert unexpected changes of non-ascii characters

* revert unexpected changes of non-ascii characters

* remove useless changes

* allocate gradients_pointer_ and hessians_pointer when necessary

* add spaces

* remove redundant virtual

* include <LightGBM/utils/log.h> for USE_CUDA

* check for  in test_goss_boosting_and_strategy_equivalent

* check for identity in test_sample_strategy_with_boosting

* remove cuda  option in test_sample_strategy_with_boosting

* Update tests/python_package_test/test_engine.py

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update tests/python_package_test/test_engine.py

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* ResetGradientBuffers after ResetSampleConfig

* ResetGradientBuffers after ResetSampleConfig

* ResetGradientBuffers after bagging

* remove useless code

* check objective_function_ instead of gradients

* enable rf with goss

simplify params in test cases

* remove useless changes

* allow rf with feature subsampling alone

* change position of ResetGradientBuffers

* check for dask

* add parameter types for data_sample_strategy

Co-authored-by: Guangda Liu <v-guangdaliu@microsoft.com>
Co-authored-by: Yu Shi <shiyu_k1994@qq.com>
Co-authored-by: GuangdaLiu <90019144+GuangdaLiu@users.noreply.github.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2022-12-28 14:09:11 +08:00
James Lamb 61ef3ada68
[ci] switch to manylinux_2_28 for Linux artifacts (fixes #5514, fixes #5589) (#5580) 2022-11-20 23:13:38 -06:00
Jianting Feng 9dae0e6d91
[docs] fix a typo Features.rst (#5463)
fix a typo in docs/Features.rst
2022-09-02 22:20:46 -05:00
Nikita Titov 5fceb4a156
[docs] Fix links (#5451)
* Update Parallel-Learning-Guide.rst

* Update GPU-Windows.rst
2022-08-30 16:32:38 +03:00
James Lamb f12d465101
[docs] [R-package] upgrade to roxygen2 7.2.1 (#5381) 2022-07-28 09:13:12 -05:00
James Lamb 8d12dca136
[ci] make check-docs job compatible with rstcheck 6.x (#5388) 2022-07-27 23:23:44 -05:00
James Lamb feb4cf4f47
[docs] set language = 'en' in Sphinx config (#5322) 2022-06-22 11:54:09 -05:00
Nikita Titov ab9236baa0
[docs] Use https links in docs (#5284)
Use https links in docs
2022-06-13 21:57:13 +03:00
Jonathan Giannuzzi 6b89651de1
[ci] Run Linux OpenCL tests against POCL instead of the AMD App SDK (#5282)
* Run OpenCL tests against POCL instead of the AMD App SDK

* Update .ci/setup.sh

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Run Linux gpu source on default Python version

* [docs] Update GPU Targets Table

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2022-06-13 19:58:34 +03:00
James Lamb 1cc9f9dcee
[R-package] [docs] upgrade docs to roxygen2==7.2.0 (#5251)
[R-package] upgrade docs to roxygen2==7.2.0
2022-06-04 23:42:35 -05:00
Nikita Titov f715645725
[docs] Update broken links in docs (#5265)
Update GPU-Windows.rst
2022-06-04 19:37:21 -05:00
James Lamb eababef8bc
[R-package] remove lgb.unloader() (#5204)
* [R-package] remove lgb.unloader()

* update version reference
2022-05-10 17:20:37 -05:00
Nikita Titov a52d9c3097
[ci] Update version of Azure REST API (#5172) 2022-04-24 10:44:40 -05:00
Samuel Wilson fc0c8fd452
[docs] Fix formula in path smoothing docs (fixes #5139)(#5154) 2022-04-15 09:12:15 -05:00
James Lamb b462d0a40f
[ci] update to R 4.1.3 and use macOS-latest for R jobs (fixes #4990) (#5137)
* [ci] update to R 4.1.3 and use macOS-latest for R jobs (fixes #4990)

* update Windows version

* update docs env

* simplify r-package config
2022-04-09 22:57:19 -05:00
Pablo Dávila Herrero 5f57d6c673
[docs] Document behaviour of the first linear estimator (#5132)
* Document behaviour of the first linear estimator

* Properly update docs

Co-authored-by: Pablo-Davila <Pablo-Davila@users.noreply.github.com>
2022-04-10 03:10:17 +03:00
shiyu1994 17d4e00785
Load initial scores with binary data files in CLI version (#4807) 2022-03-26 22:21:04 +03:00
shiyu1994 6b56a90cd1
[CUDA] New CUDA version Part 1 (#4630)
* new cuda framework

* add histogram construction kernel

* before removing multi-gpu

* new cuda framework

* tree learner cuda kernels

* single tree framework ready

* single tree training framework

* remove comments

* boosting with cuda

* optimize for best split find

* data split

* move boosting into cuda

* parallel synchronize best split point

* merge split data kernels

* before code refactor

* use tasks instead of features as units for split finding

* refactor cuda best split finder

* fix configuration error with small leaves in data split

* skip histogram construction of too small leaf

* skip split finding of invalid leaves

stop when no leaf to split

* support row wise with CUDA

* copy data for split by column

* copy data from host to CPU by column for data partition

* add synchronize best splits for one leaf from multiple blocks

* partition dense row data

* fix sync best split from task blocks

* add support for sparse row wise for CUDA

* remove useless code

* add l2 regression objective

* sparse multi value bin enabled for CUDA

* fix cuda ranking objective

* support for number of items <= 2048 per query

* speedup histogram construction by interleaving global memory access

* split optimization

* add cuda tree predictor

* remove comma

* refactor objective and score updater

* before use struct

* use structure for split information

* use structure for leaf splits

* return CUDASplitInfo directly after finding best split

* split with CUDATree directly

* use cuda row data in cuda histogram constructor

* clean src/treelearner/cuda

* gather shared cuda device functions

* put shared CUDA functions into header file

* change smaller leaf from <= back to < for consistent result with CPU

* add tree predictor

* remove useless cuda_tree_predictor

* predict on CUDA with pipeline

* add global sort algorithms

* add global argsort for queries with many items in ranking tasks

* remove limitation of maximum number of items per query in ranking

* add cuda metrics

* fix CUDA AUC

* remove debug code

* add regression metrics

* remove useless file

* don't use mask in shuffle reduce

* add more regression objectives

* fix cuda mape loss

add cuda xentropy loss

* use template for different versions of BitonicArgSortDevice

* add multiclass metrics

* add ndcg metric

* fix cross entropy objectives and metrics

* fix cross entropy and ndcg metrics

* add support for customized objective in CUDA

* complete multiclass ova for CUDA

* separate cuda tree learner

* use shuffle based prefix sum

* clean up cuda_algorithms.hpp

* add copy subset on CUDA

* add bagging for CUDA

* clean up code

* copy gradients from host to device

* support bagging without using subset

* add support of bagging with subset for CUDAColumnData

* add support of bagging with subset for dense CUDARowData

* refactor copy sparse subrow

* use copy subset for column subset

* add reset train data and reset config for CUDA tree learner

add deconstructors for cuda tree learner

* add USE_CUDA ifdef to cuda tree learner files

* check that dataset doesn't contain CUDA tree learner

* remove printf debug information

* use full new cuda tree learner only when using single GPU

* disable all CUDA code when using CPU version

* recover main.cpp

* add cpp files for multi value bins

* update LightGBM.vcxproj

* update LightGBM.vcxproj

fix lint errors

* fix lint errors

* fix lint errors

* update Makevars

fix lint errors

* fix the case with 0 feature and 0 bin

fix split finding for invalid leaves

create cuda column data when loaded from bin file

* fix lint errors

hide GetRowWiseData when cuda is not used

* recover default device type to cpu

* fix na_as_missing case

fix cuda feature meta information

* fix UpdateDataIndexToLeafIndexKernel

* create CUDA trees when needed in CUDADataPartition::UpdateTrainScore

* add refit by tree for cuda tree learner

* fix test_refit in test_engine.py

* create set of large bin partitions in CUDARowData

* add histogram construction for columns with a large number of bins

* add find best split for categorical features on CUDA

* add bitvectors for categorical split

* cuda data partition split for categorical features

* fix split tree with categorical feature

* fix categorical feature splits

* refactor cuda_data_partition.cu with multi-level templates

* refactor CUDABestSplitFinder by grouping task information into struct

* pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder

* fix misuse of reference

* remove useless changes

* add support for path smoothing

* virtual destructor for LightGBM::Tree

* fix overlapped cat threshold in best split infos

* reset histogram pointers in data partition and spllit finder in ResetConfig

* comment useless parameter

* fix reverse case when na is missing and default bin is zero

* fix mfb_is_na and mfb_is_zero and is_single_feature_column

* remove debug log

* fix cat_l2 when one-hot

fix gradient copy when data subset is used

* switch shared histogram size according to CUDA version

* gpu_use_dp=true when cuda test

* revert modification in config.h

* fix setting of gpu_use_dp=true in .ci/test.sh

* fix linter errors

* fix linter error

remove useless change

* recover main.cpp

* separate cuda_exp and cuda

* fix ci bash scripts

add description for cuda_exp

* add USE_CUDA_EXP flag

* switch off USE_CUDA_EXP

* revert changes in python-packages

* more careful separation for USE_CUDA_EXP

* fix CUDARowData::DivideCUDAFeatureGroups

fix set fields for cuda metadata

* revert config.h

* fix test settings for cuda experimental version

* skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version

* fix lint issue by adding a blank line

* fix lint errors by resorting imports

* fix lint errors by resorting imports

* fix lint errors by resorting imports

* merge cuda.yml and cuda_exp.yml

* update python version in cuda.yml

* remove cuda_exp.yml

* remove unrelated changes

* fix compilation warnings

fix cuda exp ci task name

* recover task

* use multi-level template in histogram construction

check split only in debug mode

* ignore NVCC related lines in parameter_generator.py

* update job name for CUDA tests

* apply review suggestions

* Update .github/workflows/cuda.yml

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update .github/workflows/cuda.yml

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* update header

* remove useless TODOs

* remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062

* #include <LightGBM/utils/log.h> for USE_CUDA_EXP only

* fix include order

* fix include order

* remove extra space

* address review comments

* add warning when cuda_exp is used together with deterministic

* add comment about gpu_use_dp in .ci/test.sh

* revert changing order of included headers

Co-authored-by: Yu Shi <shiyu1994@qq.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2022-03-23 10:39:23 +08:00
Nikita Titov 8e721c564d
[docs] Improve rendering of class properties in docs (fixes #5073) 2022-03-17 19:43:57 -05:00
Nikita Titov 0ef8fe0d4a
[docs] update categorical feature description in Advanced Topics (#5044) 2022-03-05 04:36:11 +03:00
Miguel Trejo Marrufo 6ced58ad9a
[Docs] Weights non-negative for train data (#5013)
* docs: weight parameter non-negative

* docs: weights non negative only for train data

* docs: weights should be non negative for validation data

* typo in html render

* docs: brief weights non-negative description
2022-02-23 13:40:58 +08:00
Nikita Titov caa087bcc5
[docs] improve docs for sklearn wrapper (fixes #4479) (#5026)
* improve docs for sklearn wrapper

* empty commit

* install add scikit-learn to conda environment for building docs
2022-02-22 21:49:39 -06:00
José Morales 820ae7e65e
[docs] clarify that categorical features will be converted to integers internally (#4959)
* clarify that categoricals will be converted to ints and not that they should be ints in the input data

* update remaining sections

* update config.h

* add suggestions
2022-02-20 23:30:49 +08:00
James Lamb 2f27d4b226
[ci][docs] use miniforge for readthedocs builds (fixes #4954) (#4957)
* [ci] [docs] use mamba for readthedocs builds (fixes #4954)

* update docs

* simplify build script and add docs flag to gitignore

* exit with non-0 if build fails

* update CI job

* add doxygen

* remove outdated requirement_base.txt reference

* use conda create instead of conda env create

* fix conda create flags

* add nodefaults to env.yml

* Update docs/README.rst

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* try to fix check-docs CI job

* additional changes

* switch from mamba to miniforge

* simplify docker command and fix issues in local build script

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* update docs and conda

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2022-02-19 06:29:56 +03:00
Nikita Titov 057ba07801
[docs] document rounding behavior of floating point numbers in categorical features (#5009) 2022-02-17 01:28:54 +03:00
Yu Shi 2d1caf140a document rounding behavior of floating point numbers in categorical features 2022-02-14 16:23:28 +00:00
Nikita Titov 49a10c3222
[docs] document `conda-forge` channel preference over `default` one and describe possible workaround for OpenMP conflicts in FAQ (#4994)
* Update README.rst

* Update FAQ.rst

* Python FAQ entry about conda-forge

* fix syntax
2022-02-12 03:50:39 +03:00
James Lamb a06fadfb7a
[dask] add support for custom objective functions (fixes #3934) (#4920)
* add test for custom objective with regressor

* add test for custom binary classification objective with classifier

* isort

* got tests working for multiclass

* update docs

* train deeper model for classifier

* Apply suggestions from code review

Co-authored-by: José Morales <jmoralz92@gmail.com>

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* update multiclass tests

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* fix multiclass probabilities

* linting

Co-authored-by: José Morales <jmoralz92@gmail.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2022-01-17 23:30:26 +03:00
Nikita Titov 4aaeb22932
[docs] minor docs improvements (#4938)
* Update README.rst

* Update FAQ.rst
2022-01-09 16:58:00 +03:00
Nikita Titov ce486e5b45
[python] remove `early_stopping_rounds` argument of `train()` and `cv()` functions (#4908) 2021-12-26 17:20:49 +03:00
Nikita Titov 90a71b9403
Add support for Visual Studio 2022 (#4889)
* Update .vsts-ci.yml

* Update .vsts-ci.yml

* Update Installation-Guide.rst

* Update install.libs.R

* Update setup.py

* Update r_package.yml

* Update install.libs.R
2021-12-16 00:28:26 +03:00
James Lamb b0327574fe
[R-package] remove Dataset `setinfo()` (#4854)
* [R-package] remove Dataset setinfo()

* revert unintended docs changes

* fix examples

* revert FAQ changes

* empty commit
2021-12-07 02:23:11 +03:00
Nikita Titov 67b4205c80
[docs] document that `pred_early_stop` can be used only in normal and raw scores prediction (#4823) 2021-11-28 20:06:39 -06:00
Nikita Titov 2f5d898520
[docs][python] simplify mocking in docs (#4830) 2021-11-28 16:50:56 +03:00
James Lamb 5fa887bb79
[R-package] [docs] add intro vignette (#3946) (#4775)
* [R-package] [docs] add intro vignette (#3946)

* add 10 test vignettes

* Revert "add 10 test vignettes"

This reverts commit 40fb2e2f19.

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Co-authored-by: Michael Mayer <mayermichael79@gmail.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-17 22:15:32 -06:00
chjinche b0137debe6
Add customized parser support (#4782)
* add customized parser support

* fix typo of parser_config_file description

* make delimiter as parameter of JoinedLines
2021-11-16 14:27:23 +08:00
Nikita Titov 6e6fb14cdf
[docs] fix broken SynapseML link (#4795) 2021-11-13 06:14:16 +03:00
Michael Mahoney 3b6ebd794b
Add 'nrounds' as an alias for 'num_iterations' (fixes #4743) (#4746)
* Add 'nrounds' as an alias for 'num_iterations'

* Improve tests

* Compare against nrounds directly

* Fix whitespace lints
2021-11-11 12:12:17 +08:00
James Lamb 99f0f3ecf1
[ci] only use conda-forge when installing R packages in docs builds (#4767) 2021-11-05 16:56:33 +03:00
Nikita Titov da98f24711
[docs ][python] add `datatable` to the mocked modules during docs building process and sort them alphabetically (#4750) 2021-10-31 20:28:44 -05:00
Nikita Titov dac0dffeb2
[docs] improve docs about `nthreads` parameter (#4756)
* in predict(), respect params set via `set_params()` after fit()

* extract docs changes
2021-10-31 02:31:48 +03:00
James Lamb 585b86a370
[docs] fix broken SynapseML link (#4712)
* [docs] fix broken SynapseML link

* just remove link
2021-10-28 01:40:04 +03:00
shiyu1994 717f037c3a
[docs] Add Tong Wu and Zhiyuan He as code owners (#4717)
* add hzy46 and tongwu-msft as code owners

* fix github link for Zhiyuan He
2021-10-26 16:01:06 +08:00
Nikita Titov df12c1b955
[docs] fix R API link to point to the current version of docs (#4691)
* fix R API link to point to the current version of docs
2021-10-24 22:28:29 -05:00
Zhiyuan He dc02dcaf02
Fix some paramater hints when loading from binary file (#4701)
Co-authored-by: hzy46 <email@example.com>
2021-10-25 09:48:02 +08:00
Nikita Titov d88b44566e
[docs] fix C API docs rendering (#4688)
* fix C API docs rendering

* place comments before members they describe
2021-10-22 02:21:05 +03:00
Nikita Titov e95d5ab849
add param aliases from scikit-learn (#4637) 2021-10-05 10:45:29 +08:00
Nikita Titov b5557b6b98
[docs] add Mars to docs (#4616)
* Update README.md

* Update Parallel-Learning-Guide.rst

* Update README.md

* Update docs/Parallel-Learning-Guide.rst

Co-authored-by: James Lamb <jaylamb20@gmail.com>

Co-authored-by: James Lamb <jaylamb20@gmail.com>
2021-09-21 21:55:25 +03:00
Nikita Titov d10f9d43c4
[docs] update link to MinGW-w64 site (#4606)
* Update README.rst

* Update README.md

* Update Installation-Guide.rst
2021-09-17 21:50:14 +03:00
Nikita Titov 54facc4d72
[python] rename `print_evaluation()` into `log_evaluation()` (#4604)
* Update __init__.py

* Update Python-API.rst

* Update engine.py

* Update test_utilities.py

* Update sklearn.py

* Update callback.py

* Update callback.py

* Update callback.py
2021-09-16 01:26:02 +03:00
James Lamb fa1d06f136
[docs] add lightgbm_ray to docs (#4584)
* [docs] add lightgbm_ray to docs

* add docs link

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* move Ray

* Update docs/Parallel-Learning-Guide.rst

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-12@hotmail.com>
2021-09-09 19:10:03 +03:00
Nikita Titov ba8533c1ee
[docs] add José Morales to repo maintainers (#4563)
* Update CODEOWNERS

* Update FAQ.rst
2021-08-28 12:58:03 -05:00
Nikita Titov bdee57c0aa
[docs] update links to SynapseML (former MMLSpark) (#4564)
* Update README.md

* Update Parallel-Learning-Guide.rst
2021-08-28 00:31:49 -05:00
James Lamb 417ba19217
[docs] Clarify the fact that predict() on a file does not support saved Datasets (fixes #4034) (#4545)
* documentation changes

* add list of supported formats to error message

* add unit tests

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* update per review comments

* make references consistent

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-08-24 21:33:13 -05:00
James Lamb 67f2cb3162
[python] add type hints in docs/conf.py (#4526)
* [python] add type hints in docs/conf.py

* more specific hint for sphinx app

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-08-18 20:31:59 -05:00
Nikita Titov 521a5c47e3
[docs] Add notes in installation guide, including ones about OpenMP (#4520) 2021-08-17 21:16:27 +03:00
James Lamb 3c781ba08c
[docs] [R-package] use CRAN-style builds when building pkgdown site (#4513)
* [docs] [R-package] use CRAN-style builds when building pkgdown site

* install with --with-keep.source

* empty commit

* set new_proccess = FALSE to get a better traceback

* copy pkgdown config
2021-08-14 15:39:12 +03:00
Nikita Titov 2370961ae0
[docs][ci] bump versions of R-package dependencies at RTD (#4488)
* remove R docs

* bump deps
2021-07-26 19:41:33 +03:00
James Lamb fdc582ea6b
[docs] document CLI behavior when label_column is omitted (#4485) 2021-07-24 23:05:48 -05:00
Nikita Titov 96583ab589
[python] migrate to pathlib in setup.py and use `absolute()` on paths first (#4444)
* use absolute() on paths first

* migrate to pathlib in setup.py
2021-07-10 16:18:50 +03:00
Nikita Titov 0d1d12fb46
[docs] clarify description of prediction early stopping (#4411) 2021-07-09 17:22:09 +03:00
Nikita Titov 02ca158fe0
[python] migrate to pathlib in conf.py (#4427) 2021-07-03 00:24:00 +03:00
Chen Yufei c359896e9b
[python-package] Create Dataset from multiple data files (#4089)
* [python-package] create Dataset from sampled data.

* [python-package] create Dataset from List[Sequence].

1. Use random access for data sampling
2. Support read data from multiple input files
3. Read data in batch so no need to hold all data in memory

* [python-package] example: create Dataset from multiple HDF5 file.

* fix: revert is_class implementation for seq

* fix: unwanted memory view reference for seq

* fix: seq is_class accepts sklearn matrices

* fix: requirements for example

* fix: pycode

* feat: print static code linting stage

* fix: linting: avoid shell str regex conversion

* code style: doc style

* code style: isort

* fix ci dependency: h5py on windows

* [py] remove rm files in test seq
https://github.com/microsoft/LightGBM/pull/4089#discussion_r612929623

* docs(python): init_from_sample summary

https://github.com/microsoft/LightGBM/pull/4089#discussion_r612903389

* remove dataset dump sample data debugging code.

* remove typo fix.

Create separate PR for this.

* fix typo in src/c_api.cpp

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* style(linting): py3 type hint for seq

* test(basic): os.path style path handling

* Revert "feat: print static code linting stage"

This reverts commit 10bd79f7f8.

* feat(python): sequence on validation set

* minor(python): comment

* minor(python): test option hint

* style(python): fix code linting

* style(python): add pydoc for ref_dataset

* doc(python): sequence

Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

* revert(python): sequence class abc

* chore(python): remove rm_files

* Remove useless static_assert.

* refactor: test_basic test for sequence.

* fix lint complaint.

* remove dataset._dump_text in sequence test.

* Fix reverting typo fix.

* Apply suggestions from code review

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Fix type hint, code and doc style.

* fix failing test_basic.

* Remove TODO about keep constant in sync with cpp.

* Install h5py only when running python-examples.

* Fix lint complaint.

* Apply suggestions from code review

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Doc fixes, remove unused params_str in __init_from_seqs.

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Remove unnecessary conda install in windows ci script.

* Keep param as example in dataset_from_multi_hdf5.py

* Add _get_sample_count function to remove code duplication.

* Use batch_size parameter in generate_hdf.

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Fix after applying suggestions.

* Fix test, check idx is instance of numbers.Integral.

* Update python-package/lightgbm/basic.py

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Expose Sequence class in Python-API doc.

* Handle Sequence object not having batch_size.

* Fix isort lint complaint.

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update docstring to mention Sequence as data input.

* Remove get_one_line in test_basic.py

* Make Sequence an abstract class.

* Reduce number of tests for test_sequence.

* Add c_api: LGBM_SampleCount, fix potential bug in LGBMSampleIndices.

* empty commit to trigger ci

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Rename to LGBM_GetSampleCount, change LGBM_SampleIndices out_len to int32_t.

Also rename total_nrow to num_total_row in c_api.h for consistency.

* Doc about Sequence in docs/Python-Intro.rst.

* Fix: basic.py change LGBM_SampleIndices out_len to int32.

* Add create_valid test case with Dataset from Sequence.

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Apply suggestions from code review

Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>

* Remove no longer used DEFAULT_BIN_CONSTRUCT_SAMPLE_CNT.

* Update python-package/lightgbm/basic.py

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Co-authored-by: Willian Zhang <willian@willian.email>
Co-authored-by: Willian Z <Willian@Willian-Zhang.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-07-02 15:17:17 +03:00
Nikita Titov aab8fc18a2
fix param aliases (#4387) 2021-06-26 15:07:37 +03:00
Nikita Titov 113b44d7e5
[docs] update link to LightGBM example in MMLSpark repo (#4401) 2021-06-24 22:16:43 +03:00
Nikita Titov df79713725
[docs] document sanitizers (#4365)
* document sanitizers

* rephrase
2021-06-14 22:15:38 +03:00
Nikita Titov 57fc2f9847
add anchor for nightly builds in docs (#4366) 2021-06-11 09:32:13 -05:00
James Lamb 24ac920879
[docs] document how to pass multi-value params from Python and R (fixes #4345) (#4346)
* [R-package] add docs and tests on monotone constraints (fixes #4345)

* remove tests

* move doc to top level

* slightly more specific

* Update docs/Parameters.rst

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-06-09 20:07:50 +03:00
Nikita Titov 28c3c45d3b
[docs] make building of C++ tests section collapsable (#4340) 2021-06-04 21:44:43 +03:00
James Lamb c34b7ffd3b
[docs] replace broken mmlspark notebook link in docs (#4303) 2021-05-20 14:30:15 +03:00
Nikita Titov 0a172d9e3b
[ci][docs] Unpin Sphinx version (#4277)
This reverts commit a421217e4e.
2021-05-12 00:27:03 +03:00
Nikita Titov a421217e4e
[ci][docs] Restrict Sphinx version (#4267) 2021-05-09 17:04:40 +03:00
Chen Yufei f83180883a
Precise text file parsing (#4081)
* New build option: USE_PRECISE_TEXT_PARSER.

Use fast_double_parser for text file parsing. For each number, fallback
to strtod in case of parse failure.

* Add benchmark for CSVParser with Atof and AtofPrecise.

* Fix lint complaint.

* Fix typo in open result error message.

* Revert "Fix lint complaint."

This reverts commit 92ab0b6bce9f17d7be9eaeb20f19d4a0a36f0387.

* Revert "Add benchmark for CSVParser with Atof and AtofPrecise."

This reverts commit 4f8639abd06c679d4382eb715a1793afd94df3d2.

* Use AtofPrecise in Common::__StringToTHelper.

* [option] precise_float_parser: precise float number parsing for text input.

* Remove USE_PRECISE_TEXT_PARSER compile option.

* test: add test for Common::AtofPrecise.

* test: remove ChunkedArrayTest with 0 length.

This triggers Log::Fatal which aborts the test program.

* fix lint, add copyright.

* Revert "test: remove ChunkedArrayTest with 0 length."

This reverts commit 346c76affe9e78b6ca2738c4a56dbb9c00f31102.

* Use LightGBM::Common::Sign

* save precise_float_parser in model file.

* Fix error checking in AtofPrecise. Add more test cases.

* Remove test case that can't pass under macOS.

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-05-07 11:00:48 +08:00
Andrew Ziem e79716e0b6
Correct spelling (#4250)
* Correct spelling

Most changes were in comments, and there were a few changes to literals for log output.

There were no changes to variable names, function names, IDs, or functionality.

* Clarify a phrase in a comment

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Clarify a phrase in a comment

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Clarify a phrase in a comment

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Correct spelling

Most are code comments, but one case is a literal in a logging message.

There are a few grammar fixes too.

Co-authored-by: James Lamb <jaylamb20@gmail.com>
2021-05-04 10:10:55 -05:00
James Lamb eb7a1b7c63
[docs] fix broken MS MPI link in Installation Guide (#4224) 2021-04-25 23:29:02 +03:00
Nikita Titov 0b1c02225f
[ci][docs] Unpin Breathe version in requirements.txt (#4222)
This reverts commit 536946e303.
2021-04-25 23:25:21 +03:00
Nikita Titov 8b477ba393
added aliases to params (#4205) 2021-04-23 14:51:38 +03:00
Nikita Titov dac774935c
[docs] fix markdown in docs (#4191) 2021-04-18 02:11:12 +03:00
Akshita Dixit 50bca9e4be
[docs] Add changes to gcc-tips (#4187)
* Add changes to gcc-tips

* Update docs/gcc-Tips.rst

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Update docs/gcc-Tips.rst

Co-authored-by: James Lamb <jaylamb20@gmail.com>

* Remove image files for gcc-tips

Co-authored-by: James Lamb <jaylamb20@gmail.com>
2021-04-17 21:24:55 +03:00
Nikita Titov 211ef7878f
[ci] run cpp tests at CI (#4166)
* run cpp tests at CI

* Update docs/Installation-Guide.rst

Co-authored-by: James Lamb <jaylamb20@gmail.com>

Co-authored-by: James Lamb <jaylamb20@gmail.com>
2021-04-16 16:22:46 +03:00
Nikita Titov fba18e488c
[docs] bring back macOS installation method with Homebrew formula in docs (#4182)
This reverts commit e98da99d88.
2021-04-15 20:53:13 -05:00
Nikita Titov 536946e303
[ci] Restrict breathe version at CI (#4168)
* Restrict Doxygen version at CI

* fix typo in version number

* Update requirements.txt

* Update test.sh
2021-04-09 17:34:42 -05:00
Nikita Titov b674439aec
[docs] update link to Boost binaries (#4157) 2021-04-05 23:10:12 +03:00
NovusEdge b2d73deea6
[python] added f-strings to docs/conf.py (#4147)
* Added f-strings to docs/conf.py

* fixed some linting errors

* fixed indent on 210:25

* YAF

* yet another try at fixing the linting

* Try: 1

* Try 2

* Update docs/conf.py

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update docs/conf.py

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* Update docs/conf.py

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
2021-04-05 13:57:00 -05:00
James Lamb 6d825cd3a1
clarify DEBUG-level log about tree depth (#4126)
* clarify DEBUG-level log about tree depth

* more places
2021-04-05 08:28:01 -05:00
Nikita Titov 9cab93a9f3
[docs] add missed CUDA device type in docs (#4130) 2021-03-28 22:22:44 +03:00
Nikita Titov e98da99d88
[docs] remove macOS installation method with Homebrew formula (#4122)
This reverts commit f2632a6e1e.
2021-03-27 23:40:13 +03:00