LightGBM

Граф коммитов

Автор	SHA1	Сообщение	Дата
Nikita Titov	41152eab4b	[python][docs] reworked predict method in sklearn wrapper and docs improvements (#1351 ) * fixed docs * reworker predict method of sklearn wrapper * fixed encapsulation * added test * fixed consistency between docstring and params docs * fixed verbose * replaced predict_proba with predict in test * fixed verbose again * fixed fraction params descriptions * added description of skip_drop and drop_rate constraints * fixed subsample_freq consistency with C++ default value * fixed nice look of params list * made force splits json file example clickable * fixed nice look of metrics list and added comma * reduced warning in test about same param specified twice * replaced pred_parameter with *kwargs in predict method added test for *kwargs in predict method fixed warnings * fixed pylint	2018-05-10 17:48:29 +08:00
Nikita Titov	21487d8a28	[ci][python] updated pep8 to pycodestyle (#1358 ) * updated pep8 to pycodestyle * fixed E722 do not use bare 'except' * fixed W605 invalid escape sequence '\' fixed W504 line break after binary operator * ignore W605 invalid escape sequence '\' in nuget builder made pycodestyle happy	2018-05-08 12:23:35 +08:00
Guolin Ke	e005cdb049	Monotone Constraint (#1314 )	2018-04-18 11:12:36 +08:00
ebernhardson	7e186a5783	Experimental support for HDFS (#1243 ) * Read and write datsets from hdfs. * Only enabled when cmake is run with -DUSE_HDFS:BOOL=TRUE * Introduces VirtualFile(Reader\|Writer) to asbtract VFS differences	2018-02-27 12:53:21 +08:00
Guolin Ke	1e61f24f72	try to fix problem with multi-dimensional sliced object. (#1210 )	2018-01-24 23:46:55 +08:00
Guolin Ke	5a89a76df3	fix early stopping edge case (#1133 ) * fix early stopping edge case * fix message. * fix tests * fix GPU tests.	2017-12-23 11:55:53 +08:00
Guolin Ke	8a5ec366aa	Speed up saving and loading model (#1083 ) * remove protobuf * add version number * remove pmml script * use float for split gain * fix warnings * refine the read model logic of gbdt * fix compile error * improve decode speed * fix some bugs * fix double accuracy problem * fix bug * multi-thread save model * speed up save model to string * parallel save/load model * fix some warnings. * fix warnings. * fix a bug * remove debug output * fix doc * fix max_bin warning in tests. * fix max_bin warning * fix pylint * clean code for stringToArray * clean code for TToString * remove max_bin * replace "class" with typename	2017-11-26 16:07:06 +08:00
wxchan	bc0579c81b	add init_score & test cpp and python result consistency (#1007 ) * add init_score & test cpp and python result consistency * try fix common.h * Fix tests (#3) * update atof * fix bug * fix tests. * fix bug * fix dtypes * fix categorical feature override * fix protobuf on vs build (#1004) * [optional] support protobuf * fix windows/LightGBM.vcxproj * add doc * fix doc * fix vs support (#2) * fix vs support * fix cmake * fix #1012 * [python] add network config api (#1019) * add network * update doc * add float tolerance in bin finder. * fix a bug * update tests * add double torelance on tree model * fix tests * simplify the double comparison * fix lightsvm zero base * move double tolerance to the bin finder. * fix pylint * clean test.sh * add sklearn test * remove underline * clean codes * set random_state=None * add last line * fix doc * rename file * try fix test	2017-11-09 23:24:20 +08:00
Nikita Titov	b9dc51a6c5	[python] fixed stratifiedkfold for non-classifying tasks (#1016 ) * Update test_engine.py * Update test_engine.py	2017-10-24 10:56:58 +08:00
Guolin Ke	087ec475b2	Use one-vs-other for small categorical features. commit c9e123f24fcbb159c04e6694c7f830530bb2f27e Author: Guolin Ke <i@yumumu.me> Date: Wed Oct 18 10:00:19 2017 +0800 change default max_cat_to_onehot commit 805a5c3125b9979d634922e1708877fa0fec80c6 Author: Guolin Ke <i@yumumu.me> Date: Tue Oct 17 22:57:18 2017 +0800 use one hot coding for the small cats	2017-10-18 10:00:55 +08:00
Guolin Ke	db9ec2176c	reduce parameters in categorical split	2017-10-17 01:58:15 +08:00
Guolin Ke	eadc7b9d3f	Refine categorical features (#993 ) * many fixes for categorical feature * add l2 to categorcial split. * remove useless file * update version * add cat_l2 * update appveyor verison * remove file * fix tests. * change default cat_l2 value * fix a bug in bin finder * change default cat_smooth_ratio	2017-10-16 14:55:25 +08:00
Guolin Ke	ef221275d1	fix #991 (#992 ) * refine categorical split * a bug fix * fix a bug	2017-10-14 00:01:38 +08:00
ChenZhiyong	cc11525d26	refine categorical split (#919 ) * refine categorical split * add test	2017-09-28 12:29:18 +08:00
Nikita Titov	0350a9a6ff	[python] bring pandas support to the sklearn wrapper back (#904 ) * added test for sklearn handle categorical features * use raw X, y in sklearn wrapper in case of pandas.DataFrame * fixed probs	2017-09-19 14:55:03 +08:00
Scott Lundberg	67c2bdf905	Fix feature attributions for regression models and add Python bindings (#861 ) * Fix feature attributions for regression models and add Python bindings * Address pylint issue * Lazy fix missing tree depth info	2017-09-16 23:03:07 +08:00
Nikita Titov	8984111f05	[python] [setup] improving installation (#880 ) * disabled logs from compilers; fixed #874 * fixed safe clear_fplder * added windows folder to manifest.in * added windows folder to build * added library path * added compilation with MSBuild from .sln-file * fixed unknown PlatformToolset returns exitcode 0 * hotfix * updated Readme * removed return * added installation with mingw test to appveyor * let's test appveyor with both VS 2015 and VS 2017; but MinGW isn't installed on VS 2017 image * fixed built-in name 'file' * simplified appveyor * removed excess data_files * fixed unreadable paths * separated exceptions for cmake and mingw * refactored silent_call * don't create artifacts with VS 2015 and mingw * be more precise with python versioning in Travis * removed unnecessary if statement * added classifiers for PyPI and python versions badge * changed python version in travis * added support of scikit-learn 0.18.x * added more python versions to Travis * added more python versions to Appveyor * reduced number of tests in Travis * Travis trick is not needed anymore * attempt to fix according to https://github.com/Microsoft/LightGBM/pull/880#discussion_r137438856	2017-09-08 18:17:00 +08:00
Nikita Titov	db8b6b00a1	[python] fixed sklearn test on python 2.7 (#888 ) * fixed sklearn test on python 2.7 * commit to show that problem has been solved * come back to python 3.6 * removed warnings check	2017-09-05 21:14:14 +08:00
Nikita Titov	015c8fff72	[python] improved sklearn interface (#870 ) * improved sklearn interface; added sklearns' tests * moved best_score into the if statement * improved docstrings; simplified LGBMCheckConsistentLength * fixed typo * pylint * updated example * fixed Ranker interface * added missed boosting_type * fixed more comfortable autocomplete without unused objects * removed check for None of eval_at * fixed according to review * fixed typo * added description of fit return type * dictionary->dict for short * markdown cleanup	2017-09-05 18:19:45 +08:00
wxchan	603bffcfac	[MRG] expose feature importance to c_api (#860 ) * expose feature importance to c_api * support type=gain * remove dump model from examples and tests temporarily because it's unstable * use double instead of float	2017-08-24 23:09:43 +08:00
Nikita Titov	3f0061ca5f	[python] parameters renaming for sklearn naming convention (#854 ) * updated scikit-learn interface * fixed better description * updated set_params() * removed backward compatibility * removed excess lines * replaced pop with setdefault * added deprecated warnings * added tests	2017-08-23 13:25:30 +08:00
Mikhail Korobov	6be7aa7ab8	TST check that single-leaf trees don't cause segfaults (#852 )	2017-08-20 23:40:57 +08:00
wxchan	c8142e3037	[MRG] [python] check params for num_boost_round & early_stopping_rounds (#806 ) * check params * add test case * fix pylint	2017-08-18 19:07:57 +08:00
Guolin Ke	4e9b589bfd	update tests.	2017-08-18 19:01:21 +08:00
j-mark-hou	e7c53270a0	added test for training when both train and valid are subsets of a si… (#759 ) * added test for training when both train and valid are subsets of a single lgb.Dataset object * pep8 changes * more pep8 * added test involving subsets of subsets of lgb.Dataset objects * minor fix to contruction of X matrix * even more pep8 * simplified test further	2017-08-18 18:52:01 +08:00
Guolin Ke	00cb04a255	Better missing value handle (#747 ) * finish the data loading part * allow prediction. * fix bug for decision type. * finish split finding part * fix bugs. * bug fixed. add a test . * fix pep8 . * update documents. * fix test bugs. * fix a format * fix import error in python test. * disable missing handle in categorial features. * fix a bug. * add more tests. * fix pep8 * fix bugs. * remove the missing handle code for categorical feature.	2017-07-30 20:09:41 +08:00
Guolin Ke	6a7470a2b0	Add Random Forest Mode (#678 ) * add draft of RF. * fix score bugs. * fix scores. * fix tests. * update document * fix GetPredictAt	2017-07-11 19:44:46 +08:00
Guolin Ke	6d4c7b03b7	Support early stopping of prediction in CLI (#565 ) * fix multi-threading. * fix name style. * support in CLI version. * remove warnings. * Not default parameters. * fix if...else... . * fix bug. * fix warning. * refine c_api. * fix R-package. * fix R's warning. * fix tests. * fix pep8 .	2017-05-30 18:28:17 +08:00
cbecker	993bbd5f91	Add prediction early stopping (#550 ) * Add early stopping for prediction * Fix GBDT if-else prediction with early stopping * Small C++ embelishments to early stopping API and functions * Fix early stopping efficiency issue by creating a singleton for no early stopping * Python improvements to early stopping API * Add assertion check for binary and multiclass prediction score length * Update vcxproj and vcxproj.filters with new early stopping files * Remove inline from PredictRaw(), the linker was not able to find it otherwise	2017-05-29 23:09:58 +08:00
Tsukasa OMOTO	babf01c2f6	python: use pytest for tests (#498 ) https://docs.pytest.org/	2017-05-11 13:27:18 +08:00
wxchan	35440b9cb9	[python-package] change default best_iteration to 0 (#495 ) * make test fail * change default best_iteration to 0 * fix test * change data_splitter to folds in cv * update docs	2017-05-06 23:37:41 +08:00
wxchan	a39141e10b	re-write test cases: remove global template (#479 )	2017-05-02 10:13:06 +08:00
wxchan	ef408f552a	lambdarank cv (#459 )	2017-04-26 15:05:26 +08:00
wxchan	7339ed648c	replace whitespaces with underlines in feature name (#426 ) * change whitespace to underline in feature names * add test * fix bug * fix bug * warning -> fatal	2017-04-18 11:03:32 +08:00
Guolin Ke	f6b25ac98d	fix test.	2017-04-17 11:17:37 +08:00
wxchan	45c1c6e8c1	add best score (#413 )	2017-04-15 19:04:35 +08:00
Laurae	ba99bcddc6	Switch RMSE to MSE (true L2 loss) (#408 ) * RMSE (L2) -> MSE (true L2) * Remove sqrt unneeded reference * Square L2 test (RMSE to MSE) * No square root on test * Attempt to add RMSE	2017-04-13 18:43:41 +08:00
Huan Zhang	a5f11d47ef	Use only one thread in test_basic.py (#412 )	2017-04-13 12:20:29 +08:00
Huan Zhang	0bb4a825af	Initial GPU acceleration support for LightGBM (#368 ) * add dummy gpu solver code * initial GPU code * fix crash bug * first working version * use asynchronous copy * use a better kernel for root * parallel read histogram * sparse features now works, but no acceleration, compute on CPU * compute sparse feature on CPU simultaneously * fix big bug; add gpu selection; add kernel selection * better debugging * clean up * add feature scatter * Add sparse_threshold control * fix a bug in feature scatter * clean up debug * temporarily add OpenCL kernels for k=64,256 * fix up CMakeList and definition USE_GPU * add OpenCL kernels as string literals * Add boost.compute as a submodule * add boost dependency into CMakeList * fix opencl pragma * use pinned memory for histogram * use pinned buffer for gradients and hessians * better debugging message * add double precision support on GPU * fix boost version in CMakeList * Add a README * reconstruct GPU initialization code for ResetTrainingData * move data to GPU in parallel * fix a bug during feature copy * update gpu kernels * update gpu code * initial port to LightGBM v2 * speedup GPU data loading process * Add 4-bit bin support to GPU * re-add sparse_threshold parameter * remove kMaxNumWorkgroups and allows an unlimited number of features * add feature mask support for skipping unused features * enable kernel cache * use GPU kernels withoug feature masks when all features are used * REAdme. * REAdme. * update README * fix typos (#349) * change compile to gcc on Apple as default * clean vscode related file * refine api of constructing from sampling data. * fix bug in the last commit. * more efficient algorithm to sample k from n. * fix bug in filter bin * change to boost from average output. * fix tests. * only stop training when all classes are finshed in multi-class. * limit the max tree output. change hessian in multi-class objective. * robust tree model loading. * fix test. * convert the probabilities to raw score in boost_from_average of classification. * fix the average label for binary classification. * Add boost_from_average to docs (#354) * don't use "ConvertToRawScore" for self-defined objective function. * boost_from_average seems doesn't work well in binary classification. remove it. * For a better jump link (#355) * Update Python-API.md * for a better jump in page A space is needed between `#` and the headers content according to Github's markdown format [guideline](https://guides.github.com/features/mastering-markdown/) After adding the spaces, we can jump to the exact position in page by click the link. * fixed something mentioned by @wxchan * Update Python-API.md * add FitByExistingTree. * adapt GPU tree learner for FitByExistingTree * avoid NaN output. * update boost.compute * fix typos (#361) * fix broken links (#359) * update README * disable GPU acceleration by default * fix image url * cleanup debug macro * remove old README * do not save sparse_threshold_ in FeatureGroup * add details for new GPU settings * ignore submodule when doing pep8 check * allocate workspace for at least one thread during builing Feature4 * move sparse_threshold to class Dataset * remove duplicated code in GPUTreeLearner::Split * Remove duplicated code in FindBestThresholds and BeforeFindBestSplit * do not rebuild ordered gradients and hessians for sparse features * support feature groups in GPUTreeLearner * Initial parallel learners with GPU support * add option device, cleanup code * clean up FindBestThresholds; add some omp parallel * constant hessian optimization for GPU * Fix GPUTreeLearner crash when there is zero feature * use np.testing.assert_almost_equal() to compare lists of floats in tests * travis for GPU	2017-04-09 21:53:14 +08:00
Laurae	21861cd49f	[Python-package]: Fix RandomState issue #376 (#377 ) * Python: Fix RandomState issue #376 * Add test case for Python's Shuffle=True	2017-04-02 09:00:50 +08:00
wxchan	6ed335df29	refine early stopping and add a test case (#369 )	2017-03-28 10:37:38 +08:00
Guolin Ke	b38a19a489	fix test.	2017-03-24 21:01:34 +08:00
Guolin Ke	2e962c779f	fix tests.	2017-03-23 00:46:36 +08:00
Guolin Ke	ef77806934	Add categorical feature support back.	2017-03-01 21:00:46 +08:00
Guolin Ke	4f77bd2860	update to v2.	2017-03-01 20:59:35 +08:00
wxchan	13d4581b96	add data_splitter to cv (#298 ) * add data_splitter for cv * update gitignore * clean code	2017-02-18 20:46:44 +08:00
wxchan	eef4d2d0c8	refine plotting library (#282 ) * refine plot * use warnings * refine logic * revert 'move to compat.py'	2017-02-03 14:38:47 +08:00
wxchan	ab4ed7254c	add feature name (#280 )	2017-02-02 15:36:44 +08:00
wxchan	58565547e8	[python-package] add plot metrics (#266 ) * add plot metrics * move 'raise Exception' to check_not_tuple_of_2_elements * rename 'plot_metrics' to 'plot_metric' * fix misleading message/docs * change 'Metrics' in title to 'Metric' * fix misleading comment	2017-01-28 19:10:02 +08:00
wxchan	8980fc7220	[python-package] add plot tree (#262 ) * add plot tree * add docs * add example * add test * fix test * fix decision type * add show_info * use feature name if available	2017-01-25 19:03:00 +08:00

1 2

83 Коммитов