* clear split info buffer in cegb_ before every iteration
* check nullable of cegb_ in serial_tree_learner.cpp
* add a test case for checking the split buffer in CEGB
* swith to Threading::For instead of raw OpenMP
* apply review suggestions
* apply review comments
* remove device cpu
* allow using feature names when retrieving number of bins
* unname vector
* use default feature names when not defined
* lint
* apply suggestions
* remove extra comma
* add test with categorical feature
* make feature names sync more transparent
* feat: support custom metrics in params
* feat: support objective in params
* test: custom objective and metric
* fix: imports are incorrectly sorted
* feat: convert eval metrics str and set to list
* feat: convert single callable eval_metric to list
* test: single callable objective in params
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* feat: callable fobj in basic cv function
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* test: cv support objective callable
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* fix: assert in cv_res
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* docs: objective callable in params
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* recover test_boost_from_average_with_single_leaf_trees
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* linters fail
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* remove metrics helper functions
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* feat: choose objective through _choose_param_values
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* test: test objective through _choose_param_values
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* test: test objective is callabe in train
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* test: parametrize choose_param_value with objective aliases
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* test: cv booster metric is none
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* fix: if string and callable choose callable
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* test train uses custom objective metrics
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* test: cv uses custom objective metrics
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* refactor: remove fobj parameter in train and cv
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* refactor: objective through params in sklearn API
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* custom objective function in advanced_example
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* fix whitespackes lint
* objective is none not a particular case for predict method
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* replace scipy.expit with custom implementation
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* test: set num_boost_round value to 20
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* fix: custom objective default_value is none
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* refactor: remove self._fobj
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* custom_objective default value is None
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* refactor: variables name reference dummy_obj
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* linter errors
* fix: process objective parameter when calling predict
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* linter errors
* fix: objective is None during predict call
Signed-off-by: Miguel Trejo <armando.trejo.marrufo@gmail.com>
* new cuda framework
* add histogram construction kernel
* before removing multi-gpu
* new cuda framework
* tree learner cuda kernels
* single tree framework ready
* single tree training framework
* remove comments
* boosting with cuda
* optimize for best split find
* data split
* move boosting into cuda
* parallel synchronize best split point
* merge split data kernels
* before code refactor
* use tasks instead of features as units for split finding
* refactor cuda best split finder
* fix configuration error with small leaves in data split
* skip histogram construction of too small leaf
* skip split finding of invalid leaves
stop when no leaf to split
* support row wise with CUDA
* copy data for split by column
* copy data from host to CPU by column for data partition
* add synchronize best splits for one leaf from multiple blocks
* partition dense row data
* fix sync best split from task blocks
* add support for sparse row wise for CUDA
* remove useless code
* add l2 regression objective
* sparse multi value bin enabled for CUDA
* fix cuda ranking objective
* support for number of items <= 2048 per query
* speedup histogram construction by interleaving global memory access
* split optimization
* add cuda tree predictor
* remove comma
* refactor objective and score updater
* before use struct
* use structure for split information
* use structure for leaf splits
* return CUDASplitInfo directly after finding best split
* split with CUDATree directly
* use cuda row data in cuda histogram constructor
* clean src/treelearner/cuda
* gather shared cuda device functions
* put shared CUDA functions into header file
* change smaller leaf from <= back to < for consistent result with CPU
* add tree predictor
* remove useless cuda_tree_predictor
* predict on CUDA with pipeline
* add global sort algorithms
* add global argsort for queries with many items in ranking tasks
* remove limitation of maximum number of items per query in ranking
* add cuda metrics
* fix CUDA AUC
* remove debug code
* add regression metrics
* remove useless file
* don't use mask in shuffle reduce
* add more regression objectives
* fix cuda mape loss
add cuda xentropy loss
* use template for different versions of BitonicArgSortDevice
* add multiclass metrics
* add ndcg metric
* fix cross entropy objectives and metrics
* fix cross entropy and ndcg metrics
* add support for customized objective in CUDA
* complete multiclass ova for CUDA
* separate cuda tree learner
* use shuffle based prefix sum
* clean up cuda_algorithms.hpp
* add copy subset on CUDA
* add bagging for CUDA
* clean up code
* copy gradients from host to device
* support bagging without using subset
* add support of bagging with subset for CUDAColumnData
* add support of bagging with subset for dense CUDARowData
* refactor copy sparse subrow
* use copy subset for column subset
* add reset train data and reset config for CUDA tree learner
add deconstructors for cuda tree learner
* add USE_CUDA ifdef to cuda tree learner files
* check that dataset doesn't contain CUDA tree learner
* remove printf debug information
* use full new cuda tree learner only when using single GPU
* disable all CUDA code when using CPU version
* recover main.cpp
* add cpp files for multi value bins
* update LightGBM.vcxproj
* update LightGBM.vcxproj
fix lint errors
* fix lint errors
* fix lint errors
* update Makevars
fix lint errors
* fix the case with 0 feature and 0 bin
fix split finding for invalid leaves
create cuda column data when loaded from bin file
* fix lint errors
hide GetRowWiseData when cuda is not used
* recover default device type to cpu
* fix na_as_missing case
fix cuda feature meta information
* fix UpdateDataIndexToLeafIndexKernel
* create CUDA trees when needed in CUDADataPartition::UpdateTrainScore
* add refit by tree for cuda tree learner
* fix test_refit in test_engine.py
* create set of large bin partitions in CUDARowData
* add histogram construction for columns with a large number of bins
* add find best split for categorical features on CUDA
* add bitvectors for categorical split
* cuda data partition split for categorical features
* fix split tree with categorical feature
* fix categorical feature splits
* refactor cuda_data_partition.cu with multi-level templates
* refactor CUDABestSplitFinder by grouping task information into struct
* pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder
* fix misuse of reference
* remove useless changes
* add support for path smoothing
* virtual destructor for LightGBM::Tree
* fix overlapped cat threshold in best split infos
* reset histogram pointers in data partition and spllit finder in ResetConfig
* comment useless parameter
* fix reverse case when na is missing and default bin is zero
* fix mfb_is_na and mfb_is_zero and is_single_feature_column
* remove debug log
* fix cat_l2 when one-hot
fix gradient copy when data subset is used
* switch shared histogram size according to CUDA version
* gpu_use_dp=true when cuda test
* revert modification in config.h
* fix setting of gpu_use_dp=true in .ci/test.sh
* fix linter errors
* fix linter error
remove useless change
* recover main.cpp
* separate cuda_exp and cuda
* fix ci bash scripts
add description for cuda_exp
* add USE_CUDA_EXP flag
* switch off USE_CUDA_EXP
* revert changes in python-packages
* more careful separation for USE_CUDA_EXP
* fix CUDARowData::DivideCUDAFeatureGroups
fix set fields for cuda metadata
* revert config.h
* fix test settings for cuda experimental version
* skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version
* fix lint issue by adding a blank line
* fix lint errors by resorting imports
* fix lint errors by resorting imports
* fix lint errors by resorting imports
* merge cuda.yml and cuda_exp.yml
* update python version in cuda.yml
* remove cuda_exp.yml
* remove unrelated changes
* fix compilation warnings
fix cuda exp ci task name
* recover task
* use multi-level template in histogram construction
check split only in debug mode
* ignore NVCC related lines in parameter_generator.py
* update job name for CUDA tests
* apply review suggestions
* Update .github/workflows/cuda.yml
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* Update .github/workflows/cuda.yml
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* update header
* remove useless TODOs
* remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062
* #include <LightGBM/utils/log.h> for USE_CUDA_EXP only
* fix include order
* fix include order
* remove extra space
* address review comments
* add warning when cuda_exp is used together with deterministic
* add comment about gpu_use_dp in .ci/test.sh
* revert changing order of included headers
Co-authored-by: Yu Shi <shiyu1994@qq.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* expose FeatureNumBin in C api
* parametrize min_data_in_bin and add test with max_bin_by_feature
* include feature_num_bin in R package
* add suggestion from review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* update error message and lint
* lint
* add call method
* minor improvements in tests
* add suggestions from review
* lint
* rename argument to feature in python and r packages
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* fix duplicate added initial scores for single-leaf trees
* add test case
* Fix import in Python test
* commit python suggestions
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* map nullable dtypes to regular float dtypes
* cast x3 to float after introducing missing values
* add test for regular dtypes
* use .astype and then values. update nullable_dtypes test and include test for regular numpy dtypes
* more specific allowed dtypes. test no copy when single float dtype df
* use np.find_common_type. set np.float128 to None when it isn't supported
* set default as type(None)
* move tests that use lgb.train to test_engine
* include np.float32 when finding common dtype
* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* add linebreak
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* reshape predictions, grad and hess in multiclass custom objective
* add sklearn test. move custom obj to utils. docs for numpy
* use num_model_per_iteration to get num_classes
* update docs and dask multiclass custom objective test
* move reshaping to __inner_predict. add test for feval
* add missing note. remove extra line
* make record_evaluation compatible with cv
* test multiple metrics in cv
* lint
* fix cv with train metric. save stdv as well
* always add dataset prefix to cv_agg
* remove unused function
* feat: refit additional kwargs for dataset and predict
* test: kwargs for refit method
* fix: __init__ got multiple values for argument
* fix: pycodestyle E302 error
* refactor: dataset_params to avoid breaking change
* refactor: expose all Dataset params in refit
* feat: dataset_params updates new_params
* fix: remove unnecessary params to test
* test: parameters input are the same
* docs: address StrikeRUS changes
* test: refit test changes in train dataset
* test: set init_score and decay_rate to zero
* fix for bad grads causing segfault
* adjust checking criteria to properly reflect reality of multi-class classifiers
* fix styling
* Line break before operator
* Update python-package/lightgbm/basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* Update python-package/lightgbm/basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* add a note to the C-API docs
* rearrange text s;ightly
* add some tests to python package
* Update include/LightGBM/c_api.h
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* PR comments
* match argument is a regex and our expression has brackets ..
* rework tests
* isorting imports
* updating test to relfect that the python APi does not take pres/labels as a fobj function
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* add C API function that returns all param names with aliases
* add C API function that returns all param names with aliases
* add R code
* test R code
* remove debug CI
* fix R lint
* refactor
* run CI
* fix R
* fix
* revert CI checks
* revert changes in docs
* Try to make function `const`
Co-authored-by: James Lamb <jaylamb20@gmail.com>
* add `const` in cpp file
* address review comments and sync with `master`
Co-authored-by: James Lamb <jaylamb20@gmail.com>
* initial implementation of init_score for multiclass classification
* check for 1d or 2d collection in init_score
* remove dataset import
* initial comments
* update dask test and docstrings
* update docstrings
* move logic to set_field. reshape back on get_field
* add type hints and update docstrings for dask. fix Dataset.set_field
* revert wrong docstrings and type hints
* add extra comma for consistency
* prefix private functions with underscore
add type hints to new functions
make commas consistent in dask and basic
* add missing spaces after type hint
* remove shape condition for dataframe in is_2d_collection
Co-authored-by: Nikita Titov <nekit94-12@hotmail.com>
* [python] support Dataset.get_data for Sequence input.
* Tweaks according to review comments.
* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* Add test cases.
* fix import order in test_basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* include distributed tests
* remove github action file
* try CI
* build shared library and fix linting error
* ignore files created for testing. add type hints and check with mypy. include docstrings
* lint
* use pre_partition and write separate model files. remove mypy
* update docs
* remove ci. lower rtol. pass num_machines in config
* write predict.conf in the predict method. more robust port setup. use subprocess.run and check returncode
* add paths to tests and binary. remove lgb dependency. update .igtignore.
* lint
* allow to pass executable dir as argument to pytest
* pass execfile to pytest instead of execdir
* add suggestions
* use os.path and add type hint to predict_config
* Update tests/distributed/_test_distributed.py
Co-authored-by: James Lamb <jaylamb20@gmail.com>
* [python-package] create Dataset from sampled data.
* [python-package] create Dataset from List[Sequence].
1. Use random access for data sampling
2. Support read data from multiple input files
3. Read data in batch so no need to hold all data in memory
* [python-package] example: create Dataset from multiple HDF5 file.
* fix: revert is_class implementation for seq
* fix: unwanted memory view reference for seq
* fix: seq is_class accepts sklearn matrices
* fix: requirements for example
* fix: pycode
* feat: print static code linting stage
* fix: linting: avoid shell str regex conversion
* code style: doc style
* code style: isort
* fix ci dependency: h5py on windows
* [py] remove rm files in test seq
https://github.com/microsoft/LightGBM/pull/4089#discussion_r612929623
* docs(python): init_from_sample summary
https://github.com/microsoft/LightGBM/pull/4089#discussion_r612903389
* remove dataset dump sample data debugging code.
* remove typo fix.
Create separate PR for this.
* fix typo in src/c_api.cpp
Co-authored-by: James Lamb <jaylamb20@gmail.com>
* style(linting): py3 type hint for seq
* test(basic): os.path style path handling
* Revert "feat: print static code linting stage"
This reverts commit 10bd79f7f8.
* feat(python): sequence on validation set
* minor(python): comment
* minor(python): test option hint
* style(python): fix code linting
* style(python): add pydoc for ref_dataset
* doc(python): sequence
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>
* revert(python): sequence class abc
* chore(python): remove rm_files
* Remove useless static_assert.
* refactor: test_basic test for sequence.
* fix lint complaint.
* remove dataset._dump_text in sequence test.
* Fix reverting typo fix.
* Apply suggestions from code review
Co-authored-by: James Lamb <jaylamb20@gmail.com>
* Fix type hint, code and doc style.
* fix failing test_basic.
* Remove TODO about keep constant in sync with cpp.
* Install h5py only when running python-examples.
* Fix lint complaint.
* Apply suggestions from code review
Co-authored-by: James Lamb <jaylamb20@gmail.com>
* Doc fixes, remove unused params_str in __init_from_seqs.
* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* Remove unnecessary conda install in windows ci script.
* Keep param as example in dataset_from_multi_hdf5.py
* Add _get_sample_count function to remove code duplication.
* Use batch_size parameter in generate_hdf.
* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* Fix after applying suggestions.
* Fix test, check idx is instance of numbers.Integral.
* Update python-package/lightgbm/basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* Expose Sequence class in Python-API doc.
* Handle Sequence object not having batch_size.
* Fix isort lint complaint.
* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* Update docstring to mention Sequence as data input.
* Remove get_one_line in test_basic.py
* Make Sequence an abstract class.
* Reduce number of tests for test_sequence.
* Add c_api: LGBM_SampleCount, fix potential bug in LGBMSampleIndices.
* empty commit to trigger ci
* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* Rename to LGBM_GetSampleCount, change LGBM_SampleIndices out_len to int32_t.
Also rename total_nrow to num_total_row in c_api.h for consistency.
* Doc about Sequence in docs/Python-Intro.rst.
* Fix: basic.py change LGBM_SampleIndices out_len to int32.
* Add create_valid test case with Dataset from Sequence.
* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* Apply suggestions from code review
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>
* Remove no longer used DEFAULT_BIN_CONSTRUCT_SAMPLE_CNT.
* Update python-package/lightgbm/basic.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Willian Zhang <willian@willian.email>
Co-authored-by: Willian Z <Willian@Willian-Zhang.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: shiyu1994 <shiyu_k1994@qq.com>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* Log warning instead of fatal when parsing float get under/overflow.
For texts that resolve to infinity, under or overflow should be
accepted.
* Remove outdated unit test.
* empty commit to trigger ci
* updated the old syntax with fstrings
* Updated the strings with + catenation to fstrings
* Updated the strings with + catenation to fstrings
* Update tests/python_package_test/test_dask.py
Co-authored-by: James Lamb <jaylamb20@gmail.com>
* run Dask tests on aarch64 architecture
* make random Dask test to fail
* Revert "make random Dask test to fail"
This reverts commit c43c98507f.
* empty commit
* empty commit
* empty commit
* empty commit
Co-authored-by: James Lamb <jaylamb20@gmail.com>
* New build option: USE_PRECISE_TEXT_PARSER.
Use fast_double_parser for text file parsing. For each number, fallback
to strtod in case of parse failure.
* Add benchmark for CSVParser with Atof and AtofPrecise.
* Fix lint complaint.
* Fix typo in open result error message.
* Revert "Fix lint complaint."
This reverts commit 92ab0b6bce9f17d7be9eaeb20f19d4a0a36f0387.
* Revert "Add benchmark for CSVParser with Atof and AtofPrecise."
This reverts commit 4f8639abd06c679d4382eb715a1793afd94df3d2.
* Use AtofPrecise in Common::__StringToTHelper.
* [option] precise_float_parser: precise float number parsing for text input.
* Remove USE_PRECISE_TEXT_PARSER compile option.
* test: add test for Common::AtofPrecise.
* test: remove ChunkedArrayTest with 0 length.
This triggers Log::Fatal which aborts the test program.
* fix lint, add copyright.
* Revert "test: remove ChunkedArrayTest with 0 length."
This reverts commit 346c76affe9e78b6ca2738c4a56dbb9c00f31102.
* Use LightGBM::Common::Sign
* save precise_float_parser in model file.
* Fix error checking in AtofPrecise. Add more test cases.
* Remove test case that can't pass under macOS.
* Apply suggestions from code review
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* Correct spelling
Most changes were in comments, and there were a few changes to literals for log output.
There were no changes to variable names, function names, IDs, or functionality.
* Clarify a phrase in a comment
Co-authored-by: James Lamb <jaylamb20@gmail.com>
* Clarify a phrase in a comment
Co-authored-by: James Lamb <jaylamb20@gmail.com>
* Clarify a phrase in a comment
Co-authored-by: James Lamb <jaylamb20@gmail.com>
* Correct spelling
Most are code comments, but one case is a literal in a logging message.
There are a few grammar fixes too.
Co-authored-by: James Lamb <jaylamb20@gmail.com>
* run cpp tests at CI
* Update docs/Installation-Guide.rst
Co-authored-by: James Lamb <jaylamb20@gmail.com>
Co-authored-by: James Lamb <jaylamb20@gmail.com>
* include voting_parallel tree_learner in test_regressor, test_classifier and test_ranker
* remove test for warnings and test for error when using feature_parallel
* use real names for tree_learner intest and include test for aliases. use the error message in the test for error in feature parallel
* split all tests with rf in test_classifier
* remove task parametrization for tree_learner aliases test. smaller input data from feature_parallel error
* define task for tree_learner aliases
* [dask] make random port search more resilient to random collisions
* linting
* more reliable ports check
* address review comments
* add error message
* include test for prediction with raw_score
* close client
* initial comments
* update data creation and include ranking task
* linting
* update _create_data
* compare unique raw_predictions with values in leaves_df
* [feature] Add ChunkedArray to SWIG
* Add ChunkedArray
* Add ChunkedArray_API_extensions.i
* Add SWIG class wrappers
* Address some review comments
* Fix linting issues
* Move test to tests/test_ChunkedArray_manually.cpp
* Add test note
* Move ChunkedArray to include/LightGBM/utils/
* Declare more explicit types of ChunkedArray in the SWIG API.
* Port ChunkedArray tests to googletest
* Please C++ linter
* Address StrikerRUS' review comments
* Update SWIG doc & disable ChunkedArray<int64_t>
* Use CHECK_EQ instead of assert
* Change include order (linting)
* Rename ChunkedArray -> chunked_array files
* Change header guards
* Address last comments from StrikerRUS
* [dask] raise more informative error for duplicates in 'machines'
* uncomment
* avoid test failure
* Revert "avoid test failure"
This reverts commit 9442bdf00f.
* include multiclass-classification task and task_to_model_factory dicts
* define centers coordinates. flatten init_scores within each partition for multiclass-classification
* include issue comment and fix linting error
* include support for init_score
* use dataframe from init_score and test difference with and without init_score in local model
* revert refactoring
* initial docs. test between distributed models with and without init_score
* remove ranker from tests
* test value for root node and change docs
* comma
* re-include parametrize
* fix incorrect merge
* use single init_score and the booster_ attribute
* use np.float64 instead of float
* [dask] [ci] add support for scikit-learn 0.24+ in tests (fixes#4031)
* Update tests/python_package_test/test_dask.py
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* try upgrading mixtexsetup
* they changed the executable name UGH
* more changes for executable name
* another path change
* changing package mirrors
* undo experiments
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* include support for column array as label
* remove nested ifs
* fix linting errors
* include tests for sklearn regressors
* include docstring for numpy_1d_array_to_dtype
* include . at end of docstring
* remove pandas import and test for regression, classification and ranking
* check predictions of sklearn models as well
* test training only in dask. drop pandas series tests
* use PANDAS_INSTALLED and pd_Series
* inline imports
* use col array in fit for test_dask
* include review comments
* use socket.bind with port 0 and client.run to find random open ports
* include test for found ports
* find random open ports as default
* parametrize local_listen_port. type hint to _find_random_open_port. fid open ports only on workers with data.
* make indentation consistent and pass list of workers to client.run
* remove socket import
* change random port implementation
* fix test
* include test for training when a worker has no data
* test single partition against local model for all tasks and outputs
* remove futures_of
* include james' comments
* remove product import
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)
* [dask] Add unit tests that signatures are the same between Dask and scikit-learn estimators (fixes microsoft#3907)