LightGBM/CMakeLists.txt

599 строки
19 KiB
CMake
Исходник Обычный вид История

option(USE_MPI "Enable MPI-based distributed learning" OFF)
option(USE_OPENMP "Enable OpenMP" ON)
option(USE_GPU "Enable GPU-accelerated training" OFF)
option(USE_SWIG "Enable SWIG to generate Java API" OFF)
option(USE_HDFS "Enable HDFS support (EXPERIMENTAL)" OFF)
option(USE_TIMETAG "Set to ON to output time costs" OFF)
option(USE_CUDA "Enable CUDA-accelerated training (EXPERIMENTAL)" OFF)
[CUDA] New CUDA version Part 1 (#4630) * new cuda framework * add histogram construction kernel * before removing multi-gpu * new cuda framework * tree learner cuda kernels * single tree framework ready * single tree training framework * remove comments * boosting with cuda * optimize for best split find * data split * move boosting into cuda * parallel synchronize best split point * merge split data kernels * before code refactor * use tasks instead of features as units for split finding * refactor cuda best split finder * fix configuration error with small leaves in data split * skip histogram construction of too small leaf * skip split finding of invalid leaves stop when no leaf to split * support row wise with CUDA * copy data for split by column * copy data from host to CPU by column for data partition * add synchronize best splits for one leaf from multiple blocks * partition dense row data * fix sync best split from task blocks * add support for sparse row wise for CUDA * remove useless code * add l2 regression objective * sparse multi value bin enabled for CUDA * fix cuda ranking objective * support for number of items <= 2048 per query * speedup histogram construction by interleaving global memory access * split optimization * add cuda tree predictor * remove comma * refactor objective and score updater * before use struct * use structure for split information * use structure for leaf splits * return CUDASplitInfo directly after finding best split * split with CUDATree directly * use cuda row data in cuda histogram constructor * clean src/treelearner/cuda * gather shared cuda device functions * put shared CUDA functions into header file * change smaller leaf from <= back to < for consistent result with CPU * add tree predictor * remove useless cuda_tree_predictor * predict on CUDA with pipeline * add global sort algorithms * add global argsort for queries with many items in ranking tasks * remove limitation of maximum number of items per query in ranking * add cuda metrics * fix CUDA AUC * remove debug code * add regression metrics * remove useless file * don't use mask in shuffle reduce * add more regression objectives * fix cuda mape loss add cuda xentropy loss * use template for different versions of BitonicArgSortDevice * add multiclass metrics * add ndcg metric * fix cross entropy objectives and metrics * fix cross entropy and ndcg metrics * add support for customized objective in CUDA * complete multiclass ova for CUDA * separate cuda tree learner * use shuffle based prefix sum * clean up cuda_algorithms.hpp * add copy subset on CUDA * add bagging for CUDA * clean up code * copy gradients from host to device * support bagging without using subset * add support of bagging with subset for CUDAColumnData * add support of bagging with subset for dense CUDARowData * refactor copy sparse subrow * use copy subset for column subset * add reset train data and reset config for CUDA tree learner add deconstructors for cuda tree learner * add USE_CUDA ifdef to cuda tree learner files * check that dataset doesn't contain CUDA tree learner * remove printf debug information * use full new cuda tree learner only when using single GPU * disable all CUDA code when using CPU version * recover main.cpp * add cpp files for multi value bins * update LightGBM.vcxproj * update LightGBM.vcxproj fix lint errors * fix lint errors * fix lint errors * update Makevars fix lint errors * fix the case with 0 feature and 0 bin fix split finding for invalid leaves create cuda column data when loaded from bin file * fix lint errors hide GetRowWiseData when cuda is not used * recover default device type to cpu * fix na_as_missing case fix cuda feature meta information * fix UpdateDataIndexToLeafIndexKernel * create CUDA trees when needed in CUDADataPartition::UpdateTrainScore * add refit by tree for cuda tree learner * fix test_refit in test_engine.py * create set of large bin partitions in CUDARowData * add histogram construction for columns with a large number of bins * add find best split for categorical features on CUDA * add bitvectors for categorical split * cuda data partition split for categorical features * fix split tree with categorical feature * fix categorical feature splits * refactor cuda_data_partition.cu with multi-level templates * refactor CUDABestSplitFinder by grouping task information into struct * pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder * fix misuse of reference * remove useless changes * add support for path smoothing * virtual destructor for LightGBM::Tree * fix overlapped cat threshold in best split infos * reset histogram pointers in data partition and spllit finder in ResetConfig * comment useless parameter * fix reverse case when na is missing and default bin is zero * fix mfb_is_na and mfb_is_zero and is_single_feature_column * remove debug log * fix cat_l2 when one-hot fix gradient copy when data subset is used * switch shared histogram size according to CUDA version * gpu_use_dp=true when cuda test * revert modification in config.h * fix setting of gpu_use_dp=true in .ci/test.sh * fix linter errors * fix linter error remove useless change * recover main.cpp * separate cuda_exp and cuda * fix ci bash scripts add description for cuda_exp * add USE_CUDA_EXP flag * switch off USE_CUDA_EXP * revert changes in python-packages * more careful separation for USE_CUDA_EXP * fix CUDARowData::DivideCUDAFeatureGroups fix set fields for cuda metadata * revert config.h * fix test settings for cuda experimental version * skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version * fix lint issue by adding a blank line * fix lint errors by resorting imports * fix lint errors by resorting imports * fix lint errors by resorting imports * merge cuda.yml and cuda_exp.yml * update python version in cuda.yml * remove cuda_exp.yml * remove unrelated changes * fix compilation warnings fix cuda exp ci task name * recover task * use multi-level template in histogram construction check split only in debug mode * ignore NVCC related lines in parameter_generator.py * update job name for CUDA tests * apply review suggestions * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * update header * remove useless TODOs * remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062 * #include <LightGBM/utils/log.h> for USE_CUDA_EXP only * fix include order * fix include order * remove extra space * address review comments * add warning when cuda_exp is used together with deterministic * add comment about gpu_use_dp in .ci/test.sh * revert changing order of included headers Co-authored-by: Yu Shi <shiyu1994@qq.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2022-03-23 05:39:23 +03:00
option(USE_CUDA_EXP "Enable CUDA-accelerated training with more acceleration (EXPERIMENTAL)" OFF)
option(USE_DEBUG "Set to ON for Debug mode" OFF)
option(USE_SANITIZER "Use santizer flags" OFF)
set(
ENABLED_SANITIZERS
"address" "leak" "undefined"
CACHE
STRING
"Semicolon separated list of sanitizer names, e.g., 'address;leak'. \
Supported sanitizers are address, leak, undefined and thread."
)
option(BUILD_CPP_TEST "Build C++ tests with Google Test" OFF)
option(BUILD_STATIC_LIB "Build static library" OFF)
option(__BUILD_FOR_R "Set to ON if building lib_lightgbm for use with the R package" OFF)
option(__INTEGRATE_OPENCL "Set to ON if building LightGBM with the OpenCL ICD Loader and its dependencies included" OFF)
if(APPLE)
option(APPLE_OUTPUT_DYLIB "Output dylib shared library" OFF)
endif()
if(__INTEGRATE_OPENCL)
cmake_minimum_required(VERSION 3.11)
elseif(USE_GPU OR APPLE)
2017-07-24 05:44:08 +03:00
cmake_minimum_required(VERSION 3.2)
[CUDA] New CUDA version Part 1 (#4630) * new cuda framework * add histogram construction kernel * before removing multi-gpu * new cuda framework * tree learner cuda kernels * single tree framework ready * single tree training framework * remove comments * boosting with cuda * optimize for best split find * data split * move boosting into cuda * parallel synchronize best split point * merge split data kernels * before code refactor * use tasks instead of features as units for split finding * refactor cuda best split finder * fix configuration error with small leaves in data split * skip histogram construction of too small leaf * skip split finding of invalid leaves stop when no leaf to split * support row wise with CUDA * copy data for split by column * copy data from host to CPU by column for data partition * add synchronize best splits for one leaf from multiple blocks * partition dense row data * fix sync best split from task blocks * add support for sparse row wise for CUDA * remove useless code * add l2 regression objective * sparse multi value bin enabled for CUDA * fix cuda ranking objective * support for number of items <= 2048 per query * speedup histogram construction by interleaving global memory access * split optimization * add cuda tree predictor * remove comma * refactor objective and score updater * before use struct * use structure for split information * use structure for leaf splits * return CUDASplitInfo directly after finding best split * split with CUDATree directly * use cuda row data in cuda histogram constructor * clean src/treelearner/cuda * gather shared cuda device functions * put shared CUDA functions into header file * change smaller leaf from <= back to < for consistent result with CPU * add tree predictor * remove useless cuda_tree_predictor * predict on CUDA with pipeline * add global sort algorithms * add global argsort for queries with many items in ranking tasks * remove limitation of maximum number of items per query in ranking * add cuda metrics * fix CUDA AUC * remove debug code * add regression metrics * remove useless file * don't use mask in shuffle reduce * add more regression objectives * fix cuda mape loss add cuda xentropy loss * use template for different versions of BitonicArgSortDevice * add multiclass metrics * add ndcg metric * fix cross entropy objectives and metrics * fix cross entropy and ndcg metrics * add support for customized objective in CUDA * complete multiclass ova for CUDA * separate cuda tree learner * use shuffle based prefix sum * clean up cuda_algorithms.hpp * add copy subset on CUDA * add bagging for CUDA * clean up code * copy gradients from host to device * support bagging without using subset * add support of bagging with subset for CUDAColumnData * add support of bagging with subset for dense CUDARowData * refactor copy sparse subrow * use copy subset for column subset * add reset train data and reset config for CUDA tree learner add deconstructors for cuda tree learner * add USE_CUDA ifdef to cuda tree learner files * check that dataset doesn't contain CUDA tree learner * remove printf debug information * use full new cuda tree learner only when using single GPU * disable all CUDA code when using CPU version * recover main.cpp * add cpp files for multi value bins * update LightGBM.vcxproj * update LightGBM.vcxproj fix lint errors * fix lint errors * fix lint errors * update Makevars fix lint errors * fix the case with 0 feature and 0 bin fix split finding for invalid leaves create cuda column data when loaded from bin file * fix lint errors hide GetRowWiseData when cuda is not used * recover default device type to cpu * fix na_as_missing case fix cuda feature meta information * fix UpdateDataIndexToLeafIndexKernel * create CUDA trees when needed in CUDADataPartition::UpdateTrainScore * add refit by tree for cuda tree learner * fix test_refit in test_engine.py * create set of large bin partitions in CUDARowData * add histogram construction for columns with a large number of bins * add find best split for categorical features on CUDA * add bitvectors for categorical split * cuda data partition split for categorical features * fix split tree with categorical feature * fix categorical feature splits * refactor cuda_data_partition.cu with multi-level templates * refactor CUDABestSplitFinder by grouping task information into struct * pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder * fix misuse of reference * remove useless changes * add support for path smoothing * virtual destructor for LightGBM::Tree * fix overlapped cat threshold in best split infos * reset histogram pointers in data partition and spllit finder in ResetConfig * comment useless parameter * fix reverse case when na is missing and default bin is zero * fix mfb_is_na and mfb_is_zero and is_single_feature_column * remove debug log * fix cat_l2 when one-hot fix gradient copy when data subset is used * switch shared histogram size according to CUDA version * gpu_use_dp=true when cuda test * revert modification in config.h * fix setting of gpu_use_dp=true in .ci/test.sh * fix linter errors * fix linter error remove useless change * recover main.cpp * separate cuda_exp and cuda * fix ci bash scripts add description for cuda_exp * add USE_CUDA_EXP flag * switch off USE_CUDA_EXP * revert changes in python-packages * more careful separation for USE_CUDA_EXP * fix CUDARowData::DivideCUDAFeatureGroups fix set fields for cuda metadata * revert config.h * fix test settings for cuda experimental version * skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version * fix lint issue by adding a blank line * fix lint errors by resorting imports * fix lint errors by resorting imports * fix lint errors by resorting imports * merge cuda.yml and cuda_exp.yml * update python version in cuda.yml * remove cuda_exp.yml * remove unrelated changes * fix compilation warnings fix cuda exp ci task name * recover task * use multi-level template in histogram construction check split only in debug mode * ignore NVCC related lines in parameter_generator.py * update job name for CUDA tests * apply review suggestions * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * update header * remove useless TODOs * remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062 * #include <LightGBM/utils/log.h> for USE_CUDA_EXP only * fix include order * fix include order * remove extra space * address review comments * add warning when cuda_exp is used together with deterministic * add comment about gpu_use_dp in .ci/test.sh * revert changing order of included headers Co-authored-by: Yu Shi <shiyu1994@qq.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2022-03-23 05:39:23 +03:00
elseif(USE_CUDA OR USE_CUDA_EXP)
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
cmake_minimum_required(VERSION 3.16)
2017-07-24 05:44:08 +03:00
else()
2020-10-30 17:15:29 +03:00
cmake_minimum_required(VERSION 3.0)
2017-07-24 05:44:08 +03:00
endif()
2016-08-05 09:06:01 +03:00
project(lightgbm LANGUAGES C CXX)
2016-08-05 09:06:01 +03:00
list(APPEND CMAKE_MODULE_PATH "${PROJECT_SOURCE_DIR}/cmake/modules")
#-- Sanitizer
if(USE_SANITIZER)
if(MSVC)
message(FATAL_ERROR "Sanitizers are not supported with MSVC.")
endif()
include(cmake/Sanitizer.cmake)
enable_sanitizers("${ENABLED_SANITIZERS}")
endif()
if(__INTEGRATE_OPENCL)
set(__INTEGRATE_OPENCL ON CACHE BOOL "" FORCE)
set(USE_GPU OFF CACHE BOOL "" FORCE)
message(STATUS "Building library with integrated OpenCL components")
endif()
2017-07-22 12:47:01 +03:00
if(CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
if(CMAKE_CXX_COMPILER_VERSION VERSION_LESS "4.8.2")
message(FATAL_ERROR "Insufficient gcc version")
endif()
elseif(CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
if(CMAKE_CXX_COMPILER_VERSION VERSION_LESS "3.8")
message(FATAL_ERROR "Insufficient Clang version")
endif()
elseif(CMAKE_CXX_COMPILER_ID STREQUAL "AppleClang")
if(CMAKE_CXX_COMPILER_VERSION VERSION_LESS "8.1.0")
message(FATAL_ERROR "Insufficient AppleClang version")
endif()
cmake_minimum_required(VERSION 3.16)
elseif(MSVC)
if(MSVC_VERSION LESS 1900)
message(
FATAL_ERROR
"The compiler ${CMAKE_CXX_COMPILER} doesn't support required C++11 features. Please use a newer MSVC."
)
endif()
cmake_minimum_required(VERSION 3.8)
endif()
2016-08-05 09:06:01 +03:00
if(USE_SWIG)
find_package(SWIG REQUIRED)
find_package(Java REQUIRED)
find_package(JNI REQUIRED)
include(UseJava)
include(UseSWIG)
set(SWIG_CXX_EXTENSION "cxx")
set(SWIG_EXTRA_LIBRARIES "")
set(SWIG_JAVA_EXTRA_FILE_EXTENSIONS ".java" "JNI.java")
set(SWIG_MODULE_JAVA_LANGUAGE "JAVA")
set(SWIG_MODULE_JAVA_SWIG_LANGUAGE_FLAG "java")
set(CMAKE_SWIG_OUTDIR "${CMAKE_CURRENT_BINARY_DIR}/java")
include_directories(Java_INCLUDE_DIRS)
include_directories(JNI_INCLUDE_DIRS)
include_directories($ENV{JAVA_HOME}/include)
if(WIN32)
file(MAKE_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/com/microsoft/ml/lightgbm/windows/x86_64")
include_directories($ENV{JAVA_HOME}/include/win32)
elseif(APPLE)
file(MAKE_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/com/microsoft/ml/lightgbm/osx/x86_64")
include_directories($ENV{JAVA_HOME}/include/darwin)
else()
file(MAKE_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/com/microsoft/ml/lightgbm/linux/x86_64")
include_directories($ENV{JAVA_HOME}/include/linux)
endif()
endif()
set(EIGEN_DIR "${PROJECT_SOURCE_DIR}/external_libs/eigen")
Trees with linear models at leaves (#3299) * Add Eigen library. * Working for simple test. * Apply changes to config params. * Handle nan data. * Update docs. * Add test. * Only load raw data if boosting=gbdt_linear * Remove unneeded code. * Minor updates. * Update to work with sk-learn interface. * Update to work with chunked datasets. * Throw error if we try to create a Booster with an already-constructed dataset having incompatible parameters. * Save raw data in binary dataset file. * Update docs and fix parameter checking. * Fix dataset loading. * Add test for regularization. * Fix bugs when saving and loading tree. * Add test for load/save linear model. * Remove unneeded code. * Fix case where not enough leaf data for linear model. * Simplify code. * Speed up code. * Speed up code. * Simplify code. * Speed up code. * Fix bugs. * Working version. * Store feature data column-wise (not fully working yet). * Fix bugs. * Speed up. * Speed up. * Remove unneeded code. * Small speedup. * Speed up. * Minor updates. * Remove unneeded code. * Fix bug. * Fix bug. * Speed up. * Speed up. * Simplify code. * Remove unneeded code. * Fix bug, add more tests. * Fix bug and add test. * Only store numerical features * Fix bug and speed up using templates. * Speed up prediction. * Fix bug with regularisation * Visual studio files. * Working version * Only check nans if necessary * Store coeff matrix as an array. * Align cache lines * Align cache lines * Preallocation coefficient calculation matrices * Small speedups * Small speedup * Reverse cache alignment changes * Change to dynamic schedule * Update docs. * Refactor so that linear tree learner is not a separate class. * Add refit capability. * Speed up * Small speedups. * Speed up add prediction to score. * Fix bug * Fix bug and speed up. * Speed up dataload. * Speed up dataload * Use vectors instead of pointers * Fix bug * Add OMP exception handling. * Change return type of LGBM_BoosterGetLinear to bool * Change return type of LGBM_BoosterGetLinear back to int, only parameter type needed to change * Remove unused internal_parent_ property of tree * Remove unused parameter to CreateTreeLearner * Remove reference to LinearTreeLearner * Minor style issues * Remove unneeded check * Reverse temporary testing change * Fix Visual Studio project files * Restore LightGBM.vcxproj.filters * Speed up * Speed up * Simplify code * Update docs * Simplify code * Initialise storage space for max num threads * Move Eigen to include directory and delete unused files * Remove old files. * Fix so it compiles with mingw * Fix gpu tree learner * Change AddPredictionToScore back to const * Fix python lint error * Fix C++ lint errors * Change eigen to a submodule * Update comment * Add the eigen folder * Try to fix build issues with eigen * Remove eigen files * Add eigen as submodule * Fix include paths * Exclude eigen files from Python linter * Ignore eigen folders for pydocstyle * Fix C++ linting errors * Fix docs * Fix docs * Exclude eigen directories from doxygen * Update manifest to include eigen * Update build_r to include eigen files * Fix compiler warnings * Store raw feature data as float * Use float for calculating linear coefficients * Remove eigen directory from GLOB * Don't compile linear model code when building R package * Fix doxygen issue * Fix lint issue * Fix lint issue * Remove uneeded code * Restore delected lines * Restore delected lines * Change return type of has_raw to bool * Update docs * Rename some variables and functions for readability * Make tree_learner parameter const in AddScore * Fix style issues * Pass vectors as const reference when setting tree properties * Make temporary storage of serial_tree_learner mutable so we can make the object's methods const * Remove get_raw_size, use num_numeric_features instead * Fix typo * Make contains_nan_ and any_nan_ properties immutable again * Remove data_has_nan_ property of tree * Remove temporary test code * Make linear_tree a dataset param * Fix lint error * Make LinearTreeLearner a separate class * Fix lint errors * Fix lint error * Add linear_tree_learner.o * Simulate omp_get_max_threads if openmp is not available * Update PushOneData to also store raw data. * Cast size to int * Fix bug in ReshapeRaw * Speed up code with multithreading * Use OMP_NUM_THREADS * Speed up with multithreading * Update to use ArrayToString * Fix tests * Fix test * Fix bug introduced in merge * Minor updates * Update docs
2020-12-24 09:01:23 +03:00
include_directories(${EIGEN_DIR})
# See https://gitlab.com/libeigen/eigen/-/blob/master/COPYING.README
add_definitions(-DEIGEN_MPL2_ONLY)
if(__BUILD_FOR_R)
[R-package] Use Rprintf for logging in the R package (fixes #1440, fixes #1909) (#2901) * [R-package] started cutting over from custom R-to-C interface to R.h * replaced LGBM_SE with SEXP * fixed error about ocnflicting definitions of length * got linking working * more stuff * eliminated R CMD CHECK note about printing * switched from hard-coded include dir to the one from FindLibR.cmake * cleaned up formatting in FindLibR.cmake * commented-out everything in CI that does not touch R * more changes * trying to get better logs * tried ignoring * added error message to confirm a suspicion * still trying to find R during R CMD CHECK * restore full CI * fixed comment * Update R-package/src/cmake/modules/FindLibR.cmake * changed strategy for finding LIBR_HOME on Windows * Removed 32-bit Windows stuff in FindLibR.cmake * Update R-package/src/cmake/modules/FindLibR.cmake * Update CMakeLists.txt Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Update CMakeLists.txt Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Update R-package/src/cmake/modules/FindLibR.cmake Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * removed some duplication in cmake scripts * Update R-package/src/cmake/modules/FindLibR.cmake Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Update R-package/src/cmake/modules/FindLibR.cmake Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Update R-package/src/cmake/modules/FindLibR.cmake Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Update R-package/src/cmake/modules/FindLibR.cmake Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Update R-package/src/cmake/modules/FindLibR.cmake Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Update R-package/src/cmake/modules/FindLibR.cmake Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * added LIBR_CORE_LIBRARY back * small fixes to CMakeLists * simplified FindLibR.cmake * some fixes for windows * Apply suggestions from code review Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * allowed for directly passing LIBR_EXECUTABLE to FindLibR.cmake * reorganized FindLibR.cmake to catch more cases * clean up inconsistencies in R calls in FindLibR.cmake * Update R-package/src/cmake/modules/FindLibR.cmake Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * removed unnecessary log messages * removed unnecessary unset() call Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2020-03-24 03:11:30 +03:00
find_package(LibR REQUIRED)
message(STATUS "LIBR_EXECUTABLE: ${LIBR_EXECUTABLE}")
message(STATUS "LIBR_INCLUDE_DIRS: ${LIBR_INCLUDE_DIRS}")
message(STATUS "LIBR_CORE_LIBRARY: ${LIBR_CORE_LIBRARY}")
include_directories(${LIBR_INCLUDE_DIRS})
add_definitions(-DLGB_R_BUILD)
endif()
if(USE_TIMETAG)
add_definitions(-DTIMETAG)
endif()
if(USE_DEBUG)
add_definitions(-DDEBUG)
endif()
2016-08-05 09:06:01 +03:00
if(USE_MPI)
2016-12-07 06:31:31 +03:00
find_package(MPI REQUIRED)
add_definitions(-DUSE_MPI)
2016-08-05 09:06:01 +03:00
else()
add_definitions(-DUSE_SOCKET)
endif()
2016-08-05 09:06:01 +03:00
[CUDA] New CUDA version Part 1 (#4630) * new cuda framework * add histogram construction kernel * before removing multi-gpu * new cuda framework * tree learner cuda kernels * single tree framework ready * single tree training framework * remove comments * boosting with cuda * optimize for best split find * data split * move boosting into cuda * parallel synchronize best split point * merge split data kernels * before code refactor * use tasks instead of features as units for split finding * refactor cuda best split finder * fix configuration error with small leaves in data split * skip histogram construction of too small leaf * skip split finding of invalid leaves stop when no leaf to split * support row wise with CUDA * copy data for split by column * copy data from host to CPU by column for data partition * add synchronize best splits for one leaf from multiple blocks * partition dense row data * fix sync best split from task blocks * add support for sparse row wise for CUDA * remove useless code * add l2 regression objective * sparse multi value bin enabled for CUDA * fix cuda ranking objective * support for number of items <= 2048 per query * speedup histogram construction by interleaving global memory access * split optimization * add cuda tree predictor * remove comma * refactor objective and score updater * before use struct * use structure for split information * use structure for leaf splits * return CUDASplitInfo directly after finding best split * split with CUDATree directly * use cuda row data in cuda histogram constructor * clean src/treelearner/cuda * gather shared cuda device functions * put shared CUDA functions into header file * change smaller leaf from <= back to < for consistent result with CPU * add tree predictor * remove useless cuda_tree_predictor * predict on CUDA with pipeline * add global sort algorithms * add global argsort for queries with many items in ranking tasks * remove limitation of maximum number of items per query in ranking * add cuda metrics * fix CUDA AUC * remove debug code * add regression metrics * remove useless file * don't use mask in shuffle reduce * add more regression objectives * fix cuda mape loss add cuda xentropy loss * use template for different versions of BitonicArgSortDevice * add multiclass metrics * add ndcg metric * fix cross entropy objectives and metrics * fix cross entropy and ndcg metrics * add support for customized objective in CUDA * complete multiclass ova for CUDA * separate cuda tree learner * use shuffle based prefix sum * clean up cuda_algorithms.hpp * add copy subset on CUDA * add bagging for CUDA * clean up code * copy gradients from host to device * support bagging without using subset * add support of bagging with subset for CUDAColumnData * add support of bagging with subset for dense CUDARowData * refactor copy sparse subrow * use copy subset for column subset * add reset train data and reset config for CUDA tree learner add deconstructors for cuda tree learner * add USE_CUDA ifdef to cuda tree learner files * check that dataset doesn't contain CUDA tree learner * remove printf debug information * use full new cuda tree learner only when using single GPU * disable all CUDA code when using CPU version * recover main.cpp * add cpp files for multi value bins * update LightGBM.vcxproj * update LightGBM.vcxproj fix lint errors * fix lint errors * fix lint errors * update Makevars fix lint errors * fix the case with 0 feature and 0 bin fix split finding for invalid leaves create cuda column data when loaded from bin file * fix lint errors hide GetRowWiseData when cuda is not used * recover default device type to cpu * fix na_as_missing case fix cuda feature meta information * fix UpdateDataIndexToLeafIndexKernel * create CUDA trees when needed in CUDADataPartition::UpdateTrainScore * add refit by tree for cuda tree learner * fix test_refit in test_engine.py * create set of large bin partitions in CUDARowData * add histogram construction for columns with a large number of bins * add find best split for categorical features on CUDA * add bitvectors for categorical split * cuda data partition split for categorical features * fix split tree with categorical feature * fix categorical feature splits * refactor cuda_data_partition.cu with multi-level templates * refactor CUDABestSplitFinder by grouping task information into struct * pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder * fix misuse of reference * remove useless changes * add support for path smoothing * virtual destructor for LightGBM::Tree * fix overlapped cat threshold in best split infos * reset histogram pointers in data partition and spllit finder in ResetConfig * comment useless parameter * fix reverse case when na is missing and default bin is zero * fix mfb_is_na and mfb_is_zero and is_single_feature_column * remove debug log * fix cat_l2 when one-hot fix gradient copy when data subset is used * switch shared histogram size according to CUDA version * gpu_use_dp=true when cuda test * revert modification in config.h * fix setting of gpu_use_dp=true in .ci/test.sh * fix linter errors * fix linter error remove useless change * recover main.cpp * separate cuda_exp and cuda * fix ci bash scripts add description for cuda_exp * add USE_CUDA_EXP flag * switch off USE_CUDA_EXP * revert changes in python-packages * more careful separation for USE_CUDA_EXP * fix CUDARowData::DivideCUDAFeatureGroups fix set fields for cuda metadata * revert config.h * fix test settings for cuda experimental version * skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version * fix lint issue by adding a blank line * fix lint errors by resorting imports * fix lint errors by resorting imports * fix lint errors by resorting imports * merge cuda.yml and cuda_exp.yml * update python version in cuda.yml * remove cuda_exp.yml * remove unrelated changes * fix compilation warnings fix cuda exp ci task name * recover task * use multi-level template in histogram construction check split only in debug mode * ignore NVCC related lines in parameter_generator.py * update job name for CUDA tests * apply review suggestions * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * update header * remove useless TODOs * remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062 * #include <LightGBM/utils/log.h> for USE_CUDA_EXP only * fix include order * fix include order * remove extra space * address review comments * add warning when cuda_exp is used together with deterministic * add comment about gpu_use_dp in .ci/test.sh * revert changing order of included headers Co-authored-by: Yu Shi <shiyu1994@qq.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2022-03-23 05:39:23 +03:00
if(USE_CUDA OR USE_CUDA_EXP)
set(CMAKE_CUDA_HOST_COMPILER "${CMAKE_CXX_COMPILER}")
2021-02-01 14:57:47 +03:00
enable_language(CUDA)
set(USE_OPENMP ON CACHE BOOL "CUDA requires OpenMP" FORCE)
endif()
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
if(USE_OPENMP)
find_package(OpenMP REQUIRED)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}")
endif()
2016-12-07 06:31:31 +03:00
Initial GPU acceleration support for LightGBM (#368) * add dummy gpu solver code * initial GPU code * fix crash bug * first working version * use asynchronous copy * use a better kernel for root * parallel read histogram * sparse features now works, but no acceleration, compute on CPU * compute sparse feature on CPU simultaneously * fix big bug; add gpu selection; add kernel selection * better debugging * clean up * add feature scatter * Add sparse_threshold control * fix a bug in feature scatter * clean up debug * temporarily add OpenCL kernels for k=64,256 * fix up CMakeList and definition USE_GPU * add OpenCL kernels as string literals * Add boost.compute as a submodule * add boost dependency into CMakeList * fix opencl pragma * use pinned memory for histogram * use pinned buffer for gradients and hessians * better debugging message * add double precision support on GPU * fix boost version in CMakeList * Add a README * reconstruct GPU initialization code for ResetTrainingData * move data to GPU in parallel * fix a bug during feature copy * update gpu kernels * update gpu code * initial port to LightGBM v2 * speedup GPU data loading process * Add 4-bit bin support to GPU * re-add sparse_threshold parameter * remove kMaxNumWorkgroups and allows an unlimited number of features * add feature mask support for skipping unused features * enable kernel cache * use GPU kernels withoug feature masks when all features are used * REAdme. * REAdme. * update README * fix typos (#349) * change compile to gcc on Apple as default * clean vscode related file * refine api of constructing from sampling data. * fix bug in the last commit. * more efficient algorithm to sample k from n. * fix bug in filter bin * change to boost from average output. * fix tests. * only stop training when all classes are finshed in multi-class. * limit the max tree output. change hessian in multi-class objective. * robust tree model loading. * fix test. * convert the probabilities to raw score in boost_from_average of classification. * fix the average label for binary classification. * Add boost_from_average to docs (#354) * don't use "ConvertToRawScore" for self-defined objective function. * boost_from_average seems doesn't work well in binary classification. remove it. * For a better jump link (#355) * Update Python-API.md * for a better jump in page A space is needed between `#` and the headers content according to Github's markdown format [guideline](https://guides.github.com/features/mastering-markdown/) After adding the spaces, we can jump to the exact position in page by click the link. * fixed something mentioned by @wxchan * Update Python-API.md * add FitByExistingTree. * adapt GPU tree learner for FitByExistingTree * avoid NaN output. * update boost.compute * fix typos (#361) * fix broken links (#359) * update README * disable GPU acceleration by default * fix image url * cleanup debug macro * remove old README * do not save sparse_threshold_ in FeatureGroup * add details for new GPU settings * ignore submodule when doing pep8 check * allocate workspace for at least one thread during builing Feature4 * move sparse_threshold to class Dataset * remove duplicated code in GPUTreeLearner::Split * Remove duplicated code in FindBestThresholds and BeforeFindBestSplit * do not rebuild ordered gradients and hessians for sparse features * support feature groups in GPUTreeLearner * Initial parallel learners with GPU support * add option device, cleanup code * clean up FindBestThresholds; add some omp parallel * constant hessian optimization for GPU * Fix GPUTreeLearner crash when there is zero feature * use np.testing.assert_almost_equal() to compare lists of floats in tests * travis for GPU
2017-04-09 16:53:14 +03:00
if(USE_GPU)
set(BOOST_COMPUTE_HEADER_DIR ${PROJECT_SOURCE_DIR}/external_libs/compute/include)
include_directories(${BOOST_COMPUTE_HEADER_DIR})
Initial GPU acceleration support for LightGBM (#368) * add dummy gpu solver code * initial GPU code * fix crash bug * first working version * use asynchronous copy * use a better kernel for root * parallel read histogram * sparse features now works, but no acceleration, compute on CPU * compute sparse feature on CPU simultaneously * fix big bug; add gpu selection; add kernel selection * better debugging * clean up * add feature scatter * Add sparse_threshold control * fix a bug in feature scatter * clean up debug * temporarily add OpenCL kernels for k=64,256 * fix up CMakeList and definition USE_GPU * add OpenCL kernels as string literals * Add boost.compute as a submodule * add boost dependency into CMakeList * fix opencl pragma * use pinned memory for histogram * use pinned buffer for gradients and hessians * better debugging message * add double precision support on GPU * fix boost version in CMakeList * Add a README * reconstruct GPU initialization code for ResetTrainingData * move data to GPU in parallel * fix a bug during feature copy * update gpu kernels * update gpu code * initial port to LightGBM v2 * speedup GPU data loading process * Add 4-bit bin support to GPU * re-add sparse_threshold parameter * remove kMaxNumWorkgroups and allows an unlimited number of features * add feature mask support for skipping unused features * enable kernel cache * use GPU kernels withoug feature masks when all features are used * REAdme. * REAdme. * update README * fix typos (#349) * change compile to gcc on Apple as default * clean vscode related file * refine api of constructing from sampling data. * fix bug in the last commit. * more efficient algorithm to sample k from n. * fix bug in filter bin * change to boost from average output. * fix tests. * only stop training when all classes are finshed in multi-class. * limit the max tree output. change hessian in multi-class objective. * robust tree model loading. * fix test. * convert the probabilities to raw score in boost_from_average of classification. * fix the average label for binary classification. * Add boost_from_average to docs (#354) * don't use "ConvertToRawScore" for self-defined objective function. * boost_from_average seems doesn't work well in binary classification. remove it. * For a better jump link (#355) * Update Python-API.md * for a better jump in page A space is needed between `#` and the headers content according to Github's markdown format [guideline](https://guides.github.com/features/mastering-markdown/) After adding the spaces, we can jump to the exact position in page by click the link. * fixed something mentioned by @wxchan * Update Python-API.md * add FitByExistingTree. * adapt GPU tree learner for FitByExistingTree * avoid NaN output. * update boost.compute * fix typos (#361) * fix broken links (#359) * update README * disable GPU acceleration by default * fix image url * cleanup debug macro * remove old README * do not save sparse_threshold_ in FeatureGroup * add details for new GPU settings * ignore submodule when doing pep8 check * allocate workspace for at least one thread during builing Feature4 * move sparse_threshold to class Dataset * remove duplicated code in GPUTreeLearner::Split * Remove duplicated code in FindBestThresholds and BeforeFindBestSplit * do not rebuild ordered gradients and hessians for sparse features * support feature groups in GPUTreeLearner * Initial parallel learners with GPU support * add option device, cleanup code * clean up FindBestThresholds; add some omp parallel * constant hessian optimization for GPU * Fix GPUTreeLearner crash when there is zero feature * use np.testing.assert_almost_equal() to compare lists of floats in tests * travis for GPU
2017-04-09 16:53:14 +03:00
find_package(OpenCL REQUIRED)
include_directories(${OpenCL_INCLUDE_DIRS})
message(STATUS "OpenCL include directory: " ${OpenCL_INCLUDE_DIRS})
if(WIN32)
2017-05-07 10:31:37 +03:00
set(Boost_USE_STATIC_LIBS ON)
endif()
Initial GPU acceleration support for LightGBM (#368) * add dummy gpu solver code * initial GPU code * fix crash bug * first working version * use asynchronous copy * use a better kernel for root * parallel read histogram * sparse features now works, but no acceleration, compute on CPU * compute sparse feature on CPU simultaneously * fix big bug; add gpu selection; add kernel selection * better debugging * clean up * add feature scatter * Add sparse_threshold control * fix a bug in feature scatter * clean up debug * temporarily add OpenCL kernels for k=64,256 * fix up CMakeList and definition USE_GPU * add OpenCL kernels as string literals * Add boost.compute as a submodule * add boost dependency into CMakeList * fix opencl pragma * use pinned memory for histogram * use pinned buffer for gradients and hessians * better debugging message * add double precision support on GPU * fix boost version in CMakeList * Add a README * reconstruct GPU initialization code for ResetTrainingData * move data to GPU in parallel * fix a bug during feature copy * update gpu kernels * update gpu code * initial port to LightGBM v2 * speedup GPU data loading process * Add 4-bit bin support to GPU * re-add sparse_threshold parameter * remove kMaxNumWorkgroups and allows an unlimited number of features * add feature mask support for skipping unused features * enable kernel cache * use GPU kernels withoug feature masks when all features are used * REAdme. * REAdme. * update README * fix typos (#349) * change compile to gcc on Apple as default * clean vscode related file * refine api of constructing from sampling data. * fix bug in the last commit. * more efficient algorithm to sample k from n. * fix bug in filter bin * change to boost from average output. * fix tests. * only stop training when all classes are finshed in multi-class. * limit the max tree output. change hessian in multi-class objective. * robust tree model loading. * fix test. * convert the probabilities to raw score in boost_from_average of classification. * fix the average label for binary classification. * Add boost_from_average to docs (#354) * don't use "ConvertToRawScore" for self-defined objective function. * boost_from_average seems doesn't work well in binary classification. remove it. * For a better jump link (#355) * Update Python-API.md * for a better jump in page A space is needed between `#` and the headers content according to Github's markdown format [guideline](https://guides.github.com/features/mastering-markdown/) After adding the spaces, we can jump to the exact position in page by click the link. * fixed something mentioned by @wxchan * Update Python-API.md * add FitByExistingTree. * adapt GPU tree learner for FitByExistingTree * avoid NaN output. * update boost.compute * fix typos (#361) * fix broken links (#359) * update README * disable GPU acceleration by default * fix image url * cleanup debug macro * remove old README * do not save sparse_threshold_ in FeatureGroup * add details for new GPU settings * ignore submodule when doing pep8 check * allocate workspace for at least one thread during builing Feature4 * move sparse_threshold to class Dataset * remove duplicated code in GPUTreeLearner::Split * Remove duplicated code in FindBestThresholds and BeforeFindBestSplit * do not rebuild ordered gradients and hessians for sparse features * support feature groups in GPUTreeLearner * Initial parallel learners with GPU support * add option device, cleanup code * clean up FindBestThresholds; add some omp parallel * constant hessian optimization for GPU * Fix GPUTreeLearner crash when there is zero feature * use np.testing.assert_almost_equal() to compare lists of floats in tests * travis for GPU
2017-04-09 16:53:14 +03:00
find_package(Boost 1.56.0 COMPONENTS filesystem system REQUIRED)
if(WIN32)
# disable autolinking in boost
add_definitions(-DBOOST_ALL_NO_LIB)
endif()
2017-04-25 07:17:26 +03:00
include_directories(${Boost_INCLUDE_DIRS})
add_definitions(-DUSE_GPU)
endif()
Initial GPU acceleration support for LightGBM (#368) * add dummy gpu solver code * initial GPU code * fix crash bug * first working version * use asynchronous copy * use a better kernel for root * parallel read histogram * sparse features now works, but no acceleration, compute on CPU * compute sparse feature on CPU simultaneously * fix big bug; add gpu selection; add kernel selection * better debugging * clean up * add feature scatter * Add sparse_threshold control * fix a bug in feature scatter * clean up debug * temporarily add OpenCL kernels for k=64,256 * fix up CMakeList and definition USE_GPU * add OpenCL kernels as string literals * Add boost.compute as a submodule * add boost dependency into CMakeList * fix opencl pragma * use pinned memory for histogram * use pinned buffer for gradients and hessians * better debugging message * add double precision support on GPU * fix boost version in CMakeList * Add a README * reconstruct GPU initialization code for ResetTrainingData * move data to GPU in parallel * fix a bug during feature copy * update gpu kernels * update gpu code * initial port to LightGBM v2 * speedup GPU data loading process * Add 4-bit bin support to GPU * re-add sparse_threshold parameter * remove kMaxNumWorkgroups and allows an unlimited number of features * add feature mask support for skipping unused features * enable kernel cache * use GPU kernels withoug feature masks when all features are used * REAdme. * REAdme. * update README * fix typos (#349) * change compile to gcc on Apple as default * clean vscode related file * refine api of constructing from sampling data. * fix bug in the last commit. * more efficient algorithm to sample k from n. * fix bug in filter bin * change to boost from average output. * fix tests. * only stop training when all classes are finshed in multi-class. * limit the max tree output. change hessian in multi-class objective. * robust tree model loading. * fix test. * convert the probabilities to raw score in boost_from_average of classification. * fix the average label for binary classification. * Add boost_from_average to docs (#354) * don't use "ConvertToRawScore" for self-defined objective function. * boost_from_average seems doesn't work well in binary classification. remove it. * For a better jump link (#355) * Update Python-API.md * for a better jump in page A space is needed between `#` and the headers content according to Github's markdown format [guideline](https://guides.github.com/features/mastering-markdown/) After adding the spaces, we can jump to the exact position in page by click the link. * fixed something mentioned by @wxchan * Update Python-API.md * add FitByExistingTree. * adapt GPU tree learner for FitByExistingTree * avoid NaN output. * update boost.compute * fix typos (#361) * fix broken links (#359) * update README * disable GPU acceleration by default * fix image url * cleanup debug macro * remove old README * do not save sparse_threshold_ in FeatureGroup * add details for new GPU settings * ignore submodule when doing pep8 check * allocate workspace for at least one thread during builing Feature4 * move sparse_threshold to class Dataset * remove duplicated code in GPUTreeLearner::Split * Remove duplicated code in FindBestThresholds and BeforeFindBestSplit * do not rebuild ordered gradients and hessians for sparse features * support feature groups in GPUTreeLearner * Initial parallel learners with GPU support * add option device, cleanup code * clean up FindBestThresholds; add some omp parallel * constant hessian optimization for GPU * Fix GPUTreeLearner crash when there is zero feature * use np.testing.assert_almost_equal() to compare lists of floats in tests * travis for GPU
2017-04-09 16:53:14 +03:00
if(__INTEGRATE_OPENCL)
if(WIN32)
include(cmake/IntegratedOpenCL.cmake)
add_definitions(-DUSE_GPU)
else()
message(FATAL_ERROR "Integrated OpenCL build is available only for Windows")
endif()
endif()
[CUDA] New CUDA version Part 1 (#4630) * new cuda framework * add histogram construction kernel * before removing multi-gpu * new cuda framework * tree learner cuda kernels * single tree framework ready * single tree training framework * remove comments * boosting with cuda * optimize for best split find * data split * move boosting into cuda * parallel synchronize best split point * merge split data kernels * before code refactor * use tasks instead of features as units for split finding * refactor cuda best split finder * fix configuration error with small leaves in data split * skip histogram construction of too small leaf * skip split finding of invalid leaves stop when no leaf to split * support row wise with CUDA * copy data for split by column * copy data from host to CPU by column for data partition * add synchronize best splits for one leaf from multiple blocks * partition dense row data * fix sync best split from task blocks * add support for sparse row wise for CUDA * remove useless code * add l2 regression objective * sparse multi value bin enabled for CUDA * fix cuda ranking objective * support for number of items <= 2048 per query * speedup histogram construction by interleaving global memory access * split optimization * add cuda tree predictor * remove comma * refactor objective and score updater * before use struct * use structure for split information * use structure for leaf splits * return CUDASplitInfo directly after finding best split * split with CUDATree directly * use cuda row data in cuda histogram constructor * clean src/treelearner/cuda * gather shared cuda device functions * put shared CUDA functions into header file * change smaller leaf from <= back to < for consistent result with CPU * add tree predictor * remove useless cuda_tree_predictor * predict on CUDA with pipeline * add global sort algorithms * add global argsort for queries with many items in ranking tasks * remove limitation of maximum number of items per query in ranking * add cuda metrics * fix CUDA AUC * remove debug code * add regression metrics * remove useless file * don't use mask in shuffle reduce * add more regression objectives * fix cuda mape loss add cuda xentropy loss * use template for different versions of BitonicArgSortDevice * add multiclass metrics * add ndcg metric * fix cross entropy objectives and metrics * fix cross entropy and ndcg metrics * add support for customized objective in CUDA * complete multiclass ova for CUDA * separate cuda tree learner * use shuffle based prefix sum * clean up cuda_algorithms.hpp * add copy subset on CUDA * add bagging for CUDA * clean up code * copy gradients from host to device * support bagging without using subset * add support of bagging with subset for CUDAColumnData * add support of bagging with subset for dense CUDARowData * refactor copy sparse subrow * use copy subset for column subset * add reset train data and reset config for CUDA tree learner add deconstructors for cuda tree learner * add USE_CUDA ifdef to cuda tree learner files * check that dataset doesn't contain CUDA tree learner * remove printf debug information * use full new cuda tree learner only when using single GPU * disable all CUDA code when using CPU version * recover main.cpp * add cpp files for multi value bins * update LightGBM.vcxproj * update LightGBM.vcxproj fix lint errors * fix lint errors * fix lint errors * update Makevars fix lint errors * fix the case with 0 feature and 0 bin fix split finding for invalid leaves create cuda column data when loaded from bin file * fix lint errors hide GetRowWiseData when cuda is not used * recover default device type to cpu * fix na_as_missing case fix cuda feature meta information * fix UpdateDataIndexToLeafIndexKernel * create CUDA trees when needed in CUDADataPartition::UpdateTrainScore * add refit by tree for cuda tree learner * fix test_refit in test_engine.py * create set of large bin partitions in CUDARowData * add histogram construction for columns with a large number of bins * add find best split for categorical features on CUDA * add bitvectors for categorical split * cuda data partition split for categorical features * fix split tree with categorical feature * fix categorical feature splits * refactor cuda_data_partition.cu with multi-level templates * refactor CUDABestSplitFinder by grouping task information into struct * pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder * fix misuse of reference * remove useless changes * add support for path smoothing * virtual destructor for LightGBM::Tree * fix overlapped cat threshold in best split infos * reset histogram pointers in data partition and spllit finder in ResetConfig * comment useless parameter * fix reverse case when na is missing and default bin is zero * fix mfb_is_na and mfb_is_zero and is_single_feature_column * remove debug log * fix cat_l2 when one-hot fix gradient copy when data subset is used * switch shared histogram size according to CUDA version * gpu_use_dp=true when cuda test * revert modification in config.h * fix setting of gpu_use_dp=true in .ci/test.sh * fix linter errors * fix linter error remove useless change * recover main.cpp * separate cuda_exp and cuda * fix ci bash scripts add description for cuda_exp * add USE_CUDA_EXP flag * switch off USE_CUDA_EXP * revert changes in python-packages * more careful separation for USE_CUDA_EXP * fix CUDARowData::DivideCUDAFeatureGroups fix set fields for cuda metadata * revert config.h * fix test settings for cuda experimental version * skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version * fix lint issue by adding a blank line * fix lint errors by resorting imports * fix lint errors by resorting imports * fix lint errors by resorting imports * merge cuda.yml and cuda_exp.yml * update python version in cuda.yml * remove cuda_exp.yml * remove unrelated changes * fix compilation warnings fix cuda exp ci task name * recover task * use multi-level template in histogram construction check split only in debug mode * ignore NVCC related lines in parameter_generator.py * update job name for CUDA tests * apply review suggestions * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * update header * remove useless TODOs * remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062 * #include <LightGBM/utils/log.h> for USE_CUDA_EXP only * fix include order * fix include order * remove extra space * address review comments * add warning when cuda_exp is used together with deterministic * add comment about gpu_use_dp in .ci/test.sh * revert changing order of included headers Co-authored-by: Yu Shi <shiyu1994@qq.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2022-03-23 05:39:23 +03:00
if(USE_CUDA OR USE_CUDA_EXP)
if(USE_CUDA)
find_package(CUDA 9.0 REQUIRED)
else()
find_package(CUDA 10.0 REQUIRED)
endif()
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
include_directories(${CUDA_INCLUDE_DIRS})
set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -Xcompiler=${OpenMP_CXX_FLAGS} -Xcompiler=-fPIC -Xcompiler=-Wall")
set(CUDA_ARCHS "6.0" "6.1" "6.2" "7.0")
if(CUDA_VERSION VERSION_GREATER_EQUAL "10.0")
list(APPEND CUDA_ARCHS "7.5")
endif()
if(CUDA_VERSION VERSION_GREATER_EQUAL "11.0")
list(APPEND CUDA_ARCHS "8.0")
endif()
if(CUDA_VERSION VERSION_GREATER_EQUAL "11.1")
list(APPEND CUDA_ARCHS "8.6")
endif()
list(POP_BACK CUDA_ARCHS CUDA_LAST_SUPPORTED_ARCH)
list(APPEND CUDA_ARCHS "${CUDA_LAST_SUPPORTED_ARCH}+PTX")
cuda_select_nvcc_arch_flags(CUDA_ARCH_FLAGS ${CUDA_ARCHS})
string(REPLACE ";" " " CUDA_ARCH_FLAGS "${CUDA_ARCH_FLAGS}")
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} ${CUDA_ARCH_FLAGS}")
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
if(USE_DEBUG)
set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -g")
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
else()
set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -O3 -lineinfo")
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
endif()
message(STATUS "CMAKE_CUDA_FLAGS: ${CMAKE_CUDA_FLAGS}")
[CUDA] New CUDA version Part 1 (#4630) * new cuda framework * add histogram construction kernel * before removing multi-gpu * new cuda framework * tree learner cuda kernels * single tree framework ready * single tree training framework * remove comments * boosting with cuda * optimize for best split find * data split * move boosting into cuda * parallel synchronize best split point * merge split data kernels * before code refactor * use tasks instead of features as units for split finding * refactor cuda best split finder * fix configuration error with small leaves in data split * skip histogram construction of too small leaf * skip split finding of invalid leaves stop when no leaf to split * support row wise with CUDA * copy data for split by column * copy data from host to CPU by column for data partition * add synchronize best splits for one leaf from multiple blocks * partition dense row data * fix sync best split from task blocks * add support for sparse row wise for CUDA * remove useless code * add l2 regression objective * sparse multi value bin enabled for CUDA * fix cuda ranking objective * support for number of items <= 2048 per query * speedup histogram construction by interleaving global memory access * split optimization * add cuda tree predictor * remove comma * refactor objective and score updater * before use struct * use structure for split information * use structure for leaf splits * return CUDASplitInfo directly after finding best split * split with CUDATree directly * use cuda row data in cuda histogram constructor * clean src/treelearner/cuda * gather shared cuda device functions * put shared CUDA functions into header file * change smaller leaf from <= back to < for consistent result with CPU * add tree predictor * remove useless cuda_tree_predictor * predict on CUDA with pipeline * add global sort algorithms * add global argsort for queries with many items in ranking tasks * remove limitation of maximum number of items per query in ranking * add cuda metrics * fix CUDA AUC * remove debug code * add regression metrics * remove useless file * don't use mask in shuffle reduce * add more regression objectives * fix cuda mape loss add cuda xentropy loss * use template for different versions of BitonicArgSortDevice * add multiclass metrics * add ndcg metric * fix cross entropy objectives and metrics * fix cross entropy and ndcg metrics * add support for customized objective in CUDA * complete multiclass ova for CUDA * separate cuda tree learner * use shuffle based prefix sum * clean up cuda_algorithms.hpp * add copy subset on CUDA * add bagging for CUDA * clean up code * copy gradients from host to device * support bagging without using subset * add support of bagging with subset for CUDAColumnData * add support of bagging with subset for dense CUDARowData * refactor copy sparse subrow * use copy subset for column subset * add reset train data and reset config for CUDA tree learner add deconstructors for cuda tree learner * add USE_CUDA ifdef to cuda tree learner files * check that dataset doesn't contain CUDA tree learner * remove printf debug information * use full new cuda tree learner only when using single GPU * disable all CUDA code when using CPU version * recover main.cpp * add cpp files for multi value bins * update LightGBM.vcxproj * update LightGBM.vcxproj fix lint errors * fix lint errors * fix lint errors * update Makevars fix lint errors * fix the case with 0 feature and 0 bin fix split finding for invalid leaves create cuda column data when loaded from bin file * fix lint errors hide GetRowWiseData when cuda is not used * recover default device type to cpu * fix na_as_missing case fix cuda feature meta information * fix UpdateDataIndexToLeafIndexKernel * create CUDA trees when needed in CUDADataPartition::UpdateTrainScore * add refit by tree for cuda tree learner * fix test_refit in test_engine.py * create set of large bin partitions in CUDARowData * add histogram construction for columns with a large number of bins * add find best split for categorical features on CUDA * add bitvectors for categorical split * cuda data partition split for categorical features * fix split tree with categorical feature * fix categorical feature splits * refactor cuda_data_partition.cu with multi-level templates * refactor CUDABestSplitFinder by grouping task information into struct * pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder * fix misuse of reference * remove useless changes * add support for path smoothing * virtual destructor for LightGBM::Tree * fix overlapped cat threshold in best split infos * reset histogram pointers in data partition and spllit finder in ResetConfig * comment useless parameter * fix reverse case when na is missing and default bin is zero * fix mfb_is_na and mfb_is_zero and is_single_feature_column * remove debug log * fix cat_l2 when one-hot fix gradient copy when data subset is used * switch shared histogram size according to CUDA version * gpu_use_dp=true when cuda test * revert modification in config.h * fix setting of gpu_use_dp=true in .ci/test.sh * fix linter errors * fix linter error remove useless change * recover main.cpp * separate cuda_exp and cuda * fix ci bash scripts add description for cuda_exp * add USE_CUDA_EXP flag * switch off USE_CUDA_EXP * revert changes in python-packages * more careful separation for USE_CUDA_EXP * fix CUDARowData::DivideCUDAFeatureGroups fix set fields for cuda metadata * revert config.h * fix test settings for cuda experimental version * skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version * fix lint issue by adding a blank line * fix lint errors by resorting imports * fix lint errors by resorting imports * fix lint errors by resorting imports * merge cuda.yml and cuda_exp.yml * update python version in cuda.yml * remove cuda_exp.yml * remove unrelated changes * fix compilation warnings fix cuda exp ci task name * recover task * use multi-level template in histogram construction check split only in debug mode * ignore NVCC related lines in parameter_generator.py * update job name for CUDA tests * apply review suggestions * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * update header * remove useless TODOs * remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062 * #include <LightGBM/utils/log.h> for USE_CUDA_EXP only * fix include order * fix include order * remove extra space * address review comments * add warning when cuda_exp is used together with deterministic * add comment about gpu_use_dp in .ci/test.sh * revert changing order of included headers Co-authored-by: Yu Shi <shiyu1994@qq.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2022-03-23 05:39:23 +03:00
if(USE_CUDA)
add_definitions(-DUSE_CUDA)
elseif(USE_CUDA_EXP)
add_definitions(-DUSE_CUDA_EXP)
endif()
if(NOT DEFINED CMAKE_CUDA_STANDARD)
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
set(CMAKE_CUDA_STANDARD 11)
set(CMAKE_CUDA_STANDARD_REQUIRED ON)
endif()
set(
BASE_DEFINES
-DPOWER_FEATURE_WORKGROUPS=12
-DUSE_CONSTANT_BUF=0
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
)
set(
ALLFEATS_DEFINES
${BASE_DEFINES}
-DENABLE_ALL_FEATURES
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
)
set(
FULLDATA_DEFINES
${ALLFEATS_DEFINES}
-DIGNORE_INDICES
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
)
message(STATUS "ALLFEATS_DEFINES: ${ALLFEATS_DEFINES}")
message(STATUS "FULLDATA_DEFINES: ${FULLDATA_DEFINES}")
function(add_histogram hsize hname hadd hconst hdir)
add_library(histo${hsize}${hname} OBJECT src/treelearner/kernels/histogram${hsize}.cu)
set_target_properties(histo${hsize}${hname} PROPERTIES CUDA_SEPARABLE_COMPILATION ON)
set_target_properties(histo${hsize}${hname} PROPERTIES CUDA_ARCHITECTURES OFF)
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
if(hadd)
list(APPEND histograms histo${hsize}${hname})
set(histograms ${histograms} PARENT_SCOPE)
endif()
target_compile_definitions(
histo${hsize}${hname}
PRIVATE
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
-DCONST_HESSIAN=${hconst}
${hdir}
)
endfunction()
foreach(hsize _16_64_256)
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
add_histogram("${hsize}" "_sp_const" "True" "1" "${BASE_DEFINES}")
add_histogram("${hsize}" "_sp" "True" "0" "${BASE_DEFINES}")
add_histogram("${hsize}" "-allfeats_sp_const" "False" "1" "${ALLFEATS_DEFINES}")
add_histogram("${hsize}" "-allfeats_sp" "False" "0" "${ALLFEATS_DEFINES}")
add_histogram("${hsize}" "-fulldata_sp_const" "True" "1" "${FULLDATA_DEFINES}")
add_histogram("${hsize}" "-fulldata_sp" "True" "0" "${FULLDATA_DEFINES}")
endforeach()
endif()
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
if(USE_HDFS)
find_package(JNI REQUIRED)
find_path(HDFS_INCLUDE_DIR hdfs.h REQUIRED)
find_library(HDFS_LIB NAMES hdfs REQUIRED)
include_directories(${HDFS_INCLUDE_DIR})
add_definitions(-DUSE_HDFS)
set(HDFS_CXX_LIBRARIES ${HDFS_LIB} ${JAVA_JVM_LIBRARY})
endif()
include(CheckCXXSourceCompiles)
check_cxx_source_compiles("
#include <xmmintrin.h>
int main() {
int a = 0;
_mm_prefetch(&a, _MM_HINT_NTA);
return 0;
}
" MM_PREFETCH)
if(${MM_PREFETCH})
message(STATUS "Using _mm_prefetch")
add_definitions(-DMM_PREFETCH)
endif()
include(CheckCXXSourceCompiles)
check_cxx_source_compiles("
#include <mm_malloc.h>
int main() {
char *a = (char*)_mm_malloc(8, 16);
_mm_free(a);
return 0;
}
" MM_MALLOC)
if(${MM_MALLOC})
message(STATUS "Using _mm_malloc")
add_definitions(-DMM_MALLOC)
endif()
2016-12-07 06:45:55 +03:00
if(UNIX OR MINGW OR CYGWIN)
set(
CMAKE_CXX_FLAGS
"${CMAKE_CXX_FLAGS} -std=c++11 -pthread -Wextra -Wall -Wno-ignored-attributes -Wno-unknown-pragmas -Wno-return-type"
)
if(USE_DEBUG)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -g -O0")
else()
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O3")
endif()
if(USE_SWIG)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fno-strict-aliasing")
endif()
if(NOT USE_OPENMP)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-unknown-pragmas -Wno-unused-private-field")
endif()
if(__BUILD_FOR_R AND CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-cast-function-type")
endif()
2016-12-07 06:31:31 +03:00
endif()
2017-09-10 04:33:46 +03:00
if(WIN32 AND MINGW)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -static-libstdc++")
2017-09-10 04:33:46 +03:00
endif()
2016-12-07 06:31:31 +03:00
if(MSVC)
set(
variables
2016-12-07 06:31:31 +03:00
CMAKE_C_FLAGS_DEBUG
CMAKE_C_FLAGS_MINSIZEREL
CMAKE_C_FLAGS_RELEASE
CMAKE_C_FLAGS_RELWITHDEBINFO
CMAKE_CXX_FLAGS_DEBUG
CMAKE_CXX_FLAGS_MINSIZEREL
CMAKE_CXX_FLAGS_RELEASE
CMAKE_CXX_FLAGS_RELWITHDEBINFO
)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /W4 /MP")
if(USE_DEBUG)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /Od")
else()
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /O2 /Ob2 /Oi /Ot /Oy")
endif()
2016-12-07 06:31:31 +03:00
else()
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
if(NOT BUILD_STATIC_LIB)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fPIC")
endif()
if(NOT USE_DEBUG)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -funroll-loops")
endif()
endif()
2016-08-05 09:06:01 +03:00
set(LightGBM_HEADER_DIR ${PROJECT_SOURCE_DIR}/include)
2016-12-07 06:31:31 +03:00
set(EXECUTABLE_OUTPUT_PATH ${PROJECT_SOURCE_DIR})
set(LIBRARY_OUTPUT_PATH ${PROJECT_SOURCE_DIR})
2016-08-05 09:06:01 +03:00
include_directories(${LightGBM_HEADER_DIR})
2016-12-07 06:31:31 +03:00
if(APPLE)
if(APPLE_OUTPUT_DYLIB)
set(CMAKE_SHARED_LIBRARY_SUFFIX ".dylib")
else()
set(CMAKE_SHARED_LIBRARY_SUFFIX ".so")
endif()
endif()
2016-12-07 06:31:31 +03:00
if(USE_MPI)
include_directories(${MPI_CXX_INCLUDE_PATH})
endif()
2016-12-07 06:31:31 +03:00
file(
GLOB
SOURCES
src/boosting/*.cpp
src/io/*.cpp
src/metric/*.cpp
src/objective/*.cpp
src/network/*.cpp
src/treelearner/*.cpp
[CUDA] New CUDA version Part 1 (#4630) * new cuda framework * add histogram construction kernel * before removing multi-gpu * new cuda framework * tree learner cuda kernels * single tree framework ready * single tree training framework * remove comments * boosting with cuda * optimize for best split find * data split * move boosting into cuda * parallel synchronize best split point * merge split data kernels * before code refactor * use tasks instead of features as units for split finding * refactor cuda best split finder * fix configuration error with small leaves in data split * skip histogram construction of too small leaf * skip split finding of invalid leaves stop when no leaf to split * support row wise with CUDA * copy data for split by column * copy data from host to CPU by column for data partition * add synchronize best splits for one leaf from multiple blocks * partition dense row data * fix sync best split from task blocks * add support for sparse row wise for CUDA * remove useless code * add l2 regression objective * sparse multi value bin enabled for CUDA * fix cuda ranking objective * support for number of items <= 2048 per query * speedup histogram construction by interleaving global memory access * split optimization * add cuda tree predictor * remove comma * refactor objective and score updater * before use struct * use structure for split information * use structure for leaf splits * return CUDASplitInfo directly after finding best split * split with CUDATree directly * use cuda row data in cuda histogram constructor * clean src/treelearner/cuda * gather shared cuda device functions * put shared CUDA functions into header file * change smaller leaf from <= back to < for consistent result with CPU * add tree predictor * remove useless cuda_tree_predictor * predict on CUDA with pipeline * add global sort algorithms * add global argsort for queries with many items in ranking tasks * remove limitation of maximum number of items per query in ranking * add cuda metrics * fix CUDA AUC * remove debug code * add regression metrics * remove useless file * don't use mask in shuffle reduce * add more regression objectives * fix cuda mape loss add cuda xentropy loss * use template for different versions of BitonicArgSortDevice * add multiclass metrics * add ndcg metric * fix cross entropy objectives and metrics * fix cross entropy and ndcg metrics * add support for customized objective in CUDA * complete multiclass ova for CUDA * separate cuda tree learner * use shuffle based prefix sum * clean up cuda_algorithms.hpp * add copy subset on CUDA * add bagging for CUDA * clean up code * copy gradients from host to device * support bagging without using subset * add support of bagging with subset for CUDAColumnData * add support of bagging with subset for dense CUDARowData * refactor copy sparse subrow * use copy subset for column subset * add reset train data and reset config for CUDA tree learner add deconstructors for cuda tree learner * add USE_CUDA ifdef to cuda tree learner files * check that dataset doesn't contain CUDA tree learner * remove printf debug information * use full new cuda tree learner only when using single GPU * disable all CUDA code when using CPU version * recover main.cpp * add cpp files for multi value bins * update LightGBM.vcxproj * update LightGBM.vcxproj fix lint errors * fix lint errors * fix lint errors * update Makevars fix lint errors * fix the case with 0 feature and 0 bin fix split finding for invalid leaves create cuda column data when loaded from bin file * fix lint errors hide GetRowWiseData when cuda is not used * recover default device type to cpu * fix na_as_missing case fix cuda feature meta information * fix UpdateDataIndexToLeafIndexKernel * create CUDA trees when needed in CUDADataPartition::UpdateTrainScore * add refit by tree for cuda tree learner * fix test_refit in test_engine.py * create set of large bin partitions in CUDARowData * add histogram construction for columns with a large number of bins * add find best split for categorical features on CUDA * add bitvectors for categorical split * cuda data partition split for categorical features * fix split tree with categorical feature * fix categorical feature splits * refactor cuda_data_partition.cu with multi-level templates * refactor CUDABestSplitFinder by grouping task information into struct * pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder * fix misuse of reference * remove useless changes * add support for path smoothing * virtual destructor for LightGBM::Tree * fix overlapped cat threshold in best split infos * reset histogram pointers in data partition and spllit finder in ResetConfig * comment useless parameter * fix reverse case when na is missing and default bin is zero * fix mfb_is_na and mfb_is_zero and is_single_feature_column * remove debug log * fix cat_l2 when one-hot fix gradient copy when data subset is used * switch shared histogram size according to CUDA version * gpu_use_dp=true when cuda test * revert modification in config.h * fix setting of gpu_use_dp=true in .ci/test.sh * fix linter errors * fix linter error remove useless change * recover main.cpp * separate cuda_exp and cuda * fix ci bash scripts add description for cuda_exp * add USE_CUDA_EXP flag * switch off USE_CUDA_EXP * revert changes in python-packages * more careful separation for USE_CUDA_EXP * fix CUDARowData::DivideCUDAFeatureGroups fix set fields for cuda metadata * revert config.h * fix test settings for cuda experimental version * skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version * fix lint issue by adding a blank line * fix lint errors by resorting imports * fix lint errors by resorting imports * fix lint errors by resorting imports * merge cuda.yml and cuda_exp.yml * update python version in cuda.yml * remove cuda_exp.yml * remove unrelated changes * fix compilation warnings fix cuda exp ci task name * recover task * use multi-level template in histogram construction check split only in debug mode * ignore NVCC related lines in parameter_generator.py * update job name for CUDA tests * apply review suggestions * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * update header * remove useless TODOs * remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062 * #include <LightGBM/utils/log.h> for USE_CUDA_EXP only * fix include order * fix include order * remove extra space * address review comments * add warning when cuda_exp is used together with deterministic * add comment about gpu_use_dp in .ci/test.sh * revert changing order of included headers Co-authored-by: Yu Shi <shiyu1994@qq.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2022-03-23 05:39:23 +03:00
if(USE_CUDA OR USE_CUDA_EXP)
src/treelearner/*.cu
endif()
[CUDA] New CUDA version Part 1 (#4630) * new cuda framework * add histogram construction kernel * before removing multi-gpu * new cuda framework * tree learner cuda kernels * single tree framework ready * single tree training framework * remove comments * boosting with cuda * optimize for best split find * data split * move boosting into cuda * parallel synchronize best split point * merge split data kernels * before code refactor * use tasks instead of features as units for split finding * refactor cuda best split finder * fix configuration error with small leaves in data split * skip histogram construction of too small leaf * skip split finding of invalid leaves stop when no leaf to split * support row wise with CUDA * copy data for split by column * copy data from host to CPU by column for data partition * add synchronize best splits for one leaf from multiple blocks * partition dense row data * fix sync best split from task blocks * add support for sparse row wise for CUDA * remove useless code * add l2 regression objective * sparse multi value bin enabled for CUDA * fix cuda ranking objective * support for number of items <= 2048 per query * speedup histogram construction by interleaving global memory access * split optimization * add cuda tree predictor * remove comma * refactor objective and score updater * before use struct * use structure for split information * use structure for leaf splits * return CUDASplitInfo directly after finding best split * split with CUDATree directly * use cuda row data in cuda histogram constructor * clean src/treelearner/cuda * gather shared cuda device functions * put shared CUDA functions into header file * change smaller leaf from <= back to < for consistent result with CPU * add tree predictor * remove useless cuda_tree_predictor * predict on CUDA with pipeline * add global sort algorithms * add global argsort for queries with many items in ranking tasks * remove limitation of maximum number of items per query in ranking * add cuda metrics * fix CUDA AUC * remove debug code * add regression metrics * remove useless file * don't use mask in shuffle reduce * add more regression objectives * fix cuda mape loss add cuda xentropy loss * use template for different versions of BitonicArgSortDevice * add multiclass metrics * add ndcg metric * fix cross entropy objectives and metrics * fix cross entropy and ndcg metrics * add support for customized objective in CUDA * complete multiclass ova for CUDA * separate cuda tree learner * use shuffle based prefix sum * clean up cuda_algorithms.hpp * add copy subset on CUDA * add bagging for CUDA * clean up code * copy gradients from host to device * support bagging without using subset * add support of bagging with subset for CUDAColumnData * add support of bagging with subset for dense CUDARowData * refactor copy sparse subrow * use copy subset for column subset * add reset train data and reset config for CUDA tree learner add deconstructors for cuda tree learner * add USE_CUDA ifdef to cuda tree learner files * check that dataset doesn't contain CUDA tree learner * remove printf debug information * use full new cuda tree learner only when using single GPU * disable all CUDA code when using CPU version * recover main.cpp * add cpp files for multi value bins * update LightGBM.vcxproj * update LightGBM.vcxproj fix lint errors * fix lint errors * fix lint errors * update Makevars fix lint errors * fix the case with 0 feature and 0 bin fix split finding for invalid leaves create cuda column data when loaded from bin file * fix lint errors hide GetRowWiseData when cuda is not used * recover default device type to cpu * fix na_as_missing case fix cuda feature meta information * fix UpdateDataIndexToLeafIndexKernel * create CUDA trees when needed in CUDADataPartition::UpdateTrainScore * add refit by tree for cuda tree learner * fix test_refit in test_engine.py * create set of large bin partitions in CUDARowData * add histogram construction for columns with a large number of bins * add find best split for categorical features on CUDA * add bitvectors for categorical split * cuda data partition split for categorical features * fix split tree with categorical feature * fix categorical feature splits * refactor cuda_data_partition.cu with multi-level templates * refactor CUDABestSplitFinder by grouping task information into struct * pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder * fix misuse of reference * remove useless changes * add support for path smoothing * virtual destructor for LightGBM::Tree * fix overlapped cat threshold in best split infos * reset histogram pointers in data partition and spllit finder in ResetConfig * comment useless parameter * fix reverse case when na is missing and default bin is zero * fix mfb_is_na and mfb_is_zero and is_single_feature_column * remove debug log * fix cat_l2 when one-hot fix gradient copy when data subset is used * switch shared histogram size according to CUDA version * gpu_use_dp=true when cuda test * revert modification in config.h * fix setting of gpu_use_dp=true in .ci/test.sh * fix linter errors * fix linter error remove useless change * recover main.cpp * separate cuda_exp and cuda * fix ci bash scripts add description for cuda_exp * add USE_CUDA_EXP flag * switch off USE_CUDA_EXP * revert changes in python-packages * more careful separation for USE_CUDA_EXP * fix CUDARowData::DivideCUDAFeatureGroups fix set fields for cuda metadata * revert config.h * fix test settings for cuda experimental version * skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version * fix lint issue by adding a blank line * fix lint errors by resorting imports * fix lint errors by resorting imports * fix lint errors by resorting imports * merge cuda.yml and cuda_exp.yml * update python version in cuda.yml * remove cuda_exp.yml * remove unrelated changes * fix compilation warnings fix cuda exp ci task name * recover task * use multi-level template in histogram construction check split only in debug mode * ignore NVCC related lines in parameter_generator.py * update job name for CUDA tests * apply review suggestions * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * update header * remove useless TODOs * remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062 * #include <LightGBM/utils/log.h> for USE_CUDA_EXP only * fix include order * fix include order * remove extra space * address review comments * add warning when cuda_exp is used together with deterministic * add comment about gpu_use_dp in .ci/test.sh * revert changing order of included headers Co-authored-by: Yu Shi <shiyu1994@qq.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2022-03-23 05:39:23 +03:00
if(USE_CUDA_EXP)
src/treelearner/cuda/*.cpp
src/treelearner/cuda/*.cu
src/io/cuda/*.cu
src/io/cuda/*.cpp
src/cuda/*.cpp
src/cuda/*.cu
endif()
2016-12-07 06:31:31 +03:00
)
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
add_library(lightgbm_objs OBJECT ${SOURCES})
add_executable(lightgbm src/main.cpp src/application/application.cpp)
target_link_libraries(lightgbm PRIVATE lightgbm_objs)
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
set(API_SOURCES "src/c_api.cpp")
# Only build the R part of the library if building for
# use with the R package
if(__BUILD_FOR_R)
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
list(APPEND API_SOURCES "src/lightgbm_R.cpp")
endif()
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
add_library(lightgbm_capi_objs OBJECT ${API_SOURCES})
if(BUILD_STATIC_LIB)
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
add_library(_lightgbm STATIC)
2020-06-26 22:32:36 +03:00
else()
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
add_library(_lightgbm SHARED)
endif()
# LightGBM headers include openmp, cuda, R etc. headers,
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
# thus PUBLIC is required for building _lightgbm_swig target.
target_link_libraries(_lightgbm PUBLIC lightgbm_capi_objs lightgbm_objs)
if(MSVC)
set_target_properties(_lightgbm PROPERTIES OUTPUT_NAME "lib_lightgbm")
endif()
if(USE_SWIG)
set_property(SOURCE swig/lightgbmlib.i PROPERTY CPLUSPLUS ON)
list(APPEND swig_options -package com.microsoft.ml.lightgbm)
set_property(SOURCE swig/lightgbmlib.i PROPERTY SWIG_FLAGS "${swig_options}")
swig_add_module(_lightgbm_swig java swig/lightgbmlib.i)
swig_link_libraries(_lightgbm_swig _lightgbm)
# needed to ensure Linux build does not have lib prefix specified twice, e.g. liblib_lightgbm_swig
set_target_properties(_lightgbm_swig PROPERTIES PREFIX "")
# needed in some versions of CMake for VS and MinGW builds to ensure output dll has lib prefix
set_target_properties(_lightgbm_swig PROPERTIES OUTPUT_NAME "lib_lightgbm_swig")
if(WIN32)
if(MINGW OR CYGWIN)
add_custom_command(
TARGET _lightgbm_swig
POST_BUILD
COMMAND "${Java_JAVAC_EXECUTABLE}" -d . java/*.java
COMMAND
"${CMAKE_COMMAND}"
-E
copy_if_different
"${PROJECT_SOURCE_DIR}/lib_lightgbm.dll"
com/microsoft/ml/lightgbm/windows/x86_64
COMMAND
"${CMAKE_COMMAND}"
-E
copy_if_different
"${PROJECT_SOURCE_DIR}/lib_lightgbm_swig.dll"
com/microsoft/ml/lightgbm/windows/x86_64
COMMAND "${Java_JAR_EXECUTABLE}" -cf lightgbmlib.jar com
)
else()
add_custom_command(
TARGET _lightgbm_swig
POST_BUILD
COMMAND "${Java_JAVAC_EXECUTABLE}" -d . java/*.java
COMMAND cp "${PROJECT_SOURCE_DIR}/Release/*.dll" com/microsoft/ml/lightgbm/windows/x86_64
COMMAND "${Java_JAR_EXECUTABLE}" -cf lightgbmlib.jar com
)
endif()
elseif(APPLE)
add_custom_command(
TARGET _lightgbm_swig
POST_BUILD
COMMAND "${Java_JAVAC_EXECUTABLE}" -d . java/*.java
COMMAND cp "${PROJECT_SOURCE_DIR}/*.dylib" com/microsoft/ml/lightgbm/osx/x86_64
COMMAND
cp
"${PROJECT_SOURCE_DIR}/lib_lightgbm_swig.jnilib"
com/microsoft/ml/lightgbm/osx/x86_64/lib_lightgbm_swig.dylib
COMMAND "${Java_JAR_EXECUTABLE}" -cf lightgbmlib.jar com
)
else()
add_custom_command(
TARGET _lightgbm_swig
POST_BUILD
COMMAND "${Java_JAVAC_EXECUTABLE}" -d . java/*.java
COMMAND cp "${PROJECT_SOURCE_DIR}/*.so" com/microsoft/ml/lightgbm/linux/x86_64
COMMAND "${Java_JAR_EXECUTABLE}" -cf lightgbmlib.jar com
)
endif()
endif()
2016-12-07 06:31:31 +03:00
if(USE_MPI)
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
target_link_libraries(lightgbm_objs PUBLIC ${MPI_CXX_LIBRARIES})
endif()
2016-12-07 06:31:31 +03:00
if(USE_OPENMP)
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
if(CMAKE_CXX_COMPILER_ID STREQUAL "AppleClang")
target_link_libraries(lightgbm_objs PUBLIC OpenMP::OpenMP_CXX)
# c_api headers also includes OpenMP headers, thus compiling
# lightgbm_capi_objs needs include directory for OpenMP.
# Specifying OpenMP in target_link_libraries will get include directory
# requirements for compilation.
# This uses CMake's Transitive Usage Requirements. Refer to CMake doc:
# https://cmake.org/cmake/help/v3.16/manual/cmake-buildsystem.7.html#transitive-usage-requirements
target_link_libraries(lightgbm_capi_objs PUBLIC OpenMP::OpenMP_CXX)
endif()
endif()
Initial GPU acceleration support for LightGBM (#368) * add dummy gpu solver code * initial GPU code * fix crash bug * first working version * use asynchronous copy * use a better kernel for root * parallel read histogram * sparse features now works, but no acceleration, compute on CPU * compute sparse feature on CPU simultaneously * fix big bug; add gpu selection; add kernel selection * better debugging * clean up * add feature scatter * Add sparse_threshold control * fix a bug in feature scatter * clean up debug * temporarily add OpenCL kernels for k=64,256 * fix up CMakeList and definition USE_GPU * add OpenCL kernels as string literals * Add boost.compute as a submodule * add boost dependency into CMakeList * fix opencl pragma * use pinned memory for histogram * use pinned buffer for gradients and hessians * better debugging message * add double precision support on GPU * fix boost version in CMakeList * Add a README * reconstruct GPU initialization code for ResetTrainingData * move data to GPU in parallel * fix a bug during feature copy * update gpu kernels * update gpu code * initial port to LightGBM v2 * speedup GPU data loading process * Add 4-bit bin support to GPU * re-add sparse_threshold parameter * remove kMaxNumWorkgroups and allows an unlimited number of features * add feature mask support for skipping unused features * enable kernel cache * use GPU kernels withoug feature masks when all features are used * REAdme. * REAdme. * update README * fix typos (#349) * change compile to gcc on Apple as default * clean vscode related file * refine api of constructing from sampling data. * fix bug in the last commit. * more efficient algorithm to sample k from n. * fix bug in filter bin * change to boost from average output. * fix tests. * only stop training when all classes are finshed in multi-class. * limit the max tree output. change hessian in multi-class objective. * robust tree model loading. * fix test. * convert the probabilities to raw score in boost_from_average of classification. * fix the average label for binary classification. * Add boost_from_average to docs (#354) * don't use "ConvertToRawScore" for self-defined objective function. * boost_from_average seems doesn't work well in binary classification. remove it. * For a better jump link (#355) * Update Python-API.md * for a better jump in page A space is needed between `#` and the headers content according to Github's markdown format [guideline](https://guides.github.com/features/mastering-markdown/) After adding the spaces, we can jump to the exact position in page by click the link. * fixed something mentioned by @wxchan * Update Python-API.md * add FitByExistingTree. * adapt GPU tree learner for FitByExistingTree * avoid NaN output. * update boost.compute * fix typos (#361) * fix broken links (#359) * update README * disable GPU acceleration by default * fix image url * cleanup debug macro * remove old README * do not save sparse_threshold_ in FeatureGroup * add details for new GPU settings * ignore submodule when doing pep8 check * allocate workspace for at least one thread during builing Feature4 * move sparse_threshold to class Dataset * remove duplicated code in GPUTreeLearner::Split * Remove duplicated code in FindBestThresholds and BeforeFindBestSplit * do not rebuild ordered gradients and hessians for sparse features * support feature groups in GPUTreeLearner * Initial parallel learners with GPU support * add option device, cleanup code * clean up FindBestThresholds; add some omp parallel * constant hessian optimization for GPU * Fix GPUTreeLearner crash when there is zero feature * use np.testing.assert_almost_equal() to compare lists of floats in tests * travis for GPU
2017-04-09 16:53:14 +03:00
if(USE_GPU)
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
target_link_libraries(lightgbm_objs PUBLIC ${OpenCL_LIBRARY} ${Boost_LIBRARIES})
endif()
Initial GPU acceleration support for LightGBM (#368) * add dummy gpu solver code * initial GPU code * fix crash bug * first working version * use asynchronous copy * use a better kernel for root * parallel read histogram * sparse features now works, but no acceleration, compute on CPU * compute sparse feature on CPU simultaneously * fix big bug; add gpu selection; add kernel selection * better debugging * clean up * add feature scatter * Add sparse_threshold control * fix a bug in feature scatter * clean up debug * temporarily add OpenCL kernels for k=64,256 * fix up CMakeList and definition USE_GPU * add OpenCL kernels as string literals * Add boost.compute as a submodule * add boost dependency into CMakeList * fix opencl pragma * use pinned memory for histogram * use pinned buffer for gradients and hessians * better debugging message * add double precision support on GPU * fix boost version in CMakeList * Add a README * reconstruct GPU initialization code for ResetTrainingData * move data to GPU in parallel * fix a bug during feature copy * update gpu kernels * update gpu code * initial port to LightGBM v2 * speedup GPU data loading process * Add 4-bit bin support to GPU * re-add sparse_threshold parameter * remove kMaxNumWorkgroups and allows an unlimited number of features * add feature mask support for skipping unused features * enable kernel cache * use GPU kernels withoug feature masks when all features are used * REAdme. * REAdme. * update README * fix typos (#349) * change compile to gcc on Apple as default * clean vscode related file * refine api of constructing from sampling data. * fix bug in the last commit. * more efficient algorithm to sample k from n. * fix bug in filter bin * change to boost from average output. * fix tests. * only stop training when all classes are finshed in multi-class. * limit the max tree output. change hessian in multi-class objective. * robust tree model loading. * fix test. * convert the probabilities to raw score in boost_from_average of classification. * fix the average label for binary classification. * Add boost_from_average to docs (#354) * don't use "ConvertToRawScore" for self-defined objective function. * boost_from_average seems doesn't work well in binary classification. remove it. * For a better jump link (#355) * Update Python-API.md * for a better jump in page A space is needed between `#` and the headers content according to Github's markdown format [guideline](https://guides.github.com/features/mastering-markdown/) After adding the spaces, we can jump to the exact position in page by click the link. * fixed something mentioned by @wxchan * Update Python-API.md * add FitByExistingTree. * adapt GPU tree learner for FitByExistingTree * avoid NaN output. * update boost.compute * fix typos (#361) * fix broken links (#359) * update README * disable GPU acceleration by default * fix image url * cleanup debug macro * remove old README * do not save sparse_threshold_ in FeatureGroup * add details for new GPU settings * ignore submodule when doing pep8 check * allocate workspace for at least one thread during builing Feature4 * move sparse_threshold to class Dataset * remove duplicated code in GPUTreeLearner::Split * Remove duplicated code in FindBestThresholds and BeforeFindBestSplit * do not rebuild ordered gradients and hessians for sparse features * support feature groups in GPUTreeLearner * Initial parallel learners with GPU support * add option device, cleanup code * clean up FindBestThresholds; add some omp parallel * constant hessian optimization for GPU * Fix GPUTreeLearner crash when there is zero feature * use np.testing.assert_almost_equal() to compare lists of floats in tests * travis for GPU
2017-04-09 16:53:14 +03:00
if(__INTEGRATE_OPENCL)
# targets OpenCL and Boost are added in IntegratedOpenCL.cmake
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
add_dependencies(lightgbm_objs OpenCL Boost)
# variables INTEGRATED_OPENCL_* are set in IntegratedOpenCL.cmake
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
target_include_directories(lightgbm_objs PRIVATE ${INTEGRATED_OPENCL_INCLUDES})
target_compile_definitions(lightgbm_objs PRIVATE ${INTEGRATED_OPENCL_DEFINITIONS})
target_link_libraries(lightgbm_objs PUBLIC ${INTEGRATED_OPENCL_LIBRARIES})
endif()
[CUDA] New CUDA version Part 1 (#4630) * new cuda framework * add histogram construction kernel * before removing multi-gpu * new cuda framework * tree learner cuda kernels * single tree framework ready * single tree training framework * remove comments * boosting with cuda * optimize for best split find * data split * move boosting into cuda * parallel synchronize best split point * merge split data kernels * before code refactor * use tasks instead of features as units for split finding * refactor cuda best split finder * fix configuration error with small leaves in data split * skip histogram construction of too small leaf * skip split finding of invalid leaves stop when no leaf to split * support row wise with CUDA * copy data for split by column * copy data from host to CPU by column for data partition * add synchronize best splits for one leaf from multiple blocks * partition dense row data * fix sync best split from task blocks * add support for sparse row wise for CUDA * remove useless code * add l2 regression objective * sparse multi value bin enabled for CUDA * fix cuda ranking objective * support for number of items <= 2048 per query * speedup histogram construction by interleaving global memory access * split optimization * add cuda tree predictor * remove comma * refactor objective and score updater * before use struct * use structure for split information * use structure for leaf splits * return CUDASplitInfo directly after finding best split * split with CUDATree directly * use cuda row data in cuda histogram constructor * clean src/treelearner/cuda * gather shared cuda device functions * put shared CUDA functions into header file * change smaller leaf from <= back to < for consistent result with CPU * add tree predictor * remove useless cuda_tree_predictor * predict on CUDA with pipeline * add global sort algorithms * add global argsort for queries with many items in ranking tasks * remove limitation of maximum number of items per query in ranking * add cuda metrics * fix CUDA AUC * remove debug code * add regression metrics * remove useless file * don't use mask in shuffle reduce * add more regression objectives * fix cuda mape loss add cuda xentropy loss * use template for different versions of BitonicArgSortDevice * add multiclass metrics * add ndcg metric * fix cross entropy objectives and metrics * fix cross entropy and ndcg metrics * add support for customized objective in CUDA * complete multiclass ova for CUDA * separate cuda tree learner * use shuffle based prefix sum * clean up cuda_algorithms.hpp * add copy subset on CUDA * add bagging for CUDA * clean up code * copy gradients from host to device * support bagging without using subset * add support of bagging with subset for CUDAColumnData * add support of bagging with subset for dense CUDARowData * refactor copy sparse subrow * use copy subset for column subset * add reset train data and reset config for CUDA tree learner add deconstructors for cuda tree learner * add USE_CUDA ifdef to cuda tree learner files * check that dataset doesn't contain CUDA tree learner * remove printf debug information * use full new cuda tree learner only when using single GPU * disable all CUDA code when using CPU version * recover main.cpp * add cpp files for multi value bins * update LightGBM.vcxproj * update LightGBM.vcxproj fix lint errors * fix lint errors * fix lint errors * update Makevars fix lint errors * fix the case with 0 feature and 0 bin fix split finding for invalid leaves create cuda column data when loaded from bin file * fix lint errors hide GetRowWiseData when cuda is not used * recover default device type to cpu * fix na_as_missing case fix cuda feature meta information * fix UpdateDataIndexToLeafIndexKernel * create CUDA trees when needed in CUDADataPartition::UpdateTrainScore * add refit by tree for cuda tree learner * fix test_refit in test_engine.py * create set of large bin partitions in CUDARowData * add histogram construction for columns with a large number of bins * add find best split for categorical features on CUDA * add bitvectors for categorical split * cuda data partition split for categorical features * fix split tree with categorical feature * fix categorical feature splits * refactor cuda_data_partition.cu with multi-level templates * refactor CUDABestSplitFinder by grouping task information into struct * pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder * fix misuse of reference * remove useless changes * add support for path smoothing * virtual destructor for LightGBM::Tree * fix overlapped cat threshold in best split infos * reset histogram pointers in data partition and spllit finder in ResetConfig * comment useless parameter * fix reverse case when na is missing and default bin is zero * fix mfb_is_na and mfb_is_zero and is_single_feature_column * remove debug log * fix cat_l2 when one-hot fix gradient copy when data subset is used * switch shared histogram size according to CUDA version * gpu_use_dp=true when cuda test * revert modification in config.h * fix setting of gpu_use_dp=true in .ci/test.sh * fix linter errors * fix linter error remove useless change * recover main.cpp * separate cuda_exp and cuda * fix ci bash scripts add description for cuda_exp * add USE_CUDA_EXP flag * switch off USE_CUDA_EXP * revert changes in python-packages * more careful separation for USE_CUDA_EXP * fix CUDARowData::DivideCUDAFeatureGroups fix set fields for cuda metadata * revert config.h * fix test settings for cuda experimental version * skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version * fix lint issue by adding a blank line * fix lint errors by resorting imports * fix lint errors by resorting imports * fix lint errors by resorting imports * merge cuda.yml and cuda_exp.yml * update python version in cuda.yml * remove cuda_exp.yml * remove unrelated changes * fix compilation warnings fix cuda exp ci task name * recover task * use multi-level template in histogram construction check split only in debug mode * ignore NVCC related lines in parameter_generator.py * update job name for CUDA tests * apply review suggestions * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * update header * remove useless TODOs * remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062 * #include <LightGBM/utils/log.h> for USE_CUDA_EXP only * fix include order * fix include order * remove extra space * address review comments * add warning when cuda_exp is used together with deterministic * add comment about gpu_use_dp in .ci/test.sh * revert changing order of included headers Co-authored-by: Yu Shi <shiyu1994@qq.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2022-03-23 05:39:23 +03:00
if(USE_CUDA OR USE_CUDA_EXP)
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
# Disable cmake warning about policy CMP0104. Refer to issue #3754 and PR #4268.
# Custom target properties does not propagate, thus we need to specify for
# each target that contains or depends on cuda source.
set_target_properties(lightgbm_objs PROPERTIES CUDA_ARCHITECTURES OFF)
set_target_properties(_lightgbm PROPERTIES CUDA_ARCHITECTURES OFF)
set_target_properties(lightgbm PROPERTIES CUDA_ARCHITECTURES OFF)
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
[CUDA] New CUDA version Part 1 (#4630) * new cuda framework * add histogram construction kernel * before removing multi-gpu * new cuda framework * tree learner cuda kernels * single tree framework ready * single tree training framework * remove comments * boosting with cuda * optimize for best split find * data split * move boosting into cuda * parallel synchronize best split point * merge split data kernels * before code refactor * use tasks instead of features as units for split finding * refactor cuda best split finder * fix configuration error with small leaves in data split * skip histogram construction of too small leaf * skip split finding of invalid leaves stop when no leaf to split * support row wise with CUDA * copy data for split by column * copy data from host to CPU by column for data partition * add synchronize best splits for one leaf from multiple blocks * partition dense row data * fix sync best split from task blocks * add support for sparse row wise for CUDA * remove useless code * add l2 regression objective * sparse multi value bin enabled for CUDA * fix cuda ranking objective * support for number of items <= 2048 per query * speedup histogram construction by interleaving global memory access * split optimization * add cuda tree predictor * remove comma * refactor objective and score updater * before use struct * use structure for split information * use structure for leaf splits * return CUDASplitInfo directly after finding best split * split with CUDATree directly * use cuda row data in cuda histogram constructor * clean src/treelearner/cuda * gather shared cuda device functions * put shared CUDA functions into header file * change smaller leaf from <= back to < for consistent result with CPU * add tree predictor * remove useless cuda_tree_predictor * predict on CUDA with pipeline * add global sort algorithms * add global argsort for queries with many items in ranking tasks * remove limitation of maximum number of items per query in ranking * add cuda metrics * fix CUDA AUC * remove debug code * add regression metrics * remove useless file * don't use mask in shuffle reduce * add more regression objectives * fix cuda mape loss add cuda xentropy loss * use template for different versions of BitonicArgSortDevice * add multiclass metrics * add ndcg metric * fix cross entropy objectives and metrics * fix cross entropy and ndcg metrics * add support for customized objective in CUDA * complete multiclass ova for CUDA * separate cuda tree learner * use shuffle based prefix sum * clean up cuda_algorithms.hpp * add copy subset on CUDA * add bagging for CUDA * clean up code * copy gradients from host to device * support bagging without using subset * add support of bagging with subset for CUDAColumnData * add support of bagging with subset for dense CUDARowData * refactor copy sparse subrow * use copy subset for column subset * add reset train data and reset config for CUDA tree learner add deconstructors for cuda tree learner * add USE_CUDA ifdef to cuda tree learner files * check that dataset doesn't contain CUDA tree learner * remove printf debug information * use full new cuda tree learner only when using single GPU * disable all CUDA code when using CPU version * recover main.cpp * add cpp files for multi value bins * update LightGBM.vcxproj * update LightGBM.vcxproj fix lint errors * fix lint errors * fix lint errors * update Makevars fix lint errors * fix the case with 0 feature and 0 bin fix split finding for invalid leaves create cuda column data when loaded from bin file * fix lint errors hide GetRowWiseData when cuda is not used * recover default device type to cpu * fix na_as_missing case fix cuda feature meta information * fix UpdateDataIndexToLeafIndexKernel * create CUDA trees when needed in CUDADataPartition::UpdateTrainScore * add refit by tree for cuda tree learner * fix test_refit in test_engine.py * create set of large bin partitions in CUDARowData * add histogram construction for columns with a large number of bins * add find best split for categorical features on CUDA * add bitvectors for categorical split * cuda data partition split for categorical features * fix split tree with categorical feature * fix categorical feature splits * refactor cuda_data_partition.cu with multi-level templates * refactor CUDABestSplitFinder by grouping task information into struct * pre-allocate space for vector split_find_tasks_ in CUDABestSplitFinder * fix misuse of reference * remove useless changes * add support for path smoothing * virtual destructor for LightGBM::Tree * fix overlapped cat threshold in best split infos * reset histogram pointers in data partition and spllit finder in ResetConfig * comment useless parameter * fix reverse case when na is missing and default bin is zero * fix mfb_is_na and mfb_is_zero and is_single_feature_column * remove debug log * fix cat_l2 when one-hot fix gradient copy when data subset is used * switch shared histogram size according to CUDA version * gpu_use_dp=true when cuda test * revert modification in config.h * fix setting of gpu_use_dp=true in .ci/test.sh * fix linter errors * fix linter error remove useless change * recover main.cpp * separate cuda_exp and cuda * fix ci bash scripts add description for cuda_exp * add USE_CUDA_EXP flag * switch off USE_CUDA_EXP * revert changes in python-packages * more careful separation for USE_CUDA_EXP * fix CUDARowData::DivideCUDAFeatureGroups fix set fields for cuda metadata * revert config.h * fix test settings for cuda experimental version * skip some tests due to unsupported features or differences in implementation details for CUDA Experimental version * fix lint issue by adding a blank line * fix lint errors by resorting imports * fix lint errors by resorting imports * fix lint errors by resorting imports * merge cuda.yml and cuda_exp.yml * update python version in cuda.yml * remove cuda_exp.yml * remove unrelated changes * fix compilation warnings fix cuda exp ci task name * recover task * use multi-level template in histogram construction check split only in debug mode * ignore NVCC related lines in parameter_generator.py * update job name for CUDA tests * apply review suggestions * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update .github/workflows/cuda.yml Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * update header * remove useless TODOs * remove [TODO(shiyu1994): constrain the split with min_data_in_group] and record in #5062 * #include <LightGBM/utils/log.h> for USE_CUDA_EXP only * fix include order * fix include order * remove extra space * address review comments * add warning when cuda_exp is used together with deterministic * add comment about gpu_use_dp in .ci/test.sh * revert changing order of included headers Co-authored-by: Yu Shi <shiyu1994@qq.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2022-03-23 05:39:23 +03:00
set_target_properties(lightgbm_objs PROPERTIES CUDA_SEPARABLE_COMPILATION ON)
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
# Device linking is not supported for object libraries.
# Thus we have to specify them on final targets.
set_target_properties(lightgbm PROPERTIES CUDA_RESOLVE_DEVICE_SYMBOLS ON)
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
set_target_properties(_lightgbm PROPERTIES CUDA_RESOLVE_DEVICE_SYMBOLS ON)
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
# histograms are list of object libraries. Linking object library to other
# object libraries only gets usage requirements, the linked objects won't be
# used. Thus we have to call target_link_libraries on final targets here.
target_link_libraries(lightgbm PRIVATE ${histograms})
target_link_libraries(_lightgbm PRIVATE ${histograms})
endif()
[GPU] Add support for CUDA-based GPU build (#3160) * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * Initial CUDA work * redirect log to python console (#3090) * redir log to python console * fix pylint * Apply suggestions from code review * Update basic.py * Apply suggestions from code review Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Update c_api.h * Apply suggestions from code review * Apply suggestions from code review * super-minor: better wording Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> * re-order includes (fixes #3132) (#3133) * Revert "re-order includes (fixes #3132) (#3133)" (#3153) This reverts commit 656d2676c2174781c91747ba158cb6d27f4cacbd. * Missing change from previous rebase * Minor cleanup and removal of development scripts. * Only set gpu_use_dp on by default for CUDA. Other minor change. * Fix python lint indentation problem. * More python lint issues. * Big lint cleanup - more to come. * Another large lint cleanup - more to come. * Even more lint cleanup. * Minor cleanup so less differences in code. * Revert is_use_subset changes * Another rebase from master to fix recent conflicts. * More lint. * Simple code cleanup - add & remove blank lines, revert unneccessary format changes, remove added dead code. * Removed parameters added for CUDA and various bug fix. * Yet more lint and unneccessary changes. * Revert another change. * Removal of unneccessary code. * temporary appveyor.yml for building and testing * Remove return value in ReSize * Removal of unused variables. * Code cleanup from reviewers suggestions. * Removal of FIXME comments and unused defines. * More reviewers comments cleanup. * More reviewers comments cleanup. * More reviewers comments cleanup. * Fix config variables. * Attempt to fix check-docs failure * Update Paramster.rst for num_gpu * Removing test appveyor.yml * Add ƒCUDA_RESOLVE_DEVICE_SYMBOLS to libraries to fix linking issue. * Fixed handling of data elements less than 2K. * More reviewers comments cleanup. * Removal of TODO and fix printing of int64_t * Add cuda change for CI testing and remove cuda from device_type in python. * Missed one change form previous check-in * Removal AdditionConfig and fix settings. * Limit number of GPUs to one for now in CUDA. * Update Parameters.rst for previous check-in * Whitespace removal. * Cleanup unused code. * Changed uint/ushort/ulong to unsigned int/short/long to help Windows based CUDA compiler work. * Lint change from previous check-in. * Changes based on reviewers comments. * More reviewer comment changes. * Adding warning for is_sparse. Revert tmp_subset code. Only return FeatureGroupData if not is_multi_val_ * Fix so that CUDA code will compile even if you enable the SCORE_T_USE_DOUBLE define. * Reviewer comment cleanup. * Replace warning with Log message. Removal of some of the USE_CUDA. Fix typo and removal of pragma once. * Remove PRINT debug for CUDA code. * Allow to use of multiple GPUs for CUDA. * More multi-GPUs enablement for CUDA. * More code cleanup based on reviews comments. * Update docs with latest config changes. Co-authored-by: Gordon Fossum <fossum@us.ibm.com> Co-authored-by: ChipKerchner <ckerchne@linux.vnet.ibm.com> Co-authored-by: Guolin Ke <guolin.ke@outlook.com> Co-authored-by: Nikita Titov <nekit94-08@mail.ru> Co-authored-by: StrikerRUS <nekit94-12@hotmail.com> Co-authored-by: James Lamb <jaylamb20@gmail.com>
2020-09-20 08:35:01 +03:00
if(USE_HDFS)
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
target_link_libraries(lightgbm_objs PUBLIC ${HDFS_CXX_LIBRARIES})
endif()
if(WIN32)
if(MINGW OR CYGWIN)
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
target_link_libraries(lightgbm_objs PUBLIC Ws2_32 IPHLPAPI)
endif()
endif()
2016-08-05 09:06:01 +03:00
if(__BUILD_FOR_R)
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
# utils/log.h and capi uses R headers, thus both object libraries need to link
# with R lib.
if(MSVC)
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
set(R_LIB ${LIBR_MSVC_CORE_LIBRARY})
else()
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
set(R_LIB ${LIBR_CORE_LIBRARY})
endif()
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
target_link_libraries(lightgbm_objs PUBLIC ${R_LIB})
target_link_libraries(lightgbm_capi_objs PUBLIC ${R_LIB})
endif()
[R-package] Use Rprintf for logging in the R package (fixes #1440, fixes #1909) (#2901) * [R-package] started cutting over from custom R-to-C interface to R.h * replaced LGBM_SE with SEXP * fixed error about ocnflicting definitions of length * got linking working * more stuff * eliminated R CMD CHECK note about printing * switched from hard-coded include dir to the one from FindLibR.cmake * cleaned up formatting in FindLibR.cmake * commented-out everything in CI that does not touch R * more changes * trying to get better logs * tried ignoring * added error message to confirm a suspicion * still trying to find R during R CMD CHECK * restore full CI * fixed comment * Update R-package/src/cmake/modules/FindLibR.cmake * changed strategy for finding LIBR_HOME on Windows * Removed 32-bit Windows stuff in FindLibR.cmake * Update R-package/src/cmake/modules/FindLibR.cmake * Update CMakeLists.txt Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Update CMakeLists.txt Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Update R-package/src/cmake/modules/FindLibR.cmake Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * removed some duplication in cmake scripts * Update R-package/src/cmake/modules/FindLibR.cmake Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Update R-package/src/cmake/modules/FindLibR.cmake Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Update R-package/src/cmake/modules/FindLibR.cmake Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Update R-package/src/cmake/modules/FindLibR.cmake Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Update R-package/src/cmake/modules/FindLibR.cmake Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * Update R-package/src/cmake/modules/FindLibR.cmake Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * added LIBR_CORE_LIBRARY back * small fixes to CMakeLists * simplified FindLibR.cmake * some fixes for windows * Apply suggestions from code review Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * allowed for directly passing LIBR_EXECUTABLE to FindLibR.cmake * reorganized FindLibR.cmake to catch more cases * clean up inconsistencies in R calls in FindLibR.cmake * Update R-package/src/cmake/modules/FindLibR.cmake Co-Authored-By: Nikita Titov <nekit94-08@mail.ru> * removed unnecessary log messages * removed unnecessary unset() call Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2020-03-24 03:11:30 +03:00
#-- Google C++ tests
if(BUILD_CPP_TEST)
find_package(GTest CONFIG)
if(NOT GTEST_FOUND)
message(STATUS "Did not find Google Test in the system root. Fetching Google Test now...")
include(FetchContent)
FetchContent_Declare(
googletest
GIT_REPOSITORY https://github.com/google/googletest.git
GIT_TAG release-1.11.0
)
FetchContent_MakeAvailable(googletest)
add_library(GTest::GTest ALIAS gtest)
endif()
file(GLOB CPP_TEST_SOURCES tests/cpp_tests/*.cpp)
if(MSVC)
set(
CompilerFlags
CMAKE_CXX_FLAGS
CMAKE_CXX_FLAGS_DEBUG
CMAKE_CXX_FLAGS_RELEASE
CMAKE_C_FLAGS
CMAKE_C_FLAGS_DEBUG
CMAKE_C_FLAGS_RELEASE
)
foreach(CompilerFlag ${CompilerFlags})
string(REPLACE "/MD" "/MT" ${CompilerFlag} "${${CompilerFlag}}")
endforeach()
endif()
cmake: use object library to avoid duplicate compilation. (#4489) * cmake: use object library to avoid duplicate compilation. * debug: verbose make log for building r package. * Include /usr/local/include for AppleClang. * Revert "debug: verbose make log for building r package." * update cmake comment and fix indentation * debug cmake USE_DEBUG. * Revert "debug cmake USt E_DEBUG." * Add -fPIC for building shared library. * Always set -fPIC for non MSVC compiler. * debug: print exception in setup.py * debug: print cmake output for vs build. * debug: set opencl related target_xxx on lightgbm_objs. * Define compile definitions, link libraries on lightgbm_objs. * Add PUBLIC to target_link_libraries to expose library dependency. * Use target_link_libraries on object library. This should propagate usage requirements. * Fix CUDA linking. Linking object library (lightgbm_objs) to object library (histograms) does not linked objects. * Use PUBLIC link for lightgbm lib. * Set cuda related properties on final targets. * Remove debugging changes. Revert "debug: print exception in setup.py" Revert "debug: print cmake output for vs build." etc. * Remove -D_lightgbm_EXPORTS. * Revert to add -fPIC only for NOT USE_DEBUG. * Enable PIC for shared lib. * Fix enable PIC. * Use -fPIC for shared lib. * testlightgbm depends only on object files. * tweak build for R. * Try to remove OpenMP related include dir settings. * link with openmp for capi object library. * Use PUBLIC for _lightgbm target_link_libraries. * Try removing exports definition. * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * fix typo Co-authored-by: Nikita Titov <nekit94-08@mail.ru> * Add some comments for cmake code. * Try to fix cmake warnings CUDA. * revert accidentally commited R-package path change. * Try to fix cmake CUDA warnings, set for _lightgbm target. * Try to fix cmake CUDA warnings, set for lightgbm target. * empty commit to trigger ci Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
2021-11-10 04:45:21 +03:00
add_executable(testlightgbm ${CPP_TEST_SOURCES})
target_link_libraries(testlightgbm PRIVATE lightgbm_objs GTest::GTest)
endif()
install(
TARGETS lightgbm _lightgbm
RUNTIME DESTINATION ${CMAKE_INSTALL_PREFIX}/bin
LIBRARY DESTINATION ${CMAKE_INSTALL_PREFIX}/lib
ARCHIVE DESTINATION ${CMAKE_INSTALL_PREFIX}/lib
)
2017-01-24 11:53:58 +03:00
install(DIRECTORY ${LightGBM_HEADER_DIR}/LightGBM DESTINATION ${CMAKE_INSTALL_PREFIX}/include)