* Fix LightGBM models locale sensitivity and improve R/W performance.
When Java is used, the default C++ locale is broken. This is true for
Java providers that use the C API or even Python models that require JEP.
This patch solves that issue making the model reads/writes insensitive
to such settings.
To achieve it, within the model read/write codebase:
- C++ streams are imbued with the classic locale
- Calls to functions that are dependent on the locale are replaced
- The default locale is not changed!
This approach means:
- The user's locale is never tampered with, avoiding issues such as
https://github.com/microsoft/LightGBM/issues/2979 with the previous
approach https://github.com/microsoft/LightGBM/pull/2891
- Datasets can still be read according the user's locale
- The model file has a single format independent of locale
Changes:
- Add CommonC namespace which provides faster locale-independent versions of Common's methods
- Model code makes conversions through CommonC
- Cleanup unused Common methods
- Performance improvements. Use fast libraries for locale-agnostic conversion:
- value->string: https://github.com/fmtlib/fmt
- string->double: https://github.com/lemire/fast_double_parser (10x
faster double parsing according to their benchmark)
Bugfixes:
- https://github.com/microsoft/LightGBM/issues/2500
- https://github.com/microsoft/LightGBM/issues/2890
- https://github.com/ninia/jep/issues/205 (as it is related to LGBM as well)
* Align CommonC namespace
* Add new external_libs/ to python setup
* Try fast_double_parser fix#1
Testing commit e09e5aad828bcb16bea7ed0ed8322e019112fdbe
If it works it should fix more LGBM builds
* CMake: Attempt to link fmt without explicit PUBLIC tag
* Exclude external_libs from linting
* Add exernal_libs to MANIFEST.in
* Set dynamic linking option for fmt.
* linting issues
* Try to fix lint includes
* Try to pass fPIC with static fmt lib
* Try CMake P_I_C option with fmt library
* [R-package] Add CMake support for R and CRAN
* Cleanup CMakeLists
* Try fmt hack to remove stdout
* Switch to header-only mode
* Add PRIVATE argument to target_link_libraries
* use fmt in header-only mode
* Remove CMakeLists comment
* Change OpenMP to PUBLIC linking in Mac
* Update fmt submodule to 7.1.2
* Use fmt in header-only-mode
* Remove fmt from CMakeLists.txt
* Upgrade fast_double_parser to v0.2.0
* Revert "Add PRIVATE argument to target_link_libraries"
This reverts commit 3dd45dde7b92531b2530ab54522bb843c56227a7.
* Address James Lamb's comments
* Update R-package/.Rbuildignore
Co-authored-by: James Lamb <jaylamb20@gmail.com>
* Upgrade to fast_double_parser v0.3.0 - Solaris support
* Use legacy code only in Solaris
* Fix lint issues
* Fix comment
* Address StrikerRUS's comments (solaris ifdef).
* Change header guards
Co-authored-by: James Lamb <jaylamb20@gmail.com>
* [R-package] fix R examples and lgb.plot.interpretation
* remove space in gitignore
* try data.table from conda-forge
* update FAQ
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* [R-package] made roxygen2 tags explicit and cleaned up documentation
* Apply suggestions from code review
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>
* Apply suggestions from code review
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>
* Update R-package/man/lightgbm.Rd
Co-Authored-By: Nikita Titov <nekit94-08@mail.ru>
* [R-package] moved @name to the top of roxygen blocks and removed some inaccurate information in documentation on parameters
Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
* added R-package docs generation routines
* change theme to be more consistent with sphinx_rtd_theme on main site in terms of color scheme
* placed man folder with old Rd files back
* specify full path to conda and make script more readable by one line - one pkg
* removed commented lines from build_r_site script
* made one line - one argument in build_reference() call
* pin R package versions
* fixed conflict
* Fixed typos in docs
* Fixed inconsistencies in documentation
* Updated strategy for registering routines
* Fixed issues caused by smashing multiple functions into one Rd
* Fixed issues with documentation
* Removed VignetteBuilder and updated Rbuildignore
* Added R build artefacts to gitignore
* Added namespacing on data.table set function. Updated handling of CMakeLists file to get around CRAN check.
* Updated build instructions
* Added R build script
* Removed build_r.sh script and updated R-package install instructions