зеркало из https://github.com/microsoft/LightGBM.git
* [R-package] clarify parameter documentation * fixes to braces * linting
This commit is contained in:
Родитель
8e126c80ba
Коммит
53602afa47
|
@ -730,11 +730,22 @@ Dataset <- R6::R6Class(
|
|||
#' @description Construct \code{lgb.Dataset} object from dense matrix, sparse matrix
|
||||
#' or local file (that was created previously by saving an \code{lgb.Dataset}).
|
||||
#' @param data a \code{matrix} object, a \code{dgCMatrix} object or a character representing a filename
|
||||
#' @param params a list of parameters
|
||||
#' @param reference reference dataset
|
||||
#' @param params a list of parameters. See
|
||||
#' \href{https://lightgbm.readthedocs.io/en/latest/Parameters.html#dataset-parameters}{
|
||||
#' The "Dataset Parameters" section of the documentation} for a list of parameters
|
||||
#' and valid values.
|
||||
#' @param reference reference dataset. When LightGBM creates a Dataset, it does some preprocessing like binning
|
||||
#' continuous features into histograms. If you want to apply the same bin boundaries from an existing
|
||||
#' dataset to new \code{data}, pass that existing Dataset to this argument.
|
||||
#' @param colnames names of columns
|
||||
#' @param categorical_feature categorical features
|
||||
#' @param free_raw_data TRUE for need to free raw data after construct
|
||||
#' @param categorical_feature categorical features. This can either be a character vector of feature
|
||||
#' names or an integer vector with the indices of the features (e.g.
|
||||
#' \code{c(1L, 10L)} to say "the first and tenth columns").
|
||||
#' @param free_raw_data LightGBM constructs its data format, called a "Dataset", from tabular data.
|
||||
#' By default, that Dataset object on the R side does not keep a copy of the raw data.
|
||||
#' This reduces LightGBM's memory consumption, but it means that the Dataset object
|
||||
#' cannot be changed after it has been constructed. If you'd prefer to be able to
|
||||
#' change the Dataset object after construction, set \code{free_raw_data = FALSE}.
|
||||
#' @param info a list of information of the \code{lgb.Dataset} object
|
||||
#' @param ... other information to pass to \code{info} or parameters pass to \code{params}
|
||||
#'
|
||||
|
|
|
@ -5,19 +5,20 @@
|
|||
#' @param valids a list of \code{lgb.Dataset} objects, used for validation
|
||||
#' @param record Boolean, TRUE will record iteration message to \code{booster$record_evals}
|
||||
#' @param colnames feature names, if not null, will use this to overwrite the names in dataset
|
||||
#' @param categorical_feature list of str or int
|
||||
#' type int represents index,
|
||||
#' type str represents feature names
|
||||
#' @param categorical_feature categorical features. This can either be a character vector of feature
|
||||
#' names or an integer vector with the indices of the features (e.g.
|
||||
#' \code{c(1L, 10L)} to say "the first and tenth columns").
|
||||
#' @param callbacks List of callback functions that are applied at each iteration.
|
||||
#' @param reset_data Boolean, setting it to TRUE (not the default value) will transform the
|
||||
#' booster model into a predictor model which frees up memory and the
|
||||
#' original datasets
|
||||
#' @param ... other parameters, see Parameters.rst for more information. A few key parameters:
|
||||
#' @param ... other parameters, see \href{https://lightgbm.readthedocs.io/en/latest/Parameters.html}{
|
||||
#' the "Parameters" section of the documentation} for more information. A few key parameters:
|
||||
#' \itemize{
|
||||
#' \item{\code{boosting}: Boosting type. \code{"gbdt"}, \code{"rf"}, \code{"dart"} or \code{"goss"}.}
|
||||
#' \item{\code{num_leaves}: Maximum number of leaves in one tree.}
|
||||
#' \item{\code{max_depth}: Limit the max depth for tree model. This is used to deal with
|
||||
#' overfit when #data is small. Tree still grow by leaf-wise.}
|
||||
#' overfitting. Tree still grow by leaf-wise.}
|
||||
#' \item{\code{num_threads}: Number of threads for LightGBM. For the best speed, set this to
|
||||
#' the number of real CPU cores(\code{parallel::detectCores(logical = FALSE)}),
|
||||
#' not the number of threads (most CPU using hyper-threading to generate 2 threads
|
||||
|
|
|
@ -5,10 +5,11 @@
|
|||
#' @param data a \code{lgb.Dataset} object, used for training. Some functions, such as \code{\link{lgb.cv}},
|
||||
#' may allow you to pass other types of data like \code{matrix} and then separately supply
|
||||
#' \code{label} as a keyword argument.
|
||||
#' @param early_stopping_rounds int. Activates early stopping. Requires at least one validation data
|
||||
#' and one metric. If there's more than one, will check all of them
|
||||
#' except the training data. Returns the model with (best_iter + early_stopping_rounds).
|
||||
#' If early stopping occurs, the model will have 'best_iter' field.
|
||||
#' @param early_stopping_rounds int. Activates early stopping. When this parameter is non-null,
|
||||
#' training will stop if the evaluation of any metric on any validation set
|
||||
#' fails to improve for \code{early_stopping_rounds} consecutive boosting rounds.
|
||||
#' If training stops early, the returned model will have attribute \code{best_iter}
|
||||
#' set to the iteration number of the best iteration.
|
||||
#' @param eval evaluation function(s). This can be a character vector, function, or list with a mixture of
|
||||
#' strings and functions.
|
||||
#'
|
||||
|
@ -48,7 +49,8 @@
|
|||
#' @param obj objective function, can be character or custom objective function. Examples include
|
||||
#' \code{regression}, \code{regression_l1}, \code{huber},
|
||||
#' \code{binary}, \code{lambdarank}, \code{multiclass}, \code{multiclass}
|
||||
#' @param params List of parameters
|
||||
#' @param params a list of parameters. See \href{https://lightgbm.readthedocs.io/en/latest/Parameters.html}{
|
||||
#' the "Parameters" section of the documentation} for a list of parameters and valid values.
|
||||
#' @param verbose verbosity for output, if <= 0, also will disable the print of evaluation during training
|
||||
#' @section Early Stopping:
|
||||
#'
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
#' @title saveRDS for \code{lgb.Booster} models
|
||||
#' @description Attempts to save a model using RDS. Has an additional parameter (\code{raw})
|
||||
#' which decides whether to save the raw model or not.
|
||||
#' @param object R object to serialize.
|
||||
#' @param object \code{lgb.Booster} object to serialize.
|
||||
#' @param file a connection or the name of the file where the R object is saved to or read from.
|
||||
#' @param ascii a logical. If TRUE or NA, an ASCII representation is written; otherwise (default),
|
||||
#' a binary one is used. See the comments in the help for save.
|
||||
|
|
|
@ -18,15 +18,26 @@ lgb.Dataset(
|
|||
\arguments{
|
||||
\item{data}{a \code{matrix} object, a \code{dgCMatrix} object or a character representing a filename}
|
||||
|
||||
\item{params}{a list of parameters}
|
||||
\item{params}{a list of parameters. See
|
||||
\href{https://lightgbm.readthedocs.io/en/latest/Parameters.html#dataset-parameters}{
|
||||
The "Dataset Parameters" section of the documentation} for a list of parameters
|
||||
and valid values.}
|
||||
|
||||
\item{reference}{reference dataset}
|
||||
\item{reference}{reference dataset. When LightGBM creates a Dataset, it does some preprocessing like binning
|
||||
continuous features into histograms. If you want to apply the same bin boundaries from an existing
|
||||
dataset to new \code{data}, pass that existing Dataset to this argument.}
|
||||
|
||||
\item{colnames}{names of columns}
|
||||
|
||||
\item{categorical_feature}{categorical features}
|
||||
\item{categorical_feature}{categorical features. This can either be a character vector of feature
|
||||
names or an integer vector with the indices of the features (e.g.
|
||||
\code{c(1L, 10L)} to say "the first and tenth columns").}
|
||||
|
||||
\item{free_raw_data}{TRUE for need to free raw data after construct}
|
||||
\item{free_raw_data}{LightGBM constructs its data format, called a "Dataset", from tabular data.
|
||||
By default, that Dataset object on the R side does not keep a copy of the raw data.
|
||||
This reduces LightGBM's memory consumption, but it means that the Dataset object
|
||||
cannot be changed after it has been constructed. If you'd prefer to be able to
|
||||
change the Dataset object after construction, set \code{free_raw_data = FALSE}.}
|
||||
|
||||
\item{info}{a list of information of the \code{lgb.Dataset} object}
|
||||
|
||||
|
|
|
@ -29,7 +29,8 @@ lgb.cv(
|
|||
)
|
||||
}
|
||||
\arguments{
|
||||
\item{params}{List of parameters}
|
||||
\item{params}{a list of parameters. See \href{https://lightgbm.readthedocs.io/en/latest/Parameters.html}{
|
||||
the "Parameters" section of the documentation} for a list of parameters and valid values.}
|
||||
|
||||
\item{data}{a \code{lgb.Dataset} object, used for training. Some functions, such as \code{\link{lgb.cv}},
|
||||
may allow you to pass other types of data like \code{matrix} and then separately supply
|
||||
|
@ -104,10 +105,11 @@ the \code{nfold} and \code{stratified} parameters are ignored.}
|
|||
names or an integer vector with the indices of the features (e.g.
|
||||
\code{c(1L, 10L)} to say "the first and tenth columns").}
|
||||
|
||||
\item{early_stopping_rounds}{int. Activates early stopping. Requires at least one validation data
|
||||
and one metric. If there's more than one, will check all of them
|
||||
except the training data. Returns the model with (best_iter + early_stopping_rounds).
|
||||
If early stopping occurs, the model will have 'best_iter' field.}
|
||||
\item{early_stopping_rounds}{int. Activates early stopping. When this parameter is non-null,
|
||||
training will stop if the evaluation of any metric on any validation set
|
||||
fails to improve for \code{early_stopping_rounds} consecutive boosting rounds.
|
||||
If training stops early, the returned model will have attribute \code{best_iter}
|
||||
set to the iteration number of the best iteration.}
|
||||
|
||||
\item{callbacks}{List of callback functions that are applied at each iteration.}
|
||||
|
||||
|
|
|
@ -24,7 +24,8 @@ lgb.train(
|
|||
)
|
||||
}
|
||||
\arguments{
|
||||
\item{params}{List of parameters}
|
||||
\item{params}{a list of parameters. See \href{https://lightgbm.readthedocs.io/en/latest/Parameters.html}{
|
||||
the "Parameters" section of the documentation} for a list of parameters and valid values.}
|
||||
|
||||
\item{data}{a \code{lgb.Dataset} object, used for training. Some functions, such as \code{\link{lgb.cv}},
|
||||
may allow you to pass other types of data like \code{matrix} and then separately supply
|
||||
|
@ -82,14 +83,15 @@ may allow you to pass other types of data like \code{matrix} and then separately
|
|||
|
||||
\item{colnames}{feature names, if not null, will use this to overwrite the names in dataset}
|
||||
|
||||
\item{categorical_feature}{list of str or int
|
||||
type int represents index,
|
||||
type str represents feature names}
|
||||
\item{categorical_feature}{categorical features. This can either be a character vector of feature
|
||||
names or an integer vector with the indices of the features (e.g.
|
||||
\code{c(1L, 10L)} to say "the first and tenth columns").}
|
||||
|
||||
\item{early_stopping_rounds}{int. Activates early stopping. Requires at least one validation data
|
||||
and one metric. If there's more than one, will check all of them
|
||||
except the training data. Returns the model with (best_iter + early_stopping_rounds).
|
||||
If early stopping occurs, the model will have 'best_iter' field.}
|
||||
\item{early_stopping_rounds}{int. Activates early stopping. When this parameter is non-null,
|
||||
training will stop if the evaluation of any metric on any validation set
|
||||
fails to improve for \code{early_stopping_rounds} consecutive boosting rounds.
|
||||
If training stops early, the returned model will have attribute \code{best_iter}
|
||||
set to the iteration number of the best iteration.}
|
||||
|
||||
\item{callbacks}{List of callback functions that are applied at each iteration.}
|
||||
|
||||
|
@ -97,12 +99,13 @@ If early stopping occurs, the model will have 'best_iter' field.}
|
|||
booster model into a predictor model which frees up memory and the
|
||||
original datasets}
|
||||
|
||||
\item{...}{other parameters, see Parameters.rst for more information. A few key parameters:
|
||||
\item{...}{other parameters, see \href{https://lightgbm.readthedocs.io/en/latest/Parameters.html}{
|
||||
the "Parameters" section of the documentation} for more information. A few key parameters:
|
||||
\itemize{
|
||||
\item{\code{boosting}: Boosting type. \code{"gbdt"}, \code{"rf"}, \code{"dart"} or \code{"goss"}.}
|
||||
\item{\code{num_leaves}: Maximum number of leaves in one tree.}
|
||||
\item{\code{max_depth}: Limit the max depth for tree model. This is used to deal with
|
||||
overfit when #data is small. Tree still grow by leaf-wise.}
|
||||
overfitting. Tree still grow by leaf-wise.}
|
||||
\item{\code{num_threads}: Number of threads for LightGBM. For the best speed, set this to
|
||||
the number of real CPU cores(\code{parallel::detectCores(logical = FALSE)}),
|
||||
not the number of threads (most CPU using hyper-threading to generate 2 threads
|
||||
|
|
|
@ -10,10 +10,11 @@
|
|||
may allow you to pass other types of data like \code{matrix} and then separately supply
|
||||
\code{label} as a keyword argument.}
|
||||
|
||||
\item{early_stopping_rounds}{int. Activates early stopping. Requires at least one validation data
|
||||
and one metric. If there's more than one, will check all of them
|
||||
except the training data. Returns the model with (best_iter + early_stopping_rounds).
|
||||
If early stopping occurs, the model will have 'best_iter' field.}
|
||||
\item{early_stopping_rounds}{int. Activates early stopping. When this parameter is non-null,
|
||||
training will stop if the evaluation of any metric on any validation set
|
||||
fails to improve for \code{early_stopping_rounds} consecutive boosting rounds.
|
||||
If training stops early, the returned model will have attribute \code{best_iter}
|
||||
set to the iteration number of the best iteration.}
|
||||
|
||||
\item{eval}{evaluation function(s). This can be a character vector, function, or list with a mixture of
|
||||
strings and functions.
|
||||
|
@ -59,7 +60,8 @@ If early stopping occurs, the model will have 'best_iter' field.}
|
|||
\code{regression}, \code{regression_l1}, \code{huber},
|
||||
\code{binary}, \code{lambdarank}, \code{multiclass}, \code{multiclass}}
|
||||
|
||||
\item{params}{List of parameters}
|
||||
\item{params}{a list of parameters. See \href{https://lightgbm.readthedocs.io/en/latest/Parameters.html}{
|
||||
the "Parameters" section of the documentation} for a list of parameters and valid values.}
|
||||
|
||||
\item{verbose}{verbosity for output, if <= 0, also will disable the print of evaluation during training}
|
||||
}
|
||||
|
|
|
@ -28,7 +28,8 @@ may allow you to pass other types of data like \code{matrix} and then separately
|
|||
|
||||
\item{weight}{vector of response values. If not NULL, will set to dataset}
|
||||
|
||||
\item{params}{List of parameters}
|
||||
\item{params}{a list of parameters. See \href{https://lightgbm.readthedocs.io/en/latest/Parameters.html}{
|
||||
the "Parameters" section of the documentation} for a list of parameters and valid values.}
|
||||
|
||||
\item{nrounds}{number of training rounds}
|
||||
|
||||
|
@ -36,10 +37,11 @@ may allow you to pass other types of data like \code{matrix} and then separately
|
|||
|
||||
\item{eval_freq}{evaluation output frequency, only effect when verbose > 0}
|
||||
|
||||
\item{early_stopping_rounds}{int. Activates early stopping. Requires at least one validation data
|
||||
and one metric. If there's more than one, will check all of them
|
||||
except the training data. Returns the model with (best_iter + early_stopping_rounds).
|
||||
If early stopping occurs, the model will have 'best_iter' field.}
|
||||
\item{early_stopping_rounds}{int. Activates early stopping. When this parameter is non-null,
|
||||
training will stop if the evaluation of any metric on any validation set
|
||||
fails to improve for \code{early_stopping_rounds} consecutive boosting rounds.
|
||||
If training stops early, the returned model will have attribute \code{best_iter}
|
||||
set to the iteration number of the best iteration.}
|
||||
|
||||
\item{save_name}{File name to use when writing the trained model to disk. Should end in ".model".}
|
||||
|
||||
|
|
|
@ -15,7 +15,7 @@ saveRDS.lgb.Booster(
|
|||
)
|
||||
}
|
||||
\arguments{
|
||||
\item{object}{R object to serialize.}
|
||||
\item{object}{\code{lgb.Booster} object to serialize.}
|
||||
|
||||
\item{file}{a connection or the name of the file where the R object is saved to or read from.}
|
||||
|
||||
|
|
Загрузка…
Ссылка в новой задаче