Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] document how to pass multi-value params from Python and R (fixes #4345) #4346

Merged
merged 5 commits into from
Jun 9, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 125 additions & 0 deletions R-package/tests/testthat/test_basic.R
Original file line number Diff line number Diff line change
Expand Up @@ -2042,3 +2042,128 @@ test_that(paste0("lgb.train() gives same results when using interaction_constrai
expect_equal(pred1, pred2)

})

context("monotone constraints")

.generate_trainset_for_monotone_constraints_tests <- function(x3_to_categorical) {
n_samples <- 3000L
x1_positively_correlated_with_y <- rnorm(n = n_samples)
x2_negatively_correlated_with_y <- rnorm(n = n_samples)
x3_negatively_correlated_with_y <- rnorm(n = n_samples)
if (x3_to_categorical) {
x3_negatively_correlated_with_y <- as.integer(abs(runif(n_samples) / 0.25))
categorical_features <- 3L
} else {
x3_negatively_correlated_with_y <- runif(n_samples)
categorical_features <- NULL
}
X <- data.matrix(
data.frame(
list(
x1_positively_correlated_with_y
, x2_negatively_correlated_with_y
, x3_negatively_correlated_with_y
)
)
)
zs <- rnorm(n = n_samples, mean = 0.0, sd = 0.01)
scales <- 10.0 * rnorm(6L + 0.5)
y <- (
scales[1L] * x1_positively_correlated_with_y
+ sin(scales[2L] * pi * x1_positively_correlated_with_y)
- scales[3L] * x2_negatively_correlated_with_y
- cos(scales[4L] * pi * x2_negatively_correlated_with_y)
- scales[5L] * x3_negatively_correlated_with_y
- cos(scales[6L] * pi * x3_negatively_correlated_with_y)
+ zs
)
return(lgb.Dataset(
data = X
, label = y
, categorical_feature = categorical_features
, free_raw_data = FALSE
))
}

.is_increasing <- function(y) {
return(all(diff(y) >= 0.0))
}

.is_decreasing <- function(y) {
return(all(diff(y) <= 0.0))
}

.is_non_monotone <- function(y) {
return(any(diff(y) < 0.0) & any(diff(y) > 0.0))
}

.is_correctly_constrained <- function(learner, x3_to_category) {
iterations <- 10L
n <- 1000L
variable_x <- seq_len(n) / n
fixed_xs_values <- seq_len(n)
for (i in seq_len(iterations)) {
monotonically_increasing_x <- data.matrix(
data.frame(
list(variable_x, fixed_x, fixed_x)
)
)
monotonically_increasing_y <- predict(learner, monotonically_increasing_x)

monotonically_decreasing_x <- data.matrix(
data.frame(
list(fixed_x, variable_x, fixed_x)
)
)
monotonically_decreasing_y <- predict(learner, monotonically_decreasing_x)

non_monotone_x <- data.matrix(
data.frame(
list(
fixed_x
, fixed_x
)
)
)

}
}

for (x3_to_categorical in c(TRUE, FALSE)){
for (monotone_constraints_method in c("basic", "intermediate", "advanced")) {
test_msg <- paste0(
"lgb.train() supports monotone constraints ("
, "categoricals="
, x3_to_categorical
, ", method="
, monotone_constraints_method
, ")"
)
test_that(test_msg, {
set.seed(708L)
dtrain <- .generate_trainset_for_monotone_constraints_tests(
x3_to_categorical = x3_to_categorical
)
params <- list(
min_data = 20L
, num_leaves = 20L
, use_missing = FALSE
)
unconstrained_model <- lgb.train(
params = params
, data = dtrain
, obj = "regression_l2"
, nrounds = 10L
)
params[["monotone_constraints"]] <- c(1L, -1L, 0L)
params[["monotone_constraints_method"]] <- monotone_constraints_method
constrained_model <- lgb.train(
params = params
, data = dtrain
, obj = "regression_l2"
, nrounds = 10L
)
X <- dtrain$.__enclos_env__$private$raw_data
})
}
}
6 changes: 6 additions & 0 deletions docs/Parameters.rst
Original file line number Diff line number Diff line change
Expand Up @@ -514,6 +514,12 @@ Learning Control Parameters

- you need to specify all features in order. For example, ``mc=-1,0,1`` means decreasing for 1st feature, non-constraint for 2nd feature and increasing for the 3rd feature

- in the CLI or C++, use a string like ``"-1,0,1"``

- in the Python package, can use either a string or a list like ``[-1, 0, 1]``
jameslamb marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

@StrikerRUS StrikerRUS Jun 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jameslamb I'd like to keep this important doc in a consistent state. Changes proposed in this this PR are actually applicable for any multi-***-type parameter. We have a lot of such params, e.g. max_bin_by_feature , cegb_penalty_feature_lazy, cegb_penalty_feature_coupled, categorical_feature, label_gain, etc. I think it will be better to write a separate paragraph about how multi-*** params can be passed to a program, if you think there should be some clarification for this.

Also, I'm against documenting internal string format for language wrappers. Actually, all params are passed via a string internally.
#4101 (comment)
I don't think we should expose this and it's better to encourage users to use native language structures to pass params.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I agree with both of these points. I'll split the unit tests part of this into a separate PR.

if you think there should be some clarification for this.

I definitely do. If an expert LightGBM user like @mayer79 wasn't aware (#4345 (comment)) then I think many others will be unaware or will spend time trying to figure it out from unit tests / example code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've changed this to just document the concept generally. I think it's ready for review.

image


- in the R package, can use either a string or a vector like ``c(-1, 0, 1)``
jameslamb marked this conversation as resolved.
Show resolved Hide resolved

- ``monotone_constraints_method`` :raw-html:`<a id="monotone_constraints_method" title="Permalink to this parameter" href="#monotone_constraints_method">&#x1F517;&#xFE0E;</a>`, default = ``basic``, type = enum, options: ``basic``, ``intermediate``, ``advanced``, aliases: ``monotone_constraining_method``, ``mc_method``

- used only if ``monotone_constraints`` is set
Expand Down
3 changes: 3 additions & 0 deletions include/LightGBM/config.h
Original file line number Diff line number Diff line change
Expand Up @@ -471,6 +471,9 @@ struct Config {
// desc = used for constraints of monotonic features
// desc = ``1`` means increasing, ``-1`` means decreasing, ``0`` means non-constraint
// desc = you need to specify all features in order. For example, ``mc=-1,0,1`` means decreasing for 1st feature, non-constraint for 2nd feature and increasing for the 3rd feature
// desc = in the CLI or C++, use a string like ``"-1,0,1"``
// desc = in the Python package, can use either a string or a list like ``[-1, 0, 1]``
jameslamb marked this conversation as resolved.
Show resolved Hide resolved
// desc = in the R package, can use either a string or a vector like ``c(-1, 0, 1)``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// desc = in the R package, can use either a string or a vector like ``c(-1, 0, 1)``
// desc = in the R package, can use either a string as in the CLI or a vector like ``c(-1, 0, 1)``

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on #4346 (comment), I think we're going to add general documentation on the fact that you can use a list in Python / vector in R, instead of specifically adding notes like this on each parameter. So I've reverted the changes to the specific monotone_constraints docs.

std::vector<int8_t> monotone_constraints;

// type = enum
Expand Down