-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R-package] CRAN error: r-devel-linux-x86_64-debian-clang #5502
Comments
@shiyu1994 can you please check your email and let us know here if CRAN has contacted you? The last time something similar happened (#4923), we were given just a few days to submit a patched version (#4930) before CRAN archived the package. If we need a v3.3.3 release with a patch, it's important that we start that work as soon as possible, to avoid disruptions to users. |
@jameslamb Thanks for the reminder! Just searched my email inbox and found this Seems that the issue is due to the test case for non-ASCII characters in feature names. -- 1. Failure (test_basic.R:1276:5): lgb.train() supports non-ASCII feature name
dumped_model[["feature_names"]] not identical to iconv(feature_names, to = "UTF-8").
4/4 mismatches
x[1]: "F_é\u009b¶"
y[1]: "F_<U+96F6>"
x[2]: "F_äž\u0080"
y[2]: "F_<U+4E00>"
x[3]: "F_äº\u008c"
y[3]: "F_<U+4E8C>"
x[4]: "F_äž\u0089"
y[4]: "F_<U+4E09>" |
Sorry for ignoring the email. Will pay more attention to such emails later on. |
AH! @shiyu1994 thank you. This is a VERY serious problem. It means we now only have 3 more days to upload a new version of the R package to CRAN, or
The current So I think we should do a special, patch-only v3.3.3 release containing only the changes necessary to satisfy CRAN, similar to what we did the last time this happened (#4930). I'm going to do the following:
Then I think we should try to release from that branch. If we run into any issues (e.g. maybe our CI jobs don't work any more with such an old version of the code), I think we can manually create a Usually I'd rely heavily on @StrikerRUS 's support for this, but I haven't seen him here for a few weeks so I don't think we should wait for him. |
@jameslamb Thanks. If you need any help during the release, please feel free to ping me. |
ok thank you very much! |
Actually @shiyu1994 ... could you review my proposed patch in #5503? I think the change to |
I'm investigating this one right now. |
❗ I just noticed that on the most recent successful build of the CI job that is supposed to mimic this check (build link), we're getting the following environment:
That is
It looks like the container image source: https://hub.docker.com/r/rhub/debian-clang-devel/tags And I can see why.... the GitHub Actions job that is supposed to publish a new version of that image daily seems to have been broken since August 1st 😭 source: https://github.com/r-hub/rhub-linux-builders/actions/workflows/debian.yml |
Ok, so! After applying a small change to the I suspect that that means the root cause then is that between July 31st and September 24th, one of the following changed in a way that causes this test to fail:
I'll continue investigating. |
I'm able to reproduce the failing test, but it's weird.... when I run the code of the test directly in an R REPL in the new But when I run something like the following in there cd LightGBM/R-package/tests
Rscript testthat.R That test on non-ASCII feature names fails 🤔 . So maybe something about the |
It really seems to me like something Consider the following R script. library(lightgbm)
dtrain <- lgb.Dataset(
data = matrix(rnorm(400L), ncol = 4L)
, label = rnorm(100L)
)
feature_names <- c("F_零", "F_一", "F_二", "F_三")
bst <- lgb.train(
data = dtrain
, nrounds = 5L
, obj = "regression"
, params = list(
metric = "rmse"
, verbose = -1L
)
, colnames = feature_names
)
dumped_model <- jsonlite::fromJSON(bst$dump_model())
all_equal <- all(
dumped_model[["feature_names"]] == feature_names
)
stopifnot(all_equal)
testthat::test_that("empty test", {
testthat::expect_true(TRUE)
}) The presence of a Well... Rscript --vanilla test-ascii.R succeeds, like this:
but using Rscript --vanilla -e "testthat::test_file('test-ascii.R')" fails!
I'm so confused 😫 |
I should have clarified....that difference only happens in the |
Found some evidence that
I'm not sure how or if that's related to the behavior I'm seeing yet, but it does feel relevant. |
@StrikerRUS I know you have a lot of notifications to catch up on, want to be sure you see this: r-hub/rhub-linux-builders#62 ^ until that PR is merged, the CI check here in LightGBM that tries to replicate this CRAN failure will be stuck on a version of r-devel from July 31st, 2022. So you might want to subscribe to notifications there. |
350+, hahaha 😬
Thanks for the reminder! Yeah, I saw that PR. Hope it will be merged. Also, I saw you
for the |
r-hub/rhub-linux-builders#62 has been merged, and a new I can see that that image contains a recent version of R-devel. docker run \
--rm \
--entrypoint="" \
-it rhub/debian-clang-devel \
/opt/R-devel/bin/R --version
So the CI job in this repo using that image should again be replicating the CRAN check! cc @hcho3 @trivialfis, might be helpful for you as welll |
Thank you for the ping, that's really helpful as we looking to improve the tests for the R package. |
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Description
CRAN checks for
{lightgbm}
are showing an ERROR for ther-devel-linux-x86_64-debian-clang
check flavor.https://cran.r-project.org/web/checks/check_results_lightgbm.html
Reproducible example
I haven't tried to reproduce this yet outside of the CRAN system.
According to CRAN's logs (link), exactly one test is failing.
Additional Comments
This project has a CI job (#4164) that is intended to exactly replicate CRAN's
r-devel-linux-x86_64-debian-clang
check. That job has been succeeding. I'm not sure yet what the difference is between that job and CRAN's setup.The text was updated successfully, but these errors were encountered: