Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] Use R standard routines to access character data in C++ #4252

Merged
merged 3 commits into from
May 4, 2021

Conversation

jameslamb
Copy link
Collaborator

Another step towards #3016.

Similar to the changes for numeric data in #4247, this proposes replacing moost uses of the LightGBM-custom R_CHAR_PTR with R's standard interface for accessing character data. Unlike that PR, this requires two steps...extracting a CHARSXP from the object passed in from R, then getting a pointer to its data (char *). That results in a pattern like this:

CHAR(Rf_asChar(filename))

As a result of these changes, the utility function lgb.c_str() is no longer necessary.

Notes for Reviewers

  1. [R-package] predict() breaks when using a Dataset stored in a file #4034 is not fixed by this change, but it is partially fixed. As of this PR, the reproducible examples in that issue yield [LightGBM] [Fatal] Unknown format of training data. instead of [LightGBM] [Fatal] Data file ��?��V doesn't exist.. That can be addressed in a followup PR.

  2. R_CHAR_PTR cannot be completely removed in this PR, because it is still used in EncodeChar. That can be addressed in a future PR.

LGBM_SE EncodeChar(LGBM_SE dest, const char* src, SEXP buf_len, SEXP actual_len, size_t str_len) {
if (str_len > INT32_MAX) {
Log::Fatal("Don't support large string in R-package");
}
INTEGER(actual_len)[0] = static_cast<int>(str_len);
if (Rf_asInteger(buf_len) < static_cast<int>(str_len)) {
return dest;
}
auto ptr = R_CHAR_PTR(dest);
std::memcpy(ptr, src, str_len);
return dest;
}

References

@jameslamb jameslamb requested review from shiyu1994 and StrikerRUS May 4, 2021 04:19
@jameslamb jameslamb requested a review from Laurae2 as a code owner May 4, 2021 04:19
@jameslamb
Copy link
Collaborator Author

jameslamb commented May 4, 2021

/gha run r-valgrind

Workflow R valgrind tests has been triggered! 🚀
https://github.com/microsoft/LightGBM/actions/runs/808927490

Status: success ✔️.

@jameslamb
Copy link
Collaborator Author

jameslamb commented May 4, 2021

/gha run r-solaris

Workflow Solaris CRAN check has been triggered! 🚀
https://github.com/microsoft/LightGBM/actions/runs/808928092

solaris-x86-patched: https://builder.r-hub.io/status/lightgbm_3.2.1.99.tar.gz-794f797bf8d943f1b2c7f39086497200
solaris-x86-patched-ods: https://builder.r-hub.io/status/lightgbm_3.2.1.99.tar.gz-347306e12acd4464b21e7121abcb01dd
Reports also have been sent to LightGBM public e-mail: http://www.yopmail.com/lightgbm_rhub_checks
Status: success ✔️.

@jameslamb
Copy link
Collaborator Author

Getting closer! Hopefully just a few more PRs and we can finally be done with #3016.

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants