Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python] [R-package] refine the parameters for Dataset #2594

Merged
merged 57 commits into from
Feb 19, 2020

Conversation

guolinke
Copy link
Collaborator

@guolinke guolinke commented Nov 26, 2019

to fix #2590

  • remove the pre-filter by min_data

@guolinke guolinke changed the title [WIP]refine the parameters for Dataset refine the parameters for Dataset Nov 27, 2019
@guolinke
Copy link
Collaborator Author

guolinke commented Nov 27, 2019

@StrikerRUS could you help to add the tests, to ensure the warnings are shown as expected?

@StrikerRUS
Copy link
Collaborator

@guolinke Sure, I'll try!

src/c_api.cpp Outdated Show resolved Hide resolved
@StrikerRUS StrikerRUS self-requested a review February 4, 2020 22:22
@guolinke
Copy link
Collaborator Author

guolinke commented Feb 8, 2020

@StrikerRUS I think we can move the discussion to the main thread. it is hard to notice the new conversion in the above sub-thread.

@StrikerRUS
Copy link
Collaborator

@guolinke

unfortunately, It is not trivial. As we don't know the filename passed in c_api is training or k-th valid data. It is hard to retrieve the corresponding init score filename. The easier way in python/R is read the init_score_file by user, and set init_score field in dataset.

another solution is to force the init_score_filename to concat(data_filename, ".init"), and remove all initscore filename parameters.

I think removing filename params is good idea! It will decrease overall number of LightGBM params which is good change for newcomers who struggling with learning a huge number of settings. Also, it will fix inconsistency in behavior when for language packages default filename is respected but it cannot be changed by those params. As we've already removed some sparse-related params in #2699 and minor (or major) version bump is required, it is OK in terms of backward compatibility.

@guolinke
Copy link
Collaborator Author

guolinke commented Feb 8, 2020

yeah, maybe do it in a new PR?

@StrikerRUS
Copy link
Collaborator

@guolinke

yeah, maybe do it in a new PR?

Sure, as you wish!

@StrikerRUS StrikerRUS mentioned this pull request Feb 9, 2020
2 tasks
@StrikerRUS StrikerRUS requested a review from jameslamb February 9, 2020 16:04
@guolinke
Copy link
Collaborator Author

ping @jameslamb for R tests.
@StrikerRUS any other issues need to be addressed?

Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guolinke

any other issues need to be addressed?

As we agreed on making the breakage in docs into Dataset and Booster params, and removing init_score params in a separate PRs, I think this PR is ready except R side (aliases and tests).

@guolinke
Copy link
Collaborator Author

ping @jameslamb for R's part. Maybe R test could be in a new PR?

@jameslamb
Copy link
Collaborator

ping @jameslamb for R's part. Maybe R test could be in a new PR?

just added #2767 . Let me know if you think anything is missing!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

reset_param feature_contrib not fixed
3 participants