Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a featurizer preset with additional matminer featurizer #150

Merged
merged 15 commits into from
Jun 1, 2023

Conversation

gbrunin
Copy link
Collaborator

@gbrunin gbrunin commented May 9, 2023

This is related to composition featurizers only.
This adds a preset to the existing ones, with hopefully additional meaningful features.
It also adds the possibility to use only features that are continuous with respect to the composition. Some features are by their nature not continuous, which can lead to unphysical discontinuities when predicting a property as a function of the materials composition.
I have not added tests for this, it should be tested first with matbench to see if it is actually useful or not.

Copy link
Collaborator

@ml-evs ml-evs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for compiling this @gbrunin, I think this is a useful addition even without fully benchmarking against matbench again. My only general comment, and I'm not convinced myself whether its worth the effort, is whether it makes sense to have the continuous-only vs all features as different presets? This could be very simple in code, i.e. continuous only just inherits from "All" but overrides the lists used in load_featurizer. What do you think?

@gbrunin
Copy link
Collaborator Author

gbrunin commented May 17, 2023

Thanks. I'm not sure about splitting into different presets, because -if I'm not wrong- we would then have 4 presets instead of 2:

  • MatminerAll2023Featurizer
  • CompositionOnlyMatminerAll2023Featurizer
  • ContinuousMatminerAll2023Featurizer
  • ContinuousCompositionOnlyMatminerAll2023Featurizer

It could indeed easily be done, but, as you said, not sure it's worth the effort :p let's see what @ppdebreuck thinks.

Copy link
Owner

@ppdebreuck ppdebreuck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me semble très bien @gbrunin :p . As discussed yesterday, let's keep the number of Featurizer classes to a minimum, and add subsets as params to the constructor. Also, in the future, it would be nice to have a __str__ function that represents nicely the configuration of the featurizer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants