Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix(9559)] - Validation fails for enum field with decimal type #1324

Conversation

mikeleppane
Copy link

@mikeleppane mikeleppane commented Jun 8, 2024

Change Summary

Fixes regression documented in pydantic/pydantic#9559.

Related issue number

pydantic/issues/9559

Checklist

  • Unit tests for the changes exist
  • Documentation reflects the changes where applicable
  • Pydantic tests pass with this pydantic-core (except for expected changes)
  • My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

Selected Reviewer: @davidhewitt

The code changes modify the test cases for enum validation with decimal values. The test cases now only include the values -1, 0, and 1. Additionally, a new assertion is added to validate the decimal values as floats. This change improves the validation of enum fields with decimal values.
Copy link

codecov bot commented Jun 8, 2024

Codecov Report

Attention: Patch coverage is 90.90909% with 1 line in your changes missing coverage. Please review.

Project coverage is 89.64%. Comparing base (ab503cb) to head (5d6986c).
Report is 179 commits behind head on main.

Files with missing lines Patch % Lines
src/validators/enum_.rs 90.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1324      +/-   ##
==========================================
- Coverage   90.21%   89.64%   -0.57%     
==========================================
  Files         106      108       +2     
  Lines       16339    17252     +913     
  Branches       36       41       +5     
==========================================
+ Hits        14740    15466     +726     
- Misses       1592     1766     +174     
- Partials        7       20      +13     
Files with missing lines Coverage Δ
src/validators/generator.rs 93.92% <100.00%> (+0.28%) ⬆️
src/validators/literal.rs 95.45% <ø> (-0.30%) ⬇️
src/validators/model.rs 97.48% <ø> (-0.34%) ⬇️
src/validators/enum_.rs 93.57% <90.00%> (ø)

... and 34 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1f8b30b...5d6986c. Read the comment docs.

Copy link

codspeed-hq bot commented Jun 8, 2024

CodSpeed Performance Report

Merging #1324 will not alter performance

Comparing mikeleppane:fix(9559)/validation-fails-for-enum-field-with-decimal-type (5d6986c) with main (9507a28)

Summary

✅ 155 untouched benchmarks

@mikeleppane
Copy link
Author

please review

Copy link
Contributor

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for the PR!

Having just reviewed with this, I'd like to see a different solution. As per pydantic/pydantic#9559 (comment) I think the problem is not limited just to Decimal but to many arbitrary types.

I think also we don't want to add Decimal coercion to our Literal pipeline as it's never valid to have a Decimal as a Literal.

I think instead I would prefer if we added a branch to src/validators/enum_.rs which, as a last resort, attempts to do the equivalent of MyEnum(input). This would add worst-case performance but should restore all the old behaviour.

@mikeleppane
Copy link
Author

mikeleppane commented Jun 15, 2024

Thanks very much for the PR!

 

Having just reviewed with this, I'd like to see a different solution. As per pydantic/pydantic#9559 (comment) I think the problem is not limited just to Decimal but to many arbitrary types.

 

I think also we don't want to add Decimal coercion to our Literal pipeline as it's never valid to have a Decimal as a Literal.

 

I think instead I would prefer if we added a branch to src/validators/enum_.rs which, as a last resort, attempts to do the equivalent of MyEnum(input). This would add worst-case performance but should restore all the old behaviour.

Hey @davidhewitt,

Thanks for the feedback!

Yes, you're correct. The problem isn't limited to Decimal. I placed the implementation in src/validators/enum_.rs initially. But, I moved it to src/validators/literal.rs.  

I've examined this thoroughly and face uncertain next steps. So I need a little bit of help. Perhaps, I don't understand something. Or, the current machinery can't handle arbitrary types well. The problem that I've been pondering is as follows:

from enum import Enum


from pydantic import BaseModel


class EnumClass(Enum):

    enum_value = 1

    enum_value = 2


class ModelClass(BaseModel):

    enum_field: EnumClass

    enum_field_2: EnumClass


class IntWrappable:

    def __init__(self, value: int):

        self.value = value

    

    def __eq__(self, value: object) -> bool:

        return self.value == value


ModelClass(enum_field=1, enum_field_2=IntWrappable(2))
LiteralLookup { expected_bool: None, expected_int: Some({1: 0, 2: 1}), expected_str: None, expected_py_dict: None, expected_py_list: None, values: [Py(0x7f396c73c950), Py(0x7f396cfe3da0)] }

How can we know which expected_ints to validate? This is in the context of enum validation in src/validators/enum_.rs. I understand handling types that can be converted to int (like Decimal). But what about those custom, artificial IntWrappable kinds of classes (NewType works perfectly)? Do we need to require that those classes implement __float__ instead of __eq__ (or both) in this case? Now, if I understand, to validate IntWrappable we need to loop "blindly" LiteralLookup values to check which one matches?

UPDATE:

Äh, maybe I thought about this too hard. It works now for any type. The types must have a test for equality. But, I feel that there might some corner case that this implementation does not cover.

removed try_validate_any function and instead try to create Python enum class.

Test case modifications and fixes.
Copy link
Contributor

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! So rereading after the latest push has sparked new thoughts 🙈

Comment on lines +456 to +458
# GIVEN
class MyEnum(Enum):
VALUE = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realise now that we probably also want to test this with e.g. MyEnum(IntEnum), which I think goes through a separate code pathway but probably was also broken when we moved enum validation to Rust?

Comment on lines +120 to 123
} else if let Some(v) = T::validate_value(py, input, &self.lookup, &self.class, strict)? {
state.floor_exactness(Exactness::Lax);
return Ok(v);
} else if let Some(ref missing) = self.missing {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So now that I look at this new point in the diff, I see that (and I regret forgetting this) that we already call _missing_ here. But as we observe in the issue and this PR, simply calling _missing_ is not enough because enum __new__ has more complex logic which isn't encapsulated purely by _missing_.

I wonder if there is a case to have a new branch (before or after this one? not sure 🤔) which calls the enum type, with the logic going here instead of in PlainEnumValidator. Putting the logic here would also solve the special-cased enums like IntEnum, I think.

That does beg the question, though: if we add the case of calling the enum type here, do we need logic for _missing_ at all? My intuition is that we don't, and we should try to phase out the _missing_ logic.

@@ -183,8 +185,14 @@ impl EnumValidateValue for PlainEnumValidator {
} else if py_input.is_instance_of::<PyFloat>() {
return Ok(lookup.validate_int(py, input, false)?.map(|v| v.clone_ref(py)));
}
if py_input.is_instance_of::<PyAny>() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_instance_of::<PyAny>() I think will always be true, will probably be optimized away by the compiler but also not necessary at all IMO.

Suggested change
if py_input.is_instance_of::<PyAny>() {

@sydney-runkle
Copy link
Member

Closing in favor of #1456

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants