You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We recently saw these metadata validation errors on a dandiset in production dandi/dandi-archive#1958. These are the errors that were reported:
contributor: String should match pattern '^([\w\s\-\.']+),\s+([\w\s\-\.']+)$'
contributor: Input should be 'Organization'
contributor: String should match pattern 'https://ror.org/[a-z0-9]+$'
The invalid contributor in question turned out to be a Person with an invalid name field; in other words, the first validation error was the actual issue, while the other two were not relevant and somewhat misleading. What's happening here is pydantic has no idea from a validation perspective whether the object is intended to be a Person or an Organization , as contributor is of type List[Union[Person, Organization]], so it's checking both cases (i.e., first it validates the object as if it were a Person and gets the first error, then it validates it as a Organization and gets the other two errors).
I propose that we use discriminated unions on the schemaKey field of each pydantic model so we can avoid this in the future. This would allow pydantic to scope down the validation to the specific type of the object based on its schemaKey. If we had this in the above mentioned scenario, pydantic would have recognized that the invalid contributor is supposed to be a Person and would not have reported the additional misleading validation errors that assume it's an Organization.
The text was updated successfully, but these errors were encountered:
We recently saw these metadata validation errors on a dandiset in production dandi/dandi-archive#1958. These are the errors that were reported:
The invalid contributor in question turned out to be a
Person
with an invalidname
field; in other words, the first validation error was the actual issue, while the other two were not relevant and somewhat misleading. What's happening here is pydantic has no idea from a validation perspective whether the object is intended to be aPerson
or anOrganization
, ascontributor
is of typeList[Union[Person, Organization]]
, so it's checking both cases (i.e., first it validates the object as if it were aPerson
and gets the first error, then it validates it as aOrganization
and gets the other two errors).I propose that we use discriminated unions on the
schemaKey
field of each pydantic model so we can avoid this in the future. This would allow pydantic to scope down the validation to the specific type of the object based on itsschemaKey
. If we had this in the above mentioned scenario, pydantic would have recognized that the invalid contributor is supposed to be aPerson
and would not have reported the additional misleading validation errors that assume it's anOrganization
.The text was updated successfully, but these errors were encountered: