You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am new to Pandera and using it to run some schema validations on my dataframe. I want to use a mix of warnings and errors. The error part is what is working seamlessly. What I am doing is to catch the rows in the original dataframe which failed validations using index column of failure_cases in SchemaErrors and send them back to users as a dataframe.
Below works for errors.
except pa.errors.SchemaErrors as e:
# run some custom logic logic using the index of the row which failed validation
print(e.failure_cases.groupby('index')
I want to do the same with Warnings as well but I am not able to find an easy way to get a handle on index of the validation failed rows in case of warnings and I get warnings as type = str.
So for below code, I get
with warnings.catch_warnings(record=True) as caught_warnings:
dynamic_schema.validate(df, lazy=True, inplace=False)
if caught_warnings:
print(type((warning.message.args[0])) # print str
output (below is one single string)
<Schema Column(name=series_value_date, type=DataType(str))> failed element-wise validator 0:
<Check validate: last saved date check failed>
failure cases:
index failure_case
0 0 2024-03-26T00:00:00.0000000
1 1 2024-03-24T00:00:00.0000000
2 2 2024-03-23T00:00:00.0000000
3 3 2024-03-22T00:00:00.0000000
If somehow, I can access index column in above string output, this will fulfill my use case. Is there a way to do this ?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I am new to Pandera and using it to run some schema validations on my dataframe. I want to use a mix of warnings and errors. The error part is what is working seamlessly. What I am doing is to catch the rows in the original dataframe which failed validations using index column of failure_cases in SchemaErrors and send them back to users as a dataframe.
Below works for errors.
I want to do the same with Warnings as well but I am not able to find an easy way to get a handle on index of the validation failed rows in case of warnings and I get warnings as type = str.
So for below code, I get
output (below is one single string)
If somehow, I can access index column in above string output, this will fulfill my use case. Is there a way to do this ?
Beta Was this translation helpful? Give feedback.
All reactions