-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warnings about bad input data #152
Comments
If the dataset reading fails then only the reader class can issue a
specific warning since pytesmo can not know why the reading failed. We can
of course add the requested gpi or lon, lat and the data source name to the
pytesmo level warning.
For the temporal matching we can add a warning if no matches are found.
Probably in the validation framework since the temporal matcher does not
have all info to issue a good warning.
I could also imagine a strict mode or something like that which raises an
exception for these failures.
A more general question is if a warning is enough for your purposes? Would
you not prefer a results object with more detailed information at which
step a validation failed?
…On Mon, Sep 17, 2018, 16:34 D. Baum ***@***.***> wrote:
As far as I can see: if there's "bad" input data, pytesmo usually either
drops it or issues a (sometimes quite generic) warning.
Examples are:
- pytesmo.validation_framework.data_manager.DataManager.read_ds:
warnings are given but exception and sometimes dataset name and arguments
information are omitted.
- pytesmo.temporal_matching.df_match, lines 90-117: If there are no
matches between data and reference, no warning is given and an empty (or
filled with NaN) DataFrame is returned.
Is there generic philosophy behind this like "don't bother the user at
all, just give them the results we can produce and let them look into
missing or faulty data themselves"?
Since we're currently trying to build a user-friendly webservice that uses
pytesmo for validations, we'd like to tell the user not only "x% of your
input data didn't yield results" but also ideally *why* that was the
case. But that may clash with the more Python-developer-oriented approach
pytesmo has?
Would you be open to us adding more warnings? How much would be too much?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#152>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAXP_4-a_Xqb5T858qU-kfvsD0PtGLtiks5ub7LrgaJpZM4WsDGt>
.
|
Might be done with https://docs.python.org/3/library/warnings.html#the-warnings-filter ? Re results object: I hadn't thought that far. It sounds promising/interesting but may be a major change, right? A tricky part may be storing the results into a netcdf file when they contain error reports as well as results arrays. PS: I'm currently playing around in a branch here but haven't done too much yet: https://github.com/awst-austria/pytesmo/tree/verbose_warnings |
Yes that should work fine.
Using a results object instead of the dictionary we currently should not be too big of a change. But I could be wrong.
We would have to come up with a flagging system where each error has a value. This should then be fairly easy to store according to CF conventions. See http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#flags |
And the results object would be put together in Of course the trick for creating a netcdf output format would be to foresee the problems that occur and categorise them in a useful fashion (NOT so that all practically occurring issues ends up in "other errors"). And then to write a reader/writer for it, I guess? |
Yes.
For every exception that we have we can add an error code/value/bit that we then set in the result. The ResultsManager will have to be updated. |
As far as I can see: if there's "bad" input data, pytesmo usually either drops it or issues a (sometimes quite generic) warning.
Examples are:
pytesmo.validation_framework.data_manager.DataManager.read_ds
: warnings are given but exception and sometimes dataset name and arguments information are omitted.pytesmo.temporal_matching.df_match
, lines 90-117: If there are no matches between data and reference, no warning is given and an empty (or filled with NaN) DataFrame is returned.Is there generic philosophy behind this like "don't bother the user at all, just give them the results we can produce and let them look into missing or faulty data themselves"?
Since we're currently trying to build a user-friendly webservice that uses pytesmo for validations, we'd like to tell the user not only "x% of your input data didn't yield results" but also ideally why that was the case. But that may clash with the more Python-developer-oriented approach pytesmo has?
Would you be open to us adding more warnings? How much would be too much?
The text was updated successfully, but these errors were encountered: