-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port apply-geolocation-rules to augur curate #1491
Conversation
87053d9
to
6f67295
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for picking this up @victorlin! I added another test in b80bc4e to make sure I'm not crazy for remembering that we can override default rules.
pass | ||
|
||
|
||
def load_geolocation_rules(geolocation_rules_file): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
silly question #2: do we eschew types in augur
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't, you'll find some here and there in two forms:
-
Types in docstrings
Lines 33 to 39 in 295f280
metadata_file : str Path to a metadata file to load. delimiters : list of str List of possible delimiters to check for between columns in the metadata. Only one delimiter will be inferred. columns : list of str List of columns to read. If unspecified, read all columns. -
Type annotations
augur/augur/filter/subsample.py
Lines 383 to 386 in 295f280
def _calculate_sequences_per_group( target_max_value: int, group_sizes: Collection[int] ) -> int:
These are picked up and rendered in docs such as augur.io.read_metadata
:
But typing isn't ubiquitous and what's typed isn't even validated.
Historically most of the codebase has been untyped and we as a team have only recently begun using types in untyped languages. It hasn't been a priority to go back and add more types to existing code, but that's something I would love to see happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
most of the codebase has been untyped and we as a team have only recently begun using types in untyped languages. It hasn't been a priority to go back and add more types to existing code, but that's something I would love to see happen.
one way to expand the use of types would be to have an general expectation that any new code being added would (at a minimum) annotate method signatures...
} | ||
""" | ||
geolocation_rules = defaultdict(lambda: defaultdict(lambda: defaultdict(dict))) | ||
with open(geolocation_rules_file, 'r') as rules_fh: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm gonna have a bunch of silly questions i think: is there a pattern in augur around who is responsible for handling (or, i guess, not handling) exceptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see #642.
Keep the questions coming! It's good to know what seems weird from a fresh perspective. Also this comment prompted me go back and update that outdated issue.
Removed the executable permission and shebang since this file is no longer expected to be called directly. Copied from: <https://github.com/nextstrain/ingest/blob/c94d78d1f38b99e893007a76526f3d3824ecded0/apply-geolocation-rules>
Added the minimum required changes to allow this subcommand to work: - renamed the file to work as a python module (i.e. replace hyphens with underscores, add .py extension) - added it as a subcommand of augur curate (after titlecase to be consistent with the order used in existing workflows) - replaced the main entrypoint with register_parser() + run() - hooked up to augur curate's argparse parser - updated to work with JSON records directly (i.e. letting augur curate handle stdin/stdout)
To be consistent with the rest of the codebase.
b80bc4e
to
295f280
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1491 +/- ##
==========================================
+ Coverage 68.86% 69.15% +0.29%
==========================================
Files 69 70 +1
Lines 7608 7694 +86
Branches 1861 1887 +26
==========================================
+ Hits 5239 5321 +82
- Misses 2086 2087 +1
- Partials 283 286 +3 ☔ View full report in Codecov by Sentry. |
Description of proposed changes
Porting over from nextstrain/ingest with minimal functional changes.
Related issue(s)
Checklist