-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[cattle-flu] consistent colours #102
Conversation
The previous implementation generated a metadata file for each subtype (i.e. before filtering) and then used that for all downstream rules. It's more common to (also) generate a post-subsampling metadata file and use this instead. This is both smaller and useful for debugging (as all the data in the TSV file are used in the build). It's also going to make generating colours for specific builds much easier, which I'll do in subsequent commits.
This shifts the modification of metadata which is specific to the h5n1-cattle-outbreak/genome/default build to be downstream of the subsetted-metadata file, which is much simpler to reason with. We could consider doing something similar for the `add_h5_clade` rule, however this wouldn't allow us to use that data as a filtering criteria, and it seems plausible we'd want to do that one day.
Make variables and functions defined in the snakefile available to custom rule files such as cattle-flu.smk. See <#100 (comment)> for more context.
Uses our well-established pattern of a color-ordering file and a set of hexes to produce a well-ordered color pallet. Here we restrict the values within the generating script to better maximise the final color range (rather than setting colors for all divisions then having augur export subset them). This approach also ensures colors are consistent across "division" and "division_metadata" which aids interpretation of the data. Closes #101
It's possible segment-level builds will include a division not found in the genome build, but in this case Auspice will use a grey colour swatch which I think is acceptable.
7836494
to
4823e7a
Compare
CI passed, trial builds passed (ncbi, fauna), results:
|
assignment = {} | ||
with open(fname) as f: | ||
for line in f.readlines(): | ||
array = line.lstrip().rstrip().split("\t") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
array = line.lstrip().rstrip().split("\t") | |
array = line.strip().split("\t") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - better! I'm going to merge without changes however to keep it identical to where it was copied from and thus (maybe?) make ENH: Add augur colors subcommand based on assign-colors.py scripts smoother when we get round to it.
with open(args.color_schemes) as f: | ||
for line in f.readlines(): | ||
counter += 1 | ||
array = line.lstrip().rstrip().split("\t") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
array = line.lstrip().rstrip().split("\t") | |
array = line.strip().split("\t") |
Closes #101. Final commit is a WIP and needs testing, but putting this up early for 👀
Admin division (inferred):
Admin division (metadata):
Colors will always be internally consistent, but may change build to build as new values show up in the data.
(Aside: #100 introduced a new coloring for
division_metadata
but not a new geographic resolution, so we can colour by known metadata but will always have inferred values plotted on the map. That's quite informative actually, but subtle.)