Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cattle-flu] consistent colours #102

Merged
merged 5 commits into from
Nov 4, 2024
Merged

Conversation

jameshadfield
Copy link
Member

Closes #101. Final commit is a WIP and needs testing, but putting this up early for 👀

Admin division (inferred):
image

Admin division (metadata):
image

Colors will always be internally consistent, but may change build to build as new values show up in the data.

(Aside: #100 introduced a new coloring for division_metadata but not a new geographic resolution, so we can colour by known metadata but will always have inferred values plotted on the map. That's quite informative actually, but subtle.)

The previous implementation generated a metadata file for each subtype
(i.e. before filtering) and then used that for all downstream rules.
It's more common to (also) generate a post-subsampling metadata file
and use this instead. This is both smaller and useful for debugging
(as all the data in the TSV file are used in the build). It's also
going to make generating colours for specific builds much easier,
which I'll do in subsequent commits.
This shifts the modification of metadata which is specific to the
h5n1-cattle-outbreak/genome/default build to be downstream of the
subsetted-metadata file, which is much simpler to reason with.

We could consider doing something similar for the `add_h5_clade` rule,
however this wouldn't allow us to use that data as a filtering criteria,
and it seems plausible we'd want to do that one day.
Make variables and functions defined in the snakefile available to
custom rule files such as cattle-flu.smk. See
<#100 (comment)>
for more context.
Uses our well-established pattern of a color-ordering file and a set of
hexes to produce a well-ordered color pallet. Here we restrict the
values within the generating script to better maximise the final
color range (rather than setting colors for all divisions then having
augur export subset them). This approach also ensures colors are
consistent across "division" and "division_metadata" which aids
interpretation of the data.

Closes #101
It's possible segment-level builds will include a division not found
in the genome build, but in this case Auspice will use a grey colour
swatch which I think is acceptable.
@jameshadfield
Copy link
Member Author

CI passed, trial builds passed (ncbi, fauna), results:

assignment = {}
with open(fname) as f:
for line in f.readlines():
array = line.lstrip().rstrip().split("\t")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
array = line.lstrip().rstrip().split("\t")
array = line.strip().split("\t")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - better! I'm going to merge without changes however to keep it identical to where it was copied from and thus (maybe?) make ENH: Add augur colors subcommand based on assign-colors.py scripts smoother when we get round to it.

with open(args.color_schemes) as f:
for line in f.readlines():
counter += 1
array = line.lstrip().rstrip().split("\t")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
array = line.lstrip().rstrip().split("\t")
array = line.strip().split("\t")

@jameshadfield jameshadfield merged commit 8df04f1 into master Nov 4, 2024
15 checks passed
@jameshadfield jameshadfield deleted the james/consistent-colours branch November 4, 2024 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consistent colours across all cattle-flu builds
2 participants