-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding annotate
module in the cp_process_singlecells
workflow
#58
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left a couple of comments to address. Minor documentation.
add_metadata_id_to_platemap: True | ||
format_broad_cmap: False | ||
clean_cellprofiler: True | ||
external_metadata: "none" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a lot of experience with what the norm is for these .yaml
files but is it possible to add hints to what the possibilities are for each param? ex. dest_datatype: parquet
what are the possible input types? I am not sure is this is doable or a norm just a thought
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. We are still figuring out what is the best way to document all the different types of parameters that each workflow module have to offer. Our current approach is to direct the user to the documentation inside the
If you look at the documentation in the workflow configs, the description of each technology is used along with it's corresponding link.
In terms of "what are the available parameters", we are currently discussing in developing a schema #55 which we will define the data type, options, etc of the parameters in the workflow config.
barcodes=BARCODES, | ||
metadata=METADATA_DIR, | ||
output: | ||
ANNOTATED_DATA, | ||
conda: | ||
"../envs/cytominer_env.yaml" | ||
log: | ||
"logs/annotate_{file_name}.log", | ||
"logs/annotate_{basename}.log", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
less ambiguous, I like basename better!
@@ -34,6 +35,7 @@ configfile: "./configs/wf_configs/cp_process_singlecells.yaml" | |||
# importing modules | |||
include: "../rules/common.smk" | |||
include: "../rules/cytotable_convert.smk" | |||
include: "../rules/annotate.smk" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at line 30-32: how do multiple config files know which rules they belong to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each module represented by the include
takes in the same workflow config.
Here's an example from: https://github.com/WayScience/CytoSnake/blob/aa852f9f4886057cb96f5aa19e648e2d760adc22/configs/wf_configs/cp_process_singlecells.yaml
cytotable_convert:
params:
dest_datatype: parquet
source_datatype: sqlite
concat: True
join: True
infer_common_schema: True
drop_null: True
preset: cellprofiler_sqlite
log_level: ERROR
annotate_configs:
params:
join_on:
- Metadata_well_position
- Image_Metadata_Well
add_metadata_id_to_platemap: True
format_broad_cmap: False
clean_cellprofiler: True
external_metadata: "none"
external_join_left: "none"
external_join_right: "none"
compression_options:
method: "gzip"
mtime: 1
float_format: null
cmap_args: {}
The corresponding parameters are selected by the name of the module. Above presents the parameters for both:
include: "../rules/cytotable_convert.smk"
include: "../rules/annotate.smk"
Here's an simple image:
Thanks for the discussion points, merging! |
About PR
In this PR, the
cp_process_singlecells
will have the annotation module imported for analysis.