Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Documentation: mapping between WGS/WXS RNA-seq pairs is confusing without documentation! #399

Open
jaclyn-taroni opened this issue Jan 3, 2020 · 2 comments
Labels
documentation Improvements or additions to documentation

Comments

@jaclyn-taroni
Copy link
Member

Context: #251 (comment)

Multiple analyses including many (all?) of the molecular subtyping tickets and the OncoPrint plotting require analysts to identify WGS/WXS and RNA-seq biospecimen IDs that map to the same sample. One can use the sample_id field in pbta-histologies.tsv to do so. Using sample_id is not documented anywhere, so multiple people have run into this issue!

@sjspielman any thoughts as to where this documentation should go? Future thought outside the scope of this ticket: perhaps we should "functionalize" the code here and put it in a util directory at the root of the repository...

@jaclyn-taroni jaclyn-taroni added the documentation Improvements or additions to documentation label Jan 3, 2020
@sjspielman
Copy link
Member

One idea is to have an Rmd document in docs/ entitled something like (but definitely not this-) "Introduction to working with our data" and have a brief demo of a general approach to obtaining the metadata one needs for maybe 1-3 types of analyses. In other words, some document would read "So, you want to work with samples? Here's a brief tutorial for how you can get what you need" and then show a demonstrative walk-through of wrangling pbta-histologies.tsv for a few applications. Eventually this could be functionalized for consistency across analyses?

@sjspielman
Copy link
Member

Related and potentially a new issue, there is a lot of reinventing-the-wheel with regards to parsing many of the data files, i.e. it's trivial to find multiple analyses with entirely different code for parsing expression data into the same final result. This is in part due to some non-tidy data structures that have been saved, but it would be helpful to re-factor code with common parsing strategies. Again, can volunteer.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants