Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Independent sample analysis (2 of 2) #171

Merged

Conversation

jashapiro
Copy link
Member

Purpose/implementation

Generate files of independent samples for downstream analysis where more than one samples from the same indidvidual would bias analysis or is otherwise not desirable

Issue

#155

Directions for reviewers

Do the sample lists look reasonable (truly independent?)
Should this PR include adding canonical versions of these files elsewhere in the project? If so, where? Should they be part of the data download?

Results

The script generates 4 files of independent samples: one with all primary tumors and WGS sequences, as well as files that include WXS samples and/or non-primary tumors.

In summary, the independent spain lists contain:
641 WGS primary specimens
788 WGS specimens (including non-primary)
657 WGS+WXS primary specimens
804 WGS+WXS specimens (including non-primary)

Docker and continuous integration

Check all those that apply or remove this section if it is not applicable.

  • The dependencies required to run the code in this pull request have been added to the project Dockerfile.
  • This analysis has been added to continuous integration.

jashapiro and others added 30 commits September 10, 2019 14:47
This workbook is an analysis of the cases where we have multiple samples from the same participant, to explore how best to manage future analysis where we want only a single sample from each tumor
Script complete, and generated files are in directory.
@jashapiro jashapiro changed the title Independent sample analysis (1 of 2) Independent sample analysis (2 of 2) Oct 24, 2019
Copy link
Member

@jaclyn-taroni jaclyn-taroni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had one question to check my understanding. I did pull this locally and kick the tires a bit. Everything was as expected/described as far as the sample sets go.

Should this PR include adding canonical versions of these files elsewhere in the project? If so, where? Should they be part of the data download?

I am a big fan of including this as part of the download. Is this something we could potentially do as part of #146 @jharenza?

@jharenza
Copy link
Collaborator

I had one question to check my understanding. I did pull this locally and kick the tires a bit. Everything was as expected/described as far as the sample sets go.

Should this PR include adding canonical versions of these files elsewhere in the project? If so, where? Should they be part of the data download?

I am a big fan of including this as part of the download. Is this something we could potentially do as part of #146 @jharenza?

How do you envision this in the download? A separate clinical file or as another column in the clinical file for which samples to use? I am working on adding a lot more info to the clinical file, and we also have two samples which need to be pulled due to non-consent, so I would suggest not as another file, but we could add a column if you give me the final list?

@jashapiro
Copy link
Member Author

How do you envision this in the download? A separate clinical file or as another column in the clinical file for which samples to use?

I would think having a separate file(s) is probably the easiest. You can run the 01-generate-independent-specimens.R script and include the output.

@jharenza
Copy link
Collaborator

How do you envision this in the download? A separate clinical file or as another column in the clinical file for which samples to use?

I would think having a separate file(s) is probably the easiest. You can run the 01-generate-independent-specimens.R script and include the output.

OK - I can provide these as lists in the release so people can extract those samples from the clinical file (since so many clinical file edits are upcoming). Once the PR is merged, can you point me to the output files? I will also have to remove two newly-annotated as non-consent samples. I haven't yet set up docker for this and to do so will require some tutorial-ing on my end and have a pressing deadline nov 15, so won't be able to do this until after then.

@jaclyn-taroni jaclyn-taroni merged commit 1c465be into AlexsLemonade:master Oct 25, 2019
@jashapiro jashapiro deleted the jashapiro/independent-samples2 branch October 25, 2019 15:22
@jharenza jharenza mentioned this pull request Nov 8, 2019
8 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants