Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

v18 CI files #871

Closed
jaclyn-taroni opened this issue Dec 13, 2020 · 0 comments · Fixed by #888
Closed

v18 CI files #871

jaclyn-taroni opened this issue Dec 13, 2020 · 0 comments · Fixed by #888
Labels
ci Related to continuous integration

Comments

@jaclyn-taroni
Copy link
Member

We need to update the CI files to reflect the v18 release. The v18 release makes the following changes (quoting from #857):

  • changes:
    • Add RNA independent specimen lists from #795 and #797
    • Update DNA independent specimen lists per comment here
    • Add LGAT fusion summary file from #808 and #830
    • Update fusion files per comment here
  • Add kinase domain and reciprocal information to pbta-fusion-putative-oncogenic.tsv per #812 and #816 and #821
    • Add 8 rRNA-depleted, total stranded RNA-Seq for tumors which previously had polyA sequencing performed: BS_FXJY0MNH, BS_KABQQA0T, BS_D7XRFE0R, BS_SHJA4MR0, BS_HE0WJRW6, BS_8QB4S4VA, BS_FN07P04C, BS_SB12W1XT per #749
    • Update pbta-histologies.tsv:
      • Add new RNA samples above
      • Pull latest clinical data
      • Add extent of tumor resection per comment
      • Rerun molecular subtyping modules to date

Known changes required to create-subset-files

  • We need copy the pbta-histologies-base.tsv file in the shell script like we do for pbta-histologies.tsv

cp $FULL_DIRECTORY/pbta-histologies.tsv $SUBSET_DIRECTORY

  • Update the release

RELEASE=${RELEASE:-release-v17-20200908}

Potential "gotchas"

I think some of the v18 file additions, which usually break things, should be covered by the existing code. I'm including this section just in case there's anything unexpected.

These files are named independent-specimens.rnaseq.primary-plus-polya.tsv and independent-specimens.rnaseq.primary-plus-stranded.tsv according to #857. That should be covered by:

cp $FULL_DIRECTORY/independent-specimens*.tsv $SUBSET_DIRECTORY

  • Add LGAT fusion summary file from #808 and #830

This file is called fusion_summary_lgat_foi.tsv according to #857. That should be covered by:

cp $FULL_DIRECTORY/fusion_summary* $SUBSET_DIRECTORY

Next steps

These steps are adapted from #670

  1. Start AWS instance with 128 GB RAM
  2. Clone OpenPBTA-analysis
  3. Create a new branch off of jharenza:v18-files called v18-ci
  4. Update analyses/create-subset-files/create_subset_files.sh as outlined above ☝️
  5. Run bash analyses/create-subset-files/create_subset_files.sh
  6. Commit the biospecimen RDS file to v18-ci branch.
  7. Zip up the subset files in data/testing/release-v18-20201123/ to testing_v18.zip
  8. Download testing_v18.zip
  9. File a pull request from v18-ci
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
ci Related to continuous integration
Projects
None yet
1 participant