Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

change to kf-openaccess-us-east-1-prd-pbta/data #1121

Conversation

kgaonkar6
Copy link
Collaborator

@kgaonkar6 kgaonkar6 commented Jul 29, 2021

Purpose/implementation Section

What scientific question is your analysis addressing?

v20 CI testing files check

What was your approach?

Changed the s3 bucket path to https://s3.amazonaws.com/kf-openaccess-us-east-1-prd-pbta/data and copied testing_v20.zip files to testing folder

What GitHub issue does your pull request address?

#1048

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

Add Shatterseek errored out here e6028f7

[1] "Running: BS_20TBZG09 (4 of 17)"
Error in if ((seg1$chrom == seg2$chrom) & (seg1$copy.num == seg2$copy.num)) { : 
  missing value where TRUE/FALSE needed
Calls: mergeCNsegments -> compare_adjacent_segs
In addition: There were 50 or more warnings (use warnings() to see the first 50)
Execution halted

so I commented it out to run CI.

Will try to fix below

Copy link
Member

@jaclyn-taroni jaclyn-taroni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@kgaonkar6
Copy link
Collaborator Author

[1] "Running: BS_20TBZG09 (4 of 17)"
Error in if ((seg1$chrom == seg2$chrom) & (seg1$copy.num == seg2$copy.num)) { : 
  missing value where TRUE/FALSE needed
Calls: mergeCNsegments -> compare_adjacent_segs
In addition: There were 50 or more warnings (use warnings() to see the first 50)
Execution halted

It seems like the code errors because the test cnv consensus data only has 1 row for BS_20TBZG09 which I tried to fix it with a273dcc

cnv_consensus %>% filter(ID %in% "BS_20TBZG09",!is.na(cnv_consensus$copy.num)) 
# A tibble: 1 x 7
  ID          chrom loc.start   loc.end num.mark seg.mean copy.num
  <chr>       <chr>     <dbl>     <dbl> <lgl>       <dbl>    <dbl>
1 BS_20TBZG09 chr12 132132000 132470393 NA          0.300        3

@jaclyn-taroni
Copy link
Member

The tests currently pass and there are no changes, outside of those to the download data step, to the CircleCI config file?

@kgaonkar6
Copy link
Collaborator Author

In addition to download data and circleCI config file, an update to 02-run-shatterseek-and-classify-confidence.R here a273dcc was needed to pass CI checks

@jaclyn-taroni
Copy link
Member

jaclyn-taroni commented Jul 30, 2021

Yes, I saw that. I guess my point is that you didn't just try to solve the problem, you successfully solved the problem and that doesn't preclude merging this!

@kgaonkar6
Copy link
Collaborator Author

kgaonkar6 commented Jul 30, 2021

Sorry for the confusion, I only added the 02-run-shatterseek-and-classify-confidence.R code update as a code review update for the CI to pass.

While rerunning the chromothripsis module I am erroring out in another place which I am still trying to figure out :

Evaluating the statistical criteria
Successfully finished!
[1] "Running: BS_JJNSP29S (102 of 777)"
Running..


Evaluating the statistical criteria
Error in seq.default(max(idxa), min(idxb), 1) : 
  'to' must be a finite number
Calls: shatterseek ... fmax2 -> nrow -> [ -> [.data.frame -> seq -> seq.default
In addition: There were 50 or more warnings (use warnings() to see the first 50)
Execution halted

@kgaonkar6
Copy link
Collaborator Author

kgaonkar6 commented Jul 30, 2021

Assistance from Chromothripsis Analysis authors @LauraEgolf @yangyangclover will be helpful, I couldn't figured out why the sample BS_JJNSP29S errors out of shatterseek in v20.

@jaclyn-taroni
Copy link
Member

@kgaonkar6 is this a problem with this pull request, the CI files, or a separate issue that comes up when running the entire pipeline (e.g., not using tested files)? I am confused.

@kgaonkar6
Copy link
Collaborator Author

The 02-run-shatterseek-and-classify-confidence.R with the update a273dcc works on the testing data (since I figured out the issue was that code could not handle cnv_current dataframes that had only 1 row) which then passes the CI.

Now I am trying to run the chromothripsis module on v20 data release to update the module before we can merge to master and I get the issue below:

Evaluating the statistical criteria
Successfully finished!
[1] "Running: BS_JJNSP29S (102 of 777)"
Running..


Evaluating the statistical criteria
Error in seq.default(max(idxa), min(idxb), 1) : 
  'to' must be a finite number
Calls: shatterseek ... fmax2 -> nrow -> [ -> [.data.frame -> seq -> seq.default
In addition: There were 50 or more warnings (use warnings() to see the first 50)
Execution halted

@jaclyn-taroni
Copy link
Member

Does the output of that script go into the release or is it required for subtyping? If not, then it doesn’t preclude release in my opinion and we should file an issue that is separate from the release process.

@kgaonkar6
Copy link
Collaborator Author

Ok sounds good, thanks for clarifying!

The outputs from chromothripsis are not released and not used in subtyping. I'll write up a different issue.

@jaclyn-taroni jaclyn-taroni merged commit 941d006 into AlexsLemonade:master Jul 30, 2021
@jaclyn-taroni jaclyn-taroni mentioned this pull request Jul 30, 2021
5 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants