Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FlowFeatureMapper: added surrounding-median-quality-size feature #8222

Merged

Conversation

dror27
Copy link
Contributor

@dror27 dror27 commented Feb 27, 2023

New feature: calculates the median quality over the N bases following or preceding the reported substitution.

For example, in the case where you have the following sequence and quality:

ATGACTGCA
II5II++5+

Where C is a substitution relative to the reference and the other bases match the reference.

The median of the qualities of the 4 bases after the substitution, ++5+, would be 20 (I = BQ40, 5 = BQ30, + = BQ20), and of the 4 bases before, II5I, 40.

@meganshand meganshand self-assigned this Mar 6, 2023
Copy link
Contributor

@meganshand meganshand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comments

@@ -1,8 +1,11 @@
package org.broadinstitute.hellbender.tools.walkers.featuremapping;

import htsjdk.samtools.CigarElement;
import org.apache.commons.lang.ArrayUtils;
import org.apache.commons.math3.stat.descriptive.DescriptiveStatistics;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can remove import

import org.apache.commons.text.similarity.LevenshteinDistance;
import org.broadinstitute.hellbender.engine.ReferenceContext;
import org.broadinstitute.hellbender.utils.MathUtils;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this import as well can be removed.

from = Math.max(0, Math.min(quals.length, from));
to = Math.max(0, Math.min(quals.length, to - 1));
if ( from > to ) {
return 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an error rather than returning 0?

// calc median
byte[] range = Arrays.copyOfRange(quals, from, to + 1);
if ( range.length == 0 ) {
return 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question about an error message here? 0 seems like a different value than missing.

@meganshand meganshand removed their assignment Mar 6, 2023
@meganshand meganshand merged commit 4ba4ab5 into broadinstitute:master Mar 7, 2023
@dror27 dror27 deleted the ultima.feature_mapper.dev branch March 7, 2023 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants