You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's unclear to me when to use the --no_force_align option to ProGraph. The README describes this as
do not force alignment of initial Methionine
What's the scientific motivation for skipping initial M by default?
I ask because of a potential bug in the interaction with the --repeat option, which matches the sequences to a T-Reks output alignment. These files reference sequence positions, so they cause an off-by-one error if the M was stripped.
I can think of several possible solutions:
Default to --no_force_align when the --repeat option is also specified
For each sequence, store a flag indicating whether it has been truncated. If so, account for that when reading in the repeats file
Be more permissive when verifying the FASTA/T-REKS alignment. Automatically recover from off-by-one errors in the coordinates. (This would have the side benefit of supporting malformed T-Reks files that used 0-based indexes rather than the correct 1-based positions.)
The text was updated successfully, but these errors were encountered:
For now, let's go for a quick fix: Default to --no_force_align when the --repeat option is also specified
Later when we have more time, it would be good to fix it properly (the last proposed solution).
It's unclear to me when to use the
--no_force_align
option to ProGraph. The README describes this asWhat's the scientific motivation for skipping initial M by default?
I ask because of a potential bug in the interaction with the
--repeat
option, which matches the sequences to a T-Reks output alignment. These files reference sequence positions, so they cause an off-by-one error if the M was stripped.I can think of several possible solutions:
--no_force_align
when the--repeat
option is also specifiedThe text was updated successfully, but these errors were encountered: