Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Further improvements to the timestamp heuristics.
Prompted by: #1378 (reply in thread)
In that example, the first two words of a segment were significantly stretched. There is an existing heuristic to handle this, but only for the first two words of a "window". Therefore, this PR moves the timestamp heuristics out of
find_alignment
(where the segment boundaries are unavailable), and moves it out one level toadd_word_timestamps
where the segment boundaries are known. Since some of the heuristics were already here, this effectively moves all of the timestamp heuristics into the same place.The same heuristic was also made more robust by only applying it if it looks like there was a pause between this segment and the last. If the two segments are tightly packed together, the timestamps are left as is.
I also continued work from one of the heuristics in #1114 . In particular, a heuristic tries to pick the segment end timestamp over the word end timestamp if the word timestamp looks far off where it should be. This PR adds a mirror heuristic for picking between the "start" timestamp from the segment vs the word.
I would note that there is a magic number of 0.5 in the code:
The 0.5 number comes from the fact that segment timestamps can sometimes be unreliable and snap to the nearest integer, and half of the time they will snap one way, and the other half the other way. Thus if a segment start timestamp snaps to the right, it could overshoot and cause the subtitle to come in late, if we let this heuristic get too close. The 0.5 creates a buffer to prevent that. In the end, we don't know whether or not we can trust the segment timestamp, but outside of that unsafe 0.5 zone, we can at least consider the segment start timestamp to be MORE reliable than the word timestamp if the word timestamp was too far in the past, indicating probable poor alignment, which is known to happen. Similar logic applies to the other heuristic for the segment end timestamp.