Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another thought for sub synchronization #42

Open
Sec-ant opened this issue Aug 23, 2019 · 0 comments
Open

Another thought for sub synchronization #42

Sec-ant opened this issue Aug 23, 2019 · 0 comments

Comments

@Sec-ant
Copy link

Sec-ant commented Aug 23, 2019

First, I want to say, very great job and big thanks!

I've been considering writing a subsync-like tool for a long time and have wrote a prototyping code for validation before. Though this repo shocks me and I'm thrilled to the FFT algorithm for aligning, I do want to share my initial thought when i implemented my code:

Because most subtitles are not synchronized in a way either the frame rate is wrong (eg: 25 fps subtitle for a 24 fps movie) or there is some kind of offset in the beginning, or both, most of them can be synced by applying a linear transformation to the time. So the problem here is kinda like a linear regression problem and the vital point is to find the corresponding points between the subtitles and the audio or a reference sub. So, similarly, I transform subs into long vectors where 1 for sub on and 0 for sub off. And inspired by the feature detection algorithm in computer vision, I choose SIFT(Scale-invariant feature transform) algorithm and modify it so it can by applied in lower dimension (computer vision is 2D and this is 1D). SIFT-1D will return a set of interesting points (timestamps, and their feature vectors) for each sub. After that I use the common methods to compare the distances between the two sets of feature vectors, match them as pairs and then use RANSAC or other linear regression algorithms to calculate the linear transformation coefficients (scale and offset). The entire progress will cost several seconds when the resolution is 0.1s. In most cases it works fine but there are cases you have to adjust the parameters for SIFT-1D or RANSAC, or the result can turn really ugly, and the result is often unstable (there are some randomness in RANSAC). Also the speed is not optimized. I'm not sure whether the problems lie in the entire thought or my codes.

When I came up with your repo, I noticed that it doesn't support scaling but only offset. I was hoping SIFT-1D may be a solution when properly reimplemented. However I agree that

If you lower the split-penalty it can even correct the framerate difference because it automatically finds that splitting the movies in 3-4 (almost) equal parts with slightly different offsets optimizes the alignment rating.

@kaegi mentioned in #10. So it may not be that necessary.

Well, any comment is welcome ^~^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant