Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Snapgrid #2917

Draft
wants to merge 25 commits into
base: main
Choose a base branch
from
Draft

[WIP] Snapgrid #2917

wants to merge 25 commits into from

Conversation

daschuer
Copy link
Member

@daschuer daschuer commented Jul 5, 2020

This is a proof of concept for a new beat detection strategy.

This PR is just an experimental PR on top of #2877

Currently we "move" the detected beats to positions they are considered musical reasonable.
Due to this for instance a loud onset that has nothing to do with the rhythm can trick the detection algorithm, because it is considered as beat.

Here we collect all possible onsets and experiential show it in beat grid.
The user can now try with looping these onsets what are actual beats.

Instead of shifting loud onsets it will snap to another possible onset.
This avoids notable double beat issues when syncing, because we are not off-beat, but just snapped to a less loud beat.

We still need to allow shifting in a small region, because of quantitation noise and jitter due to a live drummer or different characteristics of the drums. But this can now limited to a small unnoteabe region of +- 25 ms.

This algorithm is a good foundation for omitted, skipped or added beat detection.

I have tested this with
"Kahn - Helter Skelter"
https://www.youtube.com/watch?v=-byQwJ2Uoh0

At 0:31.80 the rhythm is shifted. The original qm algorithm has hard times to detect that. It is trying to iron out this region, which is musical wrong. Instead the beat grid should be snapped to the new rhythm.

Known Issues:
In a "Boom / Tshack" Rhythm the spectral difference is a lot more sensitive for Tshack than for Boom. I have used thresshold_smoothing to adjust the sensitivity. For detecting all Booms I had to increase the sensitivity to 0.05. This creates some false positives in the first region of the track. Here a value of 0.02 works perfect.

In other tracks with a deeper Boom, we need more sensitivity to detect it. Detecting the Booms is very important, because thy are likely the downbeats.

I think we should use this snapGrid when manually edit the beat grid. I am not sure if we should actually show it, because you can already predict the snap positions from the waveform.

@hacksdump @crisclacerda

What do you think?

@daschuer
Copy link
Member Author

daschuer commented Jul 5, 2020

grafik

@Be-ing
Copy link
Contributor

Be-ing commented Jul 5, 2020

This is an interesting idea. On the community call the other day we briefly discussed an idea to store the analysis results separately from the user adjusted beats so the analysis results could be used to help the infer right/left features. Repsys seems to do something like this from the video demo, but I have yet to look at the repsys code in detail. This "snapgrid" may serve that purpose if we store it as an immutable part of the protobuf.

As for whether it would be helpful to show this on the waveform sometimes, we can defer that decision until @hacksdump's project is further along.

@daschuer
Copy link
Member Author

daschuer commented Jul 6, 2020

Blue + first peak : Boom
Red + second peak: Tshack

grafik

grafik

This happens because Tshack is not that high but it has many more bins in the spectral difference.
The bin size is ~50 Hz

@daschuer daschuer mentioned this pull request Jul 6, 2020
7 tasks
@Be-ing
Copy link
Contributor

Be-ing commented Jul 12, 2020

@satchelspencer we got this idea from Repsys. We're working on building a new beatgrid format and analyzer in Mixxx capable of working with variable tempos. We'd be curious to get your input on this work.

@Be-ing
Copy link
Contributor

Be-ing commented Jul 12, 2020

I gave this a test with some jazz tracks with variable tempos and rhythms. I think this is a really promising idea. I think it could be useful for more than adjusting the beatgrid. It may be useful in some cases as an alternative grid to quantize to, particularly for setting off beat cue points and adjusting manual loops.
image

@Be-ing
Copy link
Contributor

Be-ing commented Jul 12, 2020

I propose we call this the "sound grid" to contrast with the beat grid which marks musical time.

@hacksdump give this a try. I think we should show this unobtrusively on the scrolling waveforms, for example small lines at the top & bottom of the waveforms, when it is relevant. This could be when moving the downbeats, cue points, and loop boundaries. We may consider turning quantize into a 3 state switch: off, beat quantize, sound quantize. Or maybe we could have beatloops and sync quantize to the beatgrid whereas cues and manual loops quantize to the sound grid.

@daschuer
Copy link
Member Author

I think this is only useful for editing the beatgrid.
Quantize during performance should snap to the beatgrid and probably to integer fractions of it.

The issue is that the snap grid is the raw onset detection output with a moving threshold. It is not ironed. Different instruments have a different form of unsets and we have the quantize jitter.

An alternative name would be "Onsets". Or "Raw Onsets" or such.

@Be-ing
Copy link
Contributor

Be-ing commented Jul 12, 2020

"Onsets" sounds good to me.

@Swiftb0y
Copy link
Member

Swiftb0y commented Jul 12, 2020

What about "transients"? At least it seems like that's what you mean with "onsets".

@daschuer
Copy link
Member Author

Onsets is the special term for exactly this:
https://en.wikipedia.org/wiki/Onset_%28audio%29?wprov=sfla1

@Be-ing
Copy link
Contributor

Be-ing commented Jul 20, 2020

This algorithm is a good foundation for omitted, skipped or added beat detection.

Can you elaborate on this? In my testing of #2877, the biggest problem was the analyzer detecting a very incorrect tempo when there are no loud rhythmic elements but the tempo is actually constant. Could this help with that somehow?

@daschuer
Copy link
Member Author

Yes.

This is in contrast to the queen Mary algorithm, which has no expectation of skipped or added beats build in.
It warps the beat to a smooth tempo changing grid. That reaches as many beats as possible and works perfect to follow tracks like the fold4, warp5 track, but fails for added and skipped brats.

The new approach is to not move beats out of the onset region, to not produce double beats when beatmatching.

@Be-ing
Copy link
Contributor

Be-ing commented Jul 21, 2020

Great, if we can overcome the problem of inaccurately detecting tempo changes when there are not prominent beats, I think users will not need to manually tell Mixxx to assume a constant tempo.

@Holzhaus Holzhaus marked this pull request as draft July 24, 2020 08:31
* Adjust the Boom vs Tshak compensation curve with more test tracks
* Use a smmoothing filter for the floating threshold after detecting a beat
* Look also to the falling edge of a beat when finding SD peaks
@Be-ing Be-ing changed the base branch from master to main October 23, 2020 23:16
@github-actions
Copy link

This PR is marked as stale because it has been open 90 days with no activity.

@github-actions github-actions bot added the stale Stale issues that haven't been updated for a long time. label Jan 22, 2021
@daschuer daschuer mentioned this pull request Mar 5, 2021
Copy link
Member

@Holzhaus Holzhaus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are merge conflicts and a bunch of compiler warnings now.

</widget>
</item>
<item row="6" column="0">
<widget class="QLabel" name="label_21">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoUic: /home/jan/Projects/mixxx/src/preferences/dialog/dlgprefrhythm.ui: Warning: The name 'label_21' (QLabel) is already in use, defaulting to 'label_211'.

max = tempos.value();
}
}
return mode;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If tempos.hasNext() is false, then mode is returned uninitialized.


} // namespace

AnalyzerRhythm::AnalyzerRhythm(UserSettingsPointer pConfig)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/home/jan/Projects/mixxx/src/analyzer/analyzerrhythm.cpp: In constructor ‘AnalyzerRhythm::AnalyzerRhythm(UserSettingsPointer)’:
/home/jan/Projects/mixxx/src/analyzer/analyzerrhythm.cpp:52:52: error: unused parameter ‘pConfig’ [-Werror=unused-parameter]
   52 | AnalyzerRhythm::AnalyzerRhythm(UserSettingsPointer pConfig)
      |                                ~~~~~~~~~~~~~~~~~~~~^~~~~~~

std::vector<double> AnalyzerRhythm::computeSnapGrid() {
int size = m_detectionResults.size();

int dfType = 3; // ComplexSD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused variable

// here we detect if the segment has a constant tempo or not
if (partLenght > m_beatsPerBar * 2) {
int middle = partLenght / 2;
auto beatsAtLeft = QVector<double>::fromStdVector(std::vector<double>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/home/jan/Projects/mixxx/src/analyzer/analyzerrhythmbpm.cpp: In member function ‘std::tuple<QVector<double>, QMap<int, double> > AnalyzerRhythm::FixBeatsPositions()’:
/home/jan/Projects/mixxx/src/analyzer/analyzerrhythmbpm.cpp:158:49: error: ‘static QVector<T> QVector<T>::fromStdVector(const std::vector<T>&) [with T = double]’ is deprecated: Use QVector<T>(vector.begin(), vector.end()) instead. [-Werror=deprecated-declarations]
  158 |             auto beatsAtLeft = QVector<double>::fromStdVector(std::vector<double>(
      |                                                 ^~~~~~~~~~~~~

@poelzi
Copy link
Contributor

poelzi commented Mar 5, 2021

Interesting, I will give this a try soon.
I think if we are able to not only detect onset, but transient as well, we could find the proper alignment of phrases based on that. Most music as far as I can see, always start a phrase with a transient. But I guess this is more something when #2961 lands.

@github-actions github-actions bot removed the stale Stale issues that haven't been updated for a long time. label Mar 6, 2021
@github-actions
Copy link

github-actions bot commented Jun 4, 2021

This PR is marked as stale because it has been open 90 days with no activity.

@github-actions github-actions bot added the stale Stale issues that haven't been updated for a long time. label Jun 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Stale issues that haven't been updated for a long time.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants