You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, this is a question, not an issue.
I have a bunch of features that I track over time. I am feeding them into
algo=rpt.Pelt(model=model, min_size=1, jump=1)
algo.fit(signal)
result=algo.predict(pen=p) # RESULT OF CHANGE POINT DETECTION
signal here is (for example) a 500x16 (timepoints x features). The features themselves live on pretty different scales, such that I thought that some kind of scaling / normalization (for example via https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html#sklearn.preprocessing.scale) could make sense. Now I wonder though how different costs would be affected by that. In the example I am attaching below you can see the normalized signal for L1 and L2 norms -> change points are depicted with dashed lines. You can see that there are some obvious misses there (calibrating the penalty helps sometimes, but is a finicky process).
Should normalization be skipped altogether / is there a better alternative cost for these kind of signals?
The text was updated successfully, but these errors were encountered:
What are you using to draw these graphs as an unrelated question!
Should normalization be skipped altogether / is there a better alternative cost for these kind of signals?
I do agree in some instances there might be a need to remove any pre processing of the data, this can be done upstream if needed unless it's an inherent part of the pelt algorithm.
It's not inherent to the pelt algorithm I think? Unless there is some hidden pre processing going on (?).
I would like to know whether I should do my own normalization up front, and how it might affect certain cost functions in the pelt algorithm (L1, L2, ...).
To normalize or not is task-dependant and there is no definite answer. For multivariate signals, PELT will detect the largest shifts, i.e., those with a large norm ||m_before - m_after|| where m_before and m_after are the multivariate averages just before and after the change.
As an example, consider the following 2D signal.
One dimension has large shifts and the other has small shifts. Without normalization, only changes in the large dimension are detected.
Hi, this is a question, not an issue.
I have a bunch of features that I track over time. I am feeding them into
signal
here is (for example) a 500x16 (timepoints x features). The features themselves live on pretty different scales, such that I thought that some kind of scaling / normalization (for example via https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html#sklearn.preprocessing.scale) could make sense. Now I wonder though how different costs would be affected by that. In the example I am attaching below you can see the normalized signal for L1 and L2 norms -> change points are depicted with dashed lines. You can see that there are some obvious misses there (calibrating the penalty helps sometimes, but is a finicky process).Should normalization be skipped altogether / is there a better alternative cost for these kind of signals?
The text was updated successfully, but these errors were encountered: