Parallel processing repairing playlist #203

Borewit · 2023-09-18T18:27:57Z

Changed repair playlist to parallel process.
On a CPU with at least 3 cores, on lengthy playlists, this should reduce the total processing time.
The playlist is no longer fixed in perfect sequence.

github-actions · 2023-09-22T14:03:10Z

Build run (#6275219532, attempt #1) artifacts

Borewit · 2023-09-22T14:15:56Z

@touwys, can you review this PR?

Expected result:

should do process the repair process faster, but is strongly depended on hardware (number of CPU cores).
results in playlist are updated out of sequence

Please ignore the version number.

touwys · 2023-09-24T12:12:28Z

Expected result:

* should do process the repair process faster, but is strongly depended on hardware (number of CPU cores).

* results in playlist are updated out of sequence

Tested with a c 13-year-old Windows 7 x64 PC with a 4-core Intel Core i7-3770K CPU. These observations are not absolutely observed:

Numerous playlists were submitted for test-repair. Those tested, contain any number of tracks from 19 up to 631.

Result:

There is a striking ⚡ gain of speed noticeable during the first round of playlist repair — i.e. at finding the exact matches.
As it were, these exact matches were also found and updated out-of-sequence.
The speed gain on the second round of repairs for each of the playlists — i.e. to find the closest matches — was not as clearly visible as in the first round. Superficially, it appears to be not on par with the search for the exact matches, but, presumably, that is due to any differences inherent between the two processes? For instance, are the searches for closest matches carried out out-of-sequence, too? Even so, even if the speed gain here is not as clearly established, the processing happens fast enough to be wholly acceptable.

@Borewit: Great job — once again. 🏆

Borewit · 2023-09-24T14:07:40Z

I am looking at the closest matches algorithm, looks like Jeremy optimized the algorithm for your PC @touwys:

listFix/src/main/java/listfix/model/playlists/PlaylistEntry.java

Lines 148 to 150 in 9a9c96a

    
           // Only keep the top X highest-rated matches (default is 20), anything more than that has a good chance of using too much memory 
        
           // on systems w/ huge media libraries, too little RAM, or when fixing excessively large playlists (the things you have to worry 
        
           // about when people run your software on ancient PCs in Africa =])

🤣

Borewit · 2023-09-24T14:27:27Z

Despite the funny comment, I removed that optimization and replaced that with parallel processing as well. Looking forward to hear if the find-closest-match algorithm runs faster as well @touwys.

github-actions · 2023-09-24T14:27:37Z

Build run (#6290622796, attempt #1) artifacts

touwys · 2023-09-24T14:52:16Z

Build run (#6290622796, attempt #1) artifacts

Is it ready to try out? I can only attend to it tomorrow.

Despite the funny comment, I removed that optimization and replaced that with parallel processing as well. Looking forward to hear if the find-closest-match algorithm runs faster as well.

I hope that it is as noticeable a difference as with finding the exact matches. What really counts though, is the accuracy of the matches, rather than the speed by which they are delivered.

What I have seen so far, is that the current search algorithm is actually quite accurate for finding matching tracks where the broken ones form part of various artist compilations. Obviously, it's got more data available to process, in the file name of the broken track. So, one could probably retain that strength, and focus on massaging the algorithm where it tries to locate the tracks with very little data available in the filename.

github-actions · 2023-09-24T17:36:10Z

Build run (#6291464423, attempt #1) artifacts

Borewit · 2023-09-24T17:44:44Z

It's good thing you didn't test yet. I was not the smartest location to apply parallel processing, I adjust the processing keeping both parallel processing on Jeremy's memory optimization for now.

For me the closest match runs very fast, so hard to notice a big difference.

In this PR nothing changed from a functional point of view. So the accuracy has not been changed. On my Intel(R) Core(TM) i7-7800X CPU @ 3.50GHz, 6 cores (but 12 logical cores):

Before this PR:
Repaired playlist in 9491 ms.
Resolved closest matches in 433 ms.

After this PR:
Repaired playlist in 3512 ms.
Resolved closest matches in 391 ms.

touwys · 2023-09-25T10:14:27Z

Before this PR: Repaired playlist in 9491 ms. Resolved closest matches in 433 ms.

After this PR: Repaired playlist in 3512 ms. Resolved closest matches in 391 ms.

@Borewit :

The advance in your results is clearly very impressive.

Even if the huge difference in hardware makes a straightforward comparison of the results quite impossible, I easily recognised the improvement in my results. You've certainly done something special in this.

To what extent do other variables such as, for e.g. the number of tracks in the media database, parsing of the track information by the search algorithm, etc. play a role in advancing the search speed?

It's good thing you didn't test yet. I was not the smartest location to apply parallel processing, I adjust the processing keeping both parallel processing on Jeremy's memory optimization for now.

Which build should I then download to use next, or should I wait for another? Please indicate.

Borewit · 2023-09-25T13:17:53Z

To what extent do other variables such as, for e.g. the number of tracks in the media database, parsing of the track information by the search algorithm, etc. play a role in advancing the search speed?

I expect no significant difference. With a background job which takes at least a few seconds, the advantage of parallel processing will likely by higher then a relative very short job, as result of a more optimal spread and the overhead being relatively smaller.

Gaining small advantages resulting in more complex code, I am not much in favor of. In line with "Premature Optimization is the Root of All Evil". The code to optimization done for the "users with an ancient computer" is probably an example of that. It is almost asking for trouble.

If you can please re-test the latest build

touwys · 2023-09-25T16:24:51Z

@Borewit :

listFix()-2.8.1.2 delivered some very interesting results

Setup

Windows 7 x64 PC with a 4-core Intel Core i7-3770K CPU.
Two different processing speed tests were run with the same playlist.
The playlist contains 631 fully broken mp3 tracks calling for matches.
The tests were timed.
For the first test, the Media Directory (MD) panel incorporated both a mp3 and FLAC library.
For the second test, the FLAC directory was removed from the MD.

Results

Test 1

Find Exact	Find Closest	Total
1m 7s	3m 41s	4m 48s

Test 2

Find Exact	Find Closest	Total
29s	1m 58s	2m 27s

CPU

Borewit added the improvement label Sep 18, 2023

Borewit self-assigned this Sep 18, 2023

Borewit marked this pull request as draft September 18, 2023 18:35

Borewit force-pushed the parallel-processing-repairing-playlist branch from 8008b2f to d0a7d30 Compare September 18, 2023 18:43

Borewit mentioned this pull request Sep 21, 2023

Improve tablemodel notification repair playlist #202

Merged

Process repair in parallel

4b65078

Borewit force-pushed the parallel-processing-repairing-playlist branch from d0a7d30 to 4b65078 Compare September 22, 2023 13:58

Repository owner deleted a comment from github-actions bot Sep 22, 2023

Borewit marked this pull request as ready for review September 23, 2023 12:01

Borewit force-pushed the parallel-processing-repairing-playlist branch 2 times, most recently from cd83680 to 5711af9 Compare September 24, 2023 17:28

Repository owner deleted a comment from github-actions bot Sep 24, 2023

Optimize "closest match" algorithm, enable parallel processing

ddc51bb

Borewit force-pushed the parallel-processing-repairing-playlist branch from 5711af9 to ddc51bb Compare September 24, 2023 17:32

Borewit merged commit a9d770a into main Sep 25, 2023
4 checks passed

Borewit deleted the parallel-processing-repairing-playlist branch September 25, 2023 17:01

Borewit mentioned this pull request Sep 25, 2023

Remove the "ancient PC" optimization #209

Merged

Borewit mentioned this pull request Oct 11, 2023

Prevent memory leaks by cleaning up playlist listeners #218

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel processing repairing playlist #203

Parallel processing repairing playlist #203

Borewit commented Sep 18, 2023 •

edited

Loading

github-actions bot commented Sep 22, 2023

Borewit commented Sep 22, 2023

touwys commented Sep 24, 2023

Borewit commented Sep 24, 2023 •

edited

Loading

Borewit commented Sep 24, 2023

github-actions bot commented Sep 24, 2023

touwys commented Sep 24, 2023

github-actions bot commented Sep 24, 2023

Borewit commented Sep 24, 2023

touwys commented Sep 25, 2023

Borewit commented Sep 25, 2023

touwys commented Sep 25, 2023 •

edited by Borewit

Loading

Parallel processing repairing playlist #203

Parallel processing repairing playlist #203

Conversation

Borewit commented Sep 18, 2023 • edited Loading

github-actions bot commented Sep 22, 2023

Borewit commented Sep 22, 2023

touwys commented Sep 24, 2023

Result:

Borewit commented Sep 24, 2023 • edited Loading

Borewit commented Sep 24, 2023

github-actions bot commented Sep 24, 2023

touwys commented Sep 24, 2023

github-actions bot commented Sep 24, 2023

Borewit commented Sep 24, 2023

touwys commented Sep 25, 2023

Borewit commented Sep 25, 2023

touwys commented Sep 25, 2023 • edited by Borewit Loading

Setup

Results

CPU

Borewit commented Sep 18, 2023 •

edited

Loading

Borewit commented Sep 24, 2023 •

edited

Loading

touwys commented Sep 25, 2023 •

edited by Borewit

Loading