TrackRef = Track Location + Track ID #901

uklotzde · 2016-03-03T16:40:13Z

A class for referencing tracks with or without an id. It is needed for the introduction of a TrackCache that manages tracks by their TrackRef.

The new class encapsulates both the location and id of a track. Each track has a unique location, but only tracks that are stored in the library have an id. The id is created and assigned when storing the track in the database for the first time.

This class also consistently calculates the track location from QFileInfo.

daschuer · 2016-03-04T15:00:16Z

I think I have understand what you aim, even if the new TrackRef is not used.

... but IMHO for the additional self documentation we introduce a new degree of complexity, another Object type other developers have to learn.
Can't we move the new features to the TrackInfoObject or to a track info base class?

Always calculating and storing the canonical path, might be an overhead that we don't need for all use-cases of TrackRef. I would prefer a solution that caches the canonical path only on demand.
Such caching is already done by QFileInfo, so way may want to wrap that, instead of our own caching.

daschuer · 2016-03-04T15:02:35Z

src/track/trackref.h

+};
+
+inline
+bool operator==(const TrackRef& lhs, const TrackRef& rhs) {


This is a candidate for missus since it might fail even if we reference the same track.
I would prefer to introduce a named function for comparing.
Like isRefferencingTheSameTrack() or similar.

I don't agree. This is the canonical implementation of the equality operator
for a simple value object, member by member.

If on the other hand the class name of TrackRef is misleading than we should
instead choose a different one.

On 03/04/2016 04:02 PM, Daniel Schürmann wrote:

In src/track/trackref.h
#901 (comment):

}

bool isValid() const {

return hasId() || hasCanonicalLocation();

}

+private:

bool verifyConsistency() const;

QString m_location;

QString m_canonicalLocation;

TrackId m_id;
+};

+inline
+bool operator==(const TrackRef& lhs, const TrackRef& rhs) {

This is a candidate for missus since it might fail even if we reference
the same track.
I would prefer to introduce a named function for comparing.
Like isRefferencingTheSameTrack() or similar.

—
Reply to this email directly or view it on GitHub
https://github.com/mixxxdj/mixxx/pull/901/files#r55039820.

...because construction from QFileInfo is costly and should be used consciously.

uklotzde · 2016-03-05T11:11:31Z

TrackRef is a new class, but it encapsulates a separate aspect of tracks
that does not fit into TrackInfoObject. TrackInfoObject is still too large
and needs to be decomposed.

I don't think that we should introduce an additional layer of caching.
Storing the QFileInfo and calculating canonicalLocation() dynamically is not
an option, because this class is used as a key in the TrackCache and keys
must be immutable! It is only allowed to set the ID once.

Added comments to location()/canonicalLocation() about potential file
system access
Moved the construction from a QFileInfo into a separate factory function
to avoid misuse

On 03/04/2016 04:00 PM, Daniel Schürmann wrote:

I think I have understand what you aim, even if the new TrackRef is not used.

... but IMHO for the additional self documentation we introduce a new
degree of complexity, another Object type other developers have to learn.
Can't we move the new features to the TrackInfoObject or to a track info
base class?

Always calculating and storing the canonical path, might be an overhead
that we don't need for all use-cases of TrackRef. I would prefer a
solution that caches the canonical path only on demand.
Such caching is already done by QFileInfo, so way may want to wrap that,
instead of our own caching.

—
Reply to this email directly or view it on GitHub
#901 (comment).

daschuer · 2016-03-05T16:57:59Z

It looks like we have a different view on this PR. Maybe it is because the
new object is still unused and I have other target issues in mind than you.

The object itself, locks nice and you are right in separating different
things to extra objects, so that is not the point.

For me, the main issue we have to solve is a concurrent file access by two
track objects. This may happen because Mixxx may crate two track objects in
different parts that reference the same file. Due to originating from
external libraries or duplicates by symlinks.

As I understand the way to solve it, is a track cache that tracks tracks by
its canonical path.
Once Mixxx accesses the file, to look up the track in this cache an respect
sync facilities.
Locking up the canonical path is a non cheap system call, so we should only
look it up for this purpose.
Mixxx should be remain to be able to instantiate two Track objects for the
same file, since this reflects the truth if thy presents two tracks in the
library with different meta data or as members of different playlist.
Of cause this is an annoying situation we should be able to handle. That
can be a part of a duplicate merge class.

The canonical path is cached in the Qfileinfo class which is part of
tracking.
From this point of view the is no need for a new trackref class?

Will this approach work, or do I miss something?

Which use case do you have in mind?
When do we need for example the compare operators?
TrackRef is a new class, but it encapsulates a separate aspect of tracks
that does not fit into TrackInfoObject. TrackInfoObject is still too large
and needs to be decomposed.

I don't think that we should introduce an additional layer of caching.
Storing the QFileInfo and calculating canonicalLocation() dynamically is
not
an option, because this class is used as a key in the TrackCache and keys
must be immutable! It is only allowed to set the ID once.

Added comments to location()/canonicalLocation() about potential file
system access
Moved the construction from a QFileInfo into a separate factory function
to avoid misuse

On 03/04/2016 04:00 PM, Daniel Schürmann wrote:

I think I have understand what you aim, even if the new TrackRef is not
used.

... but IMHO for the additional self documentation we introduce a new
degree of complexity, another Object type other developers have to learn.
Can't we move the new features to the TrackInfoObject or to a track info
base class?

Always calculating and storing the canonical path, might be an overhead
that we don't need for all use-cases of TrackRef. I would prefer a
solution that caches the canonical path only on demand.
Such caching is already done by QFileInfo, so way may want to wrap that,
instead of our own caching.

—
Reply to this email directly or view it on GitHub
#901 (comment).

—
Reply to this email directly or view it on GitHub
#901 (comment).

uklotzde · 2016-03-05T18:27:11Z

We must not instantiate two TrackInfoObjects for the same file! Otherwise
writing tags back into files is unsafe. This is the current situation.

The TrackCache will create and hand out managed pointers to tracks, both
from the library and from the file system. TrackInfoObjects MUST always be
allocated through and managed by this cache. Tracks are registered with both
their canonical path and their ID in this cache. The ID is missing for
tracks that have not been added to the library (yet).

Maybe we can delay the calculation of the canonical path until it is
actually needed, i.e. after looking up the track by ID failed. But this
optimization will only affect the interface of TrackCache, i.e. TrackRef
will no longer appear as a parameter in the lookup()/resolve() functions.
The TrackRef class itself is not affected. It is just only created
internally by TrackCache on demand.

I thought we could pull this PR out of the TrackCache stuff, because
TrackRef represents concepts that are independent of the TrackCache. But if
it causes too much confusion I will close this PR.

On 03/05/2016 05:58 PM, Daniel Schürmann wrote:

It looks like we have a different view on this PR. Maybe it is because the
new object is still unused and I have other target issues in mind than you.

The object itself, locks nice and you are right in separating different
things to extra objects, so that is not the point.

For me, the main issue we have to solve is a concurrent file access by two
track objects. This may happen because Mixxx may crate two track objects in
different parts that reference the same file. Due to originating from
external libraries or duplicates by symlinks.

As I understand the way to solve it, is a track cache that tracks tracks by
its canonical path.
Once Mixxx accesses the file, to look up the track in this cache an respect
sync facilities.
Locking up the canonical path is a non cheap system call, so we should only
look it up for this purpose.
Mixxx should be remain to be able to instantiate two Track objects for the
same file, since this reflects the truth if thy presents two tracks in the
library with different meta data or as members of different playlist.
Of cause this is an annoying situation we should be able to handle. That
can be a part of a duplicate merge class.

The canonical path is cached in the Qfileinfo class which is part of
tracking.

From this point of view the is no need for a new trackref class?

Will this approach work, or do I miss something?

Which use case do you have in mind?
When do we need for example the compare operators?
TrackRef is a new class, but it encapsulates a separate aspect of tracks
that does not fit into TrackInfoObject. TrackInfoObject is still too large
and needs to be decomposed.

I don't think that we should introduce an additional layer of caching.
Storing the QFileInfo and calculating canonicalLocation() dynamically is
not
an option, because this class is used as a key in the TrackCache and keys
must be immutable! It is only allowed to set the ID once.

Added comments to location()/canonicalLocation() about potential file
system access

Moved the construction from a QFileInfo into a separate factory function
to avoid misuse

On 03/04/2016 04:00 PM, Daniel Schürmann wrote:

I think I have understand what you aim, even if the new TrackRef is not
used.

... but IMHO for the additional self documentation we introduce a new
degree of complexity, another Object type other developers have to learn.
Can't we move the new features to the TrackInfoObject or to a track info
base class?

Always calculating and storing the canonical path, might be an overhead
that we don't need for all use-cases of TrackRef. I would prefer a
solution that caches the canonical path only on demand.
Such caching is already done by QFileInfo, so way may want to wrap that,
instead of our own caching.

—
Reply to this email directly or view it on GitHub
#901 (comment).

—
Reply to this email directly or view it on GitHub
#901 (comment).

—
Reply to this email directly or view it on GitHub
#901 (comment).

uklotzde · 2016-03-05T19:07:25Z

Delaying the calculation of the canonical file path was a good idea! That
even improves the interface of TrackCache :)

But as I already mentioned, the class TrackRef is unaffected.

On 03/05/2016 07:27 PM, Uwe Klotz wrote:

We must not instantiate two TrackInfoObjects for the same file! Otherwise
writing tags back into files is unsafe. This is the current situation.

The TrackCache will create and hand out managed pointers to tracks, both
from the library and from the file system. TrackInfoObjects MUST always be
allocated through and managed by this cache. Tracks are registered with
both their canonical path and their ID in this cache. The ID is missing
for tracks that have not been added to the library (yet).

Maybe we can delay the calculation of the canonical path until it is
actually needed, i.e. after looking up the track by ID failed. But this
optimization will only affect the interface of TrackCache, i.e. TrackRef
will no longer appear as a parameter in the lookup()/resolve() functions.
The TrackRef class itself is not affected. It is just only created
internally by TrackCache on demand.

I thought we could pull this PR out of the TrackCache stuff, because
TrackRef represents concepts that are independent of the TrackCache. But
if it causes too much confusion I will close this PR.

On 03/05/2016 05:58 PM, Daniel Schürmann wrote:

It looks like we have a different view on this PR. Maybe it is because the
new object is still unused and I have other target issues in mind than you.

The object itself, locks nice and you are right in separating different
things to extra objects, so that is not the point.

For me, the main issue we have to solve is a concurrent file access by two
track objects. This may happen because Mixxx may crate two track objects in
different parts that reference the same file. Due to originating from
external libraries or duplicates by symlinks.

As I understand the way to solve it, is a track cache that tracks tracks by
its canonical path.
Once Mixxx accesses the file, to look up the track in this cache an respect
sync facilities.
Locking up the canonical path is a non cheap system call, so we should only
look it up for this purpose.
Mixxx should be remain to be able to instantiate two Track objects for the
same file, since this reflects the truth if thy presents two tracks in the
library with different meta data or as members of different playlist.
Of cause this is an annoying situation we should be able to handle. That
can be a part of a duplicate merge class.

The canonical path is cached in the Qfileinfo class which is part of
tracking.

From this point of view the is no need for a new trackref class?

Will this approach work, or do I miss something?

Which use case do you have in mind?
When do we need for example the compare operators?
TrackRef is a new class, but it encapsulates a separate aspect of tracks
that does not fit into TrackInfoObject. TrackInfoObject is still too large
and needs to be decomposed.

I don't think that we should introduce an additional layer of caching.
Storing the QFileInfo and calculating canonicalLocation() dynamically is
not
an option, because this class is used as a key in the TrackCache and keys
must be immutable! It is only allowed to set the ID once.

Added comments to location()/canonicalLocation() about potential file
system access

Moved the construction from a QFileInfo into a separate factory function
to avoid misuse

On 03/04/2016 04:00 PM, Daniel Schürmann wrote:

I think I have understand what you aim, even if the new TrackRef is not
used.

... but IMHO for the additional self documentation we introduce a new
degree of complexity, another Object type other developers have to learn.
Can't we move the new features to the TrackInfoObject or to a track info
base class?

Always calculating and storing the canonical path, might be an overhead
that we don't need for all use-cases of TrackRef. I would prefer a
solution that caches the canonical path only on demand.
Such caching is already done by QFileInfo, so way may want to wrap that,
instead of our own caching.

—
Reply to this email directly or view it on GitHub
#901 (comment).

—
Reply to this email directly or view it on GitHub
#901 (comment).

—
Reply to this email directly or view it on GitHub
#901 (comment).

daschuer · 2016-03-05T22:23:06Z

I am starting to understand your approach.

We must not instantiate two TrackInfoObjects for the same file! Otherwise writing tags back into files is unsafe. This is the current situation.

This is a true requirement from your point of view. The underlying root requirement is:
We must not open a file twice.

Even if we have only a single track object there is no guarantee. But the advantage of your approach is that it allows us to set-up locking mechanism quite easy.
So lets head to a solution that combines the best of both ideas.

Currently a TrackInfo if build around the unique database ID. If we now prevent Mixxx from instantiating a TrackInfo object from a different database ID of a file that is only referenced by an other Track, it will change a lot of the meaning of a track info object.
This will move the "remove duplicates" task to every instantiation of a track info object.
I am afraid this will leads to various issues.

For example in SetlogFeature::slotJoinWithPrevious the TrackInfo object is only required to set the played counter, which has nothing to do with the track file on the HDD.
If we return a track ID from an other track info object which already reference the same file. We will write the wrong DB entry.

This is the source of my Objections: If we do the deduplicate it at the Track level, we will have problems to graceful handle duplicates, which we will have in some legacy databases. Of cause, duplicates are not desired, but they exists and we need keep a way to write the Database entries of a specific Track ID in the database even if we have already an other track object around which referenced the same file but from and on other DB ID.

How do you think about this idea:
Keep two caches:

The track ID cache (as we have already)
An canonical path cache, required to allow only one open file operation at a time.

uklotzde · 2016-03-06T12:03:48Z

I'll close this PR for now. The commits are still available in my "WriteAudioTags" branch/PR.

I have to think about the implications of an alternative solution or how to handle inconsistencies/duplicates in existing libraries.

uklotzde added 4 commits March 3, 2016 17:33

Cleanup includes for TrackInfoObject

e2f8010

Reorder member function in TrackInfoObject

e732c44

Reorder some includes

f02e7fc

Encapsulate location and id of tracks in TrackRef

d673284

daschuer reviewed Mar 4, 2016
View reviewed changes

Move construction from QFileInfo into a separate factory function

2eacde6

...because construction from QFileInfo is costly and should be used consciously.

uklotzde mentioned this pull request Mar 6, 2016

Write audio tags back into files #728

Merged

uklotzde closed this Mar 6, 2016

uklotzde deleted the TrackRef branch March 6, 2016 12:09

mixxxbot mentioned this pull request Aug 22, 2022

duplicate titles when using Symlinks in Library #8009

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TrackRef = Track Location + Track ID #901

TrackRef = Track Location + Track ID #901

uklotzde commented Mar 3, 2016

daschuer commented Mar 4, 2016

daschuer Mar 4, 2016

uklotzde Mar 5, 2016

uklotzde commented Mar 5, 2016

daschuer commented Mar 5, 2016

uklotzde commented Mar 5, 2016

uklotzde commented Mar 5, 2016

daschuer commented Mar 5, 2016

uklotzde commented Mar 6, 2016

TrackRef = Track Location + Track ID #901

TrackRef = Track Location + Track ID #901

Conversation

uklotzde commented Mar 3, 2016

daschuer commented Mar 4, 2016

daschuer Mar 4, 2016

Choose a reason for hiding this comment

uklotzde Mar 5, 2016

Choose a reason for hiding this comment

uklotzde commented Mar 5, 2016

daschuer commented Mar 5, 2016

uklotzde commented Mar 5, 2016

uklotzde commented Mar 5, 2016

daschuer commented Mar 5, 2016

uklotzde commented Mar 6, 2016