Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What information can be abstracted to the track.Track object? #224

Closed
rabitt opened this issue Apr 8, 2020 · 1 comment
Closed

What information can be abstracted to the track.Track object? #224

rabitt opened this issue Apr 8, 2020 · 1 comment
Labels
enhancement New feature or request question Further information is requested

Comments

@rabitt
Copy link
Collaborator

rabitt commented Apr 8, 2020

The generic mirdata.track.Track object provides a simple abstraction in order to automate the __repr__ method which we used to write manually for each dataset's module.

In #219 @lostanlen experimented with what other information we could abstract in order to simplify each module's Track object. He proposed to have the following live as attributes/properties/methods of the new Track object:

  • audio which loads an attribute in the track index called audio
  • duration (pulling from either the metadata or from audio)
  • a from_jams method which instantiates mirdata objects from a jams object
  • a load_metadata method, which would live at the track-level rather than the module level
  • the to_jams method by programatically parsing the track attributes
  • a validate method, which would live at the track-level rather than the module level
  • the entire init function, by instantiating the object with a track_metadata dictionary

Any thoughts about each of these abstractions?

(@lostanlen correct any of this if I got the details wrong)

@rabitt rabitt added the question Further information is requested label Apr 8, 2020
@lostanlen
Copy link
Collaborator

lostanlen commented Apr 8, 2020

Thank you for raising this. Yes, all of this is correct. Here are some details:

  • audio is really an easy one. It relies on the fact that mirdata indexes have keys to describe them. So, if we encourage index providers to denote their audio files by the key "audio" (and this isn't an eccentric thing to ask), then this audio is immediately available through the cached property track.audio, which is implemented in the parent class. So it ends up being less code for the contributors, because they don't need to implement "audio" (nor load_audio). If the Track in question has no audio, well: raise AttributeError. Then, for fancier datasets which have more than one kind of audio (e.g. GuitarSet), the dataset providers may add other properties on top: audio_hex, audio_mic, etc. But there can only be one audio property; and by convention, we assume that it's where the file indexed as audio lives.

  • duration, on the contrary, is the hardest one. Is it mutable or immutable? It is a property or an attribute? Should we cache it? Suppose that we have an annotation on disk but we haven't downloaded the audio: should we raise an AttributeError until we can get the duration from the audio, or have a tentative guesstimate of the duration via the annotation (e.g. time of the last beat?) These are difficult design questions, and i don't have a set-in-stone answer. The reason why i wrote a hacky version of duration is that i needed it for to_jams to validate.

  • a load_metadata method, which would live at the track-level rather than the module level``
    Well, load_metadata is a `@staticmethod`, so at the scale of a`Track`, it doesn't really matter whether it lives as the track level or the module level. The reason why i included it as part of the object is that i wanted to pass it to the `Dataset` constructor. So this question is largely orthogonal, and we can leave it to a different issue (Should dataset modules have a common Dataset class? #225).

  • however, on the same topic, one thing that i realized last night is that track_id is not a great argument to pass to a Track constructor, even if it is module-specialized. Rather than the key, I much prefer passing track_index and track_metadata, i.e. the values of index and metadata at the key track_id. This makes the values potentially observable to the callsite (Dataset), and therefore allows for relatively neat features, such as filtering Tracks by annotation type ("only load tracks with melody3 annotation" module.load_* should raise IOError, not return None #222) or by some metadata field ("only load GuitarSet tracks above 120 bpm"). This is pretty difficult to do with the current design, because a track_id (string) only tells you so much about what's in there.

*to_jams was the reason i started all this. I think we're on the same page here. My track.to_jams implementation is nothing more than a jams_utils.jams_converter withself as argument.

  • having a Track-level validate is a nice bonus. It's the ten-line piece of code we see in every module right now
    def validate(self):
        missing_files = []
        invalid_checksums = []
        for track_tuple in self.track_index.values():
            track_path, checksum = track_tuple
            # validate that the file exists on disk
            if not os.path.exists(track_path):
                missing_files.append(track_path)
            # validate that the checksum matches
            elif utils.md5(track_path) != checksum:
                invalid_checksums.append(track_path)
        return missing_files, invalid_checksums

Long-term, i would love to implement a "safe" mode of mirdata where we'd validate the track automatically before accessing the audio property, and raises an error if a checksum is invalid. This safe mode could be a flag in the Track constructor. In this way, users wouldn't need to run validate manually to be warned of a problem: they would just load stuff and access it with peace of mind.

  • the entire init function, by instantiating the object with a track_metadata dictionary

Yes. In fact, i could see three stage of "loading" a Track:
(1) parsing the track_id with a static function (parse_track_id). Note that this can be done from the index, without downloading any remote
(2) loading the metadata of the track. load_metadata taking data_home as argument. This is usually done at the Dataset level, because in many cases, there's not one metadata file per track, but one metadata file per dataset.
(3) loading the annotations of the track. This is done at the track level with specialized parsers such as load_beats, `load_melody. We can't unify this part at all because every dataset is different.

At the moment, mirdata does (3) splendidly, (2) inelegantly (copying over values from the track_metadata dictionary into the self object ... in every module), and (1) not at all.

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants