-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding loader for Beatport EDM key #286
Adding loader for Beatport EDM key #286
Conversation
add to README list
Codecov Report
@@ Coverage Diff @@
## master #286 +/- ##
=======================================
Coverage 99.04% 99.04%
=======================================
Files 24 24
Lines 2523 2523
=======================================
Hits 2499 2499
Misses 24 24 |
# This dataset annotations have characters that doesnt correspond with json format. In particular, "bpm": nan | ||
# doesn't correspond to json format. | ||
# input file | ||
find_replace(os.path.join(data_home, "meta"), ": nan", ": null", "*.json") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the role of this function when you have an index for all the files in the dataset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Original JSON files have nan
that are not of the JSON standard format. So I can't read JSON. So from what is downloaded of Zenodo is changed each nan
to none
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why the function is there now. Maybe it would be more informative adding the long comment on the readme of the function above and leaving a short one here. Also, thinking out loud, I guess is a good choice to do it in the download so you do it only once and in a consistent way to the index, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, right
mirdata/jams_utils.py
Outdated
@@ -249,7 +249,7 @@ def jams_converter( | |||
|
|||
|
|||
def beats_to_jams(beats): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file should not be included in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Black modifies it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try running black -S
mirdata/rwc_classical.py
Outdated
@@ -287,14 +287,14 @@ def track_ids(): | |||
def load(data_home=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file should not be included in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Black modifies it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try running it as defined in CONTRIBUTING
black --target-version py37 --skip-string-normalization mirdata/
as a rule of thumb try avoiding global commits and refer to specific files.
you can discard the commits on a specific file by updating the master and replacing your version of the file with the version from the master
git checkout master file_to_restore.ext
mirdata/utils.py
Outdated
@@ -208,10 +208,10 @@ def load_json_index(filename): | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file should not be included in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Black modifies it.
docs/source/datasets.rst
Outdated
| giantsteps_key_ | Giantsteps EDM key | - audio: ✅ | - global :ref:`key` | 1486 | | ||
| | | - annotations: ✅ | | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be Beatport?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the audio provided? or should it be an X?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!!!
yes!! the audio is provided!!
mirdata/beatport_key.py
Outdated
@@ -0,0 +1,348 @@ | |||
# -*- coding: utf-8 -*- | |||
"""beatport_key Dataset Loader | |||
The Beatport EDM Key Dataset includes 1,486 two-minute sound excerpts from various EDM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, I think we're using 1486 as notation for now, will see for bigger datasets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok perfect!
done
|
||
Data License: Creative Commons Attribution Share Alike 4.0 International | ||
""" | ||
import fnmatch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just leaving a comment here to check if this is in requirements or partial requirements. If it is ignore this comment, if it's not we should discuss it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that fnmatch is from "python standard"
def find_replace(directory, find, replace, pattern): | ||
""" | ||
Replace in some directory all the songs with the format pattern find by replace | ||
Parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point reading the module I'm not sure why this function is needed. Maybe expand on the docs why is it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok! done then
|
||
@utils.cached_property | ||
def key(self): | ||
"""String: key annotation""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've noticed we're mixing docs styles. I don't know if this is the only loader that does it, but I'll add it as part of the audit issue to make sure we are consistent later on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was imitating the Beatles loader. What style must we use?
# This dataset annotations have characters that doesnt correspond with json format. In particular, "bpm": nan | ||
# doesn't correspond to json format. | ||
# input file | ||
find_replace(os.path.join(data_home, "meta"), ": nan", ": null", "*.json") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why the function is there now. Maybe it would be more informative adding the long comment on the readme of the function above and leaving a short one here. Also, thinking out loud, I guess is a good choice to do it in the download so you do it only once and in a consistent way to the index, right?
mirdata/jams_utils.py
Outdated
@@ -249,7 +249,7 @@ def jams_converter( | |||
|
|||
|
|||
def beats_to_jams(beats): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try running black -S
}, | ||
"guest_pick": false, | ||
"sub_genres": [] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: end line
tests/test_beatport_key.py
Outdated
|
||
audio, sr = track.audio | ||
assert sr == 44100, 'sample rate {} is not 44100'.format(sr) | ||
assert audio.shape == (5292000,), 'audio shape {} was not (5294592,)'.format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't numbers match here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@PRamoneda great job! Left my comments there. After you address them I'll take a look again, but first maybe it would be good to also merge the master so I can check and merge right away? |
numbers must match
1486 without comma
@magdalenafuentes I dont know I have requested or not the review! hahahaa Thnks so much! |
Adding loader for Beatport EDM key (#286) * make script and module * fix nan problem * fixed tests * black * Update README.md add to README list * Update mirdata.rst * Update datasets.rst * fix codecov * fix codecov xd * added test in order to pass codecov * Update datasets.rst * Update test_beatport_key.py numbers must match * 1486 without comma 1486 without comma * move comments * try black
Adding loader for Beatport EDM key
Please use the following title: "Adding loader for MyDATASET". If your pull request is work in progress, change your title to "[WIP] Adding loader for MyDATASET" to avoid reviews while the loader is not ready.
Description
Please include the following information at the top level docstring for the dataset's module mydataset.py:
Dataset loaders checklist:
scripts/
, e.g.make_my_dataset_index.py
, which generates an index file.mirdata/indexes/
e.g.my_dataset_index.json
.mirdata/my_dataset.py
tests/
, e.g.test_my_dataset.py
docs/source/mirdata.rst
anddocs/source/datasets.rst
mirdata/__init__.py
README.md
file, sectionCurrently supported datasets
If your dataset is not fully downloadable there are two extra steps you should follow:
It is fully downloadable
Please-do-not-edit flag
To reduce friction, we will make commits on top of contributor's pull requests by default unless they use the
please-do-not-edit
flag. If you don't want this to happen don't forget to add the flag when you start your pull request.