Adding loader for BAF #577

guillemcortes · 2023-02-23T15:19:50Z

Adding loader for BAF

Description

Please include the following information at the top level docstring for the dataset's module mydataset.py:

Describe annotations included in the dataset
Indicate the size of the datasets (e.g. number files and duration, hours)
Mention the origin of the dataset (e.g. creator, institution)
Describe the type of music included in the dataset
Indicate any relevant papers related to the dataset
Include a description about how the data can be accessed and the license it uses (if applicable)

Dataset loaders checklist:

Create a script in scripts/, e.g. make_my_dataset_index.py, which generates an index file.
Run the script on the canonical version of the dataset and save the index in mirdata/indexes/ e.g. my_dataset_index.json.
Create a module in mirdata, e.g. mirdata/my_dataset.py
Create tests for your loader in tests/datasets/, e.g. test_my_dataset.py
Add your module to docs/source/mirdata.rst and docs/source/table.rst
Run tests/test_full_dataset.py on your dataset.

If your dataset is not fully downloadable there are two extra steps you should follow:

Contacting the mirdata organizers by opening an issue or PR so we can discuss how to proceed with the closed dataset. --> Talked to @genisplaja
Show that the version used to create the checksum is the "canonical" one, either by getting the version from the dataset creator, or by verifying equivalence with several other copies of the dataset.
Make sure someone has run pytest -s tests/test_full_dataset.py --local --dataset my_dataset once on your dataset locally and confirmed it passes

Please-do-not-edit flag

To reduce friction, we will make commits on top of contributor's pull requests by default unless they use the please-do-not-edit flag. If you don't want this to happen don't forget to add the flag when you start your pull request.

guillemcortes · 2023-02-24T15:49:27Z

I might need some help passing the failing checks.

Both buildpy37 and buildpy38 are failing in tests/datasets/test_egfxset.py, a test from another dataset.
I've ran black --target-version py38 mirdata/ tests/ but I don't pass formatting check.
I see that @genisplaja is facing the same issues above in PR-576
readthedocs.org:mirdata check fails due to some error when importing pandas. I see that openmic2018 uses pandas, so I'm wondering if this has happened before. This is the Traceback:

WARNING: autodoc: failed to import module 'baf' from module 'mirdata.datasets'; the following exception was raised:
Traceback (most recent call last):
  File "/home/docs/checkouts/readthedocs.org/user_builds/mirdata/envs/577/lib/python3.7/site-packages/sphinx/ext/autodoc/importer.py", line 70, in import_module
    return importlib.import_module(modname)
  File "/home/docs/checkouts/readthedocs.org/user_builds/mirdata/envs/577/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/docs/checkouts/readthedocs.org/user_builds/mirdata/checkouts/577/mirdata/datasets/baf.py", line 120, in <module>
    import pandas as pd
  File "/home/docs/checkouts/readthedocs.org/user_builds/mirdata/envs/577/lib/python3.7/site-packages/pandas/__init__.py", line 22, in <module>
    from pandas.compat import (
  File "/home/docs/checkouts/readthedocs.org/user_builds/mirdata/envs/577/lib/python3.7/site-packages/pandas/compat/__init__.py", line 15, in <module>
    from pandas.compat.numpy import (
  File "/home/docs/checkouts/readthedocs.org/user_builds/mirdata/envs/577/lib/python3.7/site-packages/pandas/compat/numpy/__init__.py", line 7, in <module>
    from pandas.util.version import Version
  File "/home/docs/checkouts/readthedocs.org/user_builds/mirdata/envs/577/lib/python3.7/site-packages/pandas/util/__init__.py", line 1, in <module>
    from pandas.util._decorators import (  # noqa
  File "/home/docs/checkouts/readthedocs.org/user_builds/mirdata/envs/577/lib/python3.7/site-packages/pandas/util/_decorators.py", line 14, in <module>
    from pandas._libs.properties import cache_readonly  # noqa
  File "/home/docs/checkouts/readthedocs.org/user_builds/mirdata/envs/577/lib/python3.7/site-packages/pandas/_libs/__init__.py", line 13, in <module>
    from pandas._libs.interval import Interval
  File "pandas/_libs/interval.pyx", line 1, in init pandas._libs.interval
TypeError: numpy.dtype is not a type object

dagett · 2023-02-25T20:59:31Z

About the failing test in test_egfxset.py, looks like it is due to recent librosa release, see #578

#578

* Fix tox for formatting test * Pin black version to 23.1.0 * Upgrade librosa version and ensure python3.6 compatibility * Black formatting with new 23.1.0 version * Fixing egfxset expected return value * Mock pandas import at sphinx autodoc * Fix black version for python3.6

#578

codecov · 2023-03-07T09:34:32Z

Codecov Report

Merging #577 (0bf86c5) into master (c9fd249) will increase coverage by 0.00%.
The diff coverage is 98.03%.

@@           Coverage Diff            @@
##           master     #577    +/-   ##
========================================
  Coverage   96.85%   96.86%            
========================================
  Files          55       57     +2     
  Lines        6705     6818   +113     
========================================
+ Hits         6494     6604   +110     
- Misses        211      214     +3

Guillem Cortès and others added 10 commits February 16, 2023 16:17

baf index files

83b5c17

update index

193317e

baf dataloader first commit

6449901

pre-alpha version of baf loader. Seems to be working with simple testing

d723e97

BAF toy version for testing

173f7f1

trim audios to 1 second length

c0561fa

add tests

aa51a86

Loader passing tests

c7acf39

Add docs

0ce0f8c

Rename make index script

d5873f6

guillemcortes changed the title ~~Adding loader for BAF~~ [WIP] Adding loader for BAF Feb 23, 2023

guillemcortes added 7 commits February 23, 2023 17:32

Test fix

dfe0af0

Ignore test load_matches and other fix

5c2c836

Fix local test

6445622

Fix tox for formatting test

aee4330

Remove REMOTES

326d428

black correct formatting --target py38

674a3b6

REMOTES as None

45137e4

guillemcortes changed the title ~~[WIP] Adding loader for BAF~~ Adding loader for BAF Feb 24, 2023

guillemcortes changed the title ~~Adding loader for BAF~~ [WIP] Adding loader for BAF Feb 24, 2023

guillemcortes added 2 commits February 24, 2023 16:26

mypy tests fix and other

c2cf704

black formatting

c952166

dagett mentioned this pull request Feb 25, 2023

test_egfxset.py is failing due to librosa API change #578

Closed

guillemcortes added 2 commits February 27, 2023 16:05

Fixing librosa version to 0.9.2 due to #578 issue @dagett

50b74a4

#578

Mocking pandas import for sphinx autodocs

ee609a9

guillemcortes changed the title ~~[WIP] Adding loader for BAF~~ Adding loader for BAF Feb 28, 2023

guillemcortes and others added 2 commits March 7, 2023 09:26

baf index files

ee19f90

guillemcortes added 20 commits March 7, 2023 09:48

update index

495a41a

baf dataloader first commit

3ffbc54

pre-alpha version of baf loader. Seems to be working with simple testing

746aed7

BAF toy version for testing

882c2c5

trim audios to 1 second length

f8a7d84

add tests

3e5d0d0

Loader passing tests

f632403

Add docs

d5c7a9c

Rename make index script

a3b9e5c

Test fix

37f490f

Ignore test load_matches and other fix

bab979d

Fix local test

add9465

Remove REMOTES

398b884

black correct formatting --target py38

58033ed

REMOTES as None

e75b509

mypy tests fix and other

2b1d191

black formatting

e0379bc

Fixing librosa version to 0.9.2 due to #578 issue @dagett

3caa6f8

#578

Mocking pandas import for sphinx autodocs

e2d62df

Merge branch 'master' of github.com:guillemcortes/mirdata

f6fcb6c

guillemcortes added 2 commits March 7, 2023 11:51

Better test coverage

034bc31

Remove unused import

0bf86c5

guillemcortes closed this by deleting the head repository Mar 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding loader for BAF #577

Adding loader for BAF #577

guillemcortes commented Feb 23, 2023

guillemcortes commented Feb 24, 2023

dagett commented Feb 25, 2023

codecov bot commented Mar 7, 2023 •

edited

Loading

Adding loader for BAF #577

Adding loader for BAF #577

Conversation

guillemcortes commented Feb 23, 2023