Generalize copy-paste dataset functions to utils? #293

rabitt · 2020-10-19T21:19:08Z

dataset.validate(), dataset.load() and dataset.track_ids() are identical in every loader. How can we generalize this?

Option 1 - lambdas

we define in e.g. utils lambda functions, which get instantiated in each dataset, e.g.
track_ids = utils.track_ids(DATA.index)
where utils.track_ids is itself a lambda function

pros: same API, less code to copy paste
cons: still copy pasting code, not very nice coding style

Option 2 - reverse the api

utils.validate('orchset') rather than orchset.validate()

pros: no code copy-pasting, better code style
cons: changes the API, makes some inconsistency - some things are dataset-major (e.g. orchset.Track) some are function-major (utils.validate('orchset'))

Option 3 - split it all up

*move validate to a top level (utils.validate('orchset'))
*remove load all together because it's just a wrapper
*keep track ids as they are
pros: simplifies the current api
cons: not very consistent

Option 4 - create a Dataset class

here's a big discussion about this #225

pro: helps standardize everything
cons: adds complexity, top level api change

The text was updated successfully, but these errors were encountered:

nkundiushuti · 2020-10-20T10:01:46Z

besides solving this issues, the dataset object would help with other solutions: sampling, generator. maybe it's worth the effort. we could split it between us and modify the existing loaders.

magdalenafuentes · 2020-10-20T14:10:06Z

Yeah, after discussing a lot yesterday we decided to go for the big change and add the dataset object. It will simplify code a lot, increase consistency and change the user API only a bit. We decided to base the implementation on Vincent's great idea in #219.

rabitt · 2020-10-20T20:49:38Z

@nkundiushuti take a look at #296 and let me know what you think about the new proposed API. The Dataset class can of course be extended in the future to include sampling etc as you mention! I started with just porting the existing functionality, and if it seems solid we can add to it

rabitt · 2020-11-03T22:07:58Z

Done in #296

rabitt added the priority Issues with this tag will be addressed before others. label Oct 19, 2020

rabitt assigned rabitt and magdalenafuentes Oct 19, 2020

rabitt mentioned this issue Oct 20, 2020

Dataset object #296

Merged

8 tasks

rabitt added this to the 0.3 milestone Oct 23, 2020

rabitt closed this as completed Nov 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalize copy-paste dataset functions to utils? #293

Generalize copy-paste dataset functions to utils? #293

rabitt commented Oct 19, 2020

nkundiushuti commented Oct 20, 2020

magdalenafuentes commented Oct 20, 2020

rabitt commented Oct 20, 2020

rabitt commented Nov 3, 2020

Generalize copy-paste dataset functions to utils? #293

Generalize copy-paste dataset functions to utils? #293

Comments

rabitt commented Oct 19, 2020

Option 1 - lambdas

Option 2 - reverse the api

Option 3 - split it all up

Option 4 - create a Dataset class

nkundiushuti commented Oct 20, 2020

magdalenafuentes commented Oct 20, 2020

rabitt commented Oct 20, 2020

rabitt commented Nov 3, 2020