Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor DataLayer into DataSource and DataProcessing layers #148

Closed
sergeyk opened this issue Feb 24, 2014 · 7 comments
Closed

Refactor DataLayer into DataSource and DataProcessing layers #148

sergeyk opened this issue Feb 24, 2014 · 7 comments
Milestone

Comments

@sergeyk
Copy link
Contributor

sergeyk commented Feb 24, 2014

Right now, DataLayer does image-specific processing of LevelDB data.
It should be separated into LevelDBDataSourceLayer, and ImageProcessingLayer.
That way, other data sources can more easily be added: HDF5, image directories, CSV, etc.

@kloudkl
Copy link
Contributor

kloudkl commented Feb 25, 2014

Would you like to create the first DataSourceLayer for the HDF5DataLayer in #147?

@kloudkl
Copy link
Contributor

kloudkl commented Feb 25, 2014

But for backward compatibility and not breaking all the existing user codes, we probably need to have both versions for a while and deprecate the old version after the end of an announced maintenance period. Shall we set a version release schedule from now on?

@shelhamer
Copy link
Member

@kloudkl we're still in a "break it now so it sets right" stage of development, so we aren't ready for versioned releases, but we'll do our best to make the transition comfortable.

@sergeyk
Copy link
Contributor Author

sergeyk commented Feb 25, 2014

@kloudkl no, I think #147 should be merged before this refactoring

@kloudkl
Copy link
Contributor

kloudkl commented Feb 25, 2014

Let me make it more clear. I did not mean that #147 should be refactored before being merged.

Switching easily among different data sources using the same processing layer will greatly reduce code redundancy and is a feature that many users would embrace wholeheartedly. What I asked for is a concrete example of the design that @sergeyk was thinking about.

@kloudkl
Copy link
Contributor

kloudkl commented Mar 11, 2014

Not everything has to be a separate layer considering the significant success of #128. The DataProcessing stuffs fit more naturally in a processing pipeline which should stay in the refactored data layer to avoid redundant memory copy.

The data source should be a plain field of data layer similarly to avoid creating too many kinds of layers. The data source will be defined in the proto and instantiated by a data source factory. The major methods of the DataSource base class that I can think of are has_next_batch, next_batch, load and save. Any better ideas?

@futurely
Copy link

A series of recent collaborative efforts to refactor the data layers have completed this task.

@sguada sguada closed this as completed Nov 28, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants