Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate the reusable data processors from the data layers #244

Closed
wants to merge 7 commits into from
Closed

Separate the reusable data processors from the data layers #244

wants to merge 7 commits into from

Conversation

kloudkl
Copy link
Contributor

@kloudkl kloudkl commented Mar 19, 2014

This PR will resolve #148 and take over #196.

@kloudkl
Copy link
Contributor Author

kloudkl commented Mar 19, 2014

Welcome comments on the API.

explicit MemoryDataSink(const DataSinkParameter& param);
virtual ~MemoryDataSink();

void SaveNextbatch(const shared_ptr<Blob<Dtype> > data) = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SaveNextBatch (capitalize B, and change other occurrences as well)

@jeffdonahue
Copy link
Contributor

This looks great! Could you update imagenet.prototxt with how you anticipate that looking with the new data source/processing/sink API?

@sguada
Copy link
Contributor

sguada commented Mar 19, 2014

@kloudkl I'm also working in #148, will create a PR soon, and then we can compare both.

I think data_sinks, should be separated of data_sources.

@Yangqing I don't know how protobuf handles changing enum types over time. I'm not sure if we should use enum type for the different kinds of data_sources and data_processing. What would happen if we add a new one. Would the new protobuf compatible with the old one?

@jeffdonahue
Copy link
Contributor

@sguada, I believe the enum type will work fine (I'm also using it for the new LayerTypes in #219), we just can't change any of the ID numbers once we release. So we should always use the "next available" ID number in the enum (increment by 1 the largest ID ever used in the enum) and never reuse a number even if we decide to retire a particular data source or processing type.

data_blob {
data_source {
name: "ilvsrc12_train_leveldb"
leveldb {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add param with type: leveldb. In that case is clear that only one type is possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

@kloudkl kloudkl changed the title New data layer Separate the resuable data processors from the data layers Mar 21, 2014
@kloudkl
Copy link
Contributor Author

kloudkl commented Mar 21, 2014

There is no need to re-invent the data layers into data source. Only the data processors need to be extracted. A base class InputLayer will be added to simplify integration of the processors with the data layers. There will also be OutputLayer correspondingly.

@shelhamer shelhamer changed the title Separate the resuable data processors from the data layers Separate the reusable data processors from the data layers Mar 22, 2014
@kloudkl
Copy link
Contributor Author

kloudkl commented Aug 27, 2014

Solved by #954.

@kloudkl kloudkl closed this Aug 27, 2014
lukeyeager added a commit to lukeyeager/caffe that referenced this pull request Oct 3, 2016
[TravisCI] google/protobuf renamed the 3.0 branch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants