-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can you use python to train a network from scratch? #360
Comments
See #294. That PR is usable but has a bug that I will fix shortly (after which it will be ready for merge). If you really don't care about speed, you can also write your own SGD in Python using the interface as-is (just write to the Indeed, documentation is currently poor; it should improve along with #311. |
Thanks for the quick response! I looked at #294 and it looks like that will solve my problem (i.e. solver = caffe.SGDSolver('solver.prototxt') doesn't need a preexisting net). I went ahead and started using that PR, but having having trouble getting the memory layer to work. To test things out, I tried modifying the lenet example to use a memory layer instead of a data layer. I just changed the first layer's type from DATA to MEMORY_DATA in the train prototext. But running caffe.SGDSolver('lenet_solver.prototxt') results in a weird error: I0425 12:44:44.903637 26146 net.cpp:111] conv1 -> conv1 Am I using initializing the memory layer incorrectly? |
You'll also need to add a
|
Thanks for the help longjon (in addition to the great PR)! The layer seems to be working as intended, though see #381 for some issues regarding one of its use cases. Cheers! |
Thanks again @longjon for teaching python to train models. |
The hints here also answer the older issue #135. |
How should I feed the data to SGDsolver in Python interface |
@erogol Use the mailing list for support question |
Hi @zergylord @longjon . I am working on semantic segmentation following the paper by @longjon and @shelhamer . I have already configured the prototxt files that you have provided for my needs and it is working as intended. However, the mean accuracy fluctuates between 42% to 48% while overall accuracy is around 86% after 5800 iterations with learning rate 1e-14. Any pointers regarding this other than increasing the iterations? Also, I feel that the network is very deep for my purposes and hence takes a lot of time to train. Therefore, I want to train my own network using caffe's python interface . My dataset is simple with 4 different categories but in each image it is guaranteed that it will have white/grey background and a single object belonging to one of the 4 categories. I read the above posts and modified my prototxt to use the memory data layer. Below is the first snippet of my data layer as advised by @longjon : |
I'm trying to train a model completely in python (the training process is interactive enough to justify this I think), but the only way I've seen to get a network into python is by calling:
caffe.Net(model_def_file, pretrained_model)
But this presumes you already have a model to work with. Is there a way to create a new model in python? Or if not, is there an easy way to create a model file without training it on anything (to use it as the pretrained_model argument)?
I hope this isn't a silly question; I've tried figuring it out myself, but the document for the python wrapper is pretty sparse.
The text was updated successfully, but these errors were encountered: