-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
what is the lowest loss value can reach? #9
Comments
What batch size are you using? Because without the batch size, step number cannot say anything about how far you've gone. According to the author of YOLO, he used pretty powerful machine and the training have two stages with the first stage (training convolution layer with average pool) takes about a week. So you should be patient if you're not that far from the beginning. Training deep net is more of an art than science. So my suggestion is you first train your model on a small data size first to see if the model is able to overfit over training set, if not then there's a problem to solve before proceeding. Notice due to data augmentation built in the code, you can't really reach 0.0 for the loss. I've trained a few configs on my code and the loss can shrink down well from > 10.0 to around 0.5 or below (parameters C, B, S are not relevant since the loss is averaged across the output tensor). I usually start with default learning rate 1e-5, and batch size 16 or even 8 to speed up the loss first until it stops decreasing and seem to be unstable. Then, learning rate will be decreased down to 1e-6 and batch size increase to 32 and 64 whenever I feel that the loss get stuck (and testing still does not give good result). You can switch to other adaptive learning rate training algorithm (e.g. Adadelta, Adam, etc) if you feel like familiar with them by editing You can also look at the learning rate policy the YOLO author used, inside .cfg files. Best of luck |
@thtrieu What a nice suggestion ! I also encountered similar issues, and find out that pre-trained weight might be a really help. More, quality and quantities of data-itself is really important especially while training a yolo-style network, it just too hard to converge well ... I am still struggling on this |
@thtrieu thank you~ In my first round of training, the batch size is 12. I get your point when you say patient. My final goal is to find the bounding box of object which is not in the Imagenet, so I do the training without pre-trained model. Thanks again! |
Just a friendly ping. I've finish training for a YOLO of 4 classes, if you are interested I will write some notes about the process of training it. |
@thtrieu Yes, I am looking forward to it. |
I have updated the code for many cycles since then, so it will affect the scaling of loss value. But mechanism is the same. Here are my notes:
Good luck, I'd love to hear update from your training. |
@thtrieu I run a fine-tuning on tiny-yolo-voc models, but the loss value is approximately 6, not 1.5~1.7. |
I don't have much experience in YOLOv2, maybe @ryansun1900 does. Here is why YOLOv2's loss is much higher than that of v1:
So the output volume of v2 is much larger than v1 ( |
So far, I don't have much experience in training large data too. |
thanks for the good tips :) |
Hi ,
@thtrieu can you please explain what do you mean by increase the deapth? How do we do it? By changing something in the cfg file? I am training for 9 classes with yolov2 and have creazed a cfg file called yolov2-tiny-9c.cfg. SO i make changes in this file or in the original yolov2-tiny.cfg file? |
I`m training a model for 1 class, yolov3-tiny.cfg. Training set 6800 jpegs ranging from 1 to 24 objects in each jpeg. Training set images normalized to 720 lines (height) but variable width. Batch size 24, subdivisions 2. Image size 512x512. Learning rate 0.0015. Max batches 450000. Although mAP is high (about 98%) average loss is still above 0.5. I guess that model is fully trained at iteration 31500 because beyond this point mAP is stable at 0.98 (98%). My doubt is: I feel the model is overfit because it does not generalizes well or it does not generalizes well because average loss is still high? |
hey can you tell me how to print chart like this when you training your model? |
I think he's using AlexeyAB's repo which has GUI support. |
I want to get complete loss function computation as I am facing a problem in understanding it |
do not write the parameter dont_show in the training command |
hi, I have trained a yolo-small model to step 4648, but most of loss values are greater than 1.0, and the result of test is not very well. I want to know how well can loss value be, and could you please show some key parameters when training, e.g learning rate, training time, the final loss value, and so on.
I train the model on iMac(4 GHz Inter Core i7, 16GB memory), CPU mode.
thank you!
The text was updated successfully, but these errors were encountered: