Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training / Converting Model #4

Open
brianantonelli opened this issue Oct 17, 2017 · 7 comments
Open

Training / Converting Model #4

brianantonelli opened this issue Oct 17, 2017 · 7 comments

Comments

@brianantonelli
Copy link

First off, thank you for putting this repo together!

In preparation for training my own model on YOLO and converting to TF I wanted to prove out the pattern by taking the TinyYOLO VOC weights/config and recreating the frozen memmapped graph that you provide in this repo. After going through the Darkflow docs as well as Tensorflow for Mobile Poets I came up with the following process for converting the graph. The graph loads and runs on my iPhone; however, it seems to just randomly identify non-objects (typically 20-30 per second) which inevitably runs out of memory and crashes the application. I'll provide the steps I took for converting the graph below. Could you please share how you created your graph or provide some feedback?

BTW, the first big red flag I see is when I freeze the TinyYOLO VOC weights using Darkflow my graph is 61MB, but yours is 180MB?!

Here's my process:

Grab TinyYOLO VOC weights and config (I've used PJ's config as well as yours, there are a few differences):

wget https://raw.githubusercontent.com/pjreddie/darknet/master/cfg/tiny-yolo-voc.cfg
wget https://pjreddie.com/media/files/tiny-yolo-voc.weights

Next I validate that the weights/configs work before freezing:

flow --model tiny-yolo-voc.cfg --load tiny-yolo-voc.weights --gpu 0.9 --json

Next I freeze the graph and validate it still identifies the objects (everything works great):

flow --model tiny-yolo-voc.cfg --load tiny-yolo-voc.weights --savepb --gpu 0.9
flow --pbLoad built_graph/tiny-yolo-voc.pb --metaLoad built_graph/tiny-yolo-voc.meta --imgdir sample_img/ --json --gpu 0.9

Then I optimize the graph for inference and test again (still works):

bazel-bin/tensorflow/python/tools/optimize_for_inference \
    --input=/yoloz/tiny/built_graph/tiny-yolo-voc.pb \
    --output=/yoloz/tiny/built_graph/optimized_graph.pb \
     --input_names=input \
     --output_names=output \
     --frozen_graph=True

Then I quantize and round the graph and validate (still works, but some accuracy drops as expected):

bazel-bin/tensorflow/tools/quantization/quantize_graph \
--input=/yoloz/tiny/built_graph/optimized_graph.pb \
--output=/yoloz/tiny/built_graph/rounded_graph.pb \
--output_node_names=output \
--mode=weights_rounded

Finally I enable memory mapping in the graph:

bazel-bin/tensorflow/contrib/util/convert_graphdef_memmapped_format \
--in_graph=/yoloz/tiny/built_graph/rounded_graph.pb \
--out_graph=/yoloz/tiny/built_graph/mmapped_graph.pb

I've tried cutting out steps such as quantizing and memory mapping, but I still run into the same issue where it just randomly identifies non-existent objects. Here's an example from the iOS logs of it just seeing tons of non-existent objects:

48 boxes
cow 0.26528638082959 (-716.969760456415, -15.0664010845897, 2253.67373242581, 124.151942685188)
person 60.9147208802824 (406.176437143808, 247.405798095825, 109.771308640561, 79.8200096378618)
sheep 39.6371332656176 (701.706673330065, 173.24449355272, 162.208712026626, 0.201273832795577)
aeroplane 14.7475192095906 (422.813935954874, -672.132180748015, 0.435470956695015, 1851.95653674813)
bottle 34.4575969911148 (-3691.35341221254, 230.732243466384, 7407.03830472379, 118.209991155339)
bottle 15.252358611848 (297.912561621646, 253.622715234777, 103.666117939307, 156.487766640486)
person 2.02568703175324 (-1223.67370049141, 403.496686780806, 2378.22481550453, 17.7921825504054)
person 46.802289779278 (1029.42971074394, 59.8810085550473, 0.0603049292979604, 6.32803476229502)
person 5.02492657361699 (-242.108751411802, -179.995665610159, 768.943885578703, 172.626105837104)
bottle 8.12876115292863 (1.31502027136079, 112.241214824463, 83.795759190395, 137.321676756878)
bottle 17.4926418818899 (-2195.20377769767, 44.7168849437375, 5036.41495845117, 109.233492394215)
bottle 10.0492816390524 (-348.11391411164, -823.872093321802, 1358.45449526415, 976.078075378695)
horse 11.7777611150507 (-5199.91062956631, 209.564810465238, 10445.2793049104, 193.588701516892)
sofa 2.90329754699323 (-580.55927439087, -1201.57429169997, 605.32719492025, 2319.642035421)
sheep 2.31771116209555 (335.281302477022, -265.169598244711, 303.744002610247, 615.620747373669)
bus 10.6498479187201 (-24099.0875846469, -55.1337276379371, 48149.893631693, 531.781720515113)
person 2.18346198169684 (245.163570981687, -659.499791728503, 452.623426099545, 547.663504813206)
tvmonitor 0.347630301506097 (250.864021547363, 230.171159402745, 178.092564045225, 107.932911465985)
bottle 9.95446473408515 (578.688593486094, 225.090434384003, 29.7744054641031, 5.09407076904225)
cow 9.39130423078825 (751.656599451878, 494.282862500461, 1.09127926019101, 0.0376114695453585)
cow 16.8039449513635 (-636.942771411139, -4146.12791663897, 105.520065759871, 8539.38077769234)
cow 15.3530701945051 (10.8899058785902, -2004.09817131406, 28.3976730276271, 4464.49136457081)
cow 6.18686360683682 (-1872.86129217465, 317.97697050232, 4741.10130713052, 5.93505212236634)
cow 41.8923066953403 (-5090.64552303113, -433.732989685124, 11922.7837276564, 842.91523490483)
person 7.01425882730575 (196.017857693368, 10.762791441239, 0.29155214652136, 133.058044741693)
cow 21.7848502025759 (-215.860215658783, -249.177102701316, 267.580321846654, 392.007555619633)
tvmonitor 5.30925632140537 (528.546262539762, 472.202107788961, 27.6917343533151, 152.522967437084)
tvmonitor 5.94156905219603 (565.858886985312, -920.189522495108, 318.533761354198, 2161.87640869381)
bus 8.98683072242511 (-6645.9226178394, 139.156692264233, 14659.6244466323, 374.677935877221)
cow 8.49134918033451 (432.454714344806, -2111.24702880055, 1.80568608554533, 4125.33471449469)
cow 23.8006362877428 (473.681844977678, -9128.2631528842, 239.033443917267, 18220.9025116129)
cow 32.4949977552751 (138.42400573552, -18035.3928847779, 387.916661080797, 36465.6768464101)
person 8.77725236878837 (502.371345131116, 210.051040886286, 42.5160061576193, 20.3697629750902)
diningtable 4.05980503916737 (-2256.01118752883, 343.210714264253, 5658.08545452834, 55.0627622079917)
tvmonitor 10.4713313523678 (-1818.03733710679, 194.896162589751, 4932.72937689972, 177.472580818184)
tvmonitor 12.2691757717801 (-18778.2914532181, 128.871683567611, 37271.2059504048, 29.0849889945601)
pottedplant 14.22723577094 (-28173.1173244905, -208.326447582406, 56501.6760140867, 546.753047487035)
tvmonitor 13.8996401643171 (-11994.0013712802, 370.284081045691, 24092.6758117422, 19.6677094960988)
bottle 24.2488610989835 (-195.202285100031, 385.567696758806, 1132.05653811156, 12.4024115628028)
person 17.197114360953 (-451.579067209645, -418.735144345297, 0.0300114537946303, 1134.27591314155)

Any help you can provide is greatly appreciated! Thanks! 👍

@KleinYuan
Copy link
Owner

KleinYuan commented Oct 19, 2017

@brianantonelli hmmm. That seems weird.
My only doubt is that which yolo version are you using?
The smaller one is tiny-yolo v1 I believe. And it seems you are using the 2nd version.

@macro-dadt
Copy link

@brianantonelli did you resolve this problem? I faced the same issue

@macro-dadt
Copy link

macro-dadt commented Nov 16, 2017

@KleinYuan could you give me a step-by-step solution to make .pb file for using in this repo.
im using darkflow as your suggestion. i have spent a couple of days but still cannot get any success. Thanks in advantage!

@KleinYuan
Copy link
Owner

KleinYuan commented Nov 17, 2017

@macro-dadt
Sure I am happy to do that.
Before doing that, could you elaborate which specific step you get stuck?
Darkflow should be very straightforward and only trick is that make sure you use the correct version of tiny-yolo (v1).

@macro-dadt
Copy link

macro-dadt commented Nov 17, 2017

thank you for your quick reply.
this is what i did:

  • in cfg/v1 i create my own cfg named "yolo-tiny2c" (i copy from yolo-tiny4c then only change classes=4 -> classes=2 , output=686 -> output=588)
  • create VOC dataset with labelImg
  • train with my own dataset
    flow --model cfg/v1/yolo-tiny2c.cfg --load bin/tiny-yolo-voc.weights --train --annotation train/v1/Annotations --dataset train/v1/Images
  • saved graph and weights to protobuf file
    flow --model cfg/v1/yolo-tiny2c.cfg --load bin/tiny-yolo-voc.weights --savepb
  • testing with some images by loading .pb and .meta file. So far everything works great.
  • Next freeze graph
    bazel-bin/tensorflow/python/tools/freeze_graph \ --input_graph=/Users/macro/yolo-tiny2c.pb \ --input_checkpoint=/Users/macro/yolo-tiny2c.pb-1000 \ --output_node_names=output \ --input_binary \ --output_graph=graph.pb
    i got this error
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 51: invalid start byte. i use tensorflow 1.3.0 and python 3.6
    Did i do somethings wrong? Im very new in this area please help me!!!
    Thanks a lot in advantage

@KleinYuan
Copy link
Owner

KleinYuan commented Nov 18, 2017

@macro-dadt this error usually occurs when you are using a different version of tensorflow to freeze a graph. Have you tried different versions? And also the bazel-bin/tensorflow version may not be the same as your default python's tensorflow, which run previous steps.
It seems a darkflow issue.
However, if I were you, I will just hack the function in darkflow which save the model with retrieving the tensors you need. It's just tensorflow, right? You can do it easily buddy.
Example: https://github.com/KleinYuan/cnn/blob/master/tools/graph_freezer.py#L20

@groot-1313
Copy link

I used the commands listed by @brianantonelli to convert my yolo v1 model (.cfg and .weights) into a frozen graph and its meta file
flow --model tiny-yolo-voc.cfg --load tiny-yolo-voc.weights --savepb --gpu 0.9

I have 2 questions:

  1. How do I process the output which is a 588 neuron long ( 2 classes, side 7 x 7, num of boxes per cell = 2)
  2. How can I successfully port this frozen graph to be used by opencv, because I face the following error:

cv2.error: OpenCV(3.4.2) /io/opencv/modules/dnn/src/dnn.cpp:401: error: (-2:Unspecified error) Can't create layer "truediv" of type "RealDiv" in function 'getLayerInstance'

If anybody has faced the above 2 problems, or knows a way to solve it, I would really appreciate the help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants