Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory leaks of mftraining, cntraining #516

Closed
junmocklee opened this issue Dec 1, 2016 · 12 comments
Closed

memory leaks of mftraining, cntraining #516

junmocklee opened this issue Dec 1, 2016 · 12 comments

Comments

@junmocklee
Copy link

Hello,
I guess there are some memory leaks in training tools.
I only added "#include <vld.h>" and started debugging with command arguments on VS2013.
Here is my tif/box file pair.
https://drive.google.com/file/d/0B2tu51tmJ0FvaFNnWDFNLS1lUUU/view?usp=sharing
https://drive.google.com/file/d/0B2tu51tmJ0FvdkZaWWRQVDNuMkE/view?usp=sharing

No memory leak is reported in tesseract.exe and unicharset_extractor.exe
Memory leaks are reported at

  • mftraining
    classify\cluster.cpp (2493): mftraining.exe!MultipleCharSamples()
    training\commontraining.cpp (770): mftraining.exe!SetUpForFloat2Int()
    cutil\bitvec.cpp (92): mftraining.exe!NewBitVector()
    classify\cluster.cpp (1898): mftraining.exe!ComputeChiSquared()
    training\mftraining.cpp (157): mftraining.exe!ClusterOneConfig()

  • cntraining
    training\commontraining.cpp (421): cntraining.exe!ReadTrainingSamples()
    classify\cluster.cpp (949): cntraining.exe!ComputePrototypes()
    classify\cluster.cpp (2493): cntraining.exe!MultipleCharSamples()
    training\commontraining.cpp (852): cntraining.exe!AddToNormProtosList()
    classify\cluster.cpp (2335): cntraining.exe!NewChiStruct()
    classify\cluster.cpp (1903): cntraining.exe!ComputeChiSquared()
    classify\cluster.cpp (1610): cntraining.exe!NewSimpleProto()
    classify\cluster.cpp (1546): cntraining.exe!NewEllipticalProto()

...and others. (same functions at different lines)

In my case, 15 memory leaks in mftraining and 21 memory leaks in cntraining are detected.

My environment
OS: Windows 7
IDE: Visual Studio 2013
tesseract sources: 3.05 branch commit 5750e72

Could anyone correct that?

@junmocklee
Copy link
Author

On commit 8af3629, it seems most of these memory leaks are corrected. Thanks!
Only 3 memory leaks are still reported.

---------- Block 66889 at 0x02839ED0: 24 bytes ----------
Leak Hash: 0x0F20FB0D, Count: 1, Total 24 bytes
Call Stack (TID 6980):
MSVCR120D.dll!malloc()
d:\tesseract_prj\161212\tesseract\cutil\emalloc.cpp (52): mftraining.exe!Emalloc() + 0xC bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (2335): mftraining.exe!NewChiStruct() + 0x7 bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (1898): mftraining.exe!ComputeChiSquared() + 0x17 bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (1777): mftraining.exe!MakeBuckets() + 0x2A bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (1705): mftraining.exe!GetBuckets() + 0x1A bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (1017): mftraining.exe!MakePrototype() + 0x2A bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (947): mftraining.exe!ComputePrototypes() + 0x11 bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (523): mftraining.exe!ClusterSamples() + 0xD bytes
d:\tesseract_prj\161212\tesseract\training\mftraining.cpp (136): mftraining.exe!ClusterOneConfig() + 0xE bytes
d:\tesseract_prj\161212\tesseract\training\mftraining.cpp (294): mftraining.exe!main() + 0x22 bytes
f:\dd\vctools\crt\crtw32\dllstuff\crtexe.c (466): mftraining.exe!mainCRTStartup()
kernel32.dll!BaseThreadInitThunk() + 0x12 bytes
ntdll.dll!RtlInitializeExceptionChain() + 0x63 bytes
ntdll.dll!RtlInitializeExceptionChain() + 0x36 bytes
Data:
02 00 CD CD CD CD CD CD 8D ED B5 A0 F7 C6 B0 3E ........ .......>
9F FF 8F 99 8A A1 3B 40 ......;@ ........

---------- Block 66890 at 0x02839F28: 8 bytes ----------
Leak Hash: 0x4EC07CBF, Count: 1, Total 8 bytes
Call Stack (TID 6980):
MSVCR120D.dll!operator new()
d:\tesseract_prj\161212\tesseract\cutil\structures.cpp (36): mftraining.exe!new_cell() + 0x12 bytes
d:\tesseract_prj\161212\tesseract\cutil\oldlist.cpp (326): mftraining.exe!push() + 0x5 bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (1903): mftraining.exe!ComputeChiSquared() + 0x15 bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (1777): mftraining.exe!MakeBuckets() + 0x2A bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (1705): mftraining.exe!GetBuckets() + 0x1A bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (1017): mftraining.exe!MakePrototype() + 0x2A bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (947): mftraining.exe!ComputePrototypes() + 0x11 bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (523): mftraining.exe!ClusterSamples() + 0xD bytes
d:\tesseract_prj\161212\tesseract\training\mftraining.cpp (136): mftraining.exe!ClusterOneConfig() + 0xE bytes
d:\tesseract_prj\161212\tesseract\training\mftraining.cpp (294): mftraining.exe!main() + 0x22 bytes
f:\dd\vctools\crt\crtw32\dllstuff\crtexe.c (466): mftraining.exe!mainCRTStartup()
kernel32.dll!BaseThreadInitThunk() + 0x12 bytes
ntdll.dll!RtlInitializeExceptionChain() + 0x63 bytes
ntdll.dll!RtlInitializeExceptionChain() + 0x36 bytes
Data:
D0 9E 83 02 00 00 00 00 ........ ........

---------- Block 66839 at 0x0283A3C8: 1 bytes ----------
Leak Hash: 0x8E75036D, Count: 1, Total 1 bytes
Call Stack (TID 6980):
MSVCR120D.dll!malloc()
d:\tesseract_prj\161212\tesseract\cutil\emalloc.cpp (52): mftraining.exe!Emalloc() + 0xC bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (2493): mftraining.exe!MultipleCharSamples() + 0xC bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (983): mftraining.exe!MakePrototype() + 0x1B bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (947): mftraining.exe!ComputePrototypes() + 0x11 bytes
d:\tesseract_prj\161212\tesseract\classify\cluster.cpp (523): mftraining.exe!ClusterSamples() + 0xD bytes
d:\tesseract_prj\161212\tesseract\training\mftraining.cpp (136): mftraining.exe!ClusterOneConfig() + 0xE bytes
d:\tesseract_prj\161212\tesseract\training\mftraining.cpp (294): mftraining.exe!main() + 0x22 bytes
f:\dd\vctools\crt\crtw32\dllstuff\crtexe.c (466): mftraining.exe!mainCRTStartup()
kernel32.dll!BaseThreadInitThunk() + 0x12 bytes
ntdll.dll!RtlInitializeExceptionChain() + 0x63 bytes
ntdll.dll!RtlInitializeExceptionChain() + 0x36 bytes
Data:
01 ........ ........

@Shreeshrii
Copy link
Collaborator

Please tag as 3.05.

@junmocklee
Copy link
Author

It seems like non-contributors cannot tag labels or milestones.

@Shreeshrii
Copy link
Collaborator

@stweil You have fixed many memory leaks. Does it cover these?
Can this issue be closed?

@stweil
Copy link
Contributor

stweil commented May 18, 2018

@junmocklee, do you still see any memory leaks?

@junmocklee
Copy link
Author

@stweil On commit 0a93ad2, the bug mentioned above seems to have been fixed, but new memory leak is reported.

---------- Block 1 at 0x0000000000308480: 128 bytes ----------
Leak Hash: 0x38298368, Count: 1, Total 128 bytes
Call Stack (TID 13424):
ucrtbased.dll!malloc()
f:\dd\vctools\crt\vcstartup\src\heap\new_array.cpp (16): mftraining.exe!operator new
d:\tesseract_prj\180521\tesseract\src\ccutil\genericvector.h (678): mftraining.exe!GenericVector<tesseract::StringParam * __ptr64>::reserve() + 0x2A bytes
d:\tesseract_prj\180521\tesseract\src\ccutil\genericvector.h (694): mftraining.exe!GenericVector<tesseract::StringParam * __ptr64>::double_the_size()
d:\tesseract_prj\180521\tesseract\src\ccutil\genericvector.h (793): mftraining.exe!GenericVector<tesseract::StringParam * __ptr64>::push_back()
d:\tesseract_prj\180521\tesseract\src\ccutil\params.h (198): mftraining.exe!tesseract::StringParam::StringParam() + 0x16 bytes
d:\tesseract_prj\180521\tesseract\src\training\commontraining.cpp (57): mftraining.exe!`dynamic initializer for 'FLAGS_configfile''() + 0x45 bytes
ucrtbased.dll!initterm() + 0x5D bytes
f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl (223): mftraining.exe!__scrt_common_main_seh()
f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl (296): mftraining.exe!__scrt_common_main()
f:\dd\vctools\crt\vcstartup\src\startup\exe_main.cpp (17): mftraining.exe!mainCRTStartup()
kernel32.dll!BaseThreadInitThunk() + 0xD bytes
ntdll.dll!RtlUserThreadStart() + 0x1D bytes

This memory leak is reported by mftraining, but unicharset_extractor and cntraining also report memory leaks in the same function.

My environment changed.
OS: Windows 7
IDE: Visual Studio 2015
tesseract sources: commit 0a93ad2

@amitdo
Copy link
Collaborator

amitdo commented Sep 20, 2018

@stweil, was the latest issue reported here solved?

@stweil
Copy link
Contributor

stweil commented Sep 20, 2018

I don't know. @junmocklee, could you please post the command line used for one of the commands which show the memory leak, so I can reproduce it here? Or is it already fixed.

@zdenop
Copy link
Contributor

zdenop commented Oct 18, 2018

valgring on linux shows for cntraining and mftraining:

valgrind --leak-check=full --show-leak-kinds=all mftraining -F font_properties -U unicharset -O eng_test.unicharset eng_test.font.exp0.tr
==986== Memcheck, a memory error detector
==986== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==986== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==986== Command: mftraining -F font_properties -U unicharset -O eng_test.unicharset eng_test.font.exp0.tr
==986==
Read shape table shapetable of 3 shapes
Reading eng_test.font.exp0.tr ...
Warning: no protos/configs for Joined in CreateIntTemplates()
Warning: no protos/configs for |Broken|0|1 in CreateIntTemplates()
Done!
==986==
==986== HEAP SUMMARY:
==986==     in use at exit: 49 bytes in 4 blocks
==986==   total heap usage: 68,820 allocs, 68,816 frees, 2,815,622 bytes allocated
==986==
==986== 1 bytes in 1 blocks are still reachable in loss record 1 of 4
==986==    at 0x4C2E08F: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==986==    by 0x40970C: Emalloc(int) (emalloc.cpp:33)
==986==    by 0x55F4427: MultipleCharSamples(CLUSTERER*, sample*, float) (cluster.cpp:2374)
==986==    by 0x55F50FC: MakePrototype(CLUSTERER*, CLUSTERCONFIG*, sample*) (cluster.cpp:945)
==986==    by 0x55F5322: ComputePrototypes(CLUSTERER*, CLUSTERCONFIG*) (cluster.cpp:911)
==986==    by 0x55F5945: ClusterSamples(CLUSTERER*, CLUSTERCONFIG*) (cluster.cpp:516)
==986==    by 0x4050EC: ClusterOneConfig (mftraining.cpp:113)
==986==    by 0x4050EC: main (mftraining.cpp:268)
==986==
==986== 8 bytes in 1 blocks are still reachable in loss record 2 of 4
==986==    at 0x4C2E08F: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==986==    by 0x798EA78: ??? (in /usr/lib64/libgomp.so.1.0.0)
==986==    by 0x799E006: ??? (in /usr/lib64/libgomp.so.1.0.0)
==986==    by 0x798D129: ??? (in /usr/lib64/libgomp.so.1.0.0)
==986==    by 0x400FA79: call_init.part.0 (in /lib64/ld-2.26.so)
==986==    by 0x400FB85: _dl_init (in /lib64/ld-2.26.so)
==986==    by 0x4000ED9: ??? (in /lib64/ld-2.26.so)
==986==    by 0x7: ???
==986==    by 0x1FFF0003E2: ???
==986==    by 0x1FFF0003ED: ???
==986==    by 0x1FFF0003F0: ???
==986==    by 0x1FFF000400: ???
==986==
==986== 16 bytes in 1 blocks are still reachable in loss record 3 of 4
==986==    at 0x4C2E6FF: operator new(unsigned long) (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==986==    by 0x409C87: push(list_rec*, void*) (oldlist.cpp:286)
==986==    by 0x55F3CDC: ComputeChiSquared(unsigned short, double) (cluster.cpp:1826)
==986==    by 0x55F3DE2: MakeBuckets(DISTRIBUTION, unsigned int, double) (cluster.cpp:1704)
==986==    by 0x55F408E: GetBuckets(CLUSTERER*, DISTRIBUTION, unsigned int, double) (cluster.cpp:1635)
==986==    by 0x55F519D: MakePrototype(CLUSTERER*, CLUSTERCONFIG*, sample*) (cluster.cpp:978)
==986==    by 0x55F5322: ComputePrototypes(CLUSTERER*, CLUSTERCONFIG*) (cluster.cpp:911)
==986==    by 0x55F5945: ClusterSamples(CLUSTERER*, CLUSTERCONFIG*) (cluster.cpp:516)
==986==    by 0x4050EC: ClusterOneConfig (mftraining.cpp:113)
==986==    by 0x4050EC: main (mftraining.cpp:268)
==986==
==986== 24 bytes in 1 blocks are still reachable in loss record 4 of 4
==986==    at 0x4C2E08F: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==986==    by 0x40970C: Emalloc(int) (emalloc.cpp:33)
==986==    by 0x55F3A4F: NewChiStruct(unsigned short, double) (cluster.cpp:2225)
==986==    by 0x55F3CA7: ComputeChiSquared(unsigned short, double) (cluster.cpp:1822)
==986==    by 0x55F3DE2: MakeBuckets(DISTRIBUTION, unsigned int, double) (cluster.cpp:1704)
==986==    by 0x55F408E: GetBuckets(CLUSTERER*, DISTRIBUTION, unsigned int, double) (cluster.cpp:1635)
==986==    by 0x55F519D: MakePrototype(CLUSTERER*, CLUSTERCONFIG*, sample*) (cluster.cpp:978)
==986==    by 0x55F5322: ComputePrototypes(CLUSTERER*, CLUSTERCONFIG*) (cluster.cpp:911)
==986==    by 0x55F5945: ClusterSamples(CLUSTERER*, CLUSTERCONFIG*) (cluster.cpp:516)
==986==    by 0x4050EC: ClusterOneConfig (mftraining.cpp:113)
==986==    by 0x4050EC: main (mftraining.cpp:268)
==986==
==986== LEAK SUMMARY:
==986==    definitely lost: 0 bytes in 0 blocks
==986==    indirectly lost: 0 bytes in 0 blocks
==986==      possibly lost: 0 bytes in 0 blocks
==986==    still reachable: 49 bytes in 4 blocks
==986==         suppressed: 0 bytes in 0 blocks
==986==
==986== For counts of detected and suppressed errors, rerun with: -v
==986== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@stweil
Copy link
Contributor

stweil commented Oct 26, 2018

That Valgrind output looks good.

@zdenop
Copy link
Contributor

zdenop commented Oct 26, 2018

So can we close the issue?

@stweil
Copy link
Contributor

stweil commented Oct 26, 2018

Yes. I'll close it now.

@stweil stweil closed this as completed Oct 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants