-
Notifications
You must be signed in to change notification settings - Fork 17.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encoding/csv: memory consumption is huge(1.2GB) when parsing a big(450MB) csv file #8059
Comments
Thanks for the report, claudiu.garba. I tried to reproduce this on my OS X machine but could not. I duplicated small.txt until it was 1560200 lines long and ran your sample program over it. The terminal backtrace consumed a lot of memory, but the program itself did not. Is it possible that your memory numbers include the terminal/stdout buffer? I also put in calls to runtime.ReadMemStats during the main loop. On my machine, I see MemStats.Alloc fairly stable around 60k and MemStats.Sys completely stable at 2885880 bytes. If it's not the terminal buffer consuming memory, could you report some memstats numbers here? Also, would you check whether there's a particular section of your large csv file that's required to reproduce this? Status changed to WaitingForReply. |
Experiencing something similar. I have a binary file that I read into a byte array. Binary file is 2Gb. When it's loaded into mem, the process mem usage is over 6GB. After about 5 minutes, the mem usage drops down. Wrote a simple app to reproduce and can do so each time.
Added runtime.GC() after the printf statement with no affect. If I call debug.FreeOSMemory(), the memory is released after a few seconds. Version: go version go1.6.2 darwin/amd64 |
@carlfn, your case is not similar. You're using |
Okay, I'll create a new issue. |
I tested the code from the issue with a version of small.txt (see issue) repeated to 1 million lines, but can not reproduce the problem. Memory usage/allocation reported by the Go runtime and max RSS reported from the OS (using time -v) stays below 10MB. Only my terminal has a noticeable memory increase while writing. This is in line with what @josharian reported back in 2014. My numbers are from both go 1.8.1 and tip on a notebook with linux and amd64 intel processor. Since the problem can not be reproduced and there is no specific TODO here, I suggest this issue should be closed. |
Did you by chance check whether Russ's observation about LazyQuotes holds? |
Russ's observation seems to still hold. If you wrap a field in quotes and put something between the comma and the closing quote the Reader will read more and more memory until it finds EOF or a quote followed by a comma. I agree that this is a problem, but the small.txt in the issue doesn't even contain any quotes, so we can't be sure if this is what the author meant. Regardless of whether the LazyQuotes behaviour is a problem that should be fixed (e.g. by giving an optional limit on field size) or not (I think it is), I think that this should be handled in it's own issue and that this issue should be closed. |
Ok, I'll close this now. Will you go ahead and file a new issue, please? |
Filed #20169 for this |
by claudiu.garba:
Attachments:
The text was updated successfully, but these errors were encountered: