-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory grows without limit #4112
Comments
BTW: Block cache is totally disabled |
|
Can you paste the full call stack of the allocation? |
No, this is all information I have. I am trying to find allocation in code as well. |
|
I’ve found the reason of leak this is not exactly a leak I set size of the cache to zero |
Hi @siying . I am having exactly same issue. It is serious one. Cache is simply growing. Callstack using jemalloc See attachment: My configuration is:
Is there some workaround without need to disable cache? |
There is this bugfix message: Is it related to this issue? I was not able to find any change related to block table between 5.14.0 and 5.14.1. What changeset fixes this bug? |
Executed the same benchmark on the latest RocksDB version Still see memory grows Bytes Used Count Symbol Name |
My solution was to use recommendation here: https://github.com/facebook/rocksdb/wiki/Partitioned-Index-Filters (I have huge database and I require small memory footprint) Table files are using cache driven by capacity of max_open_files. When I used big value it was growing in time a lot, because my index is around 5MB per file (and I had 5k files - 200MB each). When I used smaller value then query performance went down. Using two level index helps here, because direct impact in table cache is small and indices are cached in block cache that can be controlled. I have also increased block size which made my index even smaller. @toktarev your test is not setting max_open_files thus it is infinite size (4M files). That is why it probably just grows. Try to set that max_open_files=50 (just for test). I guess it will stop growing after some time. |
@koldat it is clear that max_open_files reduces corresponding cache capacity. But if I set it small and add |
@toktarev sure you will see performance degradation. You have to choose how you want to achieve your result. In ideal case everything fits in memory for other cases you have to tune. What I am doing:
Can you please do the test (max_open_file=50) and confirm that it is not memory leak? (Original issue) |
@koldat sure I'll test it a bit latter (a bit busy right now). I don't understand why RocksDB keeps file blocks in cache. I'd like to ask them revise their memory management and free allocated memory in Blocks after each Flush or Read operation. This is 360K cache blocks which just consumes memory for nothing. |
@toktarev It makes a lot of sense to keep it in memory. Let me describe it on example: You are calling GET operation
Every table check for GET has to do this. Let me describe worst scenario (nothing opened and cached).
Now imagine you have to do this for all table files that are possibly hitting that key. It is a lot of operation you need to do. Solution I used (two level index) seems brilliant (for my use case!). The difference is that it loads first level index into memory. But this guy is much smaller thus you can have much higher number of files opened. Second level (index partitions) are than using block cache using standard caching and eviction algorithms. That means if you have enough memory for caching it is used and fast. When you start to starve on memory, performance will go down. RocksDB is mostly about tuning. There is not an ideal way how to use it. But if you are using that correctly, there cannot be anything faster. On the other hand, incorrect usage can make it really bad. Personally I do not see anything wrong in memory management after I started to understand what it does. In case the test will prove there is not a leak. Anyway kudos to authors! |
There is no leak. But I still see at least 1 problem: I can't use https://github.com/facebook/rocksdb/wiki/Partitioned-Index-Filters |
My custom application load huge number of data into RocksDB (about 70G) After some time I see: -rw-r--r-- 1 ubuntu ubuntu 118410341 Jul 18 15:19 000121.sst And process consumes more that 70G of RAM. Corresponding column family is optimized for lookUp and Partitioned Filters and Index doesn't work there. |
I see only 17 files and 70G of RAM consumed. |
It is hard to say. Anyway it looks strange, because listed files are around 2GB size in total. That means compression generates files 35 times smaller (strange). It also strangely correlates with your data size. Are you sure that you do not have issue (memory leak) in your application? |
Our profiling indicates there is a huge leak accumulating within a couple of days with a process growing up to 60 GB a day. The process usually recycles. I am seeing multiple stacks that has a common theme of a BlockFetcher and trying to insert the block into cache etc. The below happens during Get()/MultiGet(), at the end of compaction etc. The first below stack is the most prolific raking many Gbs.My suspicion it may have something to do with the recent LRU cache changes though not sure. Does not happen in 5.6.1
|
@koldat which release or commit are you running on while seeing the leak? CC @maysamyabandeh . One the the previous heap profile shows the partitioned index took most of the memory. |
@yuslepukhin Get()/MultiGet() is not supposed to go this path. Maybe it's the compaction path? Can you paste the whole stack so that we can figure out where it is from? |
@siying Thanks for responding. There are multiple stacks that show in my profiling. Let me gather the top and I will post it here. @maysamyabandeh Partitioned index also caught my attention. |
This covers practically all of it. |
I confirm
This causes a lot of allocations and problems |
I am using 5.12. branch with our JNI modifications. I can say that two level index maybe "hide" the leak, because grow of memory is not noticeable (so I thought it was incorrect usage). Most of memory was taken from compaction and seek and seekPrev (see picture). Seek is hitting that ReadBlockContents as well as compaction. I guess that maybe index cache does not evict indices for files that are deleted. |
We switched all column families on two level index. We still observe memory grows on intensive read operations. |
@koldat what's your block cache size? Data blocks from a deleted file are not necessarily being deleted immediately. Eventually they will be evicted based on LRU. If your actual block cache usage is larger than capacity, then it's a problem. Otherwise, it is still expected. |
For those who run on C++ or you can get block cache size using another way, what's your reading of Cache::GetCapacity(), Cache::GetUsage() and Cache::GetPinnedUsage() reading? |
Actually this is very big surprise for me that C++ application has such problems with memory. I understand when Java application dies due to frequent GCs and so on .. But C++ application written in Silicon Valley, hmmm .... |
Hello Oleg. I can't share all steps we did (commercial secret). |
@koldat please, take a look at this change. Of course no a big deal, but quality of the software = set of well written "not big deal" pieces. |
I am sorry, but I do not know anything about your "lock" issue and it is not related to this one so I would not mix it. I was commenting on memory usage that I was also fighting with on my start (due to lack of my knowledge). As you have your own view on that I am not able to help more here as I am just a happy user of this library. If someone will want to help with memory configuration please create new issue with fresh data. I am unsubscribing from this one. |
@koldat, ok up to you. Good luck ! |
Thank you guys for trying to help.
Also I've tried to play with these settings: https://github.com/facebook/rocksdb/wiki/Partitioned-Index-Filters#how-to-use-it |
Oleg, did you try glibc + MALLOC_ARENA_MAX=2 ? |
Yes, I saw some decrease of memory consuming but it didn't help generally. |
FYI, this appears to be a duplicate of #3216 (I posted some notes there) I can add that the memory leak appears to be linearly associated with a file descriptor leak. I can count file descriptors by saying Resolved: See my own comment #4112 (comment) immediately below. (In my case, the compacted size of my db is 500 MBytes. Continuous editing of this DB will blow up RAM use to 200GBytes in an hour or two. There appears to be about 50MBytes RAM use per |
Update: after this comment: #3216 (comment) I came to realize that I have made a complete newbie mistake in my c++ code: rocks iterators are NOT smart pointers that self-delete when they go out of scope. They must be explicitly deleted! Upon fixing this complete-newbie mistake, all my disk and RAM usage problems go away! Wow! I suggest that anyone else reading this take a good close look at their iterators, and review |
hi guys, any update on this? |
I am not sure what you are referring to specifically. We are working on limiting memory usage and updates on that will be in the release notes ("HISTORY.md"). |
Thank you FaceBook team. My complaints were not useless. |
Just FYI -- My application https://github.com/cculianu/fulcrum , which makes heavy use of rocksdb runs great on Linux. But on macOS and on Windows it leaks memory. On Linux memory usage is rock solid at 800- 900MB even if the process is up for a month. On Windows or macOS it grows and grows and grows so much that after a few days it's like 7-9 GB consumption and growing. I'm pretty sure the leak is inside rocksdb. Rest of app is tight. Sadly, I lack the tools on macOS or Windows to actually run things like valgrind, etc. So yes -- kindly do search for leaks on macOS and/or Windows. I suspect you have plenty. |
We had problems with runs on Linux as well, but after 1-2 weeks of stress testing. |
I suspect that the root cause is a bad collaboration with the internals of libc allocator, and that At least fix using MALLOC_ARENA points on this. Please, correct me if I am wrong. |
Well I tried also with and without using jemalloc on both Windows and macOS. Note that macOS I believe has a superior allocator to the standard glibc one.. I rarely see fragmentation issues on macOS. I have no idea if it's fragmentation or if it's something else... really I do not. Curious that on Linux it's fine.. |
Hi @toktarev, were you able to find any solution for this? I'm using RocksJni (8.5.4) and I'm seeing a constant growth in memory. All the things mentioned here were checked. My DB size on disk is 5 GB. Block cache size is 10 GB with strict limits. Memtables should take note more than 1GB. I tried changing MAX_ARENA config. Then, I tried using jemalloc / tcmalloc as well. But nothing worked. I'm closing all RocksObject from code also. From my observations, the issue is in the read flow only. Been stuck on this for weeks now. Can someone please help? |
For what it’s worth I’ve had great success by using jemalloc along with the following ENV var to get jemalloc to not use thread-local caches for allocated RAM (which wastes memory since each thread has effectively its own heap)
|
Hi @areyohrahul I understand your pain and remember my pain when I worked on this problem. But i don't know why it is not a solution for you. I can give you 2 advices here:
Otherwise you will just experience pain and "guess on the coffee's heap". |
Hi @cculianu, how did you end up using jemalloc? Did you have to compile RocksDB again with some special flag? Or did you just change some setting for Java? |
Oh I'm a C++ guy.. I don't use Java. There is a way to "force" the java runtime process to use jemalloc though without its knowledge.. via some LD_PRELOAD magic. If you install jemalloc, you can use LD_PRELOAD to force it. See this guide: https://github.com/jemalloc/jemalloc/wiki/Getting-Started You can try that -- this assumes the java runtime uses |
Thanks @cculianu, this helps a lot. Will try this out. |
Hey @toktarev, I tried changing the MAX_ARENA config but it didn't work. Can you share how did you analyse the source of the problem and how exactly did you change it? Also, do you use JNI or CPP? |
JNI (Java + CPP). I remember that MAX_ARENA fixed memory growth in long runs. |
Expected behavior
Process consumes about 10 megabytes
Actual behavior
Memory grows without limit
Steps to reproduce the behavior
Run this code:
https://pastebin.com/Ch8RhsSB
Sorry RocksDB team, but it is huge problem.
This is trivial test and I expect it will work as finite state machine
Populate memory - flush - re-use memory.
I see like memory grows.
The text was updated successfully, but these errors were encountered: