-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review ArrayByteBufferPool eviction #11371
Comments
@sbordet all very good points I think. More thoughts inline
If we are using THREAD_ID strategy, how are we getting large numbers of buffers in a bucket? Is it that each thread is sometimes using many buffers of the same size (say 10) and that if we have 100 active threads then we get 1000 buffers of that size? Or is it only on servers where large numbers of threads have been configured and each only uses 1 or 2? To avoid the CompoundPool, the default size would need to be 256. As we seldom run with more that 200 threads, that would be fine if each thread is only using 1 buffer of each size. Is this less critical if we don't do findOldestEntry, or are we still iterating on acquire anyway?
Totally agree. Best to evict any unused buffer. I don't think there is any concept of "hot buffers" when we come to eviction. Or is this really about finding the oldest bucket size, so we evict buffers of sizes not used any more? That kind of makes sense. So should we instead keep stats on which buckets are used more frequently and come up with an eviction policy that comes up with a weight based on usage and size to determine which buckets are evicted from? Ultimately, if we were confident of the configuration of the pool, then we would not need eviction. If we knew the buffer sizes most frequently used, we should be able to pool X of those buffers and everything else is not pooled. The problem with this is that we don't know the perfect configuration and we are trying to heuristically arrive there.
We have to evict during acquire because we are over provisions on entries. Even with a fixed bucket size, it may be that if we used every entry from every bucket, then we may be too large. That we need to check the size as we acquire, because we may have just populated an entry. Perhaps a better way is to use the max size to exactly populate the number of entries of the largest bucket size. E.G if the maxBucketCapacity is 1MB and the pool maxCapacity is 1024MB, then we configure a maxBucketSize of 1024 for the 1MB bucket. Then all the smaller buckets have to be populated by taking an entry from the larger bucket. So a 1MB entry can be acquired by the 512KB bucket and used to provide 2 entries I guess the problem here is that if we have a lot of small buffers and a lot of big buffers, and nothing in between, then there may initially be allot of allocating of intermediary entries so they can be split into smaller entries. Anyway, the idea is to come up with a pool where we do not over provision the number of entries - so if an entry exists, we know we can allocate a buffer for it without violating the maxCapacity constraint. With such a pool, we would never need to evict, but might decide to do so occasionally to re balance the sizes. I.e if we have a lot of idle small buffers, we could let them go to make a larger bucket entry available for allocation. Also, shrinking over time is probably a good thing, if buffers are not being used. Maybe we'd be better avoiding dynamic configuration of the pool and just make it report stats that would help improve its static configuration.
My current thinking is that we try to avoid eviction as much as possible.... and then make it quick and simple.
|
The other thought that comes to mind is that we should again validate our assumption that pooling buffers is worthwhile. It may be that a minimal THREAD_ID based pool that just handles the release-and-then-quickly-reacquire scenario is sufficient. Note that the maxCapacity is not in anyway a limit on memory used, as we will dynamically allocate if there is not a pool. It is essentially a limit on the minimum memory used in idle periods. |
* Eviction is now performed on release(), rather than acquire(). * Memory accounting is done on release(), rather than acquire(). This is because we were always exceeding the maxMemory on acquire(), by returning a non-pooled buffer. We only need to account for what is idle in the pool, and that is done more efficiently on release(), and it is leak-resistant (i.e. if the buffer is not returned, the memory is already non accounted for, keeping the pool consistent). * Released entries now give precedence to Concurrent.Entry, rather than Queued.Entry, so the queued pool is always kept at minimum size. Signed-off-by: Simone Bordet <simone.bordet@gmail.com>
* Eviction is now performed on release(), rather than acquire(). * Memory accounting is done on release(), rather than acquire(). This is because we were always exceeding the maxMemory on acquire(), by returning a non-pooled buffer. We only need to account for what is idle in the pool, and that is done more efficiently on release(), and it is leak-resistant (i.e. if the buffer is not returned, the memory is already non accounted for, keeping the pool consistent). * Released entries now give precedence to Concurrent.Entry, rather than Queued.Entry, so the queued pool is always kept at minimum size. Signed-off-by: Simone Bordet <simone.bordet@gmail.com>
Jetty version(s)
12+
Description
ArrayByteBufferPool eviction needs to be review because the current algorithm is too naive.
Problems:
maxBucketSize
isInteger.MAX_VALUE
which means that it always create aCompoundPool
with possibly a large number of entries in theQueuedPool
. The entries are not pre-created because it's aConcurrentLinkedQueue
, but it may be populated to large values (say for 1kiB buffers, 1M of them will occupy 1GiB). This becomes costly when it's time to evict, because we would need to copy all the entries in a list and iterate to remove the excess memory.findOldestEntry
? Since the pool is configured withConcurrentPool.StrategyType.THREAD_ID
, there basically is no "oldest", so does not make sense to keep times associated with entries and iterate over them to find the oldest, any entry would do.while
loop inevict()
may loop multiple times trying to clear memory, as the inner loop on the bucket may not free any memory (as the "oldest" entry may now be in use and cannot be removed).entry.remove()
marks the entry as removed, returningfalse
is the entry is in use. So it will eventually be removed when all the releases are performed. However, returningfalse
makesevict()
think the memory has not been reclaimed (it will be, but later), therefore continuing the loop and possibly thrashing many entries. And the memory counters inArrayByteBufferPool
are not updated.Pool
, but the memory counters inArrayByteBufferPool
are not updated.acquire()
with the idea that if releases are missing, we never clear memory. However, the cost of eviction is paid upfront all the times. Take writes, for example: we pay the cost acquiring the response headers/hpack buffer, while we could not pay that cost, and only pay it on release when the response bytes are travelling to the client.The text was updated successfully, but these errors were encountered: