Use a single index Results when querying across blocks #1469
Labels
area:db
All issues pertaining to dbnode
area:index
All issues pertaining to m3ninx and m3db's index
P: High
T: Memory Utilization
T: Optimization
T: Perf
Right now, when we query across many index blocks we allocate a results map for each one, run all the queries in parallel for each block filling up the results map, and then we merge all the results maps together at the end. This makes parallelization easy and straightforward, but since many workloads contain many of the same documents in each block, it causes memory utilization to scale linearly with the number of index blocks. This is problematic for queries that return a large number of documents.
To address this issue, we will use a single index Results which we will push down into all the block queries. That way, each document will only get allocated in memory once regardless of the number of blocks that are being queried. This will significantly improve memory utilization but will make the results map very heavily contended. To address that, instead of acquiring a lock each time to add a single document, we'll use pooled slices to temporarily gather a configurable number of documents together within the block, then acquire the lock once and add them all together as a batch.
This should give us significantly reduced memory utilization without causing too much contention on the lock.
The text was updated successfully, but these errors were encountered: