-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cached string decoder for map keys (CachedKeyDecoder) #54
Conversation
2c6e1fc
to
97eb530
Compare
97eb530
to
3dd2893
Compare
Fascinating idea! 👏 👏 👏 I'll merge the string caching in decoders, but there are some improvement and changes needed. The reuse of DecoderIf one would like to decode it in the fastest way, they should reuse the instance of And, I think Default behaviorI'd like to use caching by default, so it should be merged to Decoder. Configuration of |
Sure, this is just a prototype solution that we use, it's not intended to be merged directly. It requires changes and tests. I think we can allow user to pass string decoder to options, so user can pass custom decoder if needed. What do you think about it? |
I am not sure about how to properly implement cacheSize yet. Maybe I can add Current version tracks number of hits for each key and orders them by this number to make most frequently used keys faster. |
Lol @gfx , I've just improved performance more :D Now it's 176660 ops/s |
Amazing! 😆 cacheSize is not necessarily correct; it's just a hint.
You mean BTW I have another idea to share encoders and decoders, not directly related to this PR: #56 |
I'd like to proceed this patch this week. FYI: Chrome 76 is going to be released at the end of this month, which includes JSON.parse performance improvement. |
838b277
to
8be4057
Compare
if (records.length >= this.maxLengthPerKey) { | ||
// `records` are full! | ||
// Set `record` to a randomized position. | ||
records[(Math.random() * records.length) | 0] = record; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have replaced the original hit-count based sorting algorithm by random-picking algorithm to reduce overhead in #get()
.
Codecov Report
@@ Coverage Diff @@
## master #54 +/- ##
==========================================
+ Coverage 98.16% 98.25% +0.08%
==========================================
Files 15 16 +1
Lines 927 972 +45
Branches 189 197 +8
==========================================
+ Hits 910 955 +45
Misses 17 17
Continue to review full report at Codecov.
|
Benchmark on NodeJS/v12.7.0
|
Now, this PR is finished. cc: @sergeyzenchenko |
@@ -59,6 +62,9 @@ export class Decoder { | |||
headByte = HEAD_BYTE_REQUIRED; | |||
readonly stack: Array<StackState> = []; | |||
|
|||
// TODO: parameterize this property. | |||
readonly cachedKeyDecoder = sharedCachedKeyDecoder; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking about the interface to customize a key decoder, so it's not yet public for now.
Check this out @gfx We are using it in our project.
Big part of all strings are map keys and most of the time total number of uniq keys is small.
For example in our payload in 260k objects 67% of all string are keys with 100 uniq values.
I've tried to cache this values and skip string decoding at all for keys.
It's using bytes representation of key string and stores actual key value. So instead of decoding key string we just need to find this key value in cache.
Results before:
After:
I am not sure that we should include it directly into this library, but maybe it can be added as optional speedup that user can add if it's required.
It works better if Decoder is reused, because cache is need to be created first.
Let me know what do you think about it.