-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Makes request headers lowercase in English locale. #29419
Closed
Closed
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adityasrini
Rather than doing the two String.toLowerCase(Locale.ENGLISH) which requires 2 changes, you should replace the new HashMap() with a new TreeMap(String.CASE_INSENSITIVE_ORDER).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't necessarily agree on this. We don't need a sorted map here, we just need to make sure that keys are case insensitive, which is one of the properties of treemap when used in this way, but we don't need all of its other properties which also affect that data structure that's internally used to store entries. Makes sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@javanna
Sorry my suggestion should have been to replace the
new HashSet()
withnew TreeSet(String.CASE_INSENSITIVE_ORDER)
at line 433 , and remove the 2xtoLowerCase(Locale.ENGLISH)
additions, obviously mentioning TreeMap was nonsense.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't necessarily agree with this either, we would be introducing a sorted data structure where we don't need ordering, but only case-insensitive lookups. Not sure it is a good trade-off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@javanna
The internal structure of treeset doesnt matter, any more than the internals of hashset matter.
Hashsets also sort/arrange their keys, it might not be alphabetically, but the entire buckets thing also has its own system sorting system when it allocates keys into a chain of buckets, but who cares.
All that matters is the *set allows us to determine if some key already exists in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in this case it won't make a huge difference, but
HashSet
is backed by aHashMap
whileTreeSet
is back by aTreeMap
.TreeMap
is implemented using a red-black tree, which is very different compared toHashMap
, the reason being that the former will have to re-balance the tree to make it possible to iterate through entries in their natural ordering. We are using a set here though just to quickly check if it already contains something, and we never iterate through it. Conceptually, I still like more the two calls to lowercase than using a red-black tree just because treeset allows to make strings case-insensitive. Hopefully I explained what I meant. It would be interesting to measure which of the two solutions is faster, not sure it's worth it in this case given that this is probably not a hotspot.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today TreeMap is a red black tree, tomorrow it might change, it doesnt matter, all we care about is that it keeps the contract that allows case insensitive look ups of keys. We can be sure they are close enough to the same speed, the jdk collection guys know this important they arent going to make one 10x slower than the other.
A HashMap turns into a tree when it gets really full anyway, which basicaly means the jdk guys dont care why should you ?
https://stackoverflow.com/questions/30164087/how-does-java-8s-hashmap-degenerate-to-balanced-trees-when-many-keys-have-the-s
And a hashmap also has to rebalance when it hits some threshold. To solve that we simply call the ctor that takes initial size so in both case the rebalance never happens.
Thats right so why talk about something that never happens.
Except that enough of this is, because HM and TM are used everywhere and will be compiled by hotspot. If one is compiled so will be the other and vice versa.
Why ? This path takes 100s/100s of cycles vs billions for a complete request, it doesnt matter. Even if one is 2x or 10x sloewr the total request time will be basically the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to keeping a hash set. Moving to a tree set would make operations perform in
O(log(size))
rather than constant time. Even if this set would usually be small so that it wouldn't matter, we have seen in the past that users are sometimes very creative when it comes to pushing the system to its boundaries.