Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC: precompute embeded string literals hash code #553

Closed
wants to merge 1 commit into from

Conversation

casperisfine
Copy link

Unclear how much worth it, it would be, but with embeded strings we often have some space left in the slot, which we can use to store the string Hash code.

It's probably only worth it for string literals, as they are the ones likely to be used as hash keys.

We could even ony do this for string literal actually used as hash keys, as seen by the compiler.

It is also unclear what the best layout would be. We currently store it right after the string terminator as to make it easy/fast to compute, but ideally it would be at a fixed position, however this means creating another RString union, which would complexify the code base significantly.

It's also unclear to us whether trying to respect alignment would lead to a performance gain or not.

hash = 10.times.to_h do |i|
  [i, i]
end

dyn_sym = "dynamic_symbol".to_sym
hash[:some_symbol] = 1
hash[dyn_sym] = 1
hash["small"] = 2
hash["frozen_string_literal"] = 2

Benchmark.ips do |x|
  x.report("symbol") { hash[:some_symbol] }
  x.report("dyn_symbol") { hash[:some_symbol] }
  x.report("small_lit") { hash["small"] }
  x.report("frozen_lit") { hash["frozen_string_literal"] }
  x.compare!(order: :baseline)
end

Before:

ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin23]
Warming up --------------------------------------
              symbol     2.392M i/100ms
          dyn_symbol     2.440M i/100ms
           small_lit     2.155M i/100ms
          frozen_lit     2.010M i/100ms
Calculating -------------------------------------
              symbol     24.175M (± 1.7%) i/s -    122.002M in   5.048306s
          dyn_symbol     24.345M (± 1.6%) i/s -    122.019M in   5.013400s
           small_lit     21.252M (± 2.1%) i/s -    107.744M in   5.072042s
          frozen_lit     20.095M (± 1.3%) i/s -    100.489M in   5.001681s

Comparison:
              symbol: 24174848.1 i/s
          dyn_symbol: 24345476.9 i/s - same-ish: difference falls within error
           small_lit: 21252403.2 i/s - 1.14x  slower
          frozen_lit: 20094766.0 i/s - 1.20x  slower

After:

ruby 3.4.0dev (2024-04-08T07:20:15Z interned-string-ha.. 76efed65bd) [arm64-darwin23]
Warming up --------------------------------------
              symbol     2.400M i/100ms
          dyn_symbol     2.405M i/100ms
           small_lit     2.308M i/100ms
          frozen_lit     2.268M i/100ms
Calculating -------------------------------------
              symbol     23.528M (± 6.9%) i/s -    117.584M in   5.033231s
          dyn_symbol     23.777M (± 4.7%) i/s -    120.231M in   5.071734s
           small_lit     23.066M (± 2.9%) i/s -    115.376M in   5.006947s
          frozen_lit     22.729M (± 1.1%) i/s -    115.693M in   5.090700s

Comparison:
              symbol: 23527823.6 i/s
          dyn_symbol: 23776757.8 i/s - same-ish: difference falls within error
           small_lit: 23065535.3 i/s - same-ish: difference falls within error
          frozen_lit: 22729351.6 i/s - same-ish: difference falls within error

cc @etiennebarrie

compile.c Outdated
@@ -10593,7 +10593,7 @@ iseq_compile_each0(rb_iseq_t *iseq, LINK_ANCHOR *const ret, const NODE *const no
lit = rb_str_freeze(lit);
}
else {
lit = rb_fstring(lit);
lit = rb_fstring(rb_str_precompute_hash(rb_str_freeze(lit)));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: the compiler should always feed pre-frozen strings to rb_fstring as to save a dup.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even better, it could use rb_enc_interned_str to save a bunch of allocs.

string.c Outdated Show resolved Hide resolved
@casperisfine
Copy link
Author

Note to self, we're precomputing in the compiler, we'd need to handle loading cached ISeq too.

@casperisfine
Copy link
Author

@XrXr: shared https://bugs.ruby-lang.org/issues/15331, but seems like that patch was lazily computing the hashcode, here we eagerly do it during compilation, so I think we benefit more from it.

Unclear how much worth it, it would be, but with embeded
strings we often have some space left in the slot, which
we can use to store the string Hash code.

It's probably only worth it for string literals, as they
are the ones likely to be used as hash keys.

We could even ony do this for string literal actually
used as hash keys, as seen by the compiler.

It is also unclear what the best layout would be. We
currently store it right after the string terminator
as to make it easy/fast to compute, but ideally it would
be at a fixed position, however this means creating another
`RString` union, which would complexify the code base significantly.

It's also unclear to us whether trying to respect alignment would
lead to a performance gain or not.

```ruby
hash = 10.times.to_h do |i|
  [i, i]
end

dyn_sym = "dynamic_symbol".to_sym
hash[:some_symbol] = 1
hash[dyn_sym] = 1
hash["small"] = 2
hash["frozen_string_literal"] = 2

Benchmark.ips do |x|
  x.report("symbol") { hash[:some_symbol] }
  x.report("dyn_symbol") { hash[:some_symbol] }
  x.report("small_lit") { hash["small"] }
  x.report("frozen_lit") { hash["frozen_string_literal"] }
  x.compare!(order: :baseline)
end
```

Before:

```
ruby 3.3.0 (2023-12-25 revision 5124f9a) [arm64-darwin23]
Warming up --------------------------------------
              symbol     2.392M i/100ms
          dyn_symbol     2.440M i/100ms
           small_lit     2.155M i/100ms
          frozen_lit     2.010M i/100ms
Calculating -------------------------------------
              symbol     24.175M (± 1.7%) i/s -    122.002M in   5.048306s
          dyn_symbol     24.345M (± 1.6%) i/s -    122.019M in   5.013400s
           small_lit     21.252M (± 2.1%) i/s -    107.744M in   5.072042s
          frozen_lit     20.095M (± 1.3%) i/s -    100.489M in   5.001681s

Comparison:
              symbol: 24174848.1 i/s
          dyn_symbol: 24345476.9 i/s - same-ish: difference falls within error
           small_lit: 21252403.2 i/s - 1.14x  slower
          frozen_lit: 20094766.0 i/s - 1.20x  slower
```

After:

```
ruby 3.4.0dev (2024-04-08T07:20:15Z interned-string-ha.. 76efed6) [arm64-darwin23]
Warming up --------------------------------------
              symbol     2.400M i/100ms
          dyn_symbol     2.405M i/100ms
           small_lit     2.308M i/100ms
          frozen_lit     2.268M i/100ms
Calculating -------------------------------------
              symbol     23.528M (± 6.9%) i/s -    117.584M in   5.033231s
          dyn_symbol     23.777M (± 4.7%) i/s -    120.231M in   5.071734s
           small_lit     23.066M (± 2.9%) i/s -    115.376M in   5.006947s
          frozen_lit     22.729M (± 1.1%) i/s -    115.693M in   5.090700s

Comparison:
              symbol: 23527823.6 i/s
          dyn_symbol: 23776757.8 i/s - same-ish: difference falls within error
           small_lit: 23065535.3 i/s - same-ish: difference falls within error
          frozen_lit: 22729351.6 i/s - same-ish: difference falls within error
```

Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants