Investigate ChampHashMap performance regression: more calls to equals when looking up an missing element with a partially colliding hash code #525

retronym · 2018-06-22T03:47:45Z

import scala.collection.immutable

object ChampPerformance {
  var size: Int = 10000
  var existingKeys: Array[Any] = _
  var missingKeys: Array[Any] = _
  existingKeys = (0 to size).map(i => new Tracer((i % 4) match {
    case 0 => i.toString
    case 1 => i.toChar
    case 2 => i.toDouble
    case 3 => i.toInt
  })).toArray
  missingKeys = (size to 2 * size).toArray
  
  def main(args: Array[String]): Unit = {
    check(immutable.ChampHashMap())
    check(immutable.HashMap())
  }
  def check(map: immutable.Map[Any, Any]) = {
    val values = existingKeys.map(x => (x, ""))
    val map1 = map ++ values
    equalsCounter = 0; hashCodeCounter = 0
    var i = 0;
    while (i < size) {
      assert(map1.contains(existingKeys(i)))
      assert(!map1.contains(missingKeys(i)))
      i += 1
    }
    println(map1.getClass.getName)
    println("equalsCounter = " + equalsCounter)
    println("hashCodeCounter = " + hashCodeCounter)
  }

  var equalsCounter = 0
  var hashCodeCounter = 0
  class Tracer[A](val a: A) {
    override def equals(obj: scala.Any): Boolean = {
      obj match {
        case t1: Tracer[_] =>
          equalsCounter +=1
          a == t1.a
        case _ => false
      }
    }
    override def hashCode(): Int = {
      hashCodeCounter += 1
      a.##
    }
  }
}

scala.collection.immutable.ChampHashMap
equalsCounter = 2342
hashCodeCounter = 20000
scala.collection.immutable.HashMap$HashTrieMap
equalsCounter = 16
hashCodeCounter = 20000

This test uses poorly distributed hash codes of small numbers. ChampHashMap (like immutable.HashMap before it) uses the folllowing hash code improvement:

  def improve(hcode: Int): Int = {
    var h: Int = hcode + ~(hcode << 9)
    h = h ^ (h >>> 14)
    h = h + (h << 4)
    h ^ (h >>> 10)
  }

  def computeHash(key: Any): Int =
    improve(key.##)

Contrast this with the version used in mutable.HashMap:

    /**
      * [ ... ]
      * The goal is to distribute across bins as well as possible even if a hash code has low entropy at some bits.
      * <p/>
      * OLD VERSION - quick, but bad for sequence 0-10000 - little entropy in higher bits - since 2003
      * {{{
      * var h: Int = hcode + ~(hcode << 9)
      * h = h ^ (h >>> 14)
      * h = h + (h << 4)
      * h ^ (h >>> 10)
      * }}}
      * the rest of the computation is due to SI-5293
      */
    protected final def improve(hcode: Int, seed: Int): Int = rotateRight(byteswap32(hcode), seed)

Those calls to equals occur in map1.contains(missingKeys(i)).

`ChampHashMap` calls `key.equals(existingKey)` even those the hash codes differ.

map1.contains(new Tracer(10283)), which calls Tracer(10283).equals(Tracer(4634.0)) (Tracer(4634.0) is an existing element in the map).

Notice the similarities in the hash codes:

scala> improve(4634.0.##).toBinaryString
res14: String = 11111101011001010110111001011000

scala> improve(10283.##).toBinaryString
res15: String = 11111010010100000110111001011000

scala> (improve(4634.0.##) ^  improve(10283.##)).toBinaryString
res16: String = 111001101010000000000000000

Leads to:

Comparison with `immutable.HashMap`

The old implementation first checked that the hash code of the located tree node is identical to the hash code of the key:

The text was updated successfully, but these errors were encountered:

retronym · 2018-06-22T04:10:11Z

@msteindorfer Is this a fundamental trade-off made by the CHAMP design, or a bug in the current implementation?

Your paper discusses this different reasons that lookup could be slower, but haven't found a discussion of how many calls to == are needed.

We have increased data locality at the cost of a more expensive
index calculation that requires an extra bitmap and
additional bit-level operations. This directly influences the
runtime performance of lookup, insertion, and deletion. Each
of these operations now requires a case distinction with, in
worst case, two separate lookups in the two distinct bitmaps,
to decide in which group an element is present. Because we
compactly store both groups in a single array, we need to
perform offset-based indexing when accessing the second
group of sub-node references. Both the offset and the index
calculations require more bit-level operations on the datamap
and nodemap, as explained below.

retronym · 2018-06-22T04:35:25Z

I suppose we'd need to add a hashes: Array[Int] to MapNode to store the hash codes. With a bit of work, this could be compacted a bit to just store the part of the hash code that isn't encoded implicitly in the address of the node within the tree (IIUC, that encodes 5 bits per level).

msteindorfer · 2018-06-22T12:16:30Z

Hi @retronym, there a discussion in the paper how many calls to hash code and == are made, also in comparison to how it scores to Scala. The section is about caching / memoizing hash codes of keys; there I'm comparing CHAMP and MEMCHAMP (the latter is CHAMP memory layout with cached key hashes similar as Scala does).

The higher amount of == come from the negative lookups, that's the only operation where CHAMP has a slight disadvantage over the previous implementation.

I suppose we'd need to add a hashes: Array[Int] to MapNode to store the hash codes. With a bit of work, this could be compacted a bit to just store the part of the hash code that isn't encoded implicitly in the address of the node within the tree (IIUC, that encodes 5 bits per level).

There's also a reference implementation of MEMCHAMP available that shows how to add the hashes array. However, you cannot restrict the information to 5 bits per level, you'll need to store full 32 bit integers at each level, since nodes can move levels in the tree (or alternatively rely on rehashing when they move).

I hope that's enough pointers to get you started. I'm currently at PLDI and try get back to you later if you have remaining questions.

retronym · 2018-07-25T07:04:36Z

Here's my WIP to implement hash code caching.

https://github.com/scala/scala/compare/c0b038c...retronym:topic/champ-hash-code-cache?expand=1

retronym · 2018-07-25T22:38:53Z

That patch is still buggy.

Intermittently, e.g.:

[error] Test scala.tools.nsc.backend.jvm.opt.InlinerTest.tupleElimination failed: java.lang.AssertionError: assertion failed:
[error] Expected: List(Op(ICONST_3), Op(ICONST_4), Op(IADD), Op(IRETURN))
[error] Actual  : List(TypeOp(NEW, scala/Tuple2$mcII$sp), Op(DUP), Op(ICONST_3), Op(ICONST_4), VarOp(ISTORE, 3), VarOp(ISTORE, 2), VarOp(ILOAD, 2), VarOp(ILOAD, 3), Invoke(INVOKESPECIAL, scala/Tuple2$mcII$sp, <init>, (II)V, false), VarOp(ASTORE, 1), VarOp(ILOAD, 2), VarOp(ALOAD, 0), VarOp(ALOAD, 1), Invoke(INVOKEVIRTUAL, C, t_2, (Lscala/Tuple2;)Ljava/lang/Object;, false), Invoke(INVOKESTATIC, scala/runtime/BoxesRunTime, unboxToInt, (Ljava/lang/Object;)I, false), Op(IADD), Op(IRETURN)), took 0.212 sec
[error]     at scala.Predef$.assert(Predef.scala:276)
[error]     at scala.tools.testing.BytecodeTesting$.assertSameCode(BytecodeTesting.scala:220)
[error]     at scala.tools.testing.BytecodeTesting$.assertSameCode(BytecodeTesting.scala:218)
[error]     at scala.tools.nsc.backend.jvm.opt.InlinerTest.tupleElimination(InlinerTest.scala:1359)
[error]     ...
[info] Test scala.tools.nsc.backend.jvm.opt.InlinerTest.inlineNoInlineOverride started

retronym changed the title ~~Investigate ChampHashMap performance regression with small-integer hash codes~~ Investigate ChampHashMap performance regression: more calls to equals when looking up an missing element with a colliding hash code Jun 22, 2018

retronym added t:collections prio:blocker labels Jun 22, 2018

retronym self-assigned this Jun 22, 2018

retronym added this to the 2.13 milestone Jun 22, 2018

SethTisue mentioned this issue Jun 27, 2018

Make sure HashSet is at least as performant as in the old collections scala/collection-strawman#442

Closed

retronym mentioned this issue Jul 26, 2018

Cache full key hash codes in ChampMap scala/scala#6975

Merged

2 tasks

retronym modified the milestones: 2.13.1, 2.13.0-M5 Aug 8, 2018

szeiger closed this as completed in scala/scala#6975 Aug 14, 2018

joshlemer mentioned this issue Jan 18, 2019

VectorMap should cooperate with im/mutable HashMap/Sets/VectorMaps during addAll/removeAll/concat/removedAll etc. scala/bug#11326

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate ChampHashMap performance regression: more calls to equals when looking up an missing element with a partially colliding hash code #525

Investigate ChampHashMap performance regression: more calls to equals when looking up an missing element with a partially colliding hash code #525

retronym commented Jun 22, 2018 •

edited

Loading

retronym commented Jun 22, 2018

retronym commented Jun 22, 2018

msteindorfer commented Jun 22, 2018

retronym commented Jul 25, 2018 •

edited

Loading

retronym commented Jul 25, 2018

Investigate ChampHashMap performance regression: more calls to equals when looking up an missing element with a partially colliding hash code #525

Investigate ChampHashMap performance regression: more calls to equals when looking up an missing element with a partially colliding hash code #525

Comments

retronym commented Jun 22, 2018 • edited Loading

ChampHashMap calls key.equals(existingKey) even those the hash codes differ.

Comparison with immutable.HashMap

retronym commented Jun 22, 2018

retronym commented Jun 22, 2018

msteindorfer commented Jun 22, 2018

retronym commented Jul 25, 2018 • edited Loading

retronym commented Jul 25, 2018

retronym commented Jun 22, 2018 •

edited

Loading

`ChampHashMap` calls `key.equals(existingKey)` even those the hash codes differ.

Comparison with `immutable.HashMap`

retronym commented Jul 25, 2018 •

edited

Loading