Skip to content
This repository has been archived by the owner on Dec 22, 2021. It is now read-only.

New immutable hash set and map #342

Merged
merged 1 commit into from
Feb 5, 2018

Conversation

msteindorfer
Copy link
Contributor

This pull request contains reimplementations of immutable HashSet and HashMap using Compressed Hash-Array Mapped Prefix-trees (CHAMP). A prototype and preliminary evaluation of CHAMP data structures in collection-strawman was already discussed in issue #192.

The new implementations (ChampHashSet and ChampHashMap) currently exist next to the HashMap and HashSet. By default immutable.Map and immutable.Set now pickup the CHAMP versions, but I also implemented a JVM flag (-Dstrawman.collection.immutable.useBaseline=true) to default to the current HashSet and HashMap implementations. I assume that in the final version of the collections only one hash-map/set will ship, but for now, having a flag helps with comparing the different trade-offs and performance characteristics of the current and the new data structures.

The data structures contained in this pull request represent a first basic re-implementation of HashSet and HashMap. The data structures should be functionally complete at this point, however there are still parts and operations that can benefit from (further) optimizations. I plan to continue working on those parts that still need attention, but in the meantime I request a review of the current state.

Preliminary performance numbers of the new CHAMP data structures were presented in issue #192; those characteristics didn't change with the Scala re-implementaiton. (I can post further results here upon request, but you might be interested to give them a spin yourself.) Overall one can summarize that the CHAMP data structures significantly lower memory footprints and significantly improve all iteration-based operations and equality checks. Lookups slow down, but insertion and deletion also seem to benefit as well.

Note that the CHAMP design / implementation differs from the current hashed data structures by not memoizing the hash codes of the individual elements (which may change the performance of certain workloads). If necessary, CHAMP's design allows to modularly add memoizing the hash codes of the individual elements (at the expense of some memory savings); details are discussed in the OOPSLA'15 paper. I plan to further explore the memoized approach in the context of collection-strawmen as well in the near future.

I'm looking forward to receiving feedback on the current implementation, to further improve the code based on your reviews.

@msteindorfer
Copy link
Contributor Author

Seems that I still have to sign the CLA, in order to make the Travis build happy; will do so later on.

Copy link
Contributor

@julienrf julienrf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @msteindorfer!

This looks great! I haven’t reviewed the implementation details specific to the CHAMP data structure, but only the way you use the collections framework.

*/

@SerialVersionUID(2L)
private[immutable] sealed case class ChampHashMap[K, +V](val rootNode: MapNode[K, V], val cachedJavaHashCode: Int, val cachedSize: Int)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be in favor of making the type public. (the primary constructor can stay private, though)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it should not be a case class, just a plain class, because we don’t want to have the case class generated equals and hashCode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

with StrictOptimizedIterableOps[(K, V), Iterable /* ChampHashMap */, ChampHashMap[K, V]]
with Serializable {

override def iterableFactory = List
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion it’s better to omit the override keyword when we only “implement” a member. That prevents to mistakenly override something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

case _ => super.equals(that)
}

override def hashCode(): Int = collection.Set.unorderedHash(toIterable, "Map".##)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should already inherit the same implementation from collection.Map, so this line seems unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


val BitPartitionMask = (1 << BitPartitionSize) - 1

val MaxDepth = ceil(HashCodeLength.asInstanceOf[Double] / BitPartitionSize).asInstanceOf[Int]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ceil(HashCodeLength.toDouble / BitPartitionSize).toInt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


val SizeMoreThanOne = 2

def $mask(hash: Int, shift: Int): Int = (hash >>> shift) & BitPartitionMask
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a good idea to start an identifier with the $ character because the scala compiler also uses this prefix for compiler-generated identifiers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The $ prefix was a workaround for compiler issues that I had, but maybe you can recommend me a better solution.

Without the $ prefix, the statement val mask = mask(elementHash, shift) will fail compilation:

[error] ChampHashSet.scala:160: recursive value mask needs type
[error]     val mask = mask(elementHash, shift)
[error]                ^
[error] one error found

Even when using a type annotation as in val mask: Int = mask(elementHash, shift), the compiler cannot infer that I want to call the mask function:

[error] ChampHashSet.scala:160: Int does not take parameters
[error]     val mask: Int = mask(elementHash, shift)
[error]                         ^
[error] one error found

Any suggestions apart from fully qualifying the method (e.g., with SetNode.mask)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One solution is to use a different name for $mask but a name that has no dollar sign in it. Like maskValue, for instance. Otherwise the solution that consists in fully qualifying the method is good too.

with StrictOptimizedIterableOps[A, ChampHashSet, ChampHashSet[A]]
with Serializable {

def iterableFactory = ChampHashSet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind adding an explicit type annotation?

def iterableFactory: IterableFactory[ChampHashSet] = ChampHashSet

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

case _ => false
}

override def hashCode(): Int = collection.Set.unorderedHash(toIterable, "Set".##)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. I'm removing it here, but typically when overriding equals I also add an explicit override of hashCode to avoid hash-code contract related issues. This is also advised in http://www.artima.com/pins1ed/object-equality.html.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

(this eq set) ||
(set canEqual this) &&
(toIterable.size == set.size) &&
(this subsetOf set)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch is not necessary (provided that you don’t use a case class anymore)


val BitPartitionMask = (1 << BitPartitionSize) - 1

val MaxDepth = ceil(HashCodeLength.asInstanceOf[Double] / BitPartitionSize).asInstanceOf[Int]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too you should use toDouble and toInt rather than asInstanceOf.

You could also move the common code between ChampHashMap and ChampHashSet in a separate place to not repeat it. (like we do with the Hashing thing)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the hint. I was aware that there's similar code between both implementations that should be factored out. It's still on my TODO list. However, I preferred to expose the current implementation earlier to the community to get feedback.

(this.content.size == node.content.size) &&
(this.content.filterNot(element0 => node.content.contains(element0))).isEmpty
case _ => false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to the equals method? It's part of the structural equality implementation and gets invoked in the ChampHashSet#equals in the expression (this.rootNode == set.rootNode). So, whenever you compare two sets that have hash collisions, then this method is executed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the equals method. Thanks for the clarification. So why don’t you need to also implement hashCode in that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BitmapIndexedSetNode and HashCollisionSetNode are private to the file and are never hashed or put into any hashed data structures that's why I omitted and the hashCode implementation; these classes represent the trie encoding of the data structure. On the other hand, equals is called as shown before to faster compare to trie data structures.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thanks for the clarification! Maybe you can put a comment in the file about that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@julienrf
Copy link
Contributor

@msteindorfer you will also have to sign the Scala CLA: https://travis-ci.org/scala/collection-strawman/builds/329365881#L464-L469

sderosiaux added a commit to sderosiaux/every-single-day-i-tldr that referenced this pull request Jan 16, 2018
def empty[K, V]: MapNode[K, V] =
EmptyMapNode.asInstanceOf[MapNode[K, V]]

val TupleLength = 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not expose these constants publicly. They could also be made into compile time constants.

private final val TupleLength = 2 // final, no type annotation, callers will constant fold this in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I "finalized" all the constants. (I assumed that vals in companion objects are final anyways, but seems that I wrong there.)

I agree in not exposing these constants, but then it would suffice to make the companion object private (e.g., private[immutable] object MapNode {}) instead of each constant, right?

@szeiger szeiger requested a review from Ichoran January 17, 2018 15:55
@msteindorfer
Copy link
Contributor Author

Thanks @julienrf and @retronym for early comments; I'll factor those in.

@julienrf
Copy link
Contributor

It looks like Dotty doesn’t like what you wrote :( https://travis-ci.org/scala/collection-strawman/builds/330072462#L1269

@msteindorfer
Copy link
Contributor Author

@julienrf how I can build collection-strawman locally against dotty to work out the quirks? What's the according sbt command?

@julienrf
Copy link
Contributor

Modify the build.sbt file to use dotty by default:

scalaVersion := dotty.value // instead of "2.13.0-M2"

@julienrf
Copy link
Contributor

That being said, it’s likely to be a bug in Dotty (the compiler crashed instead of reporting an error)

with StrictOptimizedIterableOps[(K, V), Iterable /* ChampHashMap */, ChampHashMap[K, V]]
with Serializable {

def iterableFactory: IterableFactoryLike[List] = List
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use the default immutable.Iterable here:

def iterableFactory: IterableFactory[Iterable] = Iterable

(currently the default Iterable happens to be List but that could change…)

(this eq node) ||
(this.hash == node.hash) &&
(this.content.size == node.content.size) &&
(this.content.filterNot(keyValueTuple => node.content.contains(keyValueTuple))).isEmpty
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be optimized even further by using forall:

this.content.forall(node.content.contains)

(this eq node) ||
(this.hash == node.hash) &&
(this.content.size == node.content.size) &&
(this.content.filterNot(element0 => node.content.contains(element0))).isEmpty
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can apply the same optimization here.

@julienrf
Copy link
Contributor

Great, now the CI is green!

I found a couple more issues or possible improvements.

Then, to merge it we need the following:

  • replace the dollar prefixed identifiers with something else (e.g. mask0, mkMask, etc.)
  • add unit tests (in the test/junit/ module)
  • squash all the commits into a single one

Does that work for you?

@msteindorfer
Copy link
Contributor Author

@julienrf, yes, that works for me. I'll probably go over it in the weekend, and also apply some further enhancements by factoring out commonalities.

Regarding the tests, I've to see how to integrate them best with collection-strawman. I currently do test the data structures with a property-based test suite that lives outside strawman (i.e., one that is based on Capsule's tests. I've to see if to either port them strawman, or reference them as an external dependency in the testing phase.

@julienrf
Copy link
Contributor

You can straightforwardly port the junit tests to our test/junit module.

For the property based tests, we already have some tests that use scalacheck, in the test/scalacheck module, you can try to port your tests in that module, and I can help you if needed.

@fommil fommil mentioned this pull request Jan 21, 2018
@msteindorfer
Copy link
Contributor Author

@julienrf the pull-request now also contains the junit and scalacheck tests. Regarding, squashing of commits: do I really have to do it myself? As far as I know, the person who merges a PR in GitHub can select do squash all commits to one.

@adriaanm
Copy link

Over in scala/scala, we typically ask that you squash and reformat / copy-edit the commit message so that it reads nicely. This is super important especially for new, tricky stuff. The maintainers send their thanks from the future.

@msteindorfer
Copy link
Contributor Author

@adriaanm understood; makes sense. Then I'll take care myself of squashing the commits and preparing a nice and elaborate commit message.

this.hash == hash && content.find(key == _._1).isDefined

override def contains[V1 >: V](key: K, value: V1, hash: Int, shift: Int): Boolean =
this.hash == hash && content.find(payload => key == payload._1 && value == payload._2).isDefined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simplify find(p).isDefined by exists(p)


def tuple[KV](keyValue: KV): (KV, KV) = tuple(keyValue, keyValue)

def tuple[K, V](key: K, value: V): (K, V) = Tuple2.apply(key, value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don’t you just write (key, value) to make a tuple? Alternatively, you can use the key -> value syntax.


override def equals(other: Any): Boolean = other match {
case that: DummyValue =>
(this eq that) || hash == that.hash && value == that.value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, you can define DummyValue as a case class and not worry about hashCode and equals (they will be defined to implement value equality).

BTW, in your implementation DummyValue(1, 2) != DummyValue(2, 2) although DummyValue(1, 2).## == DummyValue(2, 2).##, is that expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the question, yes this is intended behaviour in order to create values for testing hash-collisions. I documented my intend and renamed the class to CustomHashInt in order to make this clear.

private def convertToScalaMapAndCheckHashCode(input: ChampHashMap[K, V]) =
HashMap.from(input).hashCode == input.hashCode

property("convertToJavaMapAndCheckEquality") = forAll { (input: ChampHashMap[K, V]) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your implementation does not convert to Java

property("containsAfterInsert") = forAll { (inputValues: HashMap[K, V]) =>
var constructedMap = ChampHashMap.empty[K, V]

inputValues.foreach(item => constructedMap += item)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also write:

val constructedMap = ChampHashMap.empty[K, V] ++ inputValues

var constructedMap = ChampHashMap.empty[K, V]

inputValues.foreach(item => constructedMap += item)
inputValues.forall(keyValueTuple => constructedMap.get(keyValueTuple._1).get == keyValueTuple._2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of myOption.get == something you can write myOption.contains(something):

inputValues.forall { case (key, value) => constructedMap.get(key).contains(value) }

Copy link
Contributor

@julienrf julienrf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Michael, thanks for the update! I’ve read again your code and added some more comments (hopefully that’s the last time!).


protected[this] def newSpecificBuilder(): Builder[(K, V), ChampHashMap[K, V]] = ChampHashMap.newBuilder()

override def knownSize: Int = cachedSize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might also want to override size (the default implementation tests whether knownSize is defined or not, so you would just save that comparison…)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, isEmpty and nonEmpty could be overridden for performance too (the default implementation creates an Iterator and then checks whether that iterator is empty or not)


assert(TupleLength * payloadArity + nodeArity == content.length)
assert(Range(0, TupleLength * payloadArity).forall(i => !content(i).isInstanceOf[MapNode[_, _]]))
assert(Range(TupleLength * payloadArity, content.length).forall(i => content(i).isInstanceOf[MapNode[_, _]]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These assertions will be evaluated by user code unless they use a specific compiler option. Is it possible to move them to tests instead?

}

@SerialVersionUID(2L)
private[this] final class BitmapIndexedMapNode[K, +V](val dataMap: Int, val nodeMap: Int, val content: Array[Any]) extends MapNode[K, V] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m surprised that this code compiles, I have no idea what this is bound to here…

@msteindorfer
Copy link
Contributor Author

@julienrf I addressed the comments you made above, added further performance optimizations, and also squashed all commits of this PR to a single commit that contains a descriptive commit message summarizing this effort. I'm looking forward having this PR merged.

ys = ys.init
}
}
// // TODO: currently disabled, since it does not finish
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW do you have any idea on why this benchmark doesn’t finish?

Copy link
Contributor Author

@msteindorfer msteindorfer Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I do. It's a performance issue since the code calls twice per iteration last (once explicitly, and one time hidden in init) and last causes an iteration through the whole collection (front-to-back). This performance issue is solved with the CHAMP data structures, since I implemented a backwards iterator that avoids iterating front-to-back for obtaining the last element.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you guys run the benchmarks regularly? I wondered why this problem wasn't spotted earlier on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we run them occasionally, and we always select a subset of the benchmarks to run.

@julienrf
Copy link
Contributor

I’ve run the benchmarks comparing ChampHashSet, the current HashSet and the new (strawman) HashSet. I don’t see a significant performance improvement with ChampHashSet. It seems that with a small number of elements the operations are even significantly slower (at least 15%). The operations that were benchmarked are: +, ++, contains and foreach (lower is better, ChampHashSet is in blue):

expand_incl

expand_concat

access_contains

traverse_foreach

Sorry, the definition of these charts is not super good and sometimes we have huge confidence intervals that make it even harder to compare the lines. So, here are the raw numbers:

[info] ChampHashSetBenchmark.access_contains         0  avgt    8          6.236 ±        0.032  ns/op
[info] ChampHashSetBenchmark.access_contains         1  avgt    8          6.853 ±        0.111  ns/op
[info] ChampHashSetBenchmark.access_contains         2  avgt    8          6.986 ±        0.046  ns/op
[info] ChampHashSetBenchmark.access_contains         3  avgt    8          7.165 ±        0.075  ns/op
[info] ChampHashSetBenchmark.access_contains         4  avgt    8          6.861 ±        0.064  ns/op
[info] ChampHashSetBenchmark.access_contains         7  avgt    8          6.900 ±        0.050  ns/op
[info] ChampHashSetBenchmark.access_contains         8  avgt    8          6.992 ±        0.034  ns/op
[info] ChampHashSetBenchmark.access_contains        15  avgt    8          7.265 ±        0.079  ns/op
[info] ChampHashSetBenchmark.access_contains        16  avgt    8          7.288 ±        0.071  ns/op
[info] ChampHashSetBenchmark.access_contains        17  avgt    8          7.215 ±        0.122  ns/op
[info] ChampHashSetBenchmark.access_contains        39  avgt    8          8.409 ±        0.211  ns/op
[info] ChampHashSetBenchmark.access_contains       282  avgt    8         12.589 ±        0.066  ns/op
[info] ChampHashSetBenchmark.access_contains      4096  avgt    8         22.031 ±        0.246  ns/op
[info] ChampHashSetBenchmark.access_contains    131070  avgt    8         79.957 ±        7.523  ns/op
[info] ChampHashSetBenchmark.access_contains   7312102  avgt    8        158.706 ±        2.592  ns/op
[info] ChampHashSetBenchmark.expand_concat           0  avgt    8        107.528 ±        0.219  ns/op
[info] ChampHashSetBenchmark.expand_concat           1  avgt    8        106.645 ±        1.101  ns/op
[info] ChampHashSetBenchmark.expand_concat           2  avgt    8        107.542 ±        1.033  ns/op
[info] ChampHashSetBenchmark.expand_concat           3  avgt    8        109.034 ±        1.201  ns/op
[info] ChampHashSetBenchmark.expand_concat           4  avgt    8        171.628 ±        1.813  ns/op
[info] ChampHashSetBenchmark.expand_concat           7  avgt    8        171.738 ±        1.633  ns/op
[info] ChampHashSetBenchmark.expand_concat           8  avgt    8        184.490 ±        1.996  ns/op
[info] ChampHashSetBenchmark.expand_concat          15  avgt    8        176.020 ±        0.716  ns/op
[info] ChampHashSetBenchmark.expand_concat          16  avgt    8        178.534 ±        1.900  ns/op
[info] ChampHashSetBenchmark.expand_concat          17  avgt    8        175.380 ±        0.187  ns/op
[info] ChampHashSetBenchmark.expand_concat          39  avgt    8        239.542 ±        5.136  ns/op
[info] ChampHashSetBenchmark.expand_concat         282  avgt    8        163.482 ±        2.445  ns/op
[info] ChampHashSetBenchmark.expand_concat        4096  avgt    8        370.849 ±        4.703  ns/op
[info] ChampHashSetBenchmark.expand_concat      131070  avgt    8      16152.696 ±      386.466  ns/op
[info] ChampHashSetBenchmark.expand_concat     7312102  avgt    8    3777731.790 ±   144570.973  ns/op
[info] ChampHashSetBenchmark.expand_incl             0  avgt    8        100.942 ±        9.752  ns/op
[info] ChampHashSetBenchmark.expand_incl             1  avgt    8        108.150 ±        3.199  ns/op
[info] ChampHashSetBenchmark.expand_incl             2  avgt    8        106.084 ±       15.663  ns/op
[info] ChampHashSetBenchmark.expand_incl             3  avgt    8        103.523 ±        9.471  ns/op
[info] ChampHashSetBenchmark.expand_incl             4  avgt    8        106.206 ±       14.739  ns/op
[info] ChampHashSetBenchmark.expand_incl             7  avgt    8        104.710 ±        2.238  ns/op
[info] ChampHashSetBenchmark.expand_incl             8  avgt    8        107.246 ±        1.771  ns/op
[info] ChampHashSetBenchmark.expand_incl            15  avgt    8        107.552 ±        2.219  ns/op
[info] ChampHashSetBenchmark.expand_incl            16  avgt    8        101.858 ±        9.444  ns/op
[info] ChampHashSetBenchmark.expand_incl            17  avgt    8        106.726 ±       16.356  ns/op
[info] ChampHashSetBenchmark.expand_incl            39  avgt    8        104.217 ±        9.565  ns/op
[info] ChampHashSetBenchmark.expand_incl           282  avgt    8        118.461 ±       14.059  ns/op
[info] ChampHashSetBenchmark.expand_incl          4096  avgt    8        154.850 ±        3.235  ns/op
[info] ChampHashSetBenchmark.expand_incl        131070  avgt    8        259.687 ±        1.231  ns/op
[info] ChampHashSetBenchmark.expand_incl       7312102  avgt    8        487.682 ±      412.740  ns/op
[info] ChampHashSetBenchmark.traverse_foreach        0  avgt    8         43.745 ±        0.427  ns/op
[info] ChampHashSetBenchmark.traverse_foreach        1  avgt    8         48.500 ±        0.392  ns/op
[info] ChampHashSetBenchmark.traverse_foreach        2  avgt    8         52.881 ±        0.630  ns/op
[info] ChampHashSetBenchmark.traverse_foreach        3  avgt    8         56.015 ±        1.029  ns/op
[info] ChampHashSetBenchmark.traverse_foreach        4  avgt    8         60.206 ±        0.526  ns/op
[info] ChampHashSetBenchmark.traverse_foreach        7  avgt    8         71.548 ±        0.585  ns/op
[info] ChampHashSetBenchmark.traverse_foreach        8  avgt    8         83.231 ±        0.621  ns/op
[info] ChampHashSetBenchmark.traverse_foreach       15  avgt    8        104.153 ±        0.607  ns/op
[info] ChampHashSetBenchmark.traverse_foreach       16  avgt    8        105.566 ±        1.062  ns/op
[info] ChampHashSetBenchmark.traverse_foreach       17  avgt    8        108.205 ±        0.591  ns/op
[info] ChampHashSetBenchmark.traverse_foreach       39  avgt    8        305.665 ±        1.447  ns/op
[info] ChampHashSetBenchmark.traverse_foreach      282  avgt    8       1780.425 ±        5.820  ns/op
[info] ChampHashSetBenchmark.traverse_foreach     4096  avgt    8      83784.526 ±      588.810  ns/op
[info] ChampHashSetBenchmark.traverse_foreach   131070  avgt    8    1461159.588 ±     6765.814  ns/op
[info] ChampHashSetBenchmark.traverse_foreach  7312102  avgt    8  325856087.729 ± 10539975.279  ns/op
[info] HashSetBenchmark.access_contains              0  avgt    8          4.477 ±        0.095  ns/op
[info] HashSetBenchmark.access_contains              1  avgt    8          5.256 ±        0.036  ns/op
[info] HashSetBenchmark.access_contains              2  avgt    8          5.617 ±        0.038  ns/op
[info] HashSetBenchmark.access_contains              3  avgt    8          6.631 ±        0.208  ns/op
[info] HashSetBenchmark.access_contains              4  avgt    8          7.103 ±        0.088  ns/op
[info] HashSetBenchmark.access_contains              7  avgt    8          6.325 ±        0.075  ns/op
[info] HashSetBenchmark.access_contains              8  avgt    8          6.260 ±        0.082  ns/op
[info] HashSetBenchmark.access_contains             15  avgt    8          6.109 ±        0.119  ns/op
[info] HashSetBenchmark.access_contains             16  avgt    8          6.675 ±        0.129  ns/op
[info] HashSetBenchmark.access_contains             17  avgt    8          6.189 ±        0.195  ns/op
[info] HashSetBenchmark.access_contains             39  avgt    8          6.619 ±        0.121  ns/op
[info] HashSetBenchmark.access_contains            282  avgt    8         12.080 ±        0.032  ns/op
[info] HashSetBenchmark.access_contains           4096  avgt    8         28.555 ±        0.327  ns/op
[info] HashSetBenchmark.access_contains         131070  avgt    8         83.053 ±        7.043  ns/op
[info] HashSetBenchmark.access_contains        7312102  avgt    8        170.539 ±       40.044  ns/op
[info] HashSetBenchmark.expand_concat                0  avgt    8         39.395 ±        0.278  ns/op
[info] HashSetBenchmark.expand_concat                1  avgt    8         52.199 ±        0.609  ns/op
[info] HashSetBenchmark.expand_concat                2  avgt    8         67.308 ±        0.878  ns/op
[info] HashSetBenchmark.expand_concat                3  avgt    8         67.835 ±        0.439  ns/op
[info] HashSetBenchmark.expand_concat                4  avgt    8         96.095 ±        1.603  ns/op
[info] HashSetBenchmark.expand_concat                7  avgt    8         95.645 ±        0.364  ns/op
[info] HashSetBenchmark.expand_concat                8  avgt    8         95.974 ±        1.834  ns/op
[info] HashSetBenchmark.expand_concat               15  avgt    8         96.299 ±        0.991  ns/op
[info] HashSetBenchmark.expand_concat               16  avgt    8         96.856 ±        0.451  ns/op
[info] HashSetBenchmark.expand_concat               17  avgt    8         96.476 ±        0.911  ns/op
[info] HashSetBenchmark.expand_concat               39  avgt    8        115.998 ±        1.482  ns/op
[info] HashSetBenchmark.expand_concat              282  avgt    8        141.953 ±        2.515  ns/op
[info] HashSetBenchmark.expand_concat             4096  avgt    8        341.729 ±        5.950  ns/op
[info] HashSetBenchmark.expand_concat           131070  avgt    8      16330.529 ±      477.040  ns/op
[info] HashSetBenchmark.expand_concat          7312102  avgt    8    3803089.961 ±   132253.926  ns/op
[info] HashSetBenchmark.expand_incl                  0  avgt    8         82.069 ±        3.359  ns/op
[info] HashSetBenchmark.expand_incl                  1  avgt    8         84.063 ±        4.646  ns/op
[info] HashSetBenchmark.expand_incl                  2  avgt    8         83.594 ±        1.130  ns/op
[info] HashSetBenchmark.expand_incl                  3  avgt    8         82.564 ±        3.493  ns/op
[info] HashSetBenchmark.expand_incl                  4  avgt    8         83.586 ±        1.594  ns/op
[info] HashSetBenchmark.expand_incl                  7  avgt    8         84.173 ±        5.281  ns/op
[info] HashSetBenchmark.expand_incl                  8  avgt    8         82.272 ±        3.677  ns/op
[info] HashSetBenchmark.expand_incl                 15  avgt    8         83.844 ±        5.754  ns/op
[info] HashSetBenchmark.expand_incl                 16  avgt    8         83.933 ±        3.660  ns/op
[info] HashSetBenchmark.expand_incl                 17  avgt    8         82.993 ±        2.462  ns/op
[info] HashSetBenchmark.expand_incl                 39  avgt    8         84.749 ±        5.665  ns/op
[info] HashSetBenchmark.expand_incl                282  avgt    8         93.434 ±        5.356  ns/op
[info] HashSetBenchmark.expand_incl               4096  avgt    8        150.460 ±        1.181  ns/op
[info] HashSetBenchmark.expand_incl             131070  avgt    8        235.027 ±        7.598  ns/op
[info] HashSetBenchmark.expand_incl            7312102  avgt    8        497.252 ±      653.047  ns/op
[info] HashSetBenchmark.traverse_foreach             0  avgt    8          0.520 ±        0.008  ns/op
[info] HashSetBenchmark.traverse_foreach             1  avgt    8          2.732 ±        0.004  ns/op
[info] HashSetBenchmark.traverse_foreach             2  avgt    8          9.132 ±        0.765  ns/op
[info] HashSetBenchmark.traverse_foreach             3  avgt    8         12.673 ±        0.260  ns/op
[info] HashSetBenchmark.traverse_foreach             4  avgt    8         16.486 ±        0.004  ns/op
[info] HashSetBenchmark.traverse_foreach             7  avgt    8         27.846 ±        0.366  ns/op
[info] HashSetBenchmark.traverse_foreach             8  avgt    8         39.598 ±       11.728  ns/op
[info] HashSetBenchmark.traverse_foreach            15  avgt    8         74.148 ±        0.313  ns/op
[info] HashSetBenchmark.traverse_foreach            16  avgt    8         79.084 ±        0.320  ns/op
[info] HashSetBenchmark.traverse_foreach            17  avgt    8         65.460 ±        0.238  ns/op
[info] HashSetBenchmark.traverse_foreach            39  avgt    8        207.134 ±        7.643  ns/op
[info] HashSetBenchmark.traverse_foreach           282  avgt    8       1576.726 ±      160.751  ns/op
[info] HashSetBenchmark.traverse_foreach          4096  avgt    8     117894.293 ±     5896.952  ns/op
[info] HashSetBenchmark.traverse_foreach        131070  avgt    8    1670029.025 ±    11685.383  ns/op
[info] HashSetBenchmark.traverse_foreach       7312102  avgt    8  476171057.167 ± 14743420.991  ns/op
[info] ScalaHashSetBenchmark.access_contains         0  avgt    8          4.209 ±        0.081  ns/op
[info] ScalaHashSetBenchmark.access_contains         1  avgt    8          4.995 ±        0.047  ns/op
[info] ScalaHashSetBenchmark.access_contains         2  avgt    8          6.032 ±        0.030  ns/op
[info] ScalaHashSetBenchmark.access_contains         3  avgt    8          6.081 ±        0.022  ns/op
[info] ScalaHashSetBenchmark.access_contains         4  avgt    8          6.554 ±        0.085  ns/op
[info] ScalaHashSetBenchmark.access_contains         7  avgt    8          6.690 ±        0.081  ns/op
[info] ScalaHashSetBenchmark.access_contains         8  avgt    8          6.061 ±        0.100  ns/op
[info] ScalaHashSetBenchmark.access_contains        15  avgt    8          6.315 ±        0.035  ns/op
[info] ScalaHashSetBenchmark.access_contains        16  avgt    8          6.515 ±        0.078  ns/op
[info] ScalaHashSetBenchmark.access_contains        17  avgt    8          7.279 ±        0.030  ns/op
[info] ScalaHashSetBenchmark.access_contains        39  avgt    8          7.055 ±        0.034  ns/op
[info] ScalaHashSetBenchmark.access_contains       282  avgt    8         11.902 ±        0.055  ns/op
[info] ScalaHashSetBenchmark.access_contains      4096  avgt    8         25.226 ±        0.944  ns/op
[info] ScalaHashSetBenchmark.access_contains    131070  avgt    8         82.289 ±        2.735  ns/op
[info] ScalaHashSetBenchmark.access_contains   7312102  avgt    8        231.426 ±      326.284  ns/op
[info] ScalaHashSetBenchmark.expand_concat           0  avgt    8         39.597 ±        0.446  ns/op
[info] ScalaHashSetBenchmark.expand_concat           1  avgt    8         71.625 ±        0.812  ns/op
[info] ScalaHashSetBenchmark.expand_concat           2  avgt    8         91.491 ±        0.107  ns/op
[info] ScalaHashSetBenchmark.expand_concat           3  avgt    8         80.431 ±        0.943  ns/op
[info] ScalaHashSetBenchmark.expand_concat           4  avgt    8        114.223 ±        1.234  ns/op
[info] ScalaHashSetBenchmark.expand_concat           7  avgt    8         83.610 ±        0.983  ns/op
[info] ScalaHashSetBenchmark.expand_concat           8  avgt    8        116.004 ±        0.946  ns/op
[info] ScalaHashSetBenchmark.expand_concat          15  avgt    8         86.436 ±        1.466  ns/op
[info] ScalaHashSetBenchmark.expand_concat          16  avgt    8        112.823 ±        1.381  ns/op
[info] ScalaHashSetBenchmark.expand_concat          17  avgt    8        117.358 ±        1.047  ns/op
[info] ScalaHashSetBenchmark.expand_concat          39  avgt    8         88.397 ±        1.081  ns/op
[info] ScalaHashSetBenchmark.expand_concat         282  avgt    8        139.193 ±        0.537  ns/op
[info] ScalaHashSetBenchmark.expand_concat        4096  avgt    8        408.057 ±        6.733  ns/op
[info] ScalaHashSetBenchmark.expand_concat      131070  avgt    8      20127.153 ±       53.960  ns/op
[info] ScalaHashSetBenchmark.expand_concat     7312102  avgt    8    4353952.687 ±   726346.192  ns/op
[info] ScalaHashSetBenchmark.expand_incl             0  avgt    8         90.066 ±        0.998  ns/op
[info] ScalaHashSetBenchmark.expand_incl             1  avgt    8         87.978 ±        1.458  ns/op
[info] ScalaHashSetBenchmark.expand_incl             2  avgt    8         89.384 ±        1.755  ns/op
[info] ScalaHashSetBenchmark.expand_incl             3  avgt    8         89.757 ±        2.066  ns/op
[info] ScalaHashSetBenchmark.expand_incl             4  avgt    8         91.056 ±        2.285  ns/op
[info] ScalaHashSetBenchmark.expand_incl             7  avgt    8         90.586 ±        2.600  ns/op
[info] ScalaHashSetBenchmark.expand_incl             8  avgt    8         89.426 ±        2.117  ns/op
[info] ScalaHashSetBenchmark.expand_incl            15  avgt    8         91.577 ±        2.091  ns/op
[info] ScalaHashSetBenchmark.expand_incl            16  avgt    8         91.270 ±        4.091  ns/op
[info] ScalaHashSetBenchmark.expand_incl            17  avgt    8         88.580 ±        1.009  ns/op
[info] ScalaHashSetBenchmark.expand_incl            39  avgt    8         93.450 ±        1.554  ns/op
[info] ScalaHashSetBenchmark.expand_incl           282  avgt    8        106.064 ±        2.391  ns/op
[info] ScalaHashSetBenchmark.expand_incl          4096  avgt    8        164.663 ±        6.260  ns/op
[info] ScalaHashSetBenchmark.expand_incl        131070  avgt    8        268.381 ±       14.521  ns/op
[info] ScalaHashSetBenchmark.expand_incl       7312102  avgt    8       1623.461 ±     6663.576  ns/op
[info] ScalaHashSetBenchmark.traverse_foreach        0  avgt    8          0.555 ±        0.015  ns/op
[info] ScalaHashSetBenchmark.traverse_foreach        1  avgt    8          3.231 ±        1.350  ns/op
[info] ScalaHashSetBenchmark.traverse_foreach        2  avgt    8          9.641 ±        0.044  ns/op
[info] ScalaHashSetBenchmark.traverse_foreach        3  avgt    8         12.636 ±        0.062  ns/op
[info] ScalaHashSetBenchmark.traverse_foreach        4  avgt    8         16.511 ±        0.071  ns/op
[info] ScalaHashSetBenchmark.traverse_foreach        7  avgt    8         27.772 ±        0.101  ns/op
[info] ScalaHashSetBenchmark.traverse_foreach        8  avgt    8         35.272 ±        9.712  ns/op
[info] ScalaHashSetBenchmark.traverse_foreach       15  avgt    8         58.959 ±        2.570  ns/op
[info] ScalaHashSetBenchmark.traverse_foreach       16  avgt    8         63.977 ±       10.026  ns/op
[info] ScalaHashSetBenchmark.traverse_foreach       17  avgt    8         67.039 ±        0.757  ns/op
[info] ScalaHashSetBenchmark.traverse_foreach       39  avgt    8        206.569 ±        3.630  ns/op
[info] ScalaHashSetBenchmark.traverse_foreach      282  avgt    8       1589.302 ±      145.765  ns/op
[info] ScalaHashSetBenchmark.traverse_foreach     4096  avgt    8     119922.077 ±      472.984  ns/op
[info] ScalaHashSetBenchmark.traverse_foreach   131070  avgt    8    1775649.160 ±    33109.705  ns/op
[info] ScalaHashSetBenchmark.traverse_foreach  7312102  avgt    8  466304250.750 ±  9287956.410  ns/op

@msteindorfer
Copy link
Contributor Author

@julienrf thanks. Could you also a) run the memory benchmarks, and b) run a more extended set of set operations? I'll report back in more detail later on.

@julienrf
Copy link
Contributor

julienrf commented Jan 30, 2018

Here is the result of the memory benchmark:

memory-footprint

ChampHashSet is 25% smaller than our current hashset.

@msteindorfer
Copy link
Contributor Author

@julienrf I started looking into your reports and I'm trying to run and replicate things locally.

So far I've looked into the foreach benchmark, and there the story is easy: HashSet did override and specialize foreach, but ChampHashSet didn't do before (my oversight, I fixed this now). Thus the benchmark was comparing the performance characteristics of a push-based operation (HashSet) compared to a pull-based operation (ChampHashSet).

After adding an override of foreach to ChampHashSet, the performance characteristics look different, suggesting that ChampHashSet performs equally well on small inputs, but improves approximately 30-50% for larger inputs (not visible in the logarithmic chart, but in the results table below).

traverse_foreach

[info] Benchmark                                (size)  Mode  Cnt          Score          Error  Units
[info] ChampHashSetBenchmark.traverse_foreach        0  avgt    8          1.136 ?        0.044  ns/op
[info] ChampHashSetBenchmark.traverse_foreach        1  avgt    8          5.351 ?        0.129  ns/op
[info] ChampHashSetBenchmark.traverse_foreach        2  avgt    8          9.220 ?        0.153  ns/op
[info] ChampHashSetBenchmark.traverse_foreach        3  avgt    8         12.700 ?        0.143  ns/op
[info] ChampHashSetBenchmark.traverse_foreach        4  avgt    8         19.035 ?        0.200  ns/op
[info] ChampHashSetBenchmark.traverse_foreach        7  avgt    8         26.989 ?        0.813  ns/op
[info] ChampHashSetBenchmark.traverse_foreach        8  avgt    8         30.264 ?        0.610  ns/op
[info] ChampHashSetBenchmark.traverse_foreach       15  avgt    8         55.379 ?        1.762  ns/op
[info] ChampHashSetBenchmark.traverse_foreach       16  avgt    8         58.770 ?        1.089  ns/op
[info] ChampHashSetBenchmark.traverse_foreach       17  avgt    8         62.071 ?        1.220  ns/op
[info] ChampHashSetBenchmark.traverse_foreach       39  avgt    8        211.488 ?        2.246  ns/op
[info] ChampHashSetBenchmark.traverse_foreach      282  avgt    8       1578.036 ?       40.569  ns/op
[info] ChampHashSetBenchmark.traverse_foreach     4096  avgt    8      45923.356 ?     1139.294  ns/op
[info] ChampHashSetBenchmark.traverse_foreach   131070  avgt    8    1282488.890 ?    37963.425  ns/op
[info] ChampHashSetBenchmark.traverse_foreach  7312102  avgt    8  360457156.750 ?  5656166.684  ns/op
[info] HashSetBenchmark.traverse_foreach             0  avgt    8          0.618 ?        0.014  ns/op
[info] HashSetBenchmark.traverse_foreach             1  avgt    8          2.867 ?        0.048  ns/op
[info] HashSetBenchmark.traverse_foreach             2  avgt    8          9.550 ?        0.223  ns/op
[info] HashSetBenchmark.traverse_foreach             3  avgt    8         13.307 ?        0.277  ns/op
[info] HashSetBenchmark.traverse_foreach             4  avgt    8         17.587 ?        0.392  ns/op
[info] HashSetBenchmark.traverse_foreach             7  avgt    8         29.453 ?        0.391  ns/op
[info] HashSetBenchmark.traverse_foreach             8  avgt    8         33.421 ?        0.742  ns/op
[info] HashSetBenchmark.traverse_foreach            15  avgt    8         62.318 ?        2.371  ns/op
[info] HashSetBenchmark.traverse_foreach            16  avgt    8         66.377 ?        1.577  ns/op
[info] HashSetBenchmark.traverse_foreach            17  avgt    8         71.330 ?        1.843  ns/op
[info] HashSetBenchmark.traverse_foreach            39  avgt    8        217.915 ?        4.885  ns/op
[info] HashSetBenchmark.traverse_foreach           282  avgt    8       1625.634 ?       36.022  ns/op
[info] HashSetBenchmark.traverse_foreach          4096  avgt    8      64074.659 ?      761.284  ns/op
[info] HashSetBenchmark.traverse_foreach        131070  avgt    8    1966419.305 ?    57587.043  ns/op
[info] HashSetBenchmark.traverse_foreach       7312102  avgt    8  485281057.604 ? 72040957.021  ns/op

@msteindorfer
Copy link
Contributor Author

Regarding the memory results, I found out that the MemoryBenchmark is flawed and vastly under-approximates space savings.

The memory benchmark does count the total amount of memory allocated, thus it measures the total retained heap size of the data structures, including all the boxed (!) Long references stored inside. The goal of the benchmark shouldn't be to measure how memory is occupied by boxed numbers, it should rather separate the concerns and measure the memory overhead the data structure itself (without its payload) consumes. Find below my own measurements:

/*** RETAINED SIZE (SET) --> VAST UNDERAPPROXIMATION OF SPACE SAVINGS ***/

[size =       8]      456 bytes    ScalaHashSet    Footprint{Objects=18, References=17, Primitives=[int x 26]}
[size =       8]      288 bytes    ChampHashSet    Footprint{Objects=11, References=10, Primitives=[int x 20]}

[size =    2048]   138216 bytes    ScalaHashSet    Footprint{Objects=5494, References=5493, Primitives=[int x 7542]}
[size =    2048]    89088 bytes    ChampHashSet    Footprint{Objects=3447, References=3446, Primitives=[int x 5496]}

[size = 1048576] 69270776 bytes    ScalaHashSet    Footprint{Objects=2751608, References=2751607, Primitives=[int x 3800057]}
[size = 1048576] 44108024 bytes    ChampHashSet    Footprint{Objects=1703160, References=1703159, Primitives=[int x 2751610]}


/*** OVERHEAD (SET) --> PRECISE MEASUREMENT NOT COUNTING BOXED NUMBERS ***/

[size =       8]      264 bytes    ScalaHashSet    Footprint{Objects=10, References=17, Primitives=[int x 10]}
[size =       8]       96 bytes    ChampHashSet    Footprint{Objects=3, References=10, Primitives=[int x 4]}

[size =    2048]    89064 bytes    ScalaHashSet    Footprint{Objects=3446, References=5493, Primitives=[int x 3446]}
[size =    2048]    39936 bytes    ChampHashSet    Footprint{Objects=1399, References=3446, Primitives=[int x 1400]}

[size = 1048576] 44108000 bytes    ScalaHashSet    Footprint{Objects=1703159, References=2751607, Primitives=[int x 1703159]}
[size = 1048576] 18945248 bytes    ChampHashSet    Footprint{Objects=654711, References=1703159, Primitives=[int x 654712]}

For sets, even in the case of the vast under-approximation the savings are ~35%, whereas the precise measurement shows savings beyond 50%.

For maps, the flaw of the measurement is even more severe, since it contains the double amount of boxed Long references stored:

/*** RETAINED SIZE (MAP) --> VAST UNDERAPPROXIMATION OF SPACE SAVINGS ***/

[size =       8]      712 bytes    ScalaHashMap    Footprint{Objects=26, References=49, Primitives=[int x 26]}
[size =       8]      320 bytes    ChampHashMap    Footprint{Objects=11, References=18, Primitives=[int x 20]}

[size =    2048]   203752 bytes    ScalaHashMap    Footprint{Objects=7542, References=13685, Primitives=[int x 7542]}
[size =    2048]    96616 bytes    ChampHashMap    Footprint{Objects=3447, References=5494, Primitives=[int x 5496]}

[size = 1048576] 102821144 bytes   ScalaHashMap    Footprint{Objects=3800057, References=6945403, Primitives=[int x 3800057]}
[size = 1048576]  48083992 bytes   ChampHashMap    Footprint{Objects=1703160, References=2751608, Primitives=[int x 2751610]}



/*** OVERHEAD (MAP) --> PRECISE MEASUREMENT NOT COUNTING BOXED NUMBERS ***/

[size =       8]      520 bytes    ScalaHashMap    Footprint{Objects=18, References=49, Primitives=[int x 10]}
[size =       8]      128 bytes    ChampHashMap    Footprint{Objects=3, References=18, Primitives=[int x 4]}

[size =    2048]   154600 bytes    ScalaHashMap    Footprint{Objects=5494, References=13685, Primitives=[int x 3446]}
[size =    2048]    47464 bytes    ChampHashMap    Footprint{Objects=1399, References=5494, Primitives=[int x 1400]}

[size = 1048576] 77658368 bytes    ScalaHashMap    Footprint{Objects=2751608, References=6945403, Primitives=[int x 1703159]}
[size = 1048576] 22921216 bytes    ChampHashMap    Footprint{Objects=654711, References=2751608, Primitives=[int x 654712]}

Once again, even in the case of the vast under-approximation the space savings for maps are ~50%, whereas the precise measurement shows space savings of 3-4x.

@msteindorfer
Copy link
Contributor Author

I hope that the comments above help to clarify some of the concerns mentioned above. Looking into the other issues will take me some time, I can earliest start on the weekend looking into it, but will report back afterwards. I'm also going to add the foreach optimization to ChampHashMap and update the PR afterwards.

@julienrf
Copy link
Contributor

@msteindorfer Thanks a lot for your answers! Could you please help us fixing our memory benchmark?

@julienrf
Copy link
Contributor

Would you mind adding the following benchmarks to HashSetBenchmark.scala, ChampHashSetBenchmark.scala and ScalaHashSetBenchmark.scala?

  @Benchmark
  def traverse_subsetOf(bh: Blackhole): Unit = bh.consume(xs.subsetOf(xs))

  @Benchmark
  def traverse_equals(bh: Blackhole): Unit = bh.consume(xs == xs) 

var zs: ChampHashSet[Long] = _
var zipped: ChampHashSet[(Long, Long)] = _
var randomIndices: scala.Array[Int] = _
def fresh(n: Int) = ChampHashSet((1 to n).map(_.toLong): _*)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worthwhile varying the key types during the benchmark to prevent JIT from having an unrealistically easy time inlining the hashing/equality (for Double keys, that goes through BoxesRuntime.{equals,hashCode}).

This is a general problem for our collection benchmarks, but it is particularly important for hash-structures IMO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@julienrf julienrf added this to the 0.10.0 milestone Jan 31, 2018
@julienrf
Copy link
Contributor

I’ve run the benchmarks for == and subsetOf.
Given that we have a regression in the strawman (see #380) we should focus on ChampHashSet vs ScalaHashSet. In the case of ==, ChampHashSet is in general slower for small sets (less than 10 elements), otherwise it’s in general faster (up to 2 times faster). In the case of subsetOf, ChampHashSet is always slower (like 2 times slower). @msteindorfer Do you think it would be possible to optimize subsetOf?

traverse_equals

traverse_subsetof

The numbers:

[info] Benchmark                                 (size)  Mode  Cnt           Score          Error  Units
[info] ChampHashSetBenchmark.traverse_equals          0  avgt    8           2.436 ±        0.080  ns/op
[info] ChampHashSetBenchmark.traverse_equals          1  avgt    8          19.668 ±        0.342  ns/op
[info] ChampHashSetBenchmark.traverse_equals          2  avgt    8          20.693 ±        0.199  ns/op
[info] ChampHashSetBenchmark.traverse_equals          3  avgt    8          22.091 ±        0.274  ns/op
[info] ChampHashSetBenchmark.traverse_equals          4  avgt    8          22.939 ±        0.192  ns/op
[info] ChampHashSetBenchmark.traverse_equals          7  avgt    8          25.442 ±        0.150  ns/op
[info] ChampHashSetBenchmark.traverse_equals          8  avgt    8          26.585 ±        0.124  ns/op
[info] ChampHashSetBenchmark.traverse_equals         15  avgt    8          33.311 ±        0.067  ns/op
[info] ChampHashSetBenchmark.traverse_equals         16  avgt    8          34.177 ±        0.232  ns/op
[info] ChampHashSetBenchmark.traverse_equals         17  avgt    8          34.947 ±        0.048  ns/op
[info] ChampHashSetBenchmark.traverse_equals         39  avgt    8         395.422 ±        2.428  ns/op
[info] ChampHashSetBenchmark.traverse_equals        282  avgt    8        2096.663 ±       10.208  ns/op
[info] ChampHashSetBenchmark.traverse_equals       4096  avgt    8       76481.807 ±     2367.321  ns/op
[info] ChampHashSetBenchmark.traverse_equals     131070  avgt    8     4115686.241 ±    64039.706  ns/op
[info] ChampHashSetBenchmark.traverse_equals    7312102  avgt    8   524410413.188 ± 42841549.924  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf        0  avgt    8          45.589 ±        0.626  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf        1  avgt    8          59.710 ±        0.700  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf        2  avgt    8          71.103 ±        0.401  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf        3  avgt    8          81.623 ±        0.516  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf        4  avgt    8          66.582 ±        0.247  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf        7  avgt    8          75.811 ±        0.698  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf        8  avgt    8          77.553 ±        0.775  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf       15  avgt    8         241.365 ±        1.028  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf       16  avgt    8         253.249 ±        2.505  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf       17  avgt    8         265.326 ±        1.682  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf       39  avgt    8         523.205 ±        5.191  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf      282  avgt    8        5903.059 ±      930.418  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf     4096  avgt    8      301245.615 ±     7649.100  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf   131070  avgt    8     5159882.821 ±    49655.862  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf  7312102  avgt    8  1261335431.500 ± 26853010.349  ns/op
[info] HashSetBenchmark.traverse_equals               0  avgt    8           2.415 ±        0.002  ns/op
[info] HashSetBenchmark.traverse_equals               1  avgt    8          12.655 ±        0.057  ns/op
[info] HashSetBenchmark.traverse_equals               2  avgt    8          26.789 ±        0.235  ns/op
[info] HashSetBenchmark.traverse_equals               3  avgt    8          40.711 ±        0.465  ns/op
[info] HashSetBenchmark.traverse_equals               4  avgt    8          39.059 ±        0.126  ns/op
[info] HashSetBenchmark.traverse_equals               7  avgt    8          77.667 ±        0.354  ns/op
[info] HashSetBenchmark.traverse_equals               8  avgt    8          63.410 ±        0.387  ns/op
[info] HashSetBenchmark.traverse_equals              15  avgt    8         106.194 ±        0.615  ns/op
[info] HashSetBenchmark.traverse_equals              16  avgt    8         112.619 ±        0.537  ns/op
[info] HashSetBenchmark.traverse_equals              17  avgt    8         119.017 ±        1.074  ns/op
[info] HashSetBenchmark.traverse_equals              39  avgt    8         460.248 ±        4.330  ns/op
[info] HashSetBenchmark.traverse_equals             282  avgt    8        5554.693 ±     1444.887  ns/op
[info] HashSetBenchmark.traverse_equals            4096  avgt    8      411126.959 ±    14160.384  ns/op
[info] HashSetBenchmark.traverse_equals          131070  avgt    8     7017826.772 ±    46663.504  ns/op
[info] HashSetBenchmark.traverse_equals         7312102  avgt    8  1553849050.000 ± 65732729.114  ns/op
[info] HashSetBenchmark.traverse_subsetOf             0  avgt    8           4.173 ±        0.088  ns/op
[info] HashSetBenchmark.traverse_subsetOf             1  avgt    8          16.445 ±        2.088  ns/op
[info] HashSetBenchmark.traverse_subsetOf             2  avgt    8          44.382 ±        1.726  ns/op
[info] HashSetBenchmark.traverse_subsetOf             3  avgt    8          46.305 ±        0.627  ns/op
[info] HashSetBenchmark.traverse_subsetOf             4  avgt    8          49.042 ±        0.298  ns/op
[info] HashSetBenchmark.traverse_subsetOf             7  avgt    8          67.006 ±        2.214  ns/op
[info] HashSetBenchmark.traverse_subsetOf             8  avgt    8          75.293 ±        3.569  ns/op
[info] HashSetBenchmark.traverse_subsetOf            15  avgt    8         115.733 ±        0.742  ns/op
[info] HashSetBenchmark.traverse_subsetOf            16  avgt    8         124.428 ±        5.296  ns/op
[info] HashSetBenchmark.traverse_subsetOf            17  avgt    8         128.865 ±        2.476  ns/op
[info] HashSetBenchmark.traverse_subsetOf            39  avgt    8         599.804 ±        5.912  ns/op
[info] HashSetBenchmark.traverse_subsetOf           282  avgt    8        7125.025 ±       35.233  ns/op
[info] HashSetBenchmark.traverse_subsetOf          4096  avgt    8      430797.394 ±      310.871  ns/op
[info] HashSetBenchmark.traverse_subsetOf        131070  avgt    8     7000676.737 ±   133763.414  ns/op
[info] HashSetBenchmark.traverse_subsetOf       7312102  avgt    8  1592357061.875 ± 21602188.882  ns/op
[info] ScalaHashSetBenchmark.traverse_equals          0  avgt    8           2.582 ±        0.602  ns/op
[info] ScalaHashSetBenchmark.traverse_equals          1  avgt    8           3.566 ±        0.014  ns/op
[info] ScalaHashSetBenchmark.traverse_equals          2  avgt    8           9.544 ±        0.445  ns/op
[info] ScalaHashSetBenchmark.traverse_equals          3  avgt    8          11.270 ±        0.306  ns/op
[info] ScalaHashSetBenchmark.traverse_equals          4  avgt    8          13.393 ±        0.009  ns/op
[info] ScalaHashSetBenchmark.traverse_equals          7  avgt    8          21.796 ±        4.993  ns/op
[info] ScalaHashSetBenchmark.traverse_equals          8  avgt    8          22.707 ±        0.018  ns/op
[info] ScalaHashSetBenchmark.traverse_equals         15  avgt    8          38.874 ±        0.114  ns/op
[info] ScalaHashSetBenchmark.traverse_equals         16  avgt    8          41.292 ±        0.034  ns/op
[info] ScalaHashSetBenchmark.traverse_equals         17  avgt    8          46.056 ±        8.489  ns/op
[info] ScalaHashSetBenchmark.traverse_equals         39  avgt    8         257.681 ±       26.093  ns/op
[info] ScalaHashSetBenchmark.traverse_equals        282  avgt    8        2369.643 ±        9.317  ns/op
[info] ScalaHashSetBenchmark.traverse_equals       4096  avgt    8      172101.372 ±    10396.896  ns/op
[info] ScalaHashSetBenchmark.traverse_equals     131070  avgt    8     4523441.023 ±    37951.404  ns/op
[info] ScalaHashSetBenchmark.traverse_equals    7312102  avgt    8   683153542.563 ± 10142044.622  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf        0  avgt    8           2.339 ±        0.002  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf        1  avgt    8           3.118 ±        0.007  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf        2  avgt    8           9.916 ±        0.984  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf        3  avgt    8          13.375 ±        2.847  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf        4  avgt    8          14.672 ±        0.063  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf        7  avgt    8          22.450 ±        0.099  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf        8  avgt    8          25.005 ±        0.114  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf       15  avgt    8          43.116 ±        0.175  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf       16  avgt    8          45.573 ±        0.215  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf       17  avgt    8          48.388 ±        1.135  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf       39  avgt    8         256.066 ±        0.300  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf      282  avgt    8        2346.471 ±       44.361  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf     4096  avgt    8      164178.688 ±    23711.466  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf   131070  avgt    8     5005333.275 ±  1138396.788  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf  7312102  avgt    8   680744157.813 ±  6961026.653  ns/op

@msteindorfer
Copy link
Contributor Author

msteindorfer commented Feb 2, 2018

I've had a look at == and can say that I now performs much better across all the spectrum (small and big inputs). The reason why it underperformed earlier for small inputs was because I was using a for loop with a range object earlier, and now it's using a more low-level loop.

traverse_equals

[info] Benchmark                               (size)  Mode  Cnt          Score           Error  Units
[info] ChampHashSetBenchmark.traverse_equals        0  avgt    8          2.644 ?         0.044  ns/op
[info] ChampHashSetBenchmark.traverse_equals        1  avgt    8          6.571 ?         0.064  ns/op
[info] ChampHashSetBenchmark.traverse_equals        2  avgt    8          6.993 ?         0.131  ns/op
[info] ChampHashSetBenchmark.traverse_equals        3  avgt    8          7.435 ?         0.212  ns/op
[info] ChampHashSetBenchmark.traverse_equals        4  avgt    8          8.742 ?         0.098  ns/op
[info] ChampHashSetBenchmark.traverse_equals        7  avgt    8         11.038 ?         0.197  ns/op
[info] ChampHashSetBenchmark.traverse_equals        8  avgt    8         11.666 ?         0.241  ns/op
[info] ChampHashSetBenchmark.traverse_equals       15  avgt    8         13.615 ?         0.292  ns/op
[info] ChampHashSetBenchmark.traverse_equals       16  avgt    8         14.107 ?         0.427  ns/op
[info] ChampHashSetBenchmark.traverse_equals       17  avgt    8         14.454 ?         0.562  ns/op
[info] ChampHashSetBenchmark.traverse_equals       39  avgt    8         99.312 ?         2.987  ns/op
[info] ChampHashSetBenchmark.traverse_equals      282  avgt    8        933.774 ?        25.141  ns/op
[info] ChampHashSetBenchmark.traverse_equals     4096  avgt    8      42601.942 ?       467.884  ns/op
[info] ChampHashSetBenchmark.traverse_equals   131070  avgt    8    1842035.390 ?     43830.379  ns/op
[info] ChampHashSetBenchmark.traverse_equals  7312102  avgt    8  538171149.938 ?  14328909.058  ns/op
[info] ScalaHashSetBenchmark.traverse_equals        0  avgt    8          2.630 ?         0.057  ns/op
[info] ScalaHashSetBenchmark.traverse_equals        1  avgt    8          3.780 ?         0.100  ns/op
[info] ScalaHashSetBenchmark.traverse_equals        2  avgt    8         10.593 ?         0.119  ns/op
[info] ScalaHashSetBenchmark.traverse_equals        3  avgt    8         12.671 ?         0.334  ns/op
[info] ScalaHashSetBenchmark.traverse_equals        4  avgt    8         14.600 ?         0.396  ns/op
[info] ScalaHashSetBenchmark.traverse_equals        7  avgt    8         20.975 ?         0.533  ns/op
[info] ScalaHashSetBenchmark.traverse_equals        8  avgt    8         23.295 ?         0.304  ns/op
[info] ScalaHashSetBenchmark.traverse_equals       15  avgt    8         38.379 ?         0.540  ns/op
[info] ScalaHashSetBenchmark.traverse_equals       16  avgt    8         41.129 ?         0.575  ns/op
[info] ScalaHashSetBenchmark.traverse_equals       17  avgt    8         43.305 ?         0.892  ns/op
[info] ScalaHashSetBenchmark.traverse_equals       39  avgt    8        270.204 ?         6.723  ns/op
[info] ScalaHashSetBenchmark.traverse_equals      282  avgt    8       2549.605 ?        66.514  ns/op
[info] ScalaHashSetBenchmark.traverse_equals     4096  avgt    8      83515.407 ?      2679.428  ns/op
[info] ScalaHashSetBenchmark.traverse_equals   131070  avgt    8   13131542.320 ?    255304.336  ns/op
[info] ScalaHashSetBenchmark.traverse_equals  7312102  avgt    8  660486112.500 ? 189072332.316  ns/op

@julienrf
Copy link
Contributor

julienrf commented Feb 2, 2018

Wow, this is awesome, your implementation is an order of magnitude faster than the current one!

@msteindorfer
Copy link
Contributor Author

msteindorfer commented Feb 2, 2018

@julienrf, yes, I can also work on subsetOf and optimize that operation, however I think we should maybe separate this task from this PR. My goal of this PR was to implemented a hash-set and hash-map substitute that covers the same operations and at least the same optimization level as the current strawman.collection.Hash(Set|Map), ignoring all operations that were not ported yet from the collections of Scala 2.12.

PS: I the meantime, I was already prototyping a specialized subsetOf operation that improves a lot on the strawman regression, but isn't yet up to speed with the previous collections (at least at smaller sizes). I still would need more time for doing this properly, since it's not just a port, but a complete rewrite based on the CHAMP design (which is more complex that an original HAMT).

traverse_subsetof

@julienrf
Copy link
Contributor

julienrf commented Feb 2, 2018

If you think that you would need to spend a substantial amount of time to optimize the operations then I’m happy to merge this PR as soon as possible and let you open other PRs for performance improvements. Since we want to our code to compile with Dotty we are a bit constrained and can not use the return keyword in inlined methods. Can you arrange your code to not use return?

@msteindorfer
Copy link
Contributor Author

Just saw the build not succeeding on Dotty, will fix the case with return.

The reimplementations are based upon Compressed Hash-Array Mapped Prefix-trees (CHAMP), see paper "Optimizing Hash-Array Mapped Tries for Fast and Lean Immutable JVM Collections" by Steindorfer and Vinju (OOPSLA'15) for more details and descriptions of low-level performance optimizations (a pre-print of the paper is available under https://michael.steindorfer.name/publications/oopsla15.pdf). This commit closes scala#192.

The new implementations (i.e., ChampHashSet and ChampHashMap) currently exist next to the previous HashMap and HashSet. By default immutable.Map and immutable.Set now pickup the CHAMP data structures. A JVM flag (-Dstrawman.collection.immutable.useBaseline=true) allows to switch back to the previous HashSet and HashMap implementations for testing. Note, the flag and the previous HashSet and HashMap implementations will be removed in the final version of collection-strawman, but for the time being they remain to support comparing the different trade-offs and performance characteristics of the current and the new data structures.

Preliminary performance numbers of the new CHAMP data structures were presented in issue scala#192. Overall one can summarize that the CHAMP data structures significantly lower memory footprints and significantly improve all iteration-based operations and equality checks. Basic operations such as lookup, insertion, and deletion may slow down. The current state of the reimplementation does not optimize for hash-collisions yet.

Note that the CHAMP design / implementation differs from the previous immutable hashed data structures by not memoizing the hash codes of the individual elements (which may change the performance of certain workloads). If necessary, CHAMP's design allows to modularly add memoized hash codes of the individual elements (at the expense of some memory savings). Details are discussed in the paper mentioned above.
@msteindorfer
Copy link
Contributor Author

I did implement and add a specialized implementation of the subsetOf operation for the ChampHashSet that restores the performance at least in the benchmark to the Scala 2.12 HashSet performance (see chart and benchmark results below). You can find the source code of the re-implementation here: https://github.com/msteindorfer/collection-strawman/blob/new-immutable-hash-set-and-map/collections/src/main/scala/strawman/collection/immutable/ChampHashSet.scala#L406-L448

Note, with respect to the performance regression discussion of issue #380, the CHAMP design may have to re-calculate hash-codes (for a fraction of the elements) since the elements's hash codes are not cached in the the leaf nodes. See https://github.com/msteindorfer/collection-strawman/blob/new-immutable-hash-set-and-map/collections/src/main/scala/strawman/collection/immutable/ChampHashSet.scala#L433. This is desired behaviour, and enables significant memory savings, while sacrificing only little runtime performance.

traverse_subsetof

[info] Benchmark                                     (size)  Mode  Cnt           Score          Error  Units
[info] ChampHashSetBenchmark.traverse_subsetOf            0  avgt    8           2.622 ?        0.064  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf            1  avgt    8           9.222 ?        0.143  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf            2  avgt    8          13.522 ?        0.220  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf            3  avgt    8          17.507 ?        0.146  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf            4  avgt    8          21.109 ?        0.412  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf            7  avgt    8          30.665 ?        0.530  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf            8  avgt    8          33.800 ?        0.726  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf           15  avgt    8          54.292 ?        1.043  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf           16  avgt    8          58.151 ?        1.889  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf           17  avgt    8          60.593 ?        0.758  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf           39  avgt    8         249.486 ?        4.531  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf          282  avgt    8        2374.091 ?       47.854  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf         4096  avgt    8       85736.235 ?     1314.793  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf       131070  avgt    8     1998867.701 ?    15026.344  ns/op
[info] ChampHashSetBenchmark.traverse_subsetOf      7312102  avgt    8   656077694.875 ?  7421465.605  ns/op
[info] HashSetBenchmark.traverse_subsetOf                 0  avgt    8           4.638 ?        0.171  ns/op
[info] HashSetBenchmark.traverse_subsetOf                 1  avgt    8          18.206 ?        0.536  ns/op
[info] HashSetBenchmark.traverse_subsetOf                 2  avgt    8          38.303 ?        1.424  ns/op
[info] HashSetBenchmark.traverse_subsetOf                 3  avgt    8          43.705 ?        0.846  ns/op
[info] HashSetBenchmark.traverse_subsetOf                 4  avgt    8          50.182 ?        1.003  ns/op
[info] HashSetBenchmark.traverse_subsetOf                 7  avgt    8          68.212 ?        1.351  ns/op
[info] HashSetBenchmark.traverse_subsetOf                 8  avgt    8          74.523 ?        1.024  ns/op
[info] HashSetBenchmark.traverse_subsetOf                15  avgt    8         117.449 ?        1.250  ns/op
[info] HashSetBenchmark.traverse_subsetOf                16  avgt    8         124.805 ?        3.099  ns/op
[info] HashSetBenchmark.traverse_subsetOf                17  avgt    8         130.846 ?        2.091  ns/op
[info] HashSetBenchmark.traverse_subsetOf                39  avgt    8         634.184 ?       19.118  ns/op
[info] HashSetBenchmark.traverse_subsetOf               282  avgt    8        7439.831 ?      228.265  ns/op
[info] HashSetBenchmark.traverse_subsetOf              4096  avgt    8      227119.531 ?     4258.048  ns/op
[info] HashSetBenchmark.traverse_subsetOf            131070  avgt    8     8033112.073 ?   176616.379  ns/op
[info] HashSetBenchmark.traverse_subsetOf           7312102  avgt    8  1921663113.500 ? 16915702.182  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf            0  avgt    8           2.678 ?        0.216  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf            1  avgt    8           3.396 ?        0.073  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf            2  avgt    8          12.775 ?        0.415  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf            3  avgt    8          12.507 ?        0.211  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf            4  avgt    8          17.367 ?        0.457  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf            7  avgt    8          22.913 ?        0.428  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf            8  avgt    8          25.775 ?        1.085  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf           15  avgt    8          45.069 ?        1.049  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf           16  avgt    8          47.518 ?        0.641  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf           17  avgt    8          50.418 ?        0.927  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf           39  avgt    8         270.092 ?        6.716  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf          282  avgt    8        2549.277 ?       75.918  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf         4096  avgt    8       82163.910 ?     1833.250  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf       131070  avgt    8     4683206.120 ?   185428.804  ns/op
[info] ScalaHashSetBenchmark.traverse_subsetOf      7312102  avgt    8   864977047.938 ? 22122103.897  ns/op

Copy link
Contributor

@julienrf julienrf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great @msteindorfer!

@Ichoran
Copy link
Contributor

Ichoran commented Feb 5, 2018

I haven't checked it in depth but the benchmarks look great, and once we have it in we can catch any remaining correctness issues more easily. (I'll run collections-laws on it.)

@julienrf julienrf merged commit 1797dfa into scala:master Feb 5, 2018
@julienrf
Copy link
Contributor

julienrf commented Feb 5, 2018

Let’s merge it so that we can move forward!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants