Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/gc: optimize switch statements with string literal cases #10000

Closed
jonhoo opened this issue Feb 25, 2015 · 4 comments
Closed

cmd/gc: optimize switch statements with string literal cases #10000

jonhoo opened this issue Feb 25, 2015 · 4 comments

Comments

@jonhoo
Copy link

jonhoo commented Feb 25, 2015

Consider the following benchmark:

package main

import (
    "fmt"
    "testing"
)

var words = []string{"super", "califragi", "listic", "expiali", "docius"}
var wlen = len(words)

func BenchmarkSwitch(b *testing.B) {
    m := 0
    for i := 0; i < b.N; i++ {
        switch words[i%wlen] {
        case "super":
            m++
        case "califragi":
            m++
        case "listic":
            m++
        case "expiali":
            m++
        case "docius":
            m++
        }
    }
    fmt.Println(m)
}

func BenchmarkIf(b *testing.B) {
    m := 0
    for i := 0; i < b.N; i++ {
        w := words[i%wlen]
        if w == "super" {
            m++
        } else if w == "califragi" {
            m++
        } else if w == "listic" {
            m++
        } else if w == "expiali" {
            m++
        } else if w == "docius" {
            m++
        }
    }
    fmt.Println(m)
}

On my machine, this gives

$ go test -bench . -benchtime 30s
BenchmarkSwitch
2000000000          29.2 ns/op
BenchmarkIf
2000000000          22.9 ns/op
ok      x   109.375s

Despite the switch statement having more opportunity for optimization, it is consistently slower than the if variant. After a brief discussion on golang-nuts, it was pointed out that Go will move to a binary search for switch statements with more than three constant cases. Firstly, this threshold seems too low, as runtime.eqstring (because of length checks) is often an order of magnitude faster than runtime.cmpstring. Second, other optimizations can probably also be implemented here by the compiler. @josharian mentions a few in the golang-nuts thread:

I think that there can be significant improvement here, e.g. using diverge indices and lengths for the binary search followed by a final call to eqstring.

@josharian josharian self-assigned this Feb 26, 2015
@josharian
Copy link
Contributor

To expand on my last comment:

I think that we should first sort (and binary search) by string length. It is readily available in the string header, registerizable, and cheap to compare.

If we wanted to optimize further, instead of sorting (and searching) strings of the same length using cmpstring, we could instead further sort (and binary search) by the indices at which the strings vary--effectively doing something like walking a trie.

I think that using the string length is worth doing right way. It should be pretty simple and a quick win. The fancier approach I'm less sure about, at least initially, since cmpstring is written in pretty heavily optimized assembly.

@josharian josharian changed the title runtime: optimizing switch statements with constant cases cmd/gc: optimize switch statements with string literal cases Feb 26, 2015
@josharian
Copy link
Contributor

Started, but waiting for switch code cleanup before mailing to avoid churn.

@josharian
Copy link
Contributor

The benchmark here has some quirks:

  • The % operator is a significant portion of the execution time of the benchmark.
  • It is averaging the time taken by different paths into the switch statement, which hides variability by index.
  • By using string literals throughout, eqstring gets to short-circuit by doing a pointer equality check rather than having to actually compare bytes. The effect is not significant here, because the strings are short, but it is present. (This suggests an optimization opportunity for cmpstring. I will investigate separately whether doing that is worthwhile.)

Here is a set of benchmarks that provides more visibility into exactly what is going on:

package main

import "testing"

var words = [...]string{"super", "califragi", "listic", "expiali", "docius"}

func benchmarkSwitch(b *testing.B, s string) {
    m := 0
    for i := 0; i < b.N; i++ {
        switch s {
        case "super":
            m++
        case "califragi":
            m++
        case "listic":
            m++
        case "expiali":
            m++
        case "docius":
            m++
        }
    }
}

func BenchmarkSwitch0(b *testing.B) { benchmarkSwitch(b, words[0]) }
func BenchmarkSwitch1(b *testing.B) { benchmarkSwitch(b, words[1]) }
func BenchmarkSwitch2(b *testing.B) { benchmarkSwitch(b, words[2]) }
func BenchmarkSwitch3(b *testing.B) { benchmarkSwitch(b, words[3]) }
func BenchmarkSwitch4(b *testing.B) { benchmarkSwitch(b, words[4]) }

func BenchmarkSwitchNewStr0(b *testing.B) { benchmarkSwitch(b, string([]byte(words[0]))) }
func BenchmarkSwitchNewStr1(b *testing.B) { benchmarkSwitch(b, string([]byte(words[1]))) }
func BenchmarkSwitchNewStr2(b *testing.B) { benchmarkSwitch(b, string([]byte(words[2]))) }
func BenchmarkSwitchNewStr3(b *testing.B) { benchmarkSwitch(b, string([]byte(words[3]))) }
func BenchmarkSwitchNewStr4(b *testing.B) { benchmarkSwitch(b, string([]byte(words[4]))) }

func benchmarkIf(b *testing.B, s string) {
    m := 0
    for i := 0; i < b.N; i++ {
        if s == "super" {
            m++
        } else if s == "califragi" {
            m++
        } else if s == "listic" {
            m++
        } else if s == "expiali" {
            m++
        } else if s == "docius" {
            m++
        }
    }
}

func BenchmarkIf0(b *testing.B) { benchmarkIf(b, words[0]) }
func BenchmarkIf1(b *testing.B) { benchmarkIf(b, words[1]) }
func BenchmarkIf2(b *testing.B) { benchmarkIf(b, words[2]) }
func BenchmarkIf3(b *testing.B) { benchmarkIf(b, words[3]) }
func BenchmarkIf4(b *testing.B) { benchmarkIf(b, words[4]) }

func BenchmarkIfNewStr0(b *testing.B) { benchmarkIf(b, string([]byte(words[0]))) }
func BenchmarkIfNewStr1(b *testing.B) { benchmarkIf(b, string([]byte(words[1]))) }
func BenchmarkIfNewStr2(b *testing.B) { benchmarkIf(b, string([]byte(words[2]))) }
func BenchmarkIfNewStr3(b *testing.B) { benchmarkIf(b, string([]byte(words[3]))) }
func BenchmarkIfNewStr4(b *testing.B) { benchmarkIf(b, string([]byte(words[4]))) }

Here's the result of running these benchmarks:

benchmark                  old ns/op

BenchmarkIf0               3.36
BenchmarkIf1               4.45
BenchmarkIf2               5.22
BenchmarkIf3               5.56
BenchmarkIf4               10.5

BenchmarkIfNewStr0         5.26
BenchmarkIfNewStr1         7.19
BenchmarkIfNewStr2         7.23
BenchmarkIfNewStr3         7.47
BenchmarkIfNewStr4         12.4

BenchmarkSwitch0           9.56
BenchmarkSwitch1           8.73
BenchmarkSwitch2           9.38
BenchmarkSwitch3           8.66
BenchmarkSwitch4           7.99

BenchmarkSwitchNewStr0     11.3
BenchmarkSwitchNewStr1     11.1
BenchmarkSwitchNewStr2     11.0
BenchmarkSwitchNewStr3     10.3
BenchmarkSwitchNewStr4     11.0

@josharian
Copy link
Contributor

Mailed golang.org/cl/7698

@golang golang locked and limited conversation to collaborators Jun 25, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants