Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate porting to go/analysis #376

Closed
dominikh opened this issue Jan 2, 2019 · 40 comments
Closed

Investigate porting to go/analysis #376

dominikh opened this issue Jan 2, 2019 · 40 comments

Comments

@dominikh
Copy link
Owner

dominikh commented Jan 2, 2019

Check if it is a) feasible b) worthwhile to migrate to the go/analysis framework.

@dominikh dominikh added started Issues we've started working on research labels Jan 2, 2019
@dominikh dominikh self-assigned this Jan 2, 2019
@dominikh
Copy link
Owner Author

dominikh commented Jan 3, 2019

Conflicts with: #340
Conflicts with: anything PTA based (such as our planned errcheck)
Depends on: a redesign of how facts in go/analysis work (will post more details soon)

@ainar-g
Copy link
Contributor

ainar-g commented Mar 18, 2019

Just curious, what is the status here? You've written in other issues that you plan to do it. What are the perks that we can expect?

@dominikh
Copy link
Owner Author

Status: In progress. There are still some issues in go/analysis (primarily golang/go#29616) that need resolving. I have a local branch that ports a good chunk of checks to the new framework.

Perks: Primarily interoperability. You will be able to use your favourite check runner (vet, staticcheck, golangci-lint, ...) and compile a version of it that incorporates all the various checks. For something like golangci-lint it means that it'll have to do less (hopefully none) patching of tools like staticcheck to make them reuse work.

Furthermore, go/analysis forces analyses into a specific way of thinking (modular analyses), which enables some optimizations, such as memorizing build facts on disk, as well as incremental checking. This should drastically lower CPU time for repeated runs, and lower peak memory usage since we don't need to maintain the entire type graph and AST in memory all at once.

@codesuki
Copy link

When this is done it will integrate with Bazel's rules_go, which is great!
https://github.com/bazelbuild/rules_go/blob/master/go/nogo.rst
Thank you!

@dominikh
Copy link
Owner Author

dominikh commented Apr 23, 2019

Status update: the majority of the port is done. All checks (except U1000) use the go/analysis framework. We have our own go/analysis runner (which will be used by staticcheck) which greatly reduces memory usage (where master needs 4.7 GB and 22s to check all of std, the port needs 450 MB and 23s17s) and can utilize caching to avoid from-source loading of dependencies (which significantly cuts down on the time needed to check individual packages).

unused does not fit into the analysis framework. Instead, it will require custom logic in the runner. I've designed the API in a way that should make it easily usable by other runners such as golangci-lint. Documentation on that will follow.

The current hope is to finalize the port in June, give it intense internal testing, then release a version for public testing, before finally merging it into master, some time in July or August.

Raw numbers for people who are into that:

Packages master (GOGC=100) incr (no cache) (GOGC=100) incr (cache) (GOGC=100) master (GOGC=50) incr (no cache) (GOGC=50) incr (cache) (GOGC=50)
std 22.24s 4,751,624 K 16.59s 441,900 K 16.98s 461,476 K 24.83s 4,377,792 K 21.02s 341,580 K 24.17s 357,500 K
std 21.23s 4,685,608 K 18.16s 438,556 K 17.19s 438,608 K 25.02s 4,370,436 K 24.06s 349,572 K 22.26s 344,656 K
std 22.36s 4,722,756 K 17.11s 455,968 K 17.52s 436,264 K 24.19s 4,380,668 K 25.24s 357,336 K 20.62s 370,380 K
net/http 3.27s 667,500 K 4.06s 229,748 K 1.50s 196,600 K 4.04s 590,820 K 4.95s 180,200 K 1.74s 149,844 K
net/http 3.39s 667,216 K 3.96s 218,056 K 1.49s 175,144 K 4.06s 593,552 K 4.62s 175,428 K 1.66s 149,376 K
net/http 3.06s 672,516 K 3.88s 193,736 K 1.50s 198,144 K 3.93s 590,112 K 4.51s 182,072 K 1.67s 148,568 K
trivial 2.63s 388,432 K 3.21s 146,252 K 0.32s 30,424 K 2.34s 370,836 K 3.64s 109,948 K 0.36s 29,344 K
trivial 1.98s 381,656 K 3.11s 110,064 K 0.32s 32,264 K 2.48s 372,612 K 3.72s 96,060 K 0.37s 27,328 K
trivial 2.64s 379,704 K 3.06s 115,628 K 0.32s 29,312 K 2.28s 370,600 K 3.79s 120,872 K 0.36s 28,124 K

Edit: More improvements, new numbers.

@dominikh
Copy link
Owner Author

@ianthehat has a version of gopls that integrates staticcheck at https://github.com/ianthehat/tools/tree/staticcheck – it uses the go/analysis rewrite.

@dominikh
Copy link
Owner Author

The majority of the work has been completed as of c7b3dbf – I encourage everyone to run the incr branch and report any bugs you find.

@ainar-g
Copy link
Contributor

ainar-g commented Apr 26, 2019

Holy Commit Message, Batman!

I see you removed unused from cmd/. Does this mean that the whole program mode is gone? And if it is so, do you have anything against someone maybe forking it from 2019.1.1?


Edit: Or, you know, I could actually read the message, heh.

We've deleted the gosimple, unused and megacheck binaries. They had
already been deprecated and there was little point in porting them to
the new framework. With the removal of the unused binary there is
currently no way to use its whole program mode. This will be rectified
in a follow-up commit which adds said mode as its own check in
staticcheck.

@dominikh
Copy link
Owner Author

No need for forking; quoting the holy commit message:

With the removal of the unused binary there is currently no way to use its whole program mode. This will be rectified in a follow-up commit which adds said mode as its own check in staticcheck.

@RaduBerinde
Copy link

RaduBerinde commented Apr 26, 2019

(Edit by dominikh: fixed.)

I ran statickcheck from incr against cockroachdb, it did use a lot less memory, but it took ~15 minutes (I'm using go 1.11.4):

	Command being timed: "staticcheck github.com/cockroachdb/cockroach/pkg/..."
	User time (seconds): 1118.63
	System time (seconds): 3.81
	Percent of CPU this job got: 118%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 15:43.32
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 1918664
	Average resident set size (kbytes): 0

@dominikh
Copy link
Owner Author

@RaduBerinde Can you answer the following questions?

  1. Was that run on your local system or CI? Something else?
  2. What CPU does the system use? How many cores are available to the process?
  3. Did you export GOMAXPROCS?
  4. Was the system under other load while running staticcheck?
  5. What are the timings for staticcheck std?
  6. What are the timings for staticcheck std when using the master branch instead of incr?

118% CPU utilization seem suspiciously low, making me wonder if this is a very slow or single-cored CPU, or if the system was already under load.

@RaduBerinde
Copy link

  1. Local system.
  2. It's a fairly beefy CPU: i7-8750H (6 cores, 12 threads). I noticed that for a little while in the beginning there was heavy multi-core usage, but then there was a single core with high utilization. htop was showing two of the staticcheck std processes using a lot of CPU, the rest had little usage.
  3. No
  4. Very little load
  5. staticcheck std takes 8 seconds
Command exited with non-zero status 1
	Command being timed: "staticcheck std"
	User time (seconds): 51.51
	System time (seconds): 0.63
	Percent of CPU this job got: 659%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:07.90
  1. on master, staticcheck std takes 12 seconds:
	Command being timed: "staticcheck std"
	User time (seconds): 79.67
	System time (seconds): 3.36
	Percent of CPU this job got: 640%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:12.97

Bonus: on master against cockroach, takes 50 seconds:

	Command being timed: "staticcheck github.com/cockroachdb/cockroach/pkg/..."
	User time (seconds): 397.17
	System time (seconds): 13.80
	Percent of CPU this job got: 787%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:52.22

@dominikh
Copy link
Owner Author

dominikh commented Apr 26, 2019

I'd love to test this myself, but as is tradition at this point, cockroachdb doesn't actually build for me (https://play.golang.org/p/tos6sXux_V4) – are there any other projects with similarly bad performance characteristics?

Edit: apparently the thing that fails to compile is not actually important and staticcheck happily processes github.com/cockroachdb/cockroach/pkg/... and runs into the same low CPU utilization that you encountered. I'll debug it and get back to you.

@RaduBerinde
Copy link

If you run into any other problems, this is a guide on what is needed: https://github.com/cockroachdb/cockroach/blob/master/CONTRIBUTING.md#getting-and-building

@dominikh
Copy link
Owner Author

@RaduBerinde I've pushed a fix, please try again.

The concrete issue was that one package (github.com/cockroachdb/cockroach/pkg/sql/pgwire/hba) triggered an edge case in an algorithm used by staticcheck's VRP, causing it to run for a very long time. All other packages scheduled for analysis at the time were blocked on dependencies, which explains the low CPU usage.

@RaduBerinde
Copy link

Nice, you rock. It now runs in 50s and uses 2.4GB, which is great!

@dominikh
Copy link
Owner Author

@ainar-g 00c4802 adds back support for unused's whole program mode. Please give it a spin.

@ainar-g
Copy link
Contributor

ainar-g commented Apr 27, 2019

(Edit by dominikh: fixed.)

@dominikh It mostly spins really well! Except then I give it a file that doesn't exist, it panics:

$ staticcheck.incr -unused.whole-program x.go
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x9279e9]

goroutine 1 [running]:
honnef.co/go/tools/unused.(*Checker).results(0xc000161b20, 0xc00004c700, 0x7f452a196008, 0x0)
        /home/ainar/go/src/honnef.co/go/tools/unused/unused.go:596 +0x4f9
honnef.co/go/tools/unused.(*Checker).Result(0xc000161b20, 0x0, 0x0, 0x0)
        /home/ainar/go/src/honnef.co/go/tools/unused/unused.go:521 +0x2c8
honnef.co/go/tools/lint.(*Linter).Lint(0xc0001ed948, 0xc00016b8d0, 0xc00009a080, 0x1, 0x1, 0x4c9f75, 0xc0000c2040, 0xc0000c20aa,
 0x2, 0x2)
        /home/ainar/go/src/honnef.co/go/tools/lint/lint.go:169 +0x1f61
honnef.co/go/tools/lint/lintutil.Lint(0xc000184000, 0x71, 0x80, 0xc000158a90, 0x1, 0x1, 0xc00009a080, 0x1, 0x1, 0xc00016bdc8, ..
.)
        /home/ainar/go/src/honnef.co/go/tools/lint/lintutil/util.go:312 +0x2c0
honnef.co/go/tools/lint/lintutil.ProcessFlagSet(0xc000184000, 0x71, 0x80, 0xc000158a90, 0x1, 0x1, 0xc000171200)
        /home/ainar/go/src/honnef.co/go/tools/lint/lintutil/util.go:201 +0xb6a
main.main()
        /home/ainar/go/src/honnef.co/go/tools/cmd/staticcheck/staticcheck.go:37 +0x56b

@RaduBerinde
Copy link

RaduBerinde commented Apr 27, 2019

(Edit by dominikh: fixed.)

I ran 00c4802 with -unused.whole-program on cockroach. It is only 10s slower so that's great. But it returns a ton of bogus errors. Many are functions that are used through interfaces. There are also some egregious cases:

/go2/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:38:6: func FatalOnPanic is unused (U1001)
/go2/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:147:6: func WarningfDepth is unused (U1001)
/go2/src/github.com/cockroachdb/cockroach/pkg/util/log/log.go:200:6: func FatalfDepth is unused (U1001)
radu (master) ~/roach2/pkg # ag FatalOnPanic
util/log/log.go
34:// FatalOnPanic recovers from a panic and exits the process with a
38:func FatalOnPanic() {

server/server.go
1972:	defer log.FatalOnPanic()

radu (master) ~/roach2/pkg # ag WarningfDepth
util/grpcutil/log.go
81:	log.WarningfDepth(context.TODO(), 2, "", args...)
88:	log.WarningfDepth(context.TODO(), 2, "", args...)
96:		log.WarningfDepth(context.TODO(), 2, format, args...)

util/log/log.go
142:// WarningfDepth logs to the WARNING and INFO logs, offsetting the caller's
147:func WarningfDepth(ctx context.Context, depth int, format string, args ...interface{}) {

server/debug/pprofui/ui.go
42:	log.WarningfDepth(pprofCtx(context.Background()), 1, "%s", msg)

storage/raft.go
75:	log.WarningfDepth(r.ctx, 1, "", v...)
79:	log.WarningfDepth(r.ctx, 1, format, v...)

@dominikh
Copy link
Owner Author

@ainar-g Fixed in 6f1f542
@RaduBerinde Fixed in aa43d95

Thank you for your extensive testing!

@RaduBerinde
Copy link

Awesome! Tried it again, much better. Still see some false positives:
/go2/src/github.com/cockroachdb/cockroach/pkg/sql/tests/data.go:70:6: func CreateKVInterleavedTable is unused (U1001) <- this one is used in a test in another package:

# ag CreateKVInterleavedTable
sql/tests/data.go
68:// CreateKVInterleavedTable is like CreateKVTable, but it interleaves table
70:func CreateKVInterleavedTable(t *testing.T, sqlDB *gosql.DB, numRows int) {

sql/drop_test.go
685:	tests.CreateKVInterleavedTable(t, sqlDB, numRows)
1015:	tests.CreateKVInterleavedTable(t, sqlDB, numRows)

Similar:

/go2/src/github.com/cockroachdb/cockroach/pkg/testutils/testcluster/testcluster.go:504:24: func (*TestCluster).WaitForSplitAndReplication is unused (U1001)

# ag WaitForSplitAndReplication
sql/ambiguous_commit_test.go
165:		if err := tc.WaitForSplitAndReplication(tableStartKey.Load().([]byte)); err != nil {

testutils/testcluster/testcluster.go
501:// WaitForSplitAndReplication waits for a range which starts with
504:func (tc *TestCluster) WaitForSplitAndReplication(startKey roachpb.Key) error {

@ainar-g
Copy link
Contributor

ainar-g commented Apr 28, 2019

@dominikh Confirming, the panic is gone, and the whole program mode still works on my small projects. Will check on bigger projects next week.

@ainar-g
Copy link
Contributor

ainar-g commented Apr 29, 2019

@dominikh On a bigger project I, too, have issues with interfaces. I have the following code in tests:

type errTimeout struct{}
func (errTimeout) Error() string { return "timeout" }
func (errTimeout) Timeout() bool { return true }

// …

err := &url.Error{Err: errTimeout{}}

// …

switch err := err.(type) {
case nil:
	return false
case *url.Error:
	return err != nil && err.Timeout()
default:
	return false
}

(*url.Error).Timeout uses an unexported interface to check against timeout errors:

type timeout interface {
	Timeout() bool
}

The new Whole Program Mode seems to not dive deep enough to notice that. It's a minor annoyance, to be fair, and I am actually surprised that the old unused dived that deep.

@dominikh
Copy link
Owner Author

@ainar-g I don't think we can handle this case with the new framework, not unless we load all dependencies from source code (which would defeat a lot of the performance gains). For dependencies that we don't need to load from source, we collect interfaces from export data. An unexported type may not show up in export data.

A larger concern is that the check may yield different results depending on if the dependency was loaded from source or not, i.e. if we have cached them before. I'll definitely have to investigate that.

@ainar-g
Copy link
Contributor

ainar-g commented Apr 30, 2019

As I've said, it's not a big deal for me. Just don't forget to mention that in the release notes, so that you have something to point people to when the issues start coming in :-)

RaduBerinde's case seems different though.

@dominikh
Copy link
Owner Author

dominikh commented May 3, 2019

@RaduBerinde Your issues should be fixed as of a577b01 (the fix itself being 2f699b8 but subsequent commits fixing other problems.) Please let me know if that addresses all your false positives.

@dominikh
Copy link
Owner Author

I will merge incr into master by the end of this week, unless any new issues get reported.

@RaduBerinde
Copy link

Oops, I somehow missed your previous message. I'm going to try it ASAP. I tried to go install at 76464b7 but I get:

$ go install honnef.co/go/tools/cmd/staticcheck
# honnef.co/go/tools/facts
facts/deprecated.go:136:27: pass.AllObjectFacts undefined (type *analysis.Pass has no field or method AllObjectFacts)
facts/deprecated.go:139:27: pass.AllPackageFacts undefined (type *analysis.Pass has no field or method AllPackageFacts)
facts/purity.go:171:27: pass.AllObjectFacts undefined (type *analysis.Pass has no field or method AllObjectFacts)

@dominikh
Copy link
Owner Author

@RaduBerinde you'll need to update your version of golang.org/x/tools

@RaduBerinde
Copy link

Ah, thanks, sorry I didn't dig far enough. Still seeing a couple of issues:

 staticcheck -unused.whole-program github.com/cockroachdb/cockroach/pkg/...  | ag -v "\.gw\.go|\.pb\.go|sqlDollar" |& tee /tmp/staticcheck

/go2/src/github.com/cockroachdb/cockroach/pkg/ccl/workloadccl/format/sstable.go:33:6: func ToTableDescriptor is unused (U1001)

This is used in the same file:

pkg/ccl/workloadccl/format/sstable.go
31:// ToTableDescriptor returns the corresponding TableDescriptor for a workload
33:func ToTableDescriptor(
61:	tableDesc, err := ToTableDescriptor(t, tableID, ts)

A similar one:

/go2/src/github.com/cockroachdb/cockroach/pkg/ccl/workloadccl/format/sstable.go:105:6: type addSSTableSender is unused (U1001)

 70   var ssts addSSTableSender

And these are part of a pflag.Value interface, this type is used as an instance of that:

/go2/src/github.com/cockroachdb/cockroach/pkg/cli/flags_util.go:39:24: func (*localityList).Type is unused (U1001)
/go2/src/github.com/cockroachdb/cockroach/pkg/cli/flags_util.go:42:24: func (*localityList).String is unused (U1001)
/go2/src/github.com/cockroachdb/cockroach/pkg/cli/flags_util.go:52:24: func (*localityList).Set is unused (U1001)

@RaduBerinde
Copy link

Also, a random question, would you say that this new version of staticcheck includes everything that github.com/golang/lint/golint checks? Wondering if there's value in running both.

@RaduBerinde
Copy link

The last ones went away after I added var _ pflag.Value = &localityList{} (which is good practice anyway).

@dominikh
Copy link
Owner Author

dominikh commented May 10, 2019

The ToSSTable issue is caused by a flaw in unused's graph construction, caused by packages existing multiple times thanks to test variants. Will fix.

Haven't looked into the pflag.Value issue yet; the functions shouldn't get flagged if *localityList is ever passed to a function accepting a pflag.Value as an argument.

As for golint: a subset of golint's checks is implemented in staticcheck. I think the vast majority of checks in golint are too noisy, so they're not part of staticcheck. You can find the full list at http://staticcheck.dev/docs/checks – they're in the ST category of checks.

@ainar-g
Copy link
Contributor

ainar-g commented May 11, 2019

@dominikh If Go Modules are still out of the question, I think there need to be some kind of instructions to properly build HEAD or other another branch, to exclude this form of “Oh, your foo has a wrong version.” errors.

@dominikh
Copy link
Owner Author

@ainar-g you'd think that years after Go's release people would know about go get -u?

@ainar-g
Copy link
Contributor

ainar-g commented May 11, 2019

@dominikh IIRC, go get -u doesn't know about branches other than master[1], which is what my comment primarily was about. Unless incr is the last branch you ever want to make and get tested by other people :-) A “no” is a “no” then.

[1]: Ironically, unless working in module mode, where you can do go get -u honnef.co/go/tools/cmd/staticcheck@incr.

@dominikh
Copy link
Owner Author

@ainar-g I'd be happy to explain my point of view, but this issue isn't the right place for that. Feel free to hit me up on IRC or Slack if you want to hear more.

@dominikh
Copy link
Owner Author

@RaduBerinde I have pushed more fixes for unused. Note that var _ pflag.Value = &localityList{} shouldn't be necessary, we automatically mark the methods as used because they implement pflag.Value. This was just broken in some cases before.

When I run staticcheck -unused.whole-program against github.com/cockroachdb/cockroach/pkg/..., the only false positives I could spot are due to #476. Aside from that, it only reports the Code* constants in sql/pgwire/pgerror, and two actually unused types.

I hope I've managed to unbreak all of unused now. For reference, the latest commit as of writing is 4187a5f.

@RaduBerinde
Copy link

Awesome, thank you so much! Yeah, we already have whitelists for the ones you mentioned. The reflect case we have I don't think can be solved - the function names that are called through reflection come from external testfiles.

@dominikh
Copy link
Owner Author

incr has been merged into master. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants