Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make go-fuzz-build work with modules (for #195) #263

Closed
wants to merge 1 commit into from
Closed

make go-fuzz-build work with modules (for #195) #263

wants to merge 1 commit into from

Conversation

kjk
Copy link
Contributor

@kjk kjk commented Aug 28, 2019

Empirically, this change makes go-fuzz-build work with modules.

For testing I'm using https://github.com/gomarkdown/markdown package which uses go modules (i.e. has go.mod file).

Testing steps:

  • just to make sure, rm -rf ${GOPATH}/src/github.com/gomarkdown/markdown. Things might accidentally work if this is present
  • git clone https://github.com/gomarkdown/markdown.git to a directory that is outside of GOPATH e.g. ~/src/markdown
  • cd ~/src/markdown

Before this change we get the following error:

markdown kjk$ go-fuzz-build -x .
markdown.go:7:2: cannot find package "github.com/gomarkdown/markdown/ast" in any of:
	/usr/local/Cellar/go/1.12.9/libexec/src/github.com/gomarkdown/markdown/ast (from $GOROOT)
	/Users/kjk/go/src/github.com/gomarkdown/markdown/ast (from $GOPATH)

The error comes from c.loadPkg(pkg) // load and typecheck pkg step, although all the heavy lifting is done by /x/tools library.

This is not surprising. go-fuzz-build forces GO111MODULE=off which means that the tooling doesn't understand go.mod file. Since github.com/gomarkdown/markdown uses github.com/gomarkdown/markdown/ast, the tooling is trying to find that code in GOPATH but it's not there, so it fails.

After this change, which merely sets GO111MODULE=auto, both go-fuzz-build -x github.com/gomarkdown/markdown and go-fuzz-build -x . complete the build.

This is not unexpected - we tell the tools to figure out modules vs. GOPATH and act accordingly.

I've also had a version that checked if go.mod file exists and setting GO111MODULE=off or GO111MODULE=on based on that. It also worked.

I can't say this will work on every module-enabled repo but it does fix the problem on the one I've tested.

cc @yevgenypats

@yevgenypats
Copy link
Contributor

cc @dvyukov @josharian @mvdan

@yevgenypats
Copy link
Contributor

yevgenypats commented Aug 28, 2019

Thanks @kjk!

Well I guess more testing is needed though this fix looks quite trivial so if this really works this is awesome:)

I'll do some testing myself as well. please feel free to comment on issues you are aware of with this fix.

@mvdan
Copy link
Contributor

mvdan commented Aug 28, 2019

This needs tests :)

@kjk
Copy link
Contributor Author

kjk commented Aug 28, 2019

@mvdan do you have any specific test in mind?

I could add test2 directory with go.mod and code to .travis.yml that builds it with go-fuzz-build.

@mvdan
Copy link
Contributor

mvdan commented Aug 28, 2019

No specific test in mind; just that this one-line change is subtle and will likely break some users, so we should have extensive tests covering multiple realistic scenarios. For example:

  • in GOPATH
  • in GOPATH, with a go.mod present
  • outside GOPATH
  • outside GOPATH, with a go.mod present

You should check which of these worked before the change and after, and see if any regressions are introduced.

@yevgenypats
Copy link
Contributor

I can confirm that this works on https://github.com/coredns/coredns with go modules enabled.

before jumping and writing complicated tests my suggestion is the following:

use an "experimental" BETA_GO111MODULES_ENABLED flag which will be an opt-in one so only users who specify this flag will get this commit other users will have the old behaviour. This way we will be able to test it on an extensive number of projects before writing tests and making this default behaviour.

Also we can add this experimental flag to the documentation so other people will be able to experiment with this.

Let me know what you think

@mvdan
Copy link
Contributor

mvdan commented Aug 28, 2019

I think before adding a custom env var, a better option would be to simply leave the existing GO111MODULE env var as-is. People who want to use GOPATH mode already set it up, just like people who want to use modules leave it alone or turn it on. After all, this env var is already shifting more towards being on by default with each Go release.

@yevgenypats
Copy link
Contributor

@mvdan This fine by me. it just that it might break behaviour for users who didn't explicitly set GO111MODULE=off.

@yevgenypats
Copy link
Contributor

Also @dominikh brought this comment https://github.com/dvyukov/go-fuzz/blob/master/go-fuzz-build/main.go#L626 to my attention which I'm not sure what it means but maybe @josharian @dvyukov knows how this affects go-modules

@dominikh
Copy link

So, there are a couple interesting things about how go-fuzz-build works with this change. go-fuzz-build constructs a workdir that contains a copy of the package under test and all of its dependencies. This currently always constructs a complete GOPATH, not a module. This is probably(tm) fine. If go/packages has already resolved all the module dependencies, then we don't need a go.mod to resolve dependencies correctly.

go/packages in module mode also takes care of the module path. If we have a go.mod with module example.org/foo/bar and a package baz in said module, then go-fuzz-build will create the following directory in the workdir: gopath/src/example.org/foo/bar/baz – this works fine.

What doesn't work is a fun interaction with go.mod and working in GOPATH mode. Even when working in GOPATH, and even when GO111MODULE=off, go build still considers go.mod for the module path.

An example is the nats-io/nats-server project. Said project has a go.mod saying module github.com/nats-io/nats-server/v2 – however, the directory containing said go.mod file is located at $GOPATH/src/github.com/nats-io/nats-server and there is no v2 directory. Yet, imports do include the v2 component. For example, $GOPATH/src/github.com/nats-io/nats-server/server/server.go imports the package github.com/nats-io/nats-server/v2/logger. In the presence of the go.mod file, go build maps this import path to $GOPATH/src/github.com/nats-io/nats-server/logger, even though we're not using modules. This behavior is critical for interoperability of modules and GOPATH.

However, when go-fuzz-build processes the same package, it doesn't know about the v2 component, and copies the dependency to gopath/src/github.com/nats-io/nats-server/logger in its workdir. Subsequently, go build cannot find the dependency, as it a) looks for v2 b) doesn't have any go.mod to tell it otherwise.

To observe the interaction of GOPATH, go build and go.mod, you can do this: in GOPATH mode, go get github.com/nats-io/nats-server – then cd into $GOPATH/src/github.com/nats-io/nats-server and delete the go.mod file. Try to run go build and see it fail.

@kjk
Copy link
Contributor Author

kjk commented Aug 28, 2019

@dominikh go-fuzz or not, people should not be mixing up working inside GOPATH and go.mod.

Let's not fall into "perfect is the enemy of good" trap.

We know go-fuzz doesn't work at all with modules, which is a big (and growing) number of Go projects.

We know that no-one with really deep knowledge of go-fuzz code and finer points of x/tools/packages and how it handles GOPATH vs. modules is planning to do a really thorough investigation any time soon.

Not merging a fix for a large number of cases (that is easy to verify as working) because it might not cover all cases is "perfect is is the enemy of good".

The absolute worst thing that can happen is: we merge, it breaks people, they report it in the bug tracker, we revert it.

I expect that it'll fix the issue for a great number of people and if there are cases where it doesn't work, people will report them and we can investigate them.

@yevgenypats
Copy link
Contributor

I'm not an expert on the nuances of all permutations of GOPATH and go modules. But I'm totally in favour of getting the ball rolling by merging this commit (with either experimental flag or removing the hardcoded environment variable completely). And of course keeping an eye on any issues that arise.

@thepudds
Copy link
Collaborator

thepudds commented Aug 28, 2019

@kjk I suspect copying a go.mod file found in the target package into the work dir would probably help, and probably worth at least a quick try.

When in GOPATH mode, cmd/go pays attention to the existence of a go.mod file to trigger a compatibility mode called "minimal module awareness" that causes cmd/go to usually do the right thing regarding finding a /v2 or /vN in an import path, so it might "just work".

Second, one minor clarification on this comment:

go-fuzz or not, people should not be mixing up working inside GOPATH and go.mod.

It is fairly common for people to work inside GOPATH and use go.mod. That's one of the main reasons you can set GO111MODULE=on, which enables module mode even when inside GOPATH.

One approach would be to have the README say GO111MODULE=on is not supported with go-fuzz, but that might confuse people.

Alternatively, it might work to tweak the current approach in the PR of always setting GO111MODULE=auto in basePackagesConfig to instead be conditional on whether or not GO111MODULE=on is already set in the environment (and pass in GO111MODULE=on if so). That could in theory mean the user's setting of GO111MODULE=on would be respected for that portion of the process.

Of my two comments here, the first about copying go.mod is probably more important.

Personally, I'm all for incremental progress, but might be worth briefly and quickly evaluating 1-2 small tweaks to the proposed fix here, but of course, others can/should weigh in on that.

@thepudds
Copy link
Collaborator

Also, one additional related comment / FYI is that making it easier to work in GOPATH with modules is one of the changes for modules coming in the near-term with Go 1.13, where Go 1.13 is adjusting the meaning of GO111MODULE=auto to allow someone to use modules in GOPATH by default based on the presence of a go.mod file (without changing any environment variable settings). As such, it will likely become more common for people to work in GOPATH with modules with the release of Go 1.13.

@dominikh
Copy link

@dominikh go-fuzz or not, people should not be mixing up working inside GOPATH and go.mod.

It's very much intended that a Go module can be worked on from inside GOPATH. That's why go build respects the module path defined in go.mod even when working in GOPATH mode in the first place. It drives adoption of modules. I can turn my project into a module so that others can benefit from it being a module, but I am not forced into using modules myself.

Let's not fall into "perfect is the enemy of good" trap.

Nobody said that we shouldn't merge this. I documented the short-comings of the patch, and mvdan asked for test cases, so that we have clear indication of which configurations work, which didn't, and to prove that there are no regressions.

I don't care if if my concern gets fixed before or after the merge, but it should be documented and it should be fixed.

We know that no-one with really deep knowledge of go-fuzz code and finer points of x/tools/packages and how it handles GOPATH vs. modules is planning to do a really thorough investigation any time soon.

I mean… considering you tell people who know the finer points of go/packages that their concerns aren't valid...

FWIW, I don't think the change can break any existing use-cases. Just don't claim that it's a complete solution. And please don't get all defensive because people engage in discussion and code review.

@thepudds
Copy link
Collaborator

thepudds commented Aug 28, 2019

The current change is doing the following in basePackagesConfig:

	cfg.Env = append(os.Environ(), "GO111MODULE=auto")

A simpler change that might work better might be just using os.Environ() there, without explicitly setting GO111MODULE=auto there, with the intent of just using whatever the user has (or has not) set for GO111MODULE. (Or maybe equivalently, perhaps don't set cfg.Env there, given packages.Config seems to say if Env is nil, the current environment is used).

That might work better, and might align better with people's expectations. It also might possibly be a safer change, especially with the coming Go 1.13 change that redefines GO111MODULE=auto.

It would mean dependencies are found here based on whatever the user has currently set for GO111MODULE. (Just to spell it out -- if the user has GO111MODULE=auto set or has not set GO111MODULE at all, then it would in effect be using GO111MODULE=auto there, which is the default for GO111MODULE in both Go 1.12 and Go 1.13. If the user has GO111MODULE=off set, it would be using GO111MODULE=off there. And if user has GO111MODULE=on set, it would be using GO111MODULE=on there).

That said, there are many permutations to think through, so maybe that is not right.

However, could be worth a quick try.

@josharian
Copy link
Collaborator

Note that before we set GO111MODULE=off, it was at its default value of auto, and that caused problems in google/oss-fuzz#2188. That was why we started setting it to off, recently. I am thus a bit skeptical that this change is actually moving the ball forwards in any meaningful way. At a minimum, it would be good to confirm that setting it back to auto won't break oss-fuzz.

@yevgenypats
Copy link
Contributor

@josharian what about my suggestion of an experimental flag? as this fix a lot of problems in repos that use go-modules which currently cannot use go-fuzz at all.

@josharian
Copy link
Collaborator

@josharian what about my suggestion of an experimental flag?

I dunno. We could do something like -module=x, where x defaults to off. Or use the existing value of the GO111MODULE envvar if it has been set explicitly.

But I'd rather actually fix the underlying issue. Until we do that, there will still be a steady stream of support needs, only it'll be more complicated to deal with them, since there are more knobs that people might have fiddled with.

@kjk
Copy link
Contributor Author

kjk commented Aug 29, 2019

How about -mod flag that would internally set GO111MODULE=on ?

It won't regress anything and will make it possible to work with modules if explicitly set.

I also tested a variant that would check if go.mod exists in current directory and GO111MODULE on/off based on that. It also worked. Using auto seemed cleaner but that would be fine too.

Or we can remove setting GO111MODULE internally and just tell people: if it doesn't work for you, set GO111MOUDLE=off, which should also work for google/oss-fuzz (if I'm reading the thread there correctly).

But I'd rather actually fix the underlying issue

Not sure how actionable this is.

For the module-based project I tested, the issue was that x/packages was ignoring go.mod because we explicitly tell it to. Not doing that fixes the problem. To me that is underlying issue.

What are success criteria for accepting a fix for #195?

@josharian
Copy link
Collaborator

How about -mod flag that would internally set GO111MODULE=on ?

I would rather simply respect any existing GO111MODULE envvar that is set (use os.LookupEnv). Then if later we remove that logic, no one's scripts or invocations have to change, and we aren't left with a vestigial flag.

What are success criteria for accepting a fix for #195?

For any successful invocation of go build, an equivalent invocation of go-fuzz-build should also succeed and should build the same code (after instrumentation) as go build did.

Switching to go/packages (a few months ago) helped a lot with the "same code" problem. We clearly haven't solved the "also succeed" problem. It may be as simple as synthesizing a go.mod. I don't know exactly; finding out for sure is half of the effort in fixing it properly. The other half is writing tests.

I hope this helps clarify.

@mholt
Copy link

mholt commented Sep 4, 2019

We are anxiously awaiting this to be merged for use in fuzzing Caddy. Please let us know if we can help with this at all!

@yevgenypats
Copy link
Contributor

@thepudds am I correct that you are trying to pull this off? do you need any help?

@thepudds
Copy link
Collaborator

thepudds commented Sep 8, 2019

Hello @yevgenypats, I was able to make some progress, and I have something that seems to be working following what I think is a reasonable approach, with reasonable test coverage... but we'll have to see what the others think. ;-)

I'll hopefully send a more detailed update in the next 2-3 days? There is one more bit of history I want to double-check first. (I don't want to send the conversation in a circle or otherwise eat up time from people if I am misremembering something...).

@yevgenypats
Copy link
Contributor

@thepudds awesome!!

@josharian
Copy link
Collaborator

Given that #271 was merged, and #274 is under way, I am going to close this PR.

@josharian josharian closed this Oct 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants