Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/go: add an option to sequentially run tests #61318

Open
howardjohn opened this issue Jul 12, 2023 · 6 comments
Open

proposal: cmd/go: add an option to sequentially run tests #61318

howardjohn opened this issue Jul 12, 2023 · 6 comments
Milestone

Comments

@howardjohn
Copy link
Contributor

It is a common pattern to want to run Go tests in sequence rather than in parallel. Within a package, this is controlled by t.Parallel() and -test.parallel. However, across packages there is a different control: -p N (typically -p 1 for this use case).

This is commonly used for integration tests which are depending on exclusive access to some resource outside of the test.

However, this comes at a huge cost: -p 1 does not control just test parallelism, but all compiler actions. This means all compiling and linking is done sequentially. In our project, this has caused a 10x increase in test execution times - with -p 1, it takes 8m25s; without, 50s.

I would like a way to run test packages sequentially, while still benefiting from the parallelism in building.

Ecosystem usage

I did a rough analysis of usage of go test -p 1 across Github, and found a number of large projects doing the same. I use stars just as a rough measure of the project scope.

Code Link Stars
https://github.com/ory/hydra/blob/425c977a3aa4e519762582e0be4b1adf5043f383/.docker/Dockerfile-hsm#L33 14k
https://github.com/letsencrypt/boulder/blob/b090ffbd2ea01407ad6e580a9cea78db6548d942/README.md?plain=1#L187 4.7k
https://github.com/mongodb/mongo-go-driver/blob/f8b88fc241a91c32455f61c57557921ba7d17d2d/Makefile#L106 7.5k
https://github.com/netlify/gotrue/blob/3fabd3fc5a6fb51eb54cd88831a380a4c18a2eff/Makefile#L30 3.5k
https://github.com/netlify/gotrue/blob/3fabd3fc5a6fb51eb54cd88831a380a4c18a2eff/Makefile#L30 4.2k
https://github.com/atomix/atomix/blob/6decbb2531c39254979ff1f10e65e00c6e1dca99/controller/Makefile#L26 2.3k
https://github.com/Shopify/ghostferry/blob/3dc7ace25fad9c0e9c5c5f68c8de70af2bd08d46/Makefile#L66 600
https://github.com/vulcand/vulcand/blob/16e7d32893f7629e1e9b8fae8a0d9023b7cd338f/Makefile#L11 3k
https://github.com/gliderlabs/logspout/blob/818dd8260e52d2c148280d86170bdf5267b5c637/Makefile#L65 4.6k
https://github.com/istio/istio/blob/1cc44eed185626832487f56d74b0f18284d52b63/tests/integration/tests.mk#L84 33.3k
https://github.com/nuclio/nuclio/blob/ef5de182e9a22ebd48b68ad043a423bdf63cac90/Makefile#L689 4.9k
https://github.com/ory/kratos/blob/41b7c51c1c6b3bdff9e9ea8bb5e455e3c15c5256/Makefile#L79 8.9k

If you search "go run tests sequentually", basically all sources will tell you to use -p 1. This includes AI chatbots, popular stack overflow answers, blogs, and even some Go core maintainer recommendations.

Alternatives

  • Run multiple Go commands (such as go test ./a && go test ./b). This is not great; we lose parallelism in building.
  • Pre-build all the binaries (such as go test -c -o tests/ ./...; go test -p 1 ./...). This is demonstrated as effective here. However, it is highly limited unless proposal: cmd/go: support naming compiled tests with full qualified name #61199 is resolved as well.
  • Pre-compile only the binaries, demonstrated here. Basically we warm the build cache. This is better than nothing, but still much worse due to linking done sequentially. In our project, naive approach is 8min, this is 4min, while optimal is ~50s.
  • Put synchronization into the test binary itself. This is challenging and ineffective. Things like mutex cannot be used as it is cross-binary, so we need an external locking system. Even with that, it isn't effective. Tests will start and wait, blocking a worker. This results in something similar to this issue, which showed at least 2x worse test times.
  • Rework the tests to use one package or allow parallel execution. This is great if you can, but its a big ask.
@ianlancetaylor ianlancetaylor changed the title cmd/go: Add an option to sequentially run tests proposal: cmd/go: add an option to sequentially run tests Jul 12, 2023
@gopherbot gopherbot added this to the Proposal milestone Jul 12, 2023
@ianlancetaylor ianlancetaylor added the GoCommand cmd/go label Jul 12, 2023
@ianlancetaylor
Copy link
Contributor

CC @bcmills @matloob

@rsc
Copy link
Contributor

rsc commented Jul 12, 2023

To clarify, you are looking for a way to run each package test one at a time?
What about the tests inside a given package that are using t.Parallel?
Do you want to disable that parallelism too?

@howardjohn
Copy link
Contributor Author

To clarify, you are looking for a way to run each package test one at a time?

Yes, only one package should execute at a time during go test -somenewflag ./.... Similar to go test -p 1 ./..., but without the side effect of compilation also being serialized.

What about the tests inside a given package that are using t.Parallel? Do you want to disable that parallelism too?

I think keeping inner-package parallelism is fine and could be controlled with the existing -test.parallel

@bcmills
Copy link
Contributor

bcmills commented Jul 12, 2023

Put synchronization into the test binary itself. This is challenging and ineffective. Things like mutex cannot be used as it is cross-binary, so we need an external locking system.

Compare #33974, which would add a lockedfile.Mutex that could be used by such tests.

@JeremyLoy
Copy link

At work, we have a mix of NodeJS and Golang services. Those services communicated with a database, and as such have tests that exercise queries against a Dockerized database. As that is a shared resource, tests that hit the database must be serialized, while tests that operate purely in memory are allowed to be parallelized.

The NodeJS services have a pretty easy way of solving this using a custom Jest runner, which I'll link below. Our way of doing this in Golang is by putting all of the tests that use an external resource (like a database) in a singular package so that they run sequentially. This causes the test files to be located not alongside the source files though, which is a shame.

In the NodeJS way, the stream of tests that must be sequential is defined by filename. That isn't very robust and is error prone.

I think there should be a way to do this in Go using a testing.T method, something like t.Mutex("aKeyToIdentifyResource"). That way each test is self contained, and the package/file/folder organization isn't impacted by which tests need exclusive access to a particular resource. Multiple distinct resources can exist as well.

import { Config } from "@jest/types";
import TestRunner, { Test, TestEvents, TestRunnerContext, TestWatcher, UnsubscribeFn } from "jest-runner";

export default class CustomTestRunner {
    public supportsEventEmitters = true;

    private parallelRunner: ParallelTestRunner;
    private sequentialRunner: SequentialTestRunner;

    constructor(globalConfig: Config.GlobalConfig, context: TestRunnerContext) {
        this.parallelRunner = new ParallelTestRunner(globalConfig, context);
        this.sequentialRunner = new SequentialTestRunner({ ...globalConfig, maxWorkers: 1 }, context);
    }

    on<Name extends keyof TestEvents>(
        eventName: Name,
        listener: (eventData: TestEvents[Name]) => void | Promise<void>,
    ): UnsubscribeFn {
        const parallelUnsubscribe = this.parallelRunner.on(eventName, listener);
        const sequentialUnsubscribe = this.sequentialRunner.on(eventName, listener);

        return () => {
            parallelUnsubscribe();
            sequentialUnsubscribe();
        };
    }

    async runTests(tests: Test[], watcher: TestWatcher): Promise<void> {
        await Promise.all([
            this.sequentialRunner.runTests(tests, watcher),
            this.parallelRunner.runTests(tests, watcher),
        ]);
    }
}

class ParallelTestRunner extends TestRunner {
    async runTests(tests: Test[], watcher: TestWatcher): Promise<void> {
        const isParallelizeable = (test: Test) => !test.path.includes(".sequential.");
        const parallelizeableTests = tests.filter(isParallelizeable);

        await super.runTests(parallelizeableTests, watcher, { serial: false });
    }
}

class SequentialTestRunner extends TestRunner {
    async runTests(tests: Test[], watcher: TestWatcher): Promise<void> {
        const isSequentialTest = (test: Test) => test.path.includes(".sequential.");
        const sequential = tests.filter(isSequentialTest);

        // { serial: false } triggers the parallel test execution leveraging workers,
        // but along with { maxWorkers: 1, workerIdleMemoryLimit: ... } configuration
        // this sole worker would be killed and restarted if it exceeds the memory threshold.
        await super.runTests(sequential, watcher, { serial: false });
    }
}

@matloob
Copy link
Contributor

matloob commented May 2, 2024

cc @samthanawalla

I think we should think harder about if there's a way to better support having the individual tests synchronize access to the external resource. I wonder if there's a better way to act on tests being blocked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Incoming
Development

No branches or pull requests

7 participants