Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasm] Wasm.Build.Tests - split helix workload to be one per config #49559

Closed
wants to merge 17 commits into from

Conversation

radical
Copy link
Member

@radical radical commented Mar 12, 2021

No description provided.

radical added 13 commits March 11, 2021 17:35
- Essentially, we want to share builds wherever possible. Example cases:

    - Same build, but run with different hosts like v8/chrome/safari, as
      separate test runs
    - Same build but run with different command line arguments

- Sharing builds especially helps when we are AOT'ing, which is slow!

- This is done by caching the builds with the key:

    `public record BuildArgs(string ProjectName, string Config, bool AOT, string ProjectFileContents, string? ExtraBuildArgs);`

- Also, ` SharedBuildClassFixture` is added, so that the builds can be
  cleaned up after all the tests in a particular class have finished
  running.

- Each test run gets a randomly generated test id. This is used for
  creating:
  1. build paths, like `artifacts/bin/Wasm.Build.Tests/net6.0-Release/browser-wasm/xharness-output/logs/n1xwbqxi.ict`
  2. and the log for running with xharness, eg. for Chrome, are in
     `artifacts/bin/Wasm.Build.Tests/net6.0-Release/browser-wasm/xharness-output/logs/n1xwbqxi.ict/Chrome/`

- split `WasmBuildAppTest.cs` into : `BuildTestBase.cs`, and
  `MainWithArgsTests.cs`.
For AOT we generate `pinvoke-table.h` in the obj directory. But there is
one present in the runtime pack too.

In my earlier changes the order in which these were passed as include
search paths was changed from:

`"-I/runtime/pack/microsoft.netcore.app.runtime.browser-wasm/Release/runtimes/browser-wasm/native/include/wasm" "-Iartifacts/obj/mono/Wasm.Console.Sample/wasm/Release/browser-wasm/wasm/"`

.. which meant that the one from the runtime pack took precedence, and
got used. So, fix the order!

And change the property names to indicate where they are sourced from.
The environment variable is set on helix. During local testing it can be
useful when using a locally built xharness.
This is done via the environment var `WBT_TestConfigsToUse`, and takes a
comma separated value.

- Also adds a general mechanism to surface environment variables
  prefixed with `WBT_` in `CommonSettings` class
@ghost
Copy link

ghost commented Mar 12, 2021

Tagging subscribers to this area: @tarekgh, @safern
See info in area-owners.md if you want to be subscribed.

Issue Details
Author: radical
Assignees: -
Labels:

area-System.Globalization

Milestone: -

@radical radical added arch-wasm WebAssembly architecture and removed area-System.Globalization labels Mar 12, 2021
@radical radical force-pushed the wasm-helix-parallel branch from 3dc5702 to 8855c0c Compare March 13, 2021 03:48
@radical radical force-pushed the wasm-helix-parallel branch from 8855c0c to 69c17da Compare March 13, 2021 05:56
@safern
Copy link
Member

safern commented Mar 15, 2021

Question: I see that with this change the browser job is sending two helix job with the same azure devops test run. Is that correct? Should that be only 1 helix job?

  Job 36c6166c-8e84-4dc1-8767-3990a61f251e on Ubuntu.1804.Amd64.Open is completed with 2 finished work items.
  Stopping Azure Pipelines Test Run net6.0-Browser-Release-wasm-Mono_Release-buildwasmapps-Ubuntu.1804.Amd64.Open
  Job c7a3149d-b184-4901-8394-930e055e2e77 on Ubuntu.1804.Amd64.Open is completed with 2 finished work items.
  Stopping Azure Pipelines Test Run net6.0-Browser-Release-wasm-Mono_Release-buildwasmapps-Ubuntu.1804.Amd64.Open

Or are they missing something on the test run name like AOT or a test run mode?

@radical
Copy link
Member Author

radical commented Mar 15, 2021

Question: I see that with this change the browser job is sending two helix job with the same azure devops test run. Is that correct? Should that be only 1 helix job?

Or are they missing something on the test run name like AOT or a test run mode?

I want to split the test run so they run in parallel - for Debug, and Release. I tried submitting them as two helix work items in the same job, but that seemed to not run in parallel. So, I tried sending them as separate jobs. Does that sound correct?

I can modify the names, so it's clearer what they are for.

@safern
Copy link
Member

safern commented Mar 15, 2021

Does that sound correct?

Yes that sounds correct.

I can modify the names, so it's clearer what they are for.

It would be great to include the configuration on the test run name so that it is clear when there is a test failure in the test failures tab.

Also I noticed that the "new" workitems take 23 mins to run, is there a way we can do that more granular so that they are faster? The problem with such long workitems is that we are not taking so much advantage from the helix infrastructure which is parallelize as much tests or workitems as we can in multiple agents.

@radical
Copy link
Member Author

radical commented Mar 15, 2021

Also I noticed that the "new" workitems take 23 mins to run, is there a way we can do that more granular so that they are faster? The problem with such long workitems is that we are not taking so much advantage from the helix infrastructure which is parallelize as much tests or workitems as we can in multiple agents.

Yep, and I plan to do exactly that in follow up PRs, and do it in a way to reduce changes needed to the helix proj files. This one is to let the timings back to reasonable, so we can enable the Wasm.Build tests again.

@radical
Copy link
Member Author

radical commented Mar 16, 2021

@safern I'm looking at the logs for this - https://dev.azure.com/dnceng/public/_build/results?buildId=1040170&view=logs&jobId=108d2c4a-8a62-5a58-8dad-8e1042acc93c&j=108d2c4a-8a62-5a58-8dad-8e1042acc93c&t=568f884b-cc12-5fd3-e7fe-790b5ac403f4 .

(raw log with timestamps: https://dev.azure.com/dnceng/9ee6d478-d288-47f7-aacc-f6e6d082ae6d/_apis/build/builds/1040170/logs/1779)

There are four jobs submitted:

  1. Debug wasm.build.tests
  2. Release wasm.build.tests
  3. normal (running with v8, I think)
  4. wasmtestonbrowser (running with chrome)
  • wasm.build tests submission
2021-03-15T23:07:41.1142549Z   Sent Helix Job; see work items at https://helix.dot.net/api/jobs/22c04ee1-52bb-40cf-b859-5ec490aba39b/workitems?api-version=2019-06-17
2021-03-15T23:07:41.1336632Z   Sent Helix Job; see work items at https://helix.dot.net/api/jobs/a32b297a-d4a1-46c0-8ae4-c631e096edd2/workitems?api-version=2019-06-17
  • library tests submission
2021-03-15T23:07:55.4873872Z   Sent Helix Job; see work items at https://helix.dot.net/api/jobs/ef5ecd49-671c-4d2c-a48f-e262dec9ba9d/workitems?api-version=2019-06-17
2021-03-15T23:07:55.5659204Z   Sent Helix Job; see work items at https://helix.dot.net/api/jobs/e5a3970e-87c2-4b3e-8a30-fc6c4330fa59/workitems?api-version=2019-06-17

They get submitted at roughly the same time. Then (3), and (4) complete after ~23mins.

2021-03-15T23:31:37.9796887Z   Job ef5ecd49-671c-4d2c-a48f-e262dec9ba9d on Ubuntu.1804.Amd64.Open is completed with 216 finished work items.
2021-03-15T23:34:58.3293216Z   Job e5a3970e-87c2-4b3e-8a30-fc6c4330fa59 on Ubuntu.1804.Amd64.Open is completed with 216 finished work items.

And (1), and (2) complete after ~42mins after submission:

2021-03-15T23:49:57.6478348Z   Job 22c04ee1-52bb-40cf-b859-5ec490aba39b on Ubuntu.1804.Amd64.Open is completed with 2 finished work items.
2021-03-15T23:50:57.5418212Z   Job a32b297a-d4a1-46c0-8ae4-c631e096edd2 on Ubuntu.1804.Amd64.Open is completed with 2 finished work items.

But looking at the logs for the wasm build test runs, those seem to complete in roughly 25mins each. What could be causing them to take extra 17mins to return from helix?

Console log for (1), and (2):
a32b297ad4a146c08a
22c04ee152bb40cfb8

Am I doing something wrong, or is this expected?

@radical
Copy link
Member Author

radical commented Mar 16, 2021

I want to split the test run so they run in parallel - for Debug, and Release. I tried submitting them as two helix work items in the same job, but that seemed to not run in parallel. So, I tried sending them as separate jobs. Does that sound correct?

I was talking to @steveisok, and he said that submitting these as separate work items should make them run in parallel too. But his hypothesis is that in this case, since one job would have only two work items, maybe helix is doing some kinda "optimization" to just run them sequentially? Is that correct, @safern ?

And to be clear, the current PR does two separate job submissions, instead of two work items in the same job.

@radical
Copy link
Member Author

radical commented Apr 7, 2021

Closing this, because I plan to split the test runs more, and in a different manner.

@radical radical closed this Apr 7, 2021
@radical radical deleted the wasm-helix-parallel branch April 7, 2021 15:36
@ghost ghost locked as resolved and limited conversation to collaborators May 7, 2021
@karelz karelz added this to the 6.0.0 milestone May 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-wasm WebAssembly architecture
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants