Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add summary diagnostics to spec test gen #2926

Merged
merged 1 commit into from
Jul 15, 2022

Conversation

ralexstokes
Copy link
Member

@ralexstokes ralexstokes commented Jun 27, 2022

Fixes #2919.

Due to the current architecture of the spec test gen, it was most straightforward to simply write the stats for each generator to a local file.

Remaining todo:

  • collect all stats into one top-level report
  • test it out

@ralexstokes ralexstokes force-pushed the spec-diagnostis branch 5 times, most recently from 26ad566 to fb4a848 Compare June 29, 2022 16:46
@ralexstokes ralexstokes marked this pull request as ready for review June 29, 2022 23:24
Copy link
Contributor

@hwwhww hwwhww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 👍
could you share the diagnostics.json result file of a full testgen?

@@ -216,6 +227,28 @@ def output_part(out_kind: str, name: str, fn: Callable[[Path, ], None]):
if span > TIME_THRESHOLD_TO_PRINT:
summary_message += f" in {span} seconds"
print(summary_message)
diagnostics = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have an assertion to verify if generated_test_count + test_identifiers == collected_test_count?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like this doesn't hurt but I am also not convinced these assertions hold -- in particular I was considering this for the skipped test count and essentially collected == generated + skipped did NOT hold... haven't looked more into it beyond that

@ralexstokes
Copy link
Member Author

{
  "collected_test_count": 17909,
  "generated_test_count": 17646,
  "skipped_test_count": 255,
  "durations": [
    "0.8 seconds",
    "1.39 seconds",
    "36.66 seconds",
    "43.24 seconds",
    "56.74 seconds",
    "76.23 seconds",
    "83.08 seconds",
    "45.07 seconds",
    "97.03 seconds",
    "113.93 seconds",
    "60.32 seconds",
    "138.4 seconds",
    "167.47 seconds",
    "65.53 seconds",
    "0.01 seconds",
    "0.01 seconds",
    "0.01 seconds",
    "166.4 seconds",
    "128.07 seconds",
    "136.8 seconds",
    "124.1 seconds",
    "123.87 seconds",
    "138.34 seconds",
    "229.12 seconds",
    "431.12 seconds",
    "504.37 seconds",
    "517.68 seconds",
    "411.34 seconds",
    "262.53 seconds",
    "217.94 seconds",
    "343.05 seconds",
    "389.64 seconds",
    "412.88 seconds",
    "492.91 seconds",
    "673.31 seconds",
    "612.24 seconds",
    "901.78 seconds",
    "521.62 seconds",
    "595.08 seconds",
    "956.12 seconds",
    "650.26 seconds",
    "41.64 seconds",
    "601.12 seconds",
    "1145.61 seconds",
    "81.86 seconds",
    "1160.32 seconds",
    "90.79 seconds",
    "762.62 seconds",
    "899.42 seconds",
    "898.18 seconds",
    "1224.85 seconds",
    "1712.58 seconds",
    "1199.03 seconds",
    "917.17 seconds",
    "3987.39 seconds",
    "4047.06 seconds",
    "1065.26 seconds",
    "2003.51 seconds",
    "5823.07 seconds",
    "1738.97 seconds",
    "3333.55 seconds",
    "87.37 seconds",
    "42.55 seconds",
    "62.76 seconds",
    "55.19 seconds",
    "2.55 seconds",
    "1.82 seconds",
    "1329.05 seconds",
    "1248.06 seconds",
    "0.0 seconds",
    "15.32 seconds"
  ]
}

I have omitted the test_identifiers as it is just a very long list of test names

here is a sample of the output:

"mainnet::bellatrix::transition::core::pyspec_tests::transition_with_leaking_pre_fork",
  "minimal::altair::transition::core::pyspec_tests::transition_with_one_fourth_slashed_active_validators_pre_fork",
  "minimal::bellatrix::transition::core::pyspec_tests::transition_with_one_fourth_slashed_active_validators_pre_fork",
  "minimal::altair::transition::core::pyspec_tests::transition_with_attester_slashing_right_after_fork",

Copy link
Contributor

@djrtwo djrtwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_generator is a bit long of a function 😅
we could probably break it up into a number of better-named internal functions to define the steps more clearly, but don't need to do so now.

lgtm

@djrtwo djrtwo merged commit d49c98c into ethereum:dev Jul 15, 2022
@ralexstokes ralexstokes deleted the spec-diagnostis branch July 15, 2022 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add diagnostics to spec test output
3 participants