-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: test command v1.2 update #451
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the long wait and long review :)
-
First off, I have to address moving this command out of
alpha
, I think it sets us up for a problem: Either we 1) make no breaking changes to these commands from here on out, e.g. changing flag names or defaults, etc. 2) we continue to make breaking changes to flags etc, and we respect semantic versioning, incrementing a major version every time we do so, hampering our velocity. or 3) we continue making minor breaking changes to these commands without respecting Semver, shipping them in minor releases. This I believe would negatively impact our respect as a dev team, particularly if we break someone's app/dv/script, because they blindly bumped to a new minor version, and charon failed to start becuase they are now using an unknown flag. None of the three of these options sound like a strategically good decision for us, as we aim to iterate and improve on this feature quickly, what do you think we should do, pick one of these options or something else? -
I think on the run/prepare folder page, we should consider pasting sample outputs for each command. When just browsing, it would help people understand the commands quickly without running them. If we felt it made this page super long, it might be the moment to break each section into Tabs with
Run with bash
,Run with docker
,Example Output
. Wdyt?
(Could also go on the troubleshooting page, if we embed URIs in the console output, and people land to the correct anchor tag to view their results and how to 'troubleshoot' them, i.e. compare them to what we imply decent looks like. That might be better than in this guide? idk)
-
I don't love that in these test commands we have
--endpoints
,--beacon-endpoints
,--beacon-node-endpoint
,--mev-beacon-node-endpoint
, all to mean one or more BN URLs, meanwhile oncharon run
we use--beacon-node-endpoints
. Imo that's too many varieties. I think the prefixing is interesting way to organise flags, but imo the fewer varieties of these flags the fewer annoying user issues of assuming the wrong flag name. Lets see how few we can cut it down to. Imo I would try and get as close to all of them as just--beacon-node-endpoints
. -
We should also add these commands to the charon CLI reference page
-
For the MEV test:
- when I attempted the load test first, it hung on my machine. (think it might have been due to the .io relay). Now, 4 of the 6 defaults fail on createmultipleblocks test. Took 4m11s.
- Do we need 6 digits of precision for milliseconds? tbh as a human I probably need 0 digits beyond milliseconds, the variance is probably much larger than tenths or hundredths of ms. maybe in the JSON it could be more precise. would maybe make it easier to read.
- I fear a little how people might perceive the naming "mev relay load test". It might sound a bit DoS-y. more thoughts on how we structure this command to follow.
- We should allow you to run the mev test the v1.1.2 way even if you don't have a bn handy (that or we put an obol bn/public default bn in there, but we've discussed that we don't want to do that, i remember). I think many people don't have a bn easily and I don't want to hinder ux too much in that case, the simple test is likely 'good enough'.
- What if we adjusted this command such that if you pass the bn flag optionally, we do the better test, and we don't call it a load test. E.g.:
--beacon-node-endpoints If included, allows this command to ask relays for real MEV bids rather than simple status checks. Improves the realism of the test, increases the duration to a number of minutes.
- Relatedly, asking for multiple blocks might smooth the result, but i think you said it increases test time a lot because you need matches irl to the real relays + real proposers to get bids and some are rare. I would be open to just 'load testing' by default if we're given a bn endpoint, but only asking them for a single block, not the 3 block test, and then optionally, people can use --load-test-blocks to get a longer, smoother data if they want to run the command for a few hours or something. (IMO i'd rename to something like
--number-of-payloads Increases the accuracy of the test by asking for multiple payloads. Increases test duration. (Default: 1)
).
-
Not in this PR, but so I don't forget, I want to suggest a tweak on the description of the
test all
command on the charon side:
Run test suites to evaluate your DV peers, your beacon node(s), your validator client, your MEV relays, your machine's performance and internet connectivity. Verify that Charon can validate effectively on the tested setup.
Happy to chat through this feedback synchronously if you want!
docs/run/prepare/test-command.md
Outdated
--beacon-endpoints="https://ethereum-holesky-beacon-api.publicnode.com,https://ethereum-sepolia-beacon-api.publicnode.com" \ | ||
--beacon-load-test \ | ||
--mev-endpoints="https://0xac6e77dfe25ecd6110b8e780608cce0dab71fdd5ebea22a16c0205200f2f8e2e3ad3b71d3499c54ad14d6c21b41a37ae@boost-relay.flashbots.net" \ | ||
--mev-beacon-node-endpoint="https://ethereum-beacon-api.publicnode.com" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--mev-beacon-node-endpoint="https://ethereum-beacon-api.publicnode.com" \ | |
--mev-beacon-node-endpoints="https://ethereum-beacon-api.publicnode.com" \ |
Is this to get the proposer schedule to ask the relay for a block? can we just use the --beacon-endpoints
flag if present, and do the older version of the mev test if we are given none at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this to get the proposer schedule to ask the relay for a block?
It is used for 2 things:
- Get the latest block, so we can use its hash in the next one, when we query the MEV relay (it is required by it)
- Fetch proposers for an epoch, so we know for which validators to query the MEV relay
can we just use the --beacon-endpoints flag if present
Yes, we can use it and I assume you also want it for test mev
and not only for test all
. However, it will cause:
- Users being unable to test rare MEV, with
test all
as it will always get stuck finding a block. - Misleading users as we will make absolutely no use of multiple beacon nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users being unable to test rare MEV, with test all as it will always get stuck finding a block.
Hmmm, because we would try the 'good mev test' and thus not test the rare relays? I guess thats a reason to not join them, but I would also consider a flag on test all like --mev-quick-test or something to do the non real way, but can't say that's necessarily easier than omitting a different flag
Misleading users as we will make absolutely no use of multiple beacon nodes.
If we do add s
s to beacon-node-endpoint
flags places, we can create an issue to throw in the backlog about trying all beacon nodes with a pick list like @gsora developed for the --fallback-beacon-nodes feature, so if someone takes deep offence to the misleading name, we can point to the issue and say sorry, WIP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do add
s
s tobeacon-node-endpoint
flags places, we can create an issue to throw in the backlog about trying all beacon nodes with a pick list like @gsora developed for the --fallback-beacon-nodes feature, so if someone takes deep offence to the misleading name, we can point to the issue and say sorry, WIP?
Okay, but... in the MEV tests we don't test or care about the BN's performance. How fast it delivers us the payloads we ask it to doesn't change the outcome of the MEV test at all, it is testing purely the relay. And that is on purpose, to remove the bias that the BN might add. That said, I'm expecting just a healthy BN with average performance to be able to answer once per slot and once per epoch on the requests I send to it.
b345846
to
756edb4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made some suggestions, happy enough to meet you in the middle with the namings on the flags. Suggest we add back in the alpha
if that's been changed on the charon code side at this point. :)
✅ Deploy Preview for obol-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
004f1ad
to
42019fd
Compare
Change after a [comment from oisin under this PR](ObolNetwork/obol-docs#451 (comment)). category: feature ticket: none
838dccc
to
f128dae
Compare
I've rebased the PR on main |
f128dae
to
05329e7
Compare
Change after a [comment from oisin under this PR](ObolNetwork/obol-docs#451 (comment)). category: feature ticket: none
Quality Gate passedIssues Measures |
Summary
In version v1.2 we will be improving by a lot the test command. Add that to the docs.
Details
run/prepare/test-command.md
charon test all
commandperformance
toinfra
adv/troubleshooting/test-command.md
New section that explains what is the reason for each test failing. A link to it will be added at the end of running
charon test <TEST>
.