op-conductor,op-node: allow system to select port, make op-node wait for conductor endpoint #12863

protolambda · 2024-11-07T07:48:30Z

Description

Make conductor actually listen on the host that it is configured to, rather than always 0.0.0.0
Make conductor optionally default to the address it binds to, rather than a preconfigured address, for advertising their self as peer
Add conductor CLI option to change the server-address it advertises itself as, to not only use the same address it binds to, in case you bind it to an IP that doesn't resolve otherwise. If unspecified, it defaults to what address it is listening on.
Make op-node wait for the conductor RPC to be available. This is instantly available in regular operation, from CLI. But may take longer in case of an e2e test, where the conductor is started after the op-node. The conductor RPC client is lazy-loaded (initialize already runs on usage of the RPC method, rather than creation of the RPC client).

Tests

E2E system now sets up conductors without trying to reserve ports. This prevents flakes where endpoints aren't actually reserved, or where endpoints are not up and running yet.

Note: TestSequencerFailover_ActiveSequencerDown still seems to flake due to a leadership transfer timeout. Maybe because of the 1-second timeout, which is quite low for CI on limited resources. So I bumped the conductor RPC timeout to 5 seconds.

And I am leaving the op-conductor logging on debug level, so we can look at any further flakes if they do happen.

…for conductor endpoint

op-conductor/consensus/raft.go

op-e2e/system/conductor/sequencer_failover_setup.go

mslipper

looks good to me, but would like someone with more conductor experience to opine before merging.

tynes · 2024-11-07T16:28:25Z

I think this code should be moved over to the infra repo given the infra team owns the code in the infra repo and this repo is mostly owned by the protocol team

axelKingsley · 2024-11-07T17:41:23Z

axelKingsley · 2024-11-07T17:41:48Z

(My bad, I misclicked when reading this review and closed it momentarily)

zhwrd

one change suggestion, otherwise lgtm

op-conductor/consensus/raft.go

zhwrd · 2024-11-07T18:42:25Z

I think this code should be moved over to the infra repo given the infra team owns the code in the infra repo and this repo is mostly owned by the protocol team

Not sure its about what team owns what code, imo we really need e2e tests for conductor to run when op-stack changes. We actually need to expand the e2e test suite for conductor since we've had a few regressions in op-stack introduced that could be caught earlier. I'm actually also considering versioning conductor with the rest of the op-stack since they are pretty tightly coupled, but tbd

…for conductor endpoint (ethereum-optimism#12863) * op-conductor,op-node: allow system to select port, make op-node wait for conductor endpoint * op-conductor,op-node: debugging conductor test * op-conductor: more debugging * op-e2e: increase conductor timeout

protolambda added 4 commits November 7, 2024 14:30

op-conductor,op-node: allow system to select port, make op-node wait …

6fb38af

…for conductor endpoint

op-conductor,op-node: debugging conductor test

7f28116

op-conductor: more debugging

9d2335a

op-e2e: increase conductor timeout

45c4852

protolambda requested review from zhwrd, 0x00101010 and mslipper November 7, 2024 09:09

mslipper reviewed Nov 7, 2024

View reviewed changes

op-conductor/consensus/raft.go Show resolved Hide resolved

mslipper reviewed Nov 7, 2024

View reviewed changes

op-e2e/system/conductor/sequencer_failover_setup.go Show resolved Hide resolved

mslipper approved these changes Nov 7, 2024

View reviewed changes

axelKingsley closed this Nov 7, 2024

axelKingsley reopened this Nov 7, 2024

zhwrd approved these changes Nov 7, 2024

View reviewed changes

op-conductor/consensus/raft.go Show resolved Hide resolved

op-conductor/consensus/raft.go Show resolved Hide resolved

protolambda added this pull request to the merge queue Nov 8, 2024

Merged via the queue into develop with commit 5662448 Nov 8, 2024
49 checks passed

protolambda deleted the conductor-network-setup-fix branch November 8, 2024 12:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

op-conductor,op-node: allow system to select port, make op-node wait for conductor endpoint #12863

op-conductor,op-node: allow system to select port, make op-node wait for conductor endpoint #12863

protolambda commented Nov 7, 2024 •

edited

Loading

mslipper left a comment

tynes commented Nov 7, 2024

axelKingsley commented Nov 7, 2024

axelKingsley commented Nov 7, 2024

zhwrd left a comment

zhwrd commented Nov 7, 2024 •

edited

Loading

op-conductor,op-node: allow system to select port, make op-node wait for conductor endpoint #12863

op-conductor,op-node: allow system to select port, make op-node wait for conductor endpoint #12863

Conversation

protolambda commented Nov 7, 2024 • edited Loading

mslipper left a comment

Choose a reason for hiding this comment

tynes commented Nov 7, 2024

axelKingsley commented Nov 7, 2024

axelKingsley commented Nov 7, 2024

zhwrd left a comment

Choose a reason for hiding this comment

zhwrd commented Nov 7, 2024 • edited Loading

protolambda commented Nov 7, 2024 •

edited

Loading

zhwrd commented Nov 7, 2024 •

edited

Loading