[Merged by Bors] - Use `BeaconProcessor` for API requests #4462

paulhauner · 2023-07-04T07:59:56Z

Issue Addressed

NA

Proposed Changes

Rather than spawning new tasks on the tokio executor to process each HTTP API request, send the tasks to the BeaconProcessor. This achieves:

Places a bound on how many concurrent requests are being served (i.e., how many we are actually trying to compute at one time).
Places a bound on how many requests can be awaiting a response at one time (i.e., starts dropping requests when we have too many queued).
Allows the BN prioritise HTTP requests with respect to messages coming from the P2P network (i.e., proiritise importing gossip blocks rather than serving API requests).

Presently there are two levels of priorities:

Priority::P0
- The beacon processor will prioritise these above everything other than importing new blocks.
- Roughly all validator-sensitive endpoints.
Priority::P1
- The beacon processor will prioritise practically all other P2P messages over these, except for historical backfill things.
- Everything that's not Priority::P0

The --http-enable-beacon-processor false flag can be supplied to revert back to the old behaviour of spawning new tokio tasks for each request:

        --http-enable-beacon-processor <BOOLEAN>
            The beacon processor is a scheduler which provides quality-of-service and DoS protection. When set to
            "true", HTTP API requests will queued and scheduled alongside other tasks. When set to "false", HTTP API
            responses will be executed immediately. [default: true]

New CLI Flags

I added some other new CLI flags:

        --beacon-processor-aggregate-batch-size <INTEGER>
            Specifies the number of gossip aggregate attestations in a signature verification batch. Higher values may
            reduce CPU usage in a healthy network while lower values may increase CPU usage in an unhealthy or hostile
            network. [default: 64]
        --beacon-processor-attestation-batch-size <INTEGER>
            Specifies the number of gossip attestations in a signature verification batch. Higher values may reduce CPU
            usage in a healthy network whilst lower values may increase CPU usage in an unhealthy or hostile network.
            [default: 64]
        --beacon-processor-max-workers <INTEGER>
            Specifies the maximum concurrent tasks for the task scheduler. Increasing this value may increase resource
            consumption. Reducing the value may result in decreased resource usage and diminished performance. The
            default value is the number of logical CPU cores on the host.
        --beacon-processor-reprocess-queue-len <INTEGER>
            Specifies the length of the queue for messages requiring delayed processing. Higher values may prevent
            messages from being dropped while lower values may help protect the node from becoming overwhelmed.
            [default: 12288]

I needed to add the max-workers flag since the "simulator" flavor tests started failing with HTTP timeouts on the test assertions. I believe they were failing because the Github runners only have 2 cores and there just weren't enough workers available to process our requests in time. I added the other flags since they seem fun to fiddle with.

Additional Info

I bumped the timeouts on the "simulator" flavor test from 4s to 8s. The prioritisation of consensus messages seems to be causing slower responses, I guess this is what we signed up for 🤷

The validator/register validator has some special handling because the relays have a bad habit of timing out on these calls. It seems like a waste of a BeaconProcessor worker to just wait for the builder API HTTP response, so we spawn a new tokio task to wait for a builder response.

I've added an optimisation for the GET beacon/states/{state_id}/validators/{validator_id} endpoint in efbabe3. That's the endpoint the VC uses to resolve pubkeys to validator indices, and it's the endpoint that was causing us grief. Perhaps I should move that into a new PR, not sure.

*Replaces #4434. It is identical, but this PR has a smaller diff due to a curated commit history.* ## Issue Addressed NA ## Proposed Changes This PR moves the scheduling logic for the `BeaconProcessor` into a new crate in `beacon_node/beacon_processor`. Previously it existed in the `beacon_node/network` crate. This addresses a circular-dependency problem where it's not possible to use the `BeaconProcessor` from the `beacon_chain` crate. The `network` crate depends on the `beacon_chain` crate (`network -> beacon_chain`), but importing the `BeaconProcessor` into the `beacon_chain` crate would create a circular dependancy of `beacon_chain -> network`. The `BeaconProcessor` was designed to provide queuing and prioritized scheduling for messages from the network. It has proven to be quite valuable and I believe we'd make Lighthouse more stable and effective by using it elsewhere. In particular, I think we should use the `BeaconProcessor` for: 1. HTTP API requests. 1. Scheduled tasks in the `BeaconChain` (e.g., state advance). Using the `BeaconProcessor` for these tasks would help prevent the BN from becoming overwhelmed and would also help it to prioritize operations (e.g., choosing to process blocks from gossip before responding to low-priority HTTP API requests). ## Additional Info This PR is intended to have zero impact on runtime behaviour. It aims to simply separate the *scheduling* code (i.e., the `BeaconProcessor`) from the *business logic* in the `network` crate (i.e., the `Worker` impls). Future PRs (see #4462) can build upon these works to actually use the `BeaconProcessor` for more operations. I've gone to some effort to use `git mv` to make the diff look more like "file was moved and modified" rather than "file was deleted and a new one added". This should reduce review burden and help maintain commit attribution.

paulhauner · 2023-08-08T07:34:27Z

bors r+

## Issue Addressed NA ## Proposed Changes Rather than spawning new tasks on the tokio executor to process each HTTP API request, send the tasks to the `BeaconProcessor`. This achieves: 1. Places a bound on how many concurrent requests are being served (i.e., how many we are actually trying to compute at one time). 1. Places a bound on how many requests can be awaiting a response at one time (i.e., starts dropping requests when we have too many queued). 1. Allows the BN prioritise HTTP requests with respect to messages coming from the P2P network (i.e., proiritise importing gossip blocks rather than serving API requests). Presently there are two levels of priorities: - `Priority::P0` - The beacon processor will prioritise these above everything other than importing new blocks. - Roughly all validator-sensitive endpoints. - `Priority::P1` - The beacon processor will prioritise practically all other P2P messages over these, except for historical backfill things. - Everything that's not `Priority::P0` The `--http-enable-beacon-processor false` flag can be supplied to revert back to the old behaviour of spawning new `tokio` tasks for each request: ``` --http-enable-beacon-processor <BOOLEAN> The beacon processor is a scheduler which provides quality-of-service and DoS protection. When set to "true", HTTP API requests will queued and scheduled alongside other tasks. When set to "false", HTTP API responses will be executed immediately. [default: true] ``` ## New CLI Flags I added some other new CLI flags: ``` --beacon-processor-aggregate-batch-size <INTEGER> Specifies the number of gossip aggregate attestations in a signature verification batch. Higher values may reduce CPU usage in a healthy network while lower values may increase CPU usage in an unhealthy or hostile network. [default: 64] --beacon-processor-attestation-batch-size <INTEGER> Specifies the number of gossip attestations in a signature verification batch. Higher values may reduce CPU usage in a healthy network whilst lower values may increase CPU usage in an unhealthy or hostile network. [default: 64] --beacon-processor-max-workers <INTEGER> Specifies the maximum concurrent tasks for the task scheduler. Increasing this value may increase resource consumption. Reducing the value may result in decreased resource usage and diminished performance. The default value is the number of logical CPU cores on the host. --beacon-processor-reprocess-queue-len <INTEGER> Specifies the length of the queue for messages requiring delayed processing. Higher values may prevent messages from being dropped while lower values may help protect the node from becoming overwhelmed. [default: 12288] ``` I needed to add the max-workers flag since the "simulator" flavor tests started failing with HTTP timeouts on the test assertions. I believe they were failing because the Github runners only have 2 cores and there just weren't enough workers available to process our requests in time. I added the other flags since they seem fun to fiddle with. ## Additional Info I bumped the timeouts on the "simulator" flavor test from 4s to 8s. The prioritisation of consensus messages seems to be causing slower responses, I guess this is what we signed up for 🤷 The `validator/register` validator has some special handling because the relays have a bad habit of timing out on these calls. It seems like a waste of a `BeaconProcessor` worker to just wait for the builder API HTTP response, so we spawn a new `tokio` task to wait for a builder response. I've added an optimisation for the `GET beacon/states/{state_id}/validators/{validator_id}` endpoint in [efbabe3](efbabe3). That's the endpoint the VC uses to resolve pubkeys to validator indices, and it's the endpoint that was causing us grief. Perhaps I should move that into a new PR, not sure.

bors · 2023-08-08T08:06:24Z

Build failed:

release-tests-windows

paulhauner · 2023-08-08T09:55:02Z

bors r+

## Issue Addressed NA ## Proposed Changes Rather than spawning new tasks on the tokio executor to process each HTTP API request, send the tasks to the `BeaconProcessor`. This achieves: 1. Places a bound on how many concurrent requests are being served (i.e., how many we are actually trying to compute at one time). 1. Places a bound on how many requests can be awaiting a response at one time (i.e., starts dropping requests when we have too many queued). 1. Allows the BN prioritise HTTP requests with respect to messages coming from the P2P network (i.e., proiritise importing gossip blocks rather than serving API requests). Presently there are two levels of priorities: - `Priority::P0` - The beacon processor will prioritise these above everything other than importing new blocks. - Roughly all validator-sensitive endpoints. - `Priority::P1` - The beacon processor will prioritise practically all other P2P messages over these, except for historical backfill things. - Everything that's not `Priority::P0` The `--http-enable-beacon-processor false` flag can be supplied to revert back to the old behaviour of spawning new `tokio` tasks for each request: ``` --http-enable-beacon-processor <BOOLEAN> The beacon processor is a scheduler which provides quality-of-service and DoS protection. When set to "true", HTTP API requests will queued and scheduled alongside other tasks. When set to "false", HTTP API responses will be executed immediately. [default: true] ``` ## New CLI Flags I added some other new CLI flags: ``` --beacon-processor-aggregate-batch-size <INTEGER> Specifies the number of gossip aggregate attestations in a signature verification batch. Higher values may reduce CPU usage in a healthy network while lower values may increase CPU usage in an unhealthy or hostile network. [default: 64] --beacon-processor-attestation-batch-size <INTEGER> Specifies the number of gossip attestations in a signature verification batch. Higher values may reduce CPU usage in a healthy network whilst lower values may increase CPU usage in an unhealthy or hostile network. [default: 64] --beacon-processor-max-workers <INTEGER> Specifies the maximum concurrent tasks for the task scheduler. Increasing this value may increase resource consumption. Reducing the value may result in decreased resource usage and diminished performance. The default value is the number of logical CPU cores on the host. --beacon-processor-reprocess-queue-len <INTEGER> Specifies the length of the queue for messages requiring delayed processing. Higher values may prevent messages from being dropped while lower values may help protect the node from becoming overwhelmed. [default: 12288] ``` I needed to add the max-workers flag since the "simulator" flavor tests started failing with HTTP timeouts on the test assertions. I believe they were failing because the Github runners only have 2 cores and there just weren't enough workers available to process our requests in time. I added the other flags since they seem fun to fiddle with. ## Additional Info I bumped the timeouts on the "simulator" flavor test from 4s to 8s. The prioritisation of consensus messages seems to be causing slower responses, I guess this is what we signed up for 🤷 The `validator/register` validator has some special handling because the relays have a bad habit of timing out on these calls. It seems like a waste of a `BeaconProcessor` worker to just wait for the builder API HTTP response, so we spawn a new `tokio` task to wait for a builder response. I've added an optimisation for the `GET beacon/states/{state_id}/validators/{validator_id}` endpoint in [efbabe3](efbabe3). That's the endpoint the VC uses to resolve pubkeys to validator indices, and it's the endpoint that was causing us grief. Perhaps I should move that into a new PR, not sure.

bors · 2023-08-08T10:08:23Z

Build failed:

release-tests-ubuntu

michaelsproul · 2023-08-08T23:29:58Z

third time lucky

bors retry

## Issue Addressed NA ## Proposed Changes Rather than spawning new tasks on the tokio executor to process each HTTP API request, send the tasks to the `BeaconProcessor`. This achieves: 1. Places a bound on how many concurrent requests are being served (i.e., how many we are actually trying to compute at one time). 1. Places a bound on how many requests can be awaiting a response at one time (i.e., starts dropping requests when we have too many queued). 1. Allows the BN prioritise HTTP requests with respect to messages coming from the P2P network (i.e., proiritise importing gossip blocks rather than serving API requests). Presently there are two levels of priorities: - `Priority::P0` - The beacon processor will prioritise these above everything other than importing new blocks. - Roughly all validator-sensitive endpoints. - `Priority::P1` - The beacon processor will prioritise practically all other P2P messages over these, except for historical backfill things. - Everything that's not `Priority::P0` The `--http-enable-beacon-processor false` flag can be supplied to revert back to the old behaviour of spawning new `tokio` tasks for each request: ``` --http-enable-beacon-processor <BOOLEAN> The beacon processor is a scheduler which provides quality-of-service and DoS protection. When set to "true", HTTP API requests will queued and scheduled alongside other tasks. When set to "false", HTTP API responses will be executed immediately. [default: true] ``` ## New CLI Flags I added some other new CLI flags: ``` --beacon-processor-aggregate-batch-size <INTEGER> Specifies the number of gossip aggregate attestations in a signature verification batch. Higher values may reduce CPU usage in a healthy network while lower values may increase CPU usage in an unhealthy or hostile network. [default: 64] --beacon-processor-attestation-batch-size <INTEGER> Specifies the number of gossip attestations in a signature verification batch. Higher values may reduce CPU usage in a healthy network whilst lower values may increase CPU usage in an unhealthy or hostile network. [default: 64] --beacon-processor-max-workers <INTEGER> Specifies the maximum concurrent tasks for the task scheduler. Increasing this value may increase resource consumption. Reducing the value may result in decreased resource usage and diminished performance. The default value is the number of logical CPU cores on the host. --beacon-processor-reprocess-queue-len <INTEGER> Specifies the length of the queue for messages requiring delayed processing. Higher values may prevent messages from being dropped while lower values may help protect the node from becoming overwhelmed. [default: 12288] ``` I needed to add the max-workers flag since the "simulator" flavor tests started failing with HTTP timeouts on the test assertions. I believe they were failing because the Github runners only have 2 cores and there just weren't enough workers available to process our requests in time. I added the other flags since they seem fun to fiddle with. ## Additional Info I bumped the timeouts on the "simulator" flavor test from 4s to 8s. The prioritisation of consensus messages seems to be causing slower responses, I guess this is what we signed up for 🤷 The `validator/register` validator has some special handling because the relays have a bad habit of timing out on these calls. It seems like a waste of a `BeaconProcessor` worker to just wait for the builder API HTTP response, so we spawn a new `tokio` task to wait for a builder response. I've added an optimisation for the `GET beacon/states/{state_id}/validators/{validator_id}` endpoint in [efbabe3](efbabe3). That's the endpoint the VC uses to resolve pubkeys to validator indices, and it's the endpoint that was causing us grief. Perhaps I should move that into a new PR, not sure.

bors · 2023-08-09T00:36:06Z

Pull request successfully merged into unstable.

Build succeeded!

The publicly hosted instance of bors-ng is deprecated and will go away soon.

If you want to self-host your own instance, instructions are here.
For more help, visit the forum.

If you want to switch to GitHub's built-in merge queue, visit their help page.

## Issue Addressed Closes #3404 (mostly) ## Proposed Changes - Remove all uses of Warp's `and_then` (which backtracks) in favour of `then` (which doesn't). - Bump the priority of the `POST` method for `v2/blocks` to `P0`. Publishing a block needs to happen quickly. - Run the new SSZ POST endpoints on the beacon processor. I think this was missed in between merging #4462 and #4504/#4479. - Fix a minor issue in the validator registrations endpoint whereby an error from spawning the task on the beacon processor would be dropped. ## Additional Info I've tested this manually and can confirm that we no longer get the dreaded `Unsupported endpoint version` errors for queries like: ``` $ curl -X POST -H "Content-Type: application/json" --data @block.json "http://localhost:5052/eth/v2/beacon/blocks" | jq { "code": 400, "message": "BAD_REQUEST: WeakSubjectivityConflict", "stacktraces": [] } ``` ``` $ curl -X POST -H "Content-Type: application/octet-stream" --data @block.json "http://localhost:5052/eth/v2/beacon/blocks" | jq { "code": 400, "message": "BAD_REQUEST: invalid SSZ: OffsetOutOfBounds(572530811)", "stacktraces": [] } ``` ``` $ curl "http://localhost:5052/eth/v2/validator/blocks/7067595" {"code":400,"message":"BAD_REQUEST: invalid query: Invalid query string","stacktraces":[]} ``` However, I can still trigger it by leaving off the `Content-Type`. We can re-test this aspect with #4575.

## Issue Addressed Closes #4245 ## Proposed Changes - If an SSE channel fills up, send a comment instead of terminating the stream. - Add a CLI flag for scaling up the SSE buffer: `--http-sse-capacity-multiplier N`. ## Additional Info ~~Blocked on #4462. I haven't rebased on that PR yet for initial testing, because it still needs some more work to handle long-running HTTP threads.~~ - [x] Add CLI flag tests.

*Replaces sigp#4434. It is identical, but this PR has a smaller diff due to a curated commit history.* NA This PR moves the scheduling logic for the `BeaconProcessor` into a new crate in `beacon_node/beacon_processor`. Previously it existed in the `beacon_node/network` crate. This addresses a circular-dependency problem where it's not possible to use the `BeaconProcessor` from the `beacon_chain` crate. The `network` crate depends on the `beacon_chain` crate (`network -> beacon_chain`), but importing the `BeaconProcessor` into the `beacon_chain` crate would create a circular dependancy of `beacon_chain -> network`. The `BeaconProcessor` was designed to provide queuing and prioritized scheduling for messages from the network. It has proven to be quite valuable and I believe we'd make Lighthouse more stable and effective by using it elsewhere. In particular, I think we should use the `BeaconProcessor` for: 1. HTTP API requests. 1. Scheduled tasks in the `BeaconChain` (e.g., state advance). Using the `BeaconProcessor` for these tasks would help prevent the BN from becoming overwhelmed and would also help it to prioritize operations (e.g., choosing to process blocks from gossip before responding to low-priority HTTP API requests). This PR is intended to have zero impact on runtime behaviour. It aims to simply separate the *scheduling* code (i.e., the `BeaconProcessor`) from the *business logic* in the `network` crate (i.e., the `Worker` impls). Future PRs (see sigp#4462) can build upon these works to actually use the `BeaconProcessor` for more operations. I've gone to some effort to use `git mv` to make the diff look more like "file was moved and modified" rather than "file was deleted and a new one added". This should reduce review burden and help maintain commit attribution.

NA Rather than spawning new tasks on the tokio executor to process each HTTP API request, send the tasks to the `BeaconProcessor`. This achieves: 1. Places a bound on how many concurrent requests are being served (i.e., how many we are actually trying to compute at one time). 1. Places a bound on how many requests can be awaiting a response at one time (i.e., starts dropping requests when we have too many queued). 1. Allows the BN prioritise HTTP requests with respect to messages coming from the P2P network (i.e., proiritise importing gossip blocks rather than serving API requests). Presently there are two levels of priorities: - `Priority::P0` - The beacon processor will prioritise these above everything other than importing new blocks. - Roughly all validator-sensitive endpoints. - `Priority::P1` - The beacon processor will prioritise practically all other P2P messages over these, except for historical backfill things. - Everything that's not `Priority::P0` The `--http-enable-beacon-processor false` flag can be supplied to revert back to the old behaviour of spawning new `tokio` tasks for each request: ``` --http-enable-beacon-processor <BOOLEAN> The beacon processor is a scheduler which provides quality-of-service and DoS protection. When set to "true", HTTP API requests will queued and scheduled alongside other tasks. When set to "false", HTTP API responses will be executed immediately. [default: true] ``` I added some other new CLI flags: ``` --beacon-processor-aggregate-batch-size <INTEGER> Specifies the number of gossip aggregate attestations in a signature verification batch. Higher values may reduce CPU usage in a healthy network while lower values may increase CPU usage in an unhealthy or hostile network. [default: 64] --beacon-processor-attestation-batch-size <INTEGER> Specifies the number of gossip attestations in a signature verification batch. Higher values may reduce CPU usage in a healthy network whilst lower values may increase CPU usage in an unhealthy or hostile network. [default: 64] --beacon-processor-max-workers <INTEGER> Specifies the maximum concurrent tasks for the task scheduler. Increasing this value may increase resource consumption. Reducing the value may result in decreased resource usage and diminished performance. The default value is the number of logical CPU cores on the host. --beacon-processor-reprocess-queue-len <INTEGER> Specifies the length of the queue for messages requiring delayed processing. Higher values may prevent messages from being dropped while lower values may help protect the node from becoming overwhelmed. [default: 12288] ``` I needed to add the max-workers flag since the "simulator" flavor tests started failing with HTTP timeouts on the test assertions. I believe they were failing because the Github runners only have 2 cores and there just weren't enough workers available to process our requests in time. I added the other flags since they seem fun to fiddle with. I bumped the timeouts on the "simulator" flavor test from 4s to 8s. The prioritisation of consensus messages seems to be causing slower responses, I guess this is what we signed up for 🤷 The `validator/register` validator has some special handling because the relays have a bad habit of timing out on these calls. It seems like a waste of a `BeaconProcessor` worker to just wait for the builder API HTTP response, so we spawn a new `tokio` task to wait for a builder response. I've added an optimisation for the `GET beacon/states/{state_id}/validators/{validator_id}` endpoint in [efbabe3](sigp@efbabe3). That's the endpoint the VC uses to resolve pubkeys to validator indices, and it's the endpoint that was causing us grief. Perhaps I should move that into a new PR, not sure.

Closes sigp#3404 (mostly) - Remove all uses of Warp's `and_then` (which backtracks) in favour of `then` (which doesn't). - Bump the priority of the `POST` method for `v2/blocks` to `P0`. Publishing a block needs to happen quickly. - Run the new SSZ POST endpoints on the beacon processor. I think this was missed in between merging sigp#4462 and sigp#4504/sigp#4479. - Fix a minor issue in the validator registrations endpoint whereby an error from spawning the task on the beacon processor would be dropped. I've tested this manually and can confirm that we no longer get the dreaded `Unsupported endpoint version` errors for queries like: ``` $ curl -X POST -H "Content-Type: application/json" --data @block.json "http://localhost:5052/eth/v2/beacon/blocks" | jq { "code": 400, "message": "BAD_REQUEST: WeakSubjectivityConflict", "stacktraces": [] } ``` ``` $ curl -X POST -H "Content-Type: application/octet-stream" --data @block.json "http://localhost:5052/eth/v2/beacon/blocks" | jq { "code": 400, "message": "BAD_REQUEST: invalid SSZ: OffsetOutOfBounds(572530811)", "stacktraces": [] } ``` ``` $ curl "http://localhost:5052/eth/v2/validator/blocks/7067595" {"code":400,"message":"BAD_REQUEST: invalid query: Invalid query string","stacktraces":[]} ``` However, I can still trigger it by leaving off the `Content-Type`. We can re-test this aspect with sigp#4575.

Closes sigp#4245 - If an SSE channel fills up, send a comment instead of terminating the stream. - Add a CLI flag for scaling up the SSE buffer: `--http-sse-capacity-multiplier N`. ~~Blocked on sigp#4462. I haven't rebased on that PR yet for initial testing, because it still needs some more work to handle long-running HTTP threads.~~ - [x] Add CLI flag tests.

*Replaces sigp#4434. It is identical, but this PR has a smaller diff due to a curated commit history.* NA This PR moves the scheduling logic for the `BeaconProcessor` into a new crate in `beacon_node/beacon_processor`. Previously it existed in the `beacon_node/network` crate. This addresses a circular-dependency problem where it's not possible to use the `BeaconProcessor` from the `beacon_chain` crate. The `network` crate depends on the `beacon_chain` crate (`network -> beacon_chain`), but importing the `BeaconProcessor` into the `beacon_chain` crate would create a circular dependancy of `beacon_chain -> network`. The `BeaconProcessor` was designed to provide queuing and prioritized scheduling for messages from the network. It has proven to be quite valuable and I believe we'd make Lighthouse more stable and effective by using it elsewhere. In particular, I think we should use the `BeaconProcessor` for: 1. HTTP API requests. 1. Scheduled tasks in the `BeaconChain` (e.g., state advance). Using the `BeaconProcessor` for these tasks would help prevent the BN from becoming overwhelmed and would also help it to prioritize operations (e.g., choosing to process blocks from gossip before responding to low-priority HTTP API requests). This PR is intended to have zero impact on runtime behaviour. It aims to simply separate the *scheduling* code (i.e., the `BeaconProcessor`) from the *business logic* in the `network` crate (i.e., the `Worker` impls). Future PRs (see sigp#4462) can build upon these works to actually use the `BeaconProcessor` for more operations. I've gone to some effort to use `git mv` to make the diff look more like "file was moved and modified" rather than "file was deleted and a new one added". This should reduce review burden and help maintain commit attribution.

NA Rather than spawning new tasks on the tokio executor to process each HTTP API request, send the tasks to the `BeaconProcessor`. This achieves: 1. Places a bound on how many concurrent requests are being served (i.e., how many we are actually trying to compute at one time). 1. Places a bound on how many requests can be awaiting a response at one time (i.e., starts dropping requests when we have too many queued). 1. Allows the BN prioritise HTTP requests with respect to messages coming from the P2P network (i.e., proiritise importing gossip blocks rather than serving API requests). Presently there are two levels of priorities: - `Priority::P0` - The beacon processor will prioritise these above everything other than importing new blocks. - Roughly all validator-sensitive endpoints. - `Priority::P1` - The beacon processor will prioritise practically all other P2P messages over these, except for historical backfill things. - Everything that's not `Priority::P0` The `--http-enable-beacon-processor false` flag can be supplied to revert back to the old behaviour of spawning new `tokio` tasks for each request: ``` --http-enable-beacon-processor <BOOLEAN> The beacon processor is a scheduler which provides quality-of-service and DoS protection. When set to "true", HTTP API requests will queued and scheduled alongside other tasks. When set to "false", HTTP API responses will be executed immediately. [default: true] ``` I added some other new CLI flags: ``` --beacon-processor-aggregate-batch-size <INTEGER> Specifies the number of gossip aggregate attestations in a signature verification batch. Higher values may reduce CPU usage in a healthy network while lower values may increase CPU usage in an unhealthy or hostile network. [default: 64] --beacon-processor-attestation-batch-size <INTEGER> Specifies the number of gossip attestations in a signature verification batch. Higher values may reduce CPU usage in a healthy network whilst lower values may increase CPU usage in an unhealthy or hostile network. [default: 64] --beacon-processor-max-workers <INTEGER> Specifies the maximum concurrent tasks for the task scheduler. Increasing this value may increase resource consumption. Reducing the value may result in decreased resource usage and diminished performance. The default value is the number of logical CPU cores on the host. --beacon-processor-reprocess-queue-len <INTEGER> Specifies the length of the queue for messages requiring delayed processing. Higher values may prevent messages from being dropped while lower values may help protect the node from becoming overwhelmed. [default: 12288] ``` I needed to add the max-workers flag since the "simulator" flavor tests started failing with HTTP timeouts on the test assertions. I believe they were failing because the Github runners only have 2 cores and there just weren't enough workers available to process our requests in time. I added the other flags since they seem fun to fiddle with. I bumped the timeouts on the "simulator" flavor test from 4s to 8s. The prioritisation of consensus messages seems to be causing slower responses, I guess this is what we signed up for 🤷 The `validator/register` validator has some special handling because the relays have a bad habit of timing out on these calls. It seems like a waste of a `BeaconProcessor` worker to just wait for the builder API HTTP response, so we spawn a new `tokio` task to wait for a builder response. I've added an optimisation for the `GET beacon/states/{state_id}/validators/{validator_id}` endpoint in [efbabe3](sigp@efbabe3). That's the endpoint the VC uses to resolve pubkeys to validator indices, and it's the endpoint that was causing us grief. Perhaps I should move that into a new PR, not sure.

Closes sigp#3404 (mostly) - Remove all uses of Warp's `and_then` (which backtracks) in favour of `then` (which doesn't). - Bump the priority of the `POST` method for `v2/blocks` to `P0`. Publishing a block needs to happen quickly. - Run the new SSZ POST endpoints on the beacon processor. I think this was missed in between merging sigp#4462 and sigp#4504/sigp#4479. - Fix a minor issue in the validator registrations endpoint whereby an error from spawning the task on the beacon processor would be dropped. I've tested this manually and can confirm that we no longer get the dreaded `Unsupported endpoint version` errors for queries like: ``` $ curl -X POST -H "Content-Type: application/json" --data @block.json "http://localhost:5052/eth/v2/beacon/blocks" | jq { "code": 400, "message": "BAD_REQUEST: WeakSubjectivityConflict", "stacktraces": [] } ``` ``` $ curl -X POST -H "Content-Type: application/octet-stream" --data @block.json "http://localhost:5052/eth/v2/beacon/blocks" | jq { "code": 400, "message": "BAD_REQUEST: invalid SSZ: OffsetOutOfBounds(572530811)", "stacktraces": [] } ``` ``` $ curl "http://localhost:5052/eth/v2/validator/blocks/7067595" {"code":400,"message":"BAD_REQUEST: invalid query: Invalid query string","stacktraces":[]} ``` However, I can still trigger it by leaving off the `Content-Type`. We can re-test this aspect with sigp#4575.

Closes sigp#4245 - If an SSE channel fills up, send a comment instead of terminating the stream. - Add a CLI flag for scaling up the SSE buffer: `--http-sse-capacity-multiplier N`. ~~Blocked on sigp#4462. I haven't rebased on that PR yet for initial testing, because it still needs some more work to handle long-running HTTP threads.~~ - [x] Add CLI flag tests.

paulhauner added blocked work-in-progress PR is a work-in-progress labels Jul 4, 2023

paulhauner force-pushed the api-uses-bp branch from cc40f1e to 1f9a7a0 Compare July 5, 2023 03:00

paulhauner mentioned this pull request Jul 5, 2023

[Merged by Bors] - Move the BeaconProcessor into a new crate #4435

Closed

michaelsproul mentioned this pull request Jul 6, 2023

Spurious Error whilst producing block when using block relays #4473

Closed

paulhauner added 22 commits July 11, 2023 09:30

Start adding HTTP queues

238837b

Start wiring up beacon processor

e911287

Start adding TaskSpawner

da96733

Refactor to avoid repetition

5e19518

Add async fns to task spawner

be7f32f

Start wiring task spawner into routes

c5f159e

Wire task_spawner into all routes

5914e36

Add P1 queue to BP

799d20c

Start simplifying TaskSpawner

a7b259d

Futher simplify TaskSpawner

10dc153

Hold the TestRuntime in tests to prevent BP shutdown

b01b595

Dedup runtime shutdown

1ea991c

Bump syncing and health endpoints to P0

eb2be2c

Bypass the task_spawner for node/version

8a76593

Add BeaconProcessorConfig

1157568

Add CLI flags

e03017d

Set max_workers to 4 in sims

596df18

Add batch size CLI flags

15ed85d

Add -len suffix to flags.

f576b94

Tidy help text

90a8f8a

Bump max workers to 8

1d84f94

Remove unused import

9133265

paulhauner force-pushed the api-uses-bp branch from a164a51 to 9133265 Compare July 10, 2023 23:31

Merge branch 'unstable' into api-uses-bp

4e6b868

paulhauner added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Aug 8, 2023

bors bot changed the title ~~Use BeaconProcessor for API requests~~ [Merged by Bors] - Use BeaconProcessor for API requests Aug 9, 2023

bors bot closed this Aug 9, 2023

michaelsproul mentioned this pull request Aug 10, 2023

[Merged by Bors] - Improve HTTP API error messages + tweaks #4595

Closed

michaelsproul mentioned this pull request Aug 18, 2023

[Merged by Bors] - Implement expected withdrawals endpoint #4390

Closed

michaelsproul mentioned this pull request Sep 19, 2023

[BUG] Node lost synchronisation during 20 minutes #4747

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Merged by Bors] - Use `BeaconProcessor` for API requests #4462

[Merged by Bors] - Use `BeaconProcessor` for API requests #4462

paulhauner commented Jul 4, 2023 •

edited

Loading

paulhauner commented Aug 8, 2023

bors bot commented Aug 8, 2023

paulhauner commented Aug 8, 2023

bors bot commented Aug 8, 2023

michaelsproul commented Aug 8, 2023

bors bot commented Aug 9, 2023

[Merged by Bors] - Use BeaconProcessor for API requests #4462

[Merged by Bors] - Use BeaconProcessor for API requests #4462

Conversation

paulhauner commented Jul 4, 2023 • edited Loading

Issue Addressed

Proposed Changes

New CLI Flags

Additional Info

paulhauner commented Aug 8, 2023

bors bot commented Aug 8, 2023

paulhauner commented Aug 8, 2023

bors bot commented Aug 8, 2023

michaelsproul commented Aug 8, 2023

bors bot commented Aug 9, 2023

[Merged by Bors] - Use `BeaconProcessor` for API requests #4462

[Merged by Bors] - Use `BeaconProcessor` for API requests #4462

paulhauner commented Jul 4, 2023 •

edited

Loading