Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CLI] Add support for running postgres in a container to the local testnet + rest of PRs in stack #10271

Merged
merged 8 commits into from
Oct 11, 2023

Conversation

banool
Copy link
Contributor

@banool banool commented Sep 27, 2023

Stack

Description

This PR adds support for:

  • Spinning up postgres in a container.
  • Using postgres already running in the host system.

This is required to run indexer processors.

I still need to double check that this works on a fresh system, I have a suspicion that if postgres isn't installed things will fail with some error to the effect of "postgres shared library missing".

I also need to make sure I'm not reintroducing the OpenSSL dep.

This depends on aptos-labs/aptos-indexer-processors#146.

Note: To avoid using undesired diesel features in the CLI I had to move the feature declarations from the workspace level to the crate level. Crate level feature declarations are additive, they don't replace the features used at the workspace level.

Test Plan

Postgres in a container

cargo run -p aptos -- node run-local-testnet --assume-yes --force-restart --with-indexer-api --postgres-user dport
Node API is ready. Endpoint: http://0.0.0.0:8080/
Postgres is ready. Endpoint: postgres://dport@127.0.0.1:5433/local_testnet
Transaction stream is ready. Endpoint: http://0.0.0.0:50051/
Faucet is ready. Endpoint: http://127.0.0.1:8081/
psql postgres://dport@127.0.0.1:5433/local_testnet
psql (14.9 (Homebrew))
Type "help" for help.

local_testnet=# \dn
List of schemas
  Name  | Owner
--------+-------
 public | dport
(1 row)

Once the CLI shuts down so does the postgres container.

Host postgres

cargo run -p aptos -- node run-local-testnet --assume-yes --force-restart --with-indexer-api --postgres-user dport --use-host-postgres
Postgres is ready. Endpoint: postgres://dport@127.0.0.1:5432/local_testnet
psql postgres://dport@127.0.0.1:5433/local_testnet
psql (14.9 (Homebrew))
Type "help" for help.

local_testnet=# \dn
List of schemas
  Name  | Owner
--------+-------
 public | dport
(1 row)

Further Testing

Now that I've rebased all the other PRs in the stack into this PR, I will do some additional testing. I'm using this invocation:

cargo run -p aptos -- node run-local-testnet --assume-yes --force-restart --with-indexer-api

Does it work on:

  • ARM MacOS: Yes!!
  • x86 Ubuntu 22: Yes!!
  • x86 Windows (host system, NT, MSVC): Not working, but due to an unrelated error that occurs on main too: https://gist.github.com/banool/135ce49c0662832446778c6268b68d62.
  • x86 Windows (Ubuntu 22 WSL): Yes!!
  • CLI running inside container from x86 Linux tools image running on ARM MacOS: Yes!!

For the last of these, I ran it like this:

DOCKER_DEFAULT_PLATFORM=linux/amd64 docker run -it -v /var/run/docker.sock:/var/run/docker.sock --network host aptos-core/tools:from-local aptos node run-local-testnet --with-indexer-api

@banool banool changed the base branch from main to banool/local-testnet-structure September 27, 2023 14:39
@banool banool force-pushed the banool/local-testnet-postgres branch from 56436ab to fcb0465 Compare September 27, 2023 14:52
@banool banool marked this pull request as ready for review September 27, 2023 15:39
@banool banool force-pushed the banool/local-testnet-structure branch from f59af3e to 0ccc2b3 Compare September 28, 2023 11:04
@banool banool force-pushed the banool/local-testnet-postgres branch from fcb0465 to fe141c9 Compare September 28, 2023 11:12
Copy link
Contributor

@gregnazario gregnazario left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, comments on CLI commands may be a little long.

keep in mind that the first line will show in -h and the rest of the lines will show in --help, I don't know what the cutoff is.

@banool banool force-pushed the banool/local-testnet-postgres branch from fe141c9 to 25fe203 Compare September 28, 2023 13:48
@banool banool force-pushed the banool/local-testnet-structure branch from 0ccc2b3 to 41609e9 Compare September 28, 2023 17:20
@banool banool force-pushed the banool/local-testnet-postgres branch from 25fe203 to ec1bb8f Compare September 28, 2023 17:20
@banool banool force-pushed the banool/local-testnet-structure branch from 41609e9 to 982bb39 Compare September 29, 2023 08:17
@banool banool force-pushed the banool/local-testnet-postgres branch from ec1bb8f to a20df49 Compare September 29, 2023 08:17
@banool banool force-pushed the banool/local-testnet-structure branch from 982bb39 to 09eae5f Compare September 29, 2023 10:03
@banool banool force-pushed the banool/local-testnet-postgres branch from a20df49 to 0166c9f Compare September 29, 2023 10:03
@banool banool force-pushed the banool/local-testnet-structure branch from 09eae5f to 12597d6 Compare September 29, 2023 18:48
@banool banool force-pushed the banool/local-testnet-postgres branch from 0166c9f to 543cdac Compare September 29, 2023 18:48
@banool banool marked this pull request as draft September 29, 2023 22:37
@banool
Copy link
Contributor Author

banool commented Sep 29, 2023

Turns out diesel relies on pq-sys, which wraps libpq. It isn't acceptable to add this as a dep for the CLI, we have no way to ensure users have it installed and even if we did, it feels like an unnecessary dep for users who aren't using the local testnet / this feature of the local testnet.

I'm investigating alternatives here: https://aptos-org.slack.com/archives/C04PRP1K1FZ/p1696019475810529.

@banool
Copy link
Contributor Author

banool commented Oct 1, 2023

I switched to diesel-async. This solves it for this PR but the processors still use the mainline diesel postgres library, which uses libpg internally. To address this, I will make use of this branch for now in that PR: aptos-labs/aptos-indexer-processors#146. We'll discuss whether we want to actually adopt this PR or if I have to use an alternative, like implementing diesel::Connection on https://github.com/sfackler/rust-postgres and using that.

@banool banool force-pushed the banool/local-testnet-postgres branch 2 times, most recently from 446a5fb to 57b3c28 Compare October 1, 2023 17:56
@banool banool force-pushed the banool/local-testnet-structure branch from 12597d6 to 46b2dc2 Compare October 1, 2023 22:36
@banool banool force-pushed the banool/local-testnet-postgres branch from 57b3c28 to 126f8d8 Compare October 1, 2023 22:36
@banool banool force-pushed the banool/local-testnet-structure branch from 46b2dc2 to 7a9ed32 Compare October 2, 2023 11:19
@banool banool force-pushed the banool/local-testnet-postgres branch from 126f8d8 to 7f3c143 Compare October 2, 2023 11:19
@banool banool marked this pull request as ready for review October 2, 2023 11:36
Copy link
Contributor

@davidiw davidiw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if you don't have docker installed?

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@banool banool force-pushed the banool/local-testnet-postgres branch 3 times, most recently from 6149aee to 8b5b6bf Compare October 6, 2023 14:15
@banool
Copy link
Contributor Author

banool commented Oct 6, 2023

Landing this is currently blocked: https://aptos-org.slack.com/archives/C03N83P7QUC/p1691767293160939. I need to update the version of serde we use but doing so breaks some kind of old codegen / format gen we have.

@banool
Copy link
Contributor Author

banool commented Oct 11, 2023

This change to Bollard means we can keep using the version of serde we're using now: fussybeaver/bollard#343, which helps us circumvent this issue: #10424.

@banool
Copy link
Contributor Author

banool commented Oct 11, 2023

I tested that the Docker stuff still works with this fork of Bollard, let's land.

@banool banool enabled auto-merge (rebase) October 11, 2023 17:28
@banool banool force-pushed the banool/local-testnet-postgres branch from 8b5b6bf to 3e429af Compare October 11, 2023 17:34
@banool banool changed the title [CLI] Add support for running postgres in a container to the local testnet [CLI] Add support for running postgres in a container to the local testnet + rest of PRs in stack Oct 11, 2023
@banool banool force-pushed the banool/local-testnet-postgres branch from bf7f740 to 608251b Compare October 11, 2023 18:56
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

✅ Forge suite compat success on aptos-node-v1.6.2 ==> 608251b382935f3c09535f91d0473bdd366dfe40

Compatibility test results for aptos-node-v1.6.2 ==> 608251b382935f3c09535f91d0473bdd366dfe40 (PR)
1. Check liveness of validators at old version: aptos-node-v1.6.2
compatibility::simple-validator-upgrade::liveness-check : committed: 3048 txn/s, latency: 8145 ms, (p50: 8700 ms, p90: 10300 ms, p99: 10800 ms), latency samples: 155480
2. Upgrading first Validator to new version: 608251b382935f3c09535f91d0473bdd366dfe40
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 1769 txn/s, latency: 16079 ms, (p50: 19000 ms, p90: 22100 ms, p99: 22600 ms), latency samples: 92000
3. Upgrading rest of first batch to new version: 608251b382935f3c09535f91d0473bdd366dfe40
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 1391 txn/s, submitted: 1402 txn/s, expired: 10 txn/s, latency: 16252 ms, (p50: 18600 ms, p90: 23600 ms, p99: 29500 ms), latency samples: 89058
4. upgrading second batch to new version: 608251b382935f3c09535f91d0473bdd366dfe40
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 3438 txn/s, latency: 8849 ms, (p50: 9900 ms, p90: 11700 ms, p99: 12500 ms), latency samples: 144420
5. check swarm health
Compatibility test for aptos-node-v1.6.2 ==> 608251b382935f3c09535f91d0473bdd366dfe40 passed
Test Ok

@github-actions
Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 608251b382935f3c09535f91d0473bdd366dfe40

two traffics test: inner traffic : committed: 8200 txn/s, latency: 4787 ms, (p50: 4500 ms, p90: 5700 ms, p99: 10800 ms), latency samples: 3542800
two traffics test : committed: 100 txn/s, latency: 2222 ms, (p50: 2100 ms, p90: 2600 ms, p99: 8300 ms), latency samples: 1760
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.222, avg: 0.199", "QsPosToProposal: max: 0.181, avg: 0.149", "ConsensusProposalToOrdered: max: 0.619, avg: 0.578", "ConsensusOrderedToCommit: max: 0.527, avg: 0.499", "ConsensusProposalToCommit: max: 1.137, avg: 1.077"]
Max round gap was 1 [limit 4] at version 1260858. Max no progress secs was 4.366688 [limit 10] at version 1260858.
Test Ok

@banool banool merged commit 6b7072c into main Oct 11, 2023
46 checks passed
@banool banool deleted the banool/local-testnet-postgres branch October 11, 2023 19:29
@github-actions
Copy link
Contributor

✅ Forge suite framework_upgrade success on aptos-node-v1.5.1 ==> 608251b382935f3c09535f91d0473bdd366dfe40

Compatibility test results for aptos-node-v1.5.1 ==> 608251b382935f3c09535f91d0473bdd366dfe40 (PR)
Upgrade the nodes to version: 608251b382935f3c09535f91d0473bdd366dfe40
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 4321 txn/s, latency: 4824 ms, (p50: 4800 ms, p90: 7700 ms, p99: 8200 ms), latency samples: 242000
5. check swarm health
Compatibility test for aptos-node-v1.5.1 ==> 608251b382935f3c09535f91d0473bdd366dfe40 passed
Test Ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants