Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add js-libp2p perf tests #244

Merged
merged 39 commits into from
Aug 25, 2023
Merged

Conversation

maschad
Copy link
Member

@maschad maschad commented Jul 26, 2023

@mxinden
Copy link
Member

mxinden commented Jul 28, 2023

Great to see @maschad. Thank you for the work. Let me know once this is ready for a review. Also, in case you don't have permissions to trigger the libp2p perf GitHub Action, I am happy to do it for you.

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the work here. Once #249 is merged I can trigger the CI action.

perf/README.md Outdated Show resolved Hide resolved
perf/impl/js-libp2p/v0.45/Makefile Outdated Show resolved Hide resolved
perf/impl/js-libp2p/v0.45/Makefile Outdated Show resolved Hide resolved
perf/impl/js-libp2p/v0.45/Makefile Outdated Show resolved Hide resolved
perf/impl/js-libp2p/v0.45/Makefile Outdated Show resolved Hide resolved
perf/runner/terraform.tfstate Outdated Show resolved Hide resolved
perf/terraform.tfstate Outdated Show resolved Hide resolved
perf/terraform/configs/local/terraform.tf Outdated Show resolved Hide resolved
perf/runner/src/versions.ts Outdated Show resolved Hide resolved
@mxinden
Copy link
Member

mxinden commented Aug 7, 2023

@maschad please ping me once this is ready for another review.

@maschad maschad marked this pull request as ready for review August 7, 2023 16:53
@maschad maschad requested a review from mxinden August 7, 2023 16:54
@mxinden
Copy link
Member

mxinden commented Aug 10, 2023

I triggered the libp2p perf GitHub Action for the libp2p/test-plans perf/js-libp2p branch:

https://github.com/libp2p/test-plans/actions/runs/5818528858/job/15775124342

@maschad do I understand correctly that the branch on libp2p/test-plans was created by you? The origin of this pull request is maschad/test-plans. How about using libp2p/test-plans only by closing this pull request and replacing it with a new one off of libp2p/test-plans? That would avoid confusion.

@mxinden
Copy link
Member

mxinden commented Aug 10, 2023

Test run (see link above) failed with:

=== Starting client js-libp2p/v0.45/tcp
bash: line 1: ./impl/js-libp2p/v0.45/perf: Permission denied

The perf file seems to not be marked as executable:

ls -la

total 16
drwxrwxr-x 2 mxinden mxinden 4096 Aug 10 10:43 .
drwxrwxr-x 3 mxinden mxinden 4096 Aug 10 10:43 ..
-rw-rw-r-- 1 mxinden mxinden  699 Aug 10 10:43 Makefile
-rw-rw-r-- 1 mxinden mxinden  876 Aug 10 10:43 perf

@maschad
Copy link
Member Author

maschad commented Aug 10, 2023

@maschad do I understand correctly that the branch on libp2p/test-plans was created by you? The origin of this pull request is maschad/test-plans. How about using libp2p/test-plans only by closing this pull request and replacing it with a new one off of libp2p/test-plans? That would avoid confusion.

Thanks for running the CI @mxinden I don't have write access to the repo so this is the only branch I have been using i.e. maschad/test-plans:perf/js-libp2p

The perf file seems to not be marked as executable:

Thanks for helping me resolve this, was a file permissions issue, resolved via chmod +x

@mxinden
Copy link
Member

mxinden commented Aug 11, 2023

The perf file seems to not be marked as executable:

Thanks for helping me resolve this, was a file permissions issue, resolved via chmod +x

But you didn't push the change, right @maschad?

@maschad
Copy link
Member Author

maschad commented Aug 11, 2023

But you didn't push the change, right @maschad?

Correct, I'm debugging another issue and then will push the changes.

@maschad maschad marked this pull request as draft August 11, 2023 15:55
@maschad
Copy link
Member Author

maschad commented Aug 14, 2023

@mxinden I've run this locally and generated these benchmark results when you have a chance could you please trigger a CI run? Thanks.

@maschad
Copy link
Member Author

maschad commented Aug 14, 2023

Once the CI run is successful and js-libp2p protocol perf 1.1.0 is merged and published I can follow up with PR that will make builds faster by running the perf package solely (as opposed to installing and building the entire monorepo).

@p-shahi
Copy link
Member

p-shahi commented Aug 21, 2023

@mxinden I saw you pushed some changes, did you execute a perf run as well and is it approved for merge now?

@mxinden
Copy link
Member

mxinden commented Aug 21, 2023

The NodeJS installation issue is resolved with #267. My bad for the broken #266.

I saw you pushed some changes

I reverted your changes to the NodeJS installation instructions. They are no longer needed.

NodeJS installation is now properly in place. See js-libp2p perf test being executed in https://github.com/libp2p/test-plans/actions/runs/5923420241/job/16060336976.


Unfortunately perf runs are failing consistently, namely on rust-libp2p/v0.52/tcp. See e.g. https://github.com/libp2p/test-plans/actions/runs/5923420241/job/16060336976.

The rust-libp2p server is unable to bind to the 4001 TCP port:

$ cat server.log 
Error: 

Caused by:
    0: Address already in use (os error 98)
    1: Address already in use (os error 98)

and thus the Rust client fails to connect:

[2023-08-21T09:17:05.031Z INFO  perf] start benchmark: custom
Error: Outgoing connection error to None: Transport([("/ip4/54.212.203.99/tcp/4001", Other(Custom { kind: Other, error: Transport(Right(Left(Right(Apply(ClientUpgrade(Custom { kind: Other, error: Failed })))))) }))])

The problem is, that the previous js-libp2p NodeJS process is still running and binded to 4001 TCP:

$ sudo netstat -tulnep | grep 4001
tcp        0      0 0.0.0.0:4001            0.0.0.0:*               LISTEN      1000       96570      72416/node          

@maschad could it be that the js-libp2p server is not reacting to kill $PID?

@maschad
Copy link
Member Author

maschad commented Aug 21, 2023

@maschad could it be that the js-libp2p server is not reacting to kill $PID?

That's correct. The node process has a separate PID from the bash executable, and it was the bash script's PID being written to pidfile and subsequently being killed.

I have modified the runner to kill the process running on port 4001 as an alternative, rather than writing the PID to a pidfile, then reading that PID and then killing it. I have ran this a few times on my machine successfully.

@p-shahi
Copy link
Member

p-shahi commented Aug 22, 2023

Results of run: https://github.com/libp2p/test-plans/actions/runs/5932647404
benchmark results committed to perf/js-libp2p branch: 84295bc

Kills the node child process when the `perf` script is killed. Does not depend
on specific ports. Keeps the idiosyncrasies of the script local to the script.
@p-shahi
Copy link
Member

p-shahi commented Aug 22, 2023

Is the ready for merge?

@maschad could it be that the js-libp2p server is not reacting to kill $PID?

That's correct. The node process has a separate PID from the bash executable, and it was the bash script's PID being written to pidfile and subsequently being killed.

I have modified the runner to kill the process running on port 4001 as an alternative, rather than writing the PID to a pidfile, then reading that PID and then killing it. I have ran this a few times on my machine successfully.

drive by comment here: why not get the PID of node process within the perf shell script itself? seems better than assuming that we have to kill the process listening on port 4001 - that could change in the future?

@mxinden
Copy link
Member

mxinden commented Aug 22, 2023

I suggest containing the cleanup of the NodeJS process within the perf bash script. I implemented this in 6d4c137. It is currently running on https://github.com/libp2p/test-plans/actions/runs/5939464620/job/16105895040. Unless @maschad you object I will push that commit here on green.

@maschad
Copy link
Member Author

maschad commented Aug 22, 2023

drive by comment here: why not get the PID of node process within the perf shell script itself? seems better than assuming that we have to kill the process listening on port 4001 - that could change in the future?

That was my original course of action but it's not an assumption since the port is set by the nohup command that runs the server and it's unlikely that would change.

It's more efficient to kill the process running at the port since that was pre-determined to be the server and not another process, as opposed to writing it to a file and then reading that file, then deleting it.

Unless @maschad you object I will push that commit here on green.

I don't necessarily object but I would like to understand the preference for doing it that way.

@p-shahi p-shahi mentioned this pull request Aug 22, 2023
26 tasks
@maschad
Copy link
Member Author

maschad commented Aug 22, 2023

I suggest containing the cleanup of the NodeJS process within the perf bash script. I implemented this in 6d4c137. It is currently running on libp2p/test-plans/actions/runs/5939464620/job/16105895040. Unless @maschad you object I will push that commit here on green.

I spoke to @p-shahi on the maintainers call on 22-08-23 and he explained why that's his preference, I don't object to it so please feel free to go ahead and push that commit in order that this can be merged.

@mxinden
Copy link
Member

mxinden commented Aug 23, 2023

@maschad
Copy link
Member Author

maschad commented Aug 23, 2023

Latest perf run: libp2p/test-plans/actions/runs/5949260069/job/16134709657

Thanks @mxinden looks good. This is ready for merge from my end.

@mxinden
Copy link
Member

mxinden commented Aug 23, 2023

@mxinden
Copy link
Member

mxinden commented Aug 23, 2023

To view the test results on the dashboard:

https://observablehq.com/@libp2p-workspace/performance-dashboard?branch=perf%2Fjs-libp2p#branch

Latency median of 1s seems off. Is this expected @maschad? Maybe a unit conversion error?

@maschad
Copy link
Member Author

maschad commented Aug 23, 2023

Latency median of 1s seems off. Is this expected @maschad? Maybe a unit conversion error?

This has been what I have observed in my tests runs, on average I have seen latencies of 1.03 seconds when running the test plans perf benchmark on my machine and I think the issue may be related to known performance issues with our yamux implementation

I don't think it's a conversion error given we the conversion from milliseconds to seconds is correct.

I've ran the perf package locally in two terminals over 100 iterations and averaged 0.02 seconds but that of course excludes connection establishment latency.

@mxinden
Copy link
Member

mxinden commented Aug 25, 2023

Adding @achingbrain and @MarcoPolo here for visibility.

Summary: we are measuring a median of 1s when establishing a js-libp2p TCP+Plaintext+Yamux connection and sending an empty request and receive an empty response. For comparison, the median latency between the machines is 60ms and an https (TCP+TLS) request takes median 180ms (3 round-trips).

See dashboard below for higher resolution.

https://observablehq.com/@libp2p-workspace/performance-dashboard?branch=perf%2Fjs-libp2p#branch


Moving forward here anyways. In my eyes measuring in itself already provides value.

Any related fixes can go into follow-up pull requests.

@mxinden mxinden merged commit c39bfb9 into libp2p:master Aug 25, 2023
mxinden added a commit that referenced this pull request Sep 1, 2023
This reverts commit c39bfb9.

Reverting due to #291. Can revert
revert once resolved.
mxinden added a commit that referenced this pull request Sep 1, 2023
This reverts commit c39bfb9.

Reverting due to #291. Can revert
revert once resolved.
maschad added a commit to maschad/test-plans that referenced this pull request Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants