Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI migration #534

Merged
merged 11 commits into from
Aug 3, 2023
Merged

CLI migration #534

merged 11 commits into from
Aug 3, 2023

Conversation

tegefaulkes
Copy link
Contributor

@tegefaulkes tegefaulkes commented Jul 20, 2023

Description

In this PR we need to take the CLI part of Polykey and move it into it's own package that depends on Polykey. This means moving the src/bin directory as well as the tests/bin and integration tests. Any CI jobs related to building and testing the built executables need to be moved along with this.

Any package dependencies relating directly to CLI need to be moved as well.

This needs to be rebased on top of #525 consistently.

Issues Fixed

Tasks

  1. Move all CLI and Bin related code into a new repo and package called Polykey-CLI.
    • a. New repo needs to be created, use the ts-demo-lib as a base
    • b. All CLI code needs to be transplanted.
    • c. All Bin related tests need to be transplanted.
    • d. All CLI and integration CI jobs need to be transplanted.
  2. Remove all CLI and bin related code from this Repo.
    • b. All CLI code needs to be removed.
    • c. All Bin related tests need to be removed.
    • d. All CLI and integration CI jobs need to be removed.

Final checklist

  • Domain specific tests
  • Full tests
  • Updated inline-comment documentation
  • Lint fixed
  • Squash and rebased
  • Sanity check the final build

@tegefaulkes tegefaulkes self-assigned this Jul 20, 2023
@ghost
Copy link

ghost commented Jul 20, 2023

👇 Click on the image for a new way to code review

Review these changes using an interactive CodeSee Map

Legend

CodeSee Map legend

@tegefaulkes
Copy link
Contributor Author

Looks like there's a Polykey-CLI repo that already exists at https://github.com/MatrixAI/Polykey-CLI. I'll just work from there.

@CMCDragonkai
Copy link
Member

CI deployment of testnet should be migrated to Polykey-CLI as well.

Mainly because the CLI is what is used as an application to deploy.

So we cannot actually deploy anything from here. That's the app to this library.

@CMCDragonkai CMCDragonkai self-requested a review August 1, 2023 06:09
@CMCDragonkai
Copy link
Member

Make sure that this PR fixes the staging CI/CD too, so that staging CI should be passing when this is merged. @tegefaulkes

@CMCDragonkai CMCDragonkai mentioned this pull request Aug 1, 2023
2 tasks
@tegefaulkes
Copy link
Contributor Author

Some tests are just going to fail until stage 2 agent migration is done. I'll have to skip these.

@tegefaulkes
Copy link
Contributor Author

It may be useful to add a new release job publish an pre-release version on NPN and only check build and linting. We may want to do a pre-release without running through all the tests.

This will be useful for testing the Polykey-CLI CI jobs without a fully working version of Polykey.

@tegefaulkes
Copy link
Contributor Author

Oh, I think this is already a thing?

@CMCDragonkai
Copy link
Member

It may be useful to add a new release job publish an pre-release version on NPN and only check build and linting. We may want to do a pre-release without running through all the tests.

This will be useful for testing the Polykey-CLI CI jobs without a fully working version of Polykey.

Not sure what you are talking about here? There's already prereleases.

@tegefaulkes
Copy link
Contributor Author

The following tests have been disabled for now, pending work and fixes in agent migration stage 2. I'll add a comment in that PR as well.

  • validation/index.test.ts:372 'manipulate this when using function expressions'
  • vaults/VaultInternal.test.ts:187 'can change commits and preserve the log with no intermediate vault mutation'
  • vaults/VaultInternal.test.ts:236 'can change to the latest commit'
  • vaults/VaultInternal.test.ts:427 'concurrent read operations allowed'
  • vaults/VaultManager.test.ts:473 'with remote agents'

@tegefaulkes tegefaulkes mentioned this pull request Aug 2, 2023
11 tasks
@tegefaulkes
Copy link
Contributor Author

CI is failing on client domain tests. It's core dumping with Reached heap limit Allocation failed - JavaScript heap out of memory.

<--- Last few GCs --->
[346:0x3928ac0]   194494 ms: Mark-sweep 1884.5 (2091.2) -> 1879.1 (2092.5) MB, 1051.8 / 0.2 ms  (average mu = 0.338, current mu = 0.224) allocation failure; scavenge might not succeed
[346:0x3928ac0]   195667 ms: Mark-sweep 1891.0 (2095.0) -> 1880.5 (2097.2) MB, 1055.4 / 0.1 ms  (average mu = 0.229, current mu = 0.100) allocation failure; scavenge might not succeed
<--- JS stacktrace --->
FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory
 1: 0xab0218 node::Abort() [node]
 2: 0x99f518  [node]
 3: 0xcc91b0 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
 4: 0xcc957b v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
 5: 0xeaf5c5  [node]
 6: 0xec3549 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
 7: 0xea016a v8::internal::HeapAllocator::AllocateRawWithLightRetrySlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
 8: 0xea1514 v8::internal::HeapAllocator::AllocateRawWithRetryOrFailSlowPath(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
 9: 0xe80e2b v8::internal::Factory::AllocateRaw(int, v8::internal::AllocationType, v8::internal::AllocationAlignment) [node]
10: 0xe79764 v8::internal::FactoryBase<v8::internal::Factory>::NewRawTwoByteString(int, v8::internal::AllocationType) [node]
11: 0x1174095 v8::internal::String::SlowFlatten(v8::internal::Isolate*, v8::internal::Handle<v8::internal::ConsString>, v8::internal::AllocationType) [node]
12: 0x101205b v8::internal::CompilationCacheTable::LookupScript(v8::internal::Handle<v8::internal::CompilationCacheTable>, v8::internal::Handle<v8::internal::String>, v8::internal::LanguageMode, v8::internal::Isolate*) [node]
13: 0xd7564a v8::internal::CompilationCacheScript::Lookup(v8::internal::Handle<v8::internal::String>, v8::internal::ScriptDetails const&, v8::internal::LanguageMode) [node]
14: 0xd835e0  [node]
15: 0xd846aa v8::internal::Compiler::GetSharedFunctionInfoForScript(v8::internal::Isolate*, v8::internal::Handle<v8::internal::String>, v8::internal::ScriptDetails const&, v8::ScriptCompiler::CompileOptions, v8::ScriptCompiler::NoCacheReason, v8::internal::NativesFlag) [node]
16: 0xce02f6 v8::ScriptCompiler::CompileUnboundInternal(v8::Isolate*, v8::ScriptCompiler::Source*, v8::ScriptCompiler::CompileOptions, v8::ScriptCompiler::NoCacheReason) [node]
17: 0xa9fbdc node::contextify::ContextifyScript::New(v8::FunctionCallbackInfo<v8::Value> const&) [node]
18: 0xd2c51e v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) [node]
19: 0xd2d0f7  [node]
20: 0xd2d3f4 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [node]
21: 0x16bf4f9  [node]
/tmp/nix-shell-19-0/rc: line 3:   335 Aborted                 (core dumped) npm test -- --ci --coverage tests/client/handlers/agentLockAll.test.ts tests/client/handlers/agentStatus.test.ts tests/client/handlers/agentStop.test.ts tests/client/handlers/agentUnlock.test.ts tests/client/handlers/gestaltsActionsSetUnsetGetByIdentity.test.ts tests/client/handlers/gestaltsActionsSetUnsetGetByNode.test.ts tests/client/handlers/gestaltsDiscoveryByIdentity.test.ts tests/client/handlers/gestaltsDiscoveryByNode.test.ts tests/client/handlers/gestaltsGestaltGetByIdentity.test.ts tests/client/handlers/gestaltsGestaltGetByNode.test.ts tests/client/handlers/gestaltsGestaltList.test.ts tests/client/handlers/gestaltsGestaltTrustByIdentity.test.ts tests/client/handlers/gestaltsGestaltTrustByNode.test.ts tests/client/handlers/identitiesAuthenticatedGet.test.ts tests/client/handlers/identitiesAuthenticate.test.ts tests/client/handlers/identitiesClaim.test.ts tests/client/handlers/identitiesInfoConnectedGet.test.ts tests/client/handlers/identitiesInfoGet.test.ts tests/client/handlers/identitiesInvite.test.ts tests/client/handlers/identitiesProvidersList.test.ts tests/client/handlers/identitiesTokenPutDeleteGet.test.ts tests/client/handlers/keysCertsChainGet.test.ts tests/client/handlers/keysCertsGet.test.ts tests/client/handlers/keysEncryptDecrypt.test.ts tests/client/handlers/keysKeyPairRenew.test.ts tests/client/handlers/keysKeyPairReset.test.ts tests/client/handlers/keysKeyPair.test.ts tests/client/handlers/keysPasswordChange.test.ts tests/client/handlers/keysPublicKey.test.ts tests/client/handlers/keysSignVerify.test.ts tests/client/handlers/nodesAdd.test.ts tests/client/handlers/nodesClaim.test.ts tests/client/handlers/nodesFind.test.ts tests/client/handlers/nodesPing.test.ts tests/client/handlers/notificationsClear.test.ts tests/client/handlers/notificationsRead.test.ts tests/client/handlers/notificationsSend.test.ts tests/client/handlers/vaultsClone.test.ts tests/client/handlers/vaultsCreateDeleteList.test.ts tests/client/handlers/vaultsLog.test.ts tests/client/handlers/vaultsPermissionSetUnsetGet.test.ts tests/client/handlers/vaultsPull.test.ts tests/client/handlers/vaultsRename.test.ts tests/client/handlers/vaultsScan.test.ts tests/client/handlers/vaultsSecretsEdit.test.ts tests/client/handlers/vaultsSecretsMkdir.test.ts tests/client/handlers/vaultsSecretsNewDeleteGet.test.ts tests/client/handlers/vaultsSecretsNewDirList.test.ts tests/client/handlers/vaultsSecretsRename.test.ts tests/client/handlers/vaultsSecretsStat.test.ts tests/client/handlers/vaultsVersion.test.ts

I ran the tests locally for that domain with --logHeapUsage and got the following results

121 MB | PASS  handlers/agentStop.test.ts (6.491 s, 121 MB heap size)
107 MB | PASS  handlers/nodesAdd.test.ts (6.773 s, 107 MB heap size)
94  MB | PASS  handlers/vaultsVersion.test.ts (9.188 s, 94 MB heap size)
175 MB | PASS  handlers/identitiesInfoConnectedGet.test.ts (175 MB heap size)
202 MB | PASS  handlers/keysKeyPairRenew.test.ts (202 MB heap size)
157 MB | PASS  handlers/notificationsSend.test.ts (157 MB heap size)
186 MB | PASS  handlers/notificationsRead.test.ts (14.466 s, 186 MB heap size)
181 MB | PASS  handlers/keysKeyPairReset.test.ts (181 MB heap size)
239 MB | PASS  handlers/nodesClaim.test.ts (239 MB heap size)
183 MB | PASS  handlers/gestaltsGestaltTrustByIdentity.test.ts (15.664 s, 183 MB heap size)
247 MB | PASS  handlers/agentStatus.test.ts (247 MB heap size)
236 MB | PASS  handlers/gestaltsGestaltTrustByNode.test.ts (16.37 s, 236 MB heap size)
145 MB | PASS  handlers/vaultsLog.test.ts (17.971 s, 145 MB heap size)
226 MB | PASS  handlers/identitiesInfoGet.test.ts (226 MB heap size)
115 MB | PASS  handlers/vaultsSecretsEdit.test.ts (115 MB heap size)
245 MB | PASS  handlers/vaultsSecretsNewDeleteGet.test.ts (245 MB heap size)
190 MB | PASS  handlers/vaultsSecretsRename.test.ts (190 MB heap size)
142 MB | PASS  handlers/gestaltsDiscoveryByIdentity.test.ts (142 MB heap size)
184 MB | PASS  handlers/vaultsSecretsNewDirList.test.ts (184 MB heap size)
206 MB | PASS  handlers/gestaltsDiscoveryByNode.test.ts (206 MB heap size)
290 MB | PASS  handlers/vaultsSecretsMkdir.test.ts (290 MB heap size)
143 MB | PASS  handlers/vaultsCreateDeleteList.test.ts (143 MB heap size)
169 MB | PASS  handlers/vaultsSecretsStat.test.ts (169 MB heap size)
192 MB | PASS  handlers/vaultsPermissionSetUnsetGet.test.ts (192 MB heap size)
230 MB | PASS  handlers/identitiesAuthenticatedGet.test.ts (230 MB heap size)
265 MB | PASS  handlers/keysKeyPair.test.ts (265 MB heap size)
265 MB | PASS  authenticationMiddleware.test.ts (265 MB heap size)
332 MB | PASS  handlers/nodesPing.test.ts (332 MB heap size)
196 MB | PASS  handlers/identitiesClaim.test.ts (196 MB heap size)
233 MB | PASS  handlers/nodesFind.test.ts (233 MB heap size)
208 MB | PASS  handlers/identitiesAuthenticate.test.ts (208 MB heap size)
133 MB | PASS  handlers/vaultsRename.test.ts (133 MB heap size)
307 MB | PASS  handlers/identitiesProvidersList.test.ts (307 MB heap size)
291 MB | PASS  timeoutMiddleware.test.ts (291 MB heap size)
367 MB | PASS  handlers/agentUnlock.test.ts (367 MB heap size)
226 MB | PASS  handlers/keysCertsChainGet.test.ts (226 MB heap size)
248 MB | PASS  handlers/identitiesTokenPutDeleteGet.test.ts (248 MB heap size)
267 MB | PASS  handlers/identitiesInvite.test.ts (267 MB heap size)
172 MB | PASS  handlers/gestaltsGestaltList.test.ts (172 MB heap size)
325 MB | PASS  handlers/agentLockAll.test.ts (325 MB heap size)
336 MB | PASS  handlers/keysCertsGet.test.ts (336 MB heap size)
399 MB | PASS  handlers/gestaltsActionsSetUnsetGetByNode.test.ts (399 MB heap size)
280 MB | PASS  handlers/keysEncryptDecrypt.test.ts (280 MB heap size)
302 MB | PASS  handlers/gestaltsGestaltGetByIdentity.test.ts (302 MB heap size)
257 MB | PASS  handlers/gestaltsActionsSetUnsetGetByIdentity.test.ts (257 MB heap size)
207 MB | PASS  handlers/gestaltsGestaltGetByNode.test.ts (207 MB heap size)
335 MB | PASS  handlers/vaultsClone.test.ts (335 MB heap size)
357 MB | PASS  handlers/keysPasswordChange.test.ts (357 MB heap size)
290 MB | PASS  handlers/vaultsScan.test.ts (290 MB heap size)
311 MB | PASS  handlers/vaultsPull.test.ts (311 MB heap size)
372 MB | PASS  handlers/keysSignVerify.test.ts (372 MB heap size)
431 MB | PASS  handlers/keysPublicKey.test.ts (431 MB heap size)

// compared to the following domains

// websockets
PASS  tests/websockets/WebSocket.test.ts (31.137 s, 126 MB heap size)

// RPC
 PASS  tests/rpc/utils/utils.test.ts (77 MB heap size)
 PASS  tests/rpc/RPC.test.ts (6.853 s, 148 MB heap size)
 PASS  tests/rpc/RPCClient.test.ts (8.891 s, 110 MB heap size)
 PASS  tests/rpc/RPCServer.test.ts (9.071 s, 87 MB heap size)
 PASS  tests/rpc/utils/middleware.test.ts (9.367 s, 98 MB heap size)

// keys
PASS  tests/keys/KeyRing.test.ts (53 MB heap size)
PASS  tests/keys/utils/jwk.test.ts (70 MB heap size)
PASS  tests/keys/utils/generate.test.ts (48 MB heap size)
PASS  tests/keys/utils/webcrypto.test.ts (68 MB heap size)
PASS  tests/keys/utils/asymmetric.test.ts (77 MB heap size)
PASS  tests/keys/utils/recoveryCode.test.ts (93 MB heap size)
PASS  tests/keys/utils/random.test.ts (71 MB heap size)
PASS  tests/keys/utils/pem.test.ts (82 MB heap size)
PASS  tests/keys/utils/hash.test.ts (5.249 s, 73 MB heap size)
PASS  tests/keys/utils/symmetric.test.ts (7.547 s, 77 MB heap size)
PASS  tests/keys/utils/x509.test.ts (11.999 s, 71 MB heap size)
PASS  tests/keys/CertManager.test.ts (40.556 s, 125 MB heap size)

We can see that the client RPC is using a lot more memory compared to the others. I'll need to dig into why this is happening.

I also ran the tests with --detectLeaks. Running keys domain with this resulted in no problems. Running client domain with this resulted in all but 3 tests failing. So it's likely a memory leak here. But the threashold for a leak here is pretty small. It could just be tests not fully cleaning up before completing.

  ● Test suite failed to run

    EXPERIMENTAL FEATURE!
    Your test suite is leaking memory. Please ensure all references are cleaned.

    There is a number of things that can leak memory:
      - Async operations that have not finished (e.g. fs.readFile).
      - Timers not properly mocked (e.g. setInterval, setTimeout).
      - Keeping references to the global scope.

      at onResult (node_modules/@jest/core/build/TestScheduler.js:188:18)
      at node_modules/@jest/core/build/TestScheduler.js:300:17
      at node_modules/emittery/index.js:311:13
          at Array.map (<anonymous>)
      at Emittery.emit (node_modules/emittery/index.js:309:23)

Related to this is one of the notifications test is holding the process open.

@tegefaulkes
Copy link
Contributor Author

Eyeballing htop shows that vaults and keys domain tests use up about 1GB-2GB, while client domain is using 4GB-5GB.

@tegefaulkes
Copy link
Contributor Author

The open handle is coming from js-quic there are also reported leaks in QUICClient and QUICStream tests.

@tegefaulkes
Copy link
Contributor Author

Note, using { timer: new Timer({ delay: 500 }) }, for the ctx will result in a 'leak' since timer is not cleaned up or resolved before the test ends. Not technically a problem but will by caught by jests's --detectLeaks.

@CMCDragonkai
Copy link
Member

The receiver of the context is suppose to clean it up. This is something that's handled by js-contexts. If you just create a Timer with nothing receiving it, then yes it will leak. Which means you should use js-contexts in these scenarios or make sure to clean up manually.

@CMCDragonkai
Copy link
Member

Remember whoever created the object is supposed to destroy it. So whoever created the timer needs to clear the timer.

@CMCDragonkai
Copy link
Member

Js-quic handle is just dlopen and that wouldn't be a memory leaks.

@tegefaulkes
Copy link
Contributor Author

tegefaulkes commented Aug 3, 2023

Ok, from what I can tell, then the tests are run in band with --logHeapUsage we see a steady increase in memory usage to about 1.2GB for the tests/client tests. I think part of the problem here is that we run multiple files, each with their own environment. things that are normally a singleton in the global scope are duplicated between each file taking up extra memory. Across 52 tests this takes up a fair amount of memory.

I think there are 3 things we can do to help.

  1. Combine tests into larger files so global state is better shared. We can do this by domain and keep things reasonably clustered,
  2. Modify the test script to run jest with --expose-gc I think this allows jest to better clean up between runs. I did see some improvement in heap usage, especially with tests combined within a file.
  3. Worst case we can split the tests/client domain into multiple CI jobs. I don't think this is needed on top of 1 and 2 but we'll see.

I'm going to move forward with 1. and 2.

@tegefaulkes
Copy link
Contributor Author

Applying .1 and .2 results in 1/4th the heap usage now, totalling 300MB. So this should resolve the problem.

@CMCDragonkai
Copy link
Member

I don't like the idea of combining test files. That just increases the amount of cross-over interference and bugs.

A better idea would be the reduce the concurrency... Memory should be garbage collected between test files. If they aren't that's a problem.

@tegefaulkes
Copy link
Contributor Author

How would we reduce concurrency? Everything already runs in a single band in CI.

As for gabage collection between test files, that's up to jest isn't it? Even using node --expose-gc ./node_modules/.bin/jest so jest can trigger gc doesn't fully fix the problem.

I'll make a new issue for further investigation into the memory issue.

@tegefaulkes
Copy link
Contributor Author

tegefaulkes commented Aug 3, 2023

New issue created at #542

@CMCDragonkai
Copy link
Member

Jest is just running nodejs processes. Nodejs should be GCing whatever is no longer necessary. It's unlikely jest is deliberately keeping memory around for no reason. It's far more likely that a memory leak is caused by introduction of the new code, and incorrect resource creation/destruction procedures during testing. It's a spiky system at scale.

@CMCDragonkai
Copy link
Member

Well the fact that it runs single band convinces me even more that memory leaks is not caused by anything underlying. It's the new code and the new tests.

@tegefaulkes
Copy link
Contributor Author

CI has passed.

@tegefaulkes tegefaulkes force-pushed the feature-cli_migration branch from 51bd7c4 to a4e8311 Compare August 3, 2023 06:49
@tegefaulkes
Copy link
Contributor Author

This can be merged when CI passes again.

README.md Show resolved Hide resolved
@CMCDragonkai
Copy link
Member

Should we merge both at the same time? MatrixAI/Polykey-CLI#2

I'd like to make sure we don't lose anything in the extraction.

fc.jsonValue can generate valid json that won't be equal to itself after a serialize/parse. This is normal for json but was breaking the tests sometimes.
@tegefaulkes
Copy link
Contributor Author

I need a published version, preferably not an alpha one to test the CI in Polykey-CLI. I can easily compare diffs to double what was removed from Polykey to include in Polykey-CLI. I think right now that's just some ReadMe stuff I removed here.

@CMCDragonkai
Copy link
Member

You need a published version of Polykey itself right? So CI needs to pass for that to occur. The CI will do an auto-publish when it's all done.

@CMCDragonkai
Copy link
Member

The only thing necessary is to push up a pre-release tag. In fact I believe this is possible just even in the feature branch, and not the staging branch.

Is there anything else that needs to be done here?

Please ensure there's a list of tasks in the MatrixAI/Polykey-CLI#2 that tracks all that must be done there and preserved/factored out from here including README.

@tegefaulkes
Copy link
Contributor Author

Except for the pre-release tag, everything is done here. I'll do a tag really quick.

@tegefaulkes
Copy link
Contributor Author

The prepatch pipeline succeeded. It seems like it triggered a merge of staging to master. I'm not sure we want that to happen? considering it was initiated from a feature branch.

@tegefaulkes tegefaulkes marked this pull request as ready for review August 3, 2023 07:58
@tegefaulkes
Copy link
Contributor Author

I'll merge this now.

@tegefaulkes tegefaulkes merged commit 3d16e2b into staging Aug 3, 2023
@CMCDragonkai
Copy link
Member

Try doing prereleases from staging branch in the future.

@CMCDragonkai CMCDragonkai added the r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management label Aug 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management
Development

Successfully merging this pull request may close these issues.

3 participants