Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

polykey async-init fixes #281

Closed
18 tasks done
tegefaulkes opened this issue Oct 29, 2021 · 43 comments
Closed
18 tasks done

polykey async-init fixes #281

tegefaulkes opened this issue Oct 29, 2021 · 43 comments
Assignees
Labels
development Standard development procedure Action that must be executed r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy

Comments

@tegefaulkes
Copy link
Contributor

tegefaulkes commented Oct 29, 2021

For addressing point 1. #194 (comment)

Problem regarding the usage of src/config.ts and PolykeyAgent.ts. The PK_INGRESS_PORT has to be propagated from env variables to the ReverseProxy.createReverseProxy().

However we noticed during the integration of js-async-init, some things weren't done exactly the way it's supposed to be.

In PolykeyAgent, the structure at the end of createPolykeyAgent should be:

    const pk = new Polykey({ ... });
    await pk.start({ ... });
    return pk;

The reason to have an asynchronous createX static constructor is to call the start and ensure that constructed instances are started already. If this wasn't intended, then you only needed the StartStop decorator, not the CreateDestroyStartStop decorator.

The fact that PolykeyAgent wasn't done this way, may mean that other classes that are using the CreateDestroyStartStop pattern are also doing it incorrectly.

Now because the createX static methods are meant to call the asynchronous start, this means start parameters/options must be available in the createX static methods. The ReverseProxy has these parameters:

  public async start({
    ingressHost = '0.0.0.0' as Host,
    ingressPort = 0 as Port,
    serverHost,
    serverPort,
    tlsConfig,
  }: {
    ingressHost?: Host;
    ingressPort?: Port;
    serverHost: Host;
    serverPort: Port;
    tlsConfig: TLSConfig;
  }): Promise<void> {

These need to be replicated to the createReverseProxy.

@tegefaulkes

Note the difference between constructor parameters and start parameters. In the recent work on EFS and js-db, it's better to put parameters into the constructor if it is expected that they will stick around as class properties. Only if the parameters are temporary to the asynchronous start, then you should have them in the start method. This means there are some changes required to the PK domains that haven't yet been done.

In ReverseProxy, there are some cases where start does require parameters, because for example ingressPort can be set to 0, and then it has be resolved subsequently to the actual port, and that's why it is in the start method.

Tasks

  1. Review all uses of CreateDestroyStartStop decorator and ensure that their createX functions are calling start
  2. Move any parameter that is meant to be stored as a class property and doesn't require asynchronous operations on them to the constructor instead of start.
  3. Propagate all remaining start parameters to the createX functions so that when calling create you can parameterise the asynchronous creation.
    • order of parameters: required constructor, required start, optional constructor, optional start
  4. Change to using CDSS when there is underlying state, destroy should destroy the underlying state, while stop doesn't, remember destroy is propagated from the top level
  5. CreateDestroyStartStop should be using 3 exceptions: ErrorXRunning, ErrorXDestroyed, and ErrorXRunning, refer to the parameter name of the decorators to see what to use
  6. Specify default values in parameters inside the function signature like { x = false }: { x: boolean; }, then for encapsulated dependencies use x = x ?? await X.createX() style. This should reduce the line noise.
  7. For CDSS pattern, make sure default values are on the createX but not the constructor. In some cases we will put defaults also on the constructor if we expect the class to be also be constructed without the asynchronous create like for `
  8. The info logging should be standardised for createX, destroy, start and stop. Always do this (even if the methods are noops you should still log, this is important to debug the matroska model):
    // in the createX
    logger.info(`Creating ${this.name}`);
    logger.info(`Created ${this.name}`);
    // for the other methods
    logger.info(`Starting ${this.constructor.name}`);
    logger.info(`Started ${this.constructor.name}`);
    // repeat for stop and destroy
  9. Child loggers should be propagated with class name logger.getChild(Class.name).
  10. Due to recursive dependency for PolykeyAgent instance in agentService, the GRPC related code may need to be SS pattern and not CDSS, check this and investigate. See polykey async-init fixes #281 (comment)
  11. Setting WorkerManager should be dynamic, see src/PolykeyAgent.ts.
  12. The difference between encapsulated dependencies and just external references is whether the dependency is optional. If optional, then it is an encapsulated dependency, if required, then it is an external reference. So you don't manage it's lifecycle, which means you don't call create or start.
  13. The createX should be the dual of destroy. So then the domain managed state should be "created" at createX. And destroy should destroy it. However start() should always be called at the end of createX so that way the proper starting operations can be done. However don't use the recursive option for utils.mkdirExists polykey async-init fixes #281 (comment), but do use recursive option for destroy and also force...
  14. Propagate the start, stop, and destroy into encapsulated dependencies but not required dependencies.
  15. Make sure to test all of this in accordance to the matroska/babushka doll.
  16. update GRPCClient to be inline with the async-init lib updates.
  17. Update relevent getters to use @ready decoration.
  18. Incorporate readiness testing to each one see tests/sessions/Session.ts.
@tegefaulkes
Copy link
Contributor Author

Review all uses of CreateDestroyStartStop decorator and ensure that their createX functions are calling start

  1. create a list of all the domains using CreateDestroyStartStop.
  2. determine if any domains are missing CreateDestroyStartStop pattern.
  3. make sure all domains on the list are calling start() inside the createX function.

@tegefaulkes
Copy link
Contributor Author

tegefaulkes commented Oct 29, 2021

Referencing the diagram from https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/205#note_688426664.

list of domains using CreateDestroyStartStop.

  • FwdProxy
  • RevProxy
  • NodeManager
  • PolykeyAgent
  • PolykeyClient
  • GRPCServer
  • Session - not converted.
  • GRPCClient - not converted

@tegefaulkes
Copy link
Contributor Author

tegefaulkes commented Oct 29, 2021

Checking all relevent domains are calling start when being created.

  • FwdProxy
  • RevProxy
  • NodeManager
  • PolykeyAgent
  • PolykeyClient
  • GRPCServer
  • Session
  • GRPCClient
    • GRPCClientAgent
    • GRPCClientClient

@tegefaulkes
Copy link
Contributor Author

tegefaulkes commented Oct 29, 2021

To finish the last two items on the above list I need to convert them to using CreateDestroyStartStop first.

  • Session
  • GRPCClient This is an odd case. it was already converted. the interface was defined on this but the decoration with @CreateDestroyStartStop was done when it was extended for the client and the agent version of it.

@CMCDragonkai
Copy link
Member

It's also PolykeyAgent. It wasn't propagating the start and stop and create and destroy.

@tegefaulkes
Copy link
Contributor Author

I'm not sure we should have the networking modules start at creation. It means NodeMnanager can't be started before the services, GRPC and proxies are started. but also we're getting a dependency cycle where the clientService depends on PolykeyAgent, Polykeyagent depends on nodeManager and nodemanager depends on the service.
PolykeyAgent<-agentService<-grpcServer<-rev,fwdProxy<-NodeManager<-PolykeyAgent.

Since the networking modules are the most expensive to start up it makes sense that they and by extension PolykeyAgent should start independent of creation.

@CMCDragonkai
Copy link
Member

@tegefaulkes did you make any WIP commits on the MR: https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/213 or was there some unpushed work?

@CMCDragonkai
Copy link
Member

The reason clientService depends on PolykeyAgent is due to all the agent-related commands right? Lke start and stop?

There may be a way to abstract this the there's circularity here.

@CMCDragonkai
Copy link
Member

The CreateDestroy that is applied on KeyManager is also problematic.

The CreateDestroy should not have a create method. Only a createX static method should exist.

See the tests that use CreateDestroy, there's no mention of create method.

@CMCDragonkai
Copy link
Member

See this: https://gist.github.com/CMCDragonkai/1dbf5069d9efc11585c27cc774271584. The gist explains the different uses of CreateDestroy.

@joshuakarp
Copy link
Contributor

Moving the async-init fixes todo list from the top-level description of #194. Most of these points are already in the task list in this issue, but they're a little more fleshed out here:

  1. Fix usage of the CreateDestroyStartStop decorator, as per Testnet Node Deployment (testnet.polykey.io) #194 (comment) (please read through this comment carefully). This will involve:

    • - Review all uses of CreateDestroyStartStop decorator and ensure that their createX functions are calling start
      For example:
      public static async createPolykeyAgent( ... ) {
        ...
        const pk = new Polykey({ ... });
        await pk.start({ ... });
        return pk;
      }

    That is, the assumption is that a class using the CreateDestroyStartStop pattern must return a started instance after the call to createX. If this wasn't intended, then you only needed the StartStop decorator, not the CreateDestroyStartStop decorator.

    • - Move any parameter that is meant to be stored as a class property and doesn't require asynchronous operations on them to the constructor instead of start.
    • - Propagate all remaining start parameters to the createX functions so that when calling create you can parameterise the asynchronous creation.
      For example, the ReverseProxy should be something like:
    static async createReverseProxy({
      ingressHost = '0.0.0.0' as Host,
      ingressPort = 0 as Port,
      serverHost,
      serverPort,
      tlsConfig,
      connConnectTime = 20000,
      connTimeoutTime = 20000,
      logger,
    }: {
      connConnectTime?: number;
      connTimeoutTime?: number;
      logger?: Logger;
    }): Promise<ReverseProxy> {
      const logger_ = logger ?? new Logger('ReverseProxy');
      ...
      const revProxy = new ReverseProxy( ... );
      await revProxy.start( ... );
      return revProxy;
    }
    
    constructor({
      connConnectTime,
      connTimeoutTime,
      logger,
    }: {
      connConnectTime: number;
      connTimeoutTime: number;
      logger: Logger;
    }) {
      logger.info('Creating Reverse Proxy');
      this.logger = logger;
      this.connConnectTime = connConnectTime;
      this.connTimeoutTime = connTimeoutTime;
      this.logger.info('Created Reverse Proxy');
    }
    
    public async start({
      ingressHost = '0.0.0.0' as Host,
      ingressPort = 0 as Port,
      serverHost,
      serverPort,
      tlsConfig,
    }: {
      ingressHost?: Host;
      ingressPort?: Port;
      serverHost: Host;
      serverPort: Port;
      tlsConfig: TLSConfig;
    })
    • - Ensure that createX is calling start on any encapsulated dependencies too. Ensure that their lifecycles are also managed in stop and destroy (i.e. call their respective stop/destroy functions here).
      • Note the distinction between an encapsulated dependency and an external reference
      • An encapsulated dependency is any optional parameter in createX/start (because if they aren't passed, we have to create them ourselves). The class is expected to manage the lifecycle of these dependencies.
      • An external reference is any required parameter. Therefore, no management of lifecycle is required (it's managed elsewhere - do not call start/stop/create/destroy on these references).
    • - Incorporate these changes into testing
    • - Create sequence diagram in accordance with the matroshka/babushka doll (i.e. clearly showing how and when everything is created and where they're injected in the lifecycle of Polykey)

@CMCDragonkai
Copy link
Member

The current MR https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/213 shows how to use CreateDestroyStartStop for src/sessions/Session.ts.

Also in our meeting today we realised that we needed to make the networking related domains StartStop instead of CreateDestroyStartStop because of a loop occuring where the PolykeyAgent instance has to be accessed by the clientService.ts in order to stop/destroy the agent instance.

This can only be achieved by passing this from the start method of PolykeyAgent as the instance is not yet available when you call createX static methods of the dependencies of PolykeyAgent.

For classes using StartStop, they don't have a createX method, instead relying on the upstream start methods to propagate start calls.

This may apply to ForwardProxy, ReverseProxy, GRPCServer and more. This requires some prototyping to check what the case is.

@CMCDragonkai CMCDragonkai added the development Standard development label Nov 1, 2021
@CMCDragonkai
Copy link
Member

@CMCDragonkai
Copy link
Member

From this comment: https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/213#note_721681210

Regarding CreateDestroyStartStop. For both Session and SessionManager you need:

  • at least StartStop because you want to be able to stop without destroying underlying state
  • at least CreateDestroy becuase you want to be able to destroy the underlying state

THEREFORE you need both!

It would indicate that a number of places using CreateDestroy should be using CreateDestroyStartStop as long as the underlying state is something we may want to destroy. For Session and SessionManager this may be the case, but for things like notifications, acl, having a separate destroy from stop can be a convenience.

@CMCDragonkai
Copy link
Member

I noticed this call:

    await utils.mkdirExists(fs, nodePath);

In the PolykeyAgent.createPolykeyAgent.

Now the problem is that usually I've been creating the directories in the start.

But with asynchronous creation, this is important if the directory being created is needed by the dependencies.

So the start is called at the very end, this is something that is relevant.

On the otherhand, if each dependency won't really need the state until they themselves are also calling start, then it should work to only keep the directory creation in the start.

This could mean that side effects don't actually occur until you call start. It's just that the createX methods call start at the very end which propagates all the way down.

I'm trying this out in https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/213

@CMCDragonkai
Copy link
Member

Note that I removed the recursive option since they should only make the directory assuming the parent directory already exists. This is a good safe option.

@CMCDragonkai
Copy link
Member

  • KeyManager - should be CreateDestroyStartStop because the underlying keys state can be destroyed separately from being stopped

@CMCDragonkai
Copy link
Member

One of the things I had to do is figure out the order of parameters for propagating start to createX.

So now I've done:

  • required constructor parameters
  • required start parameters
  • optional constructor parameters
  • optional start parameters

Usually there should be no required start parameters...

@CMCDragonkai CMCDragonkai added the procedure Action that must be executed label Nov 9, 2021
@CMCDragonkai
Copy link
Member

@tegefaulkes I've updated the task list regarding the async fixes.

@CMCDragonkai
Copy link
Member

I noticed this call:

    await utils.mkdirExists(fs, nodePath);

In the PolykeyAgent.createPolykeyAgent.

Now the problem is that usually I've been creating the directories in the start.

But with asynchronous creation, this is important if the directory being created is needed by the dependencies.

So the start is called at the very end, this is something that is relevant.

On the otherhand, if each dependency won't really need the state until they themselves are also calling start, then it should work to only keep the directory creation in the start.

This could mean that side effects don't actually occur until you call start. It's just that the createX methods call start at the very end which propagates all the way down.

I'm trying this out in https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/213

I changed my mind on this, this should actually be in the createX. Because dependency starts may require it.

@tegefaulkes
Copy link
Contributor Author

Fixed up the networking depencency cycle and cleaned up PolykeyAgent.ts

@tegefaulkes
Copy link
Contributor Author

tegefaulkes commented Nov 9, 2021

Checking Review all uses of CreateDestroyStartStop decorator and ensure that their createX functions are calling start
domains:

  • polykeyAgent
  • PolykeyClient
  • ACL
  • GRPCClientAgent
  • GRPCClientClient
  • GRPCClient
  • NodeConnection
  • NodeManager
  • Schema
  • Session
  • SessionManager

@CMCDragonkai
Copy link
Member

Does the networking dependency cycle involve GRPCClient? It's currently still CDSS?

@CMCDragonkai
Copy link
Member

Seems like it doesn't. The GRPCServer is now StartStop.

But it seems like GRPCClient should be CreateDestroy instead because there's no relevant destruction here.

@CMCDragonkai
Copy link
Member

Hmm the solution to the async init problem on the abstract class is simply not to have the decorator on the abstract class. Instead decorator on GRPCClientClient and GRPCClientAgent and that should be enough. Plus it's indeed impossible to decorate abstract constructors. I tried it out.

@CMCDragonkai
Copy link
Member

Actually to fix the problem of using async init for the GRPCClient we must follow this: MatrixAI/js-async-init#1 (comment)

Basically the GRPCClient should not be wrapped in a decorator, only the child classes should be, or only the parent class.

Furthermore, the GRPCClient or derived should only be using CreateDestroy not CreateDestroyStartStop, and not StartStop. Only GRPCServer needs the StartStop treatment.

@tegefaulkes
Copy link
Contributor Author

Do we really need readiness testing for all of the domains? since its provided via the asyc-ini library we should assume it works? is there a need to replicate the test for each domain?

@CMCDragonkai
Copy link
Member

Just a copy of this test that exists in Session.test.ts:

  test('session readiness', async () => {
    const session = await Session.createSession({
      sessionTokenPath: path.join(dataDir, 'token'),
      logger
    });
    await expect(session.destroy()).rejects.toThrow(sessionErrors.ErrorSessionRunning);
    // should be a noop
    await session.start();
    await session.stop();
    await session.destroy();
    await expect(session.start()).rejects.toThrow(sessionErrors.ErrorSessionDestroyed);
    await expect(session.readToken()).rejects.toThrow(sessionErrors.ErrorSessionNotRunning);
    await expect(session.writeToken('abc' as SessionToken)).rejects.toThrow(sessionErrors.ErrorSessionNotRunning);
  });

That's a CDSS test. CD should be even more simpler.

We can create a standard template and copy paste it for each one.

It's in order to ensure that the exceptions is being abided by. A quick sanity test for create destroy related work too.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Nov 11, 2021

And no you don't need to test every single method. Just like a one or 2 methods. If there's a class with MANY methods, then just 1 or 2 methods. The simple ones. This test is not attempting to test every single method.

Although if there's a limited set of actual "ready" methods, then it can be useful there. See how the expectation is that the method throws an exception.

@CMCDragonkai
Copy link
Member

@tegefaulkes I made a mistake on the logger default in static.

It should be this.name there, example:

  static async createSession({
    sessionTokenPath,
    fs = require('fs'),
    logger = new Logger(this.name),
    sessionToken,
    fresh = false,
  }:{

this.name is used in static, but this.constructor.name is used in instance methods.

@tegefaulkes
Copy link
Contributor Author

Fixed the logger defaults.

@CMCDragonkai
Copy link
Member

Awesome work @tegefaulkes!

Last thing to fix is the usage of these:


[nix-shell:~/Projects/js-polykey/src]$ ag 'new Logger\(this.constructor.name\)'
keys/KeyManager.ts
58:    logger = new Logger(this.constructor.name),

vaults/VaultManager.ts
76:    logger = new Logger(this.constructor.name),

vaults/VaultInternal.ts
41:    logger = new Logger(this.constructor.name),

gestalts/GestaltGraph.ts
53:    logger = new Logger(this.constructor.name),

acl/ACL.ts
26:    logger = new Logger(this.constructor.name),

notifications/NotificationsManager.ts
60:    logger = new Logger(this.constructor.name),

identities/providers/github/GitHubProvider.ts
38:    this.logger = logger ?? new Logger(this.constructor.name);

identities/IdentitiesManager.ts
35:    logger = new Logger(this.constructor.name),

nodes/NodeGraph.ts
50:    logger = new Logger(this.constructor.name),

nodes/NodeConnection.ts
62:    logger = new Logger(this.constructor.name),

nodes/NodeManager.ts
67:    logger = new Logger(this.constructor.name),

schema/Schema.ts
22:    logger = new Logger(this.constructor.name),

sigchain/Sigchain.ts
59:    logger = new Logger(this.constructor.name),

discovery/Discovery.ts
36:    logger = new Logger(this.constructor.name),

agent/GRPCClientAgent.ts
23:    logger = new Logger(this.constructor.name),

All of those should be this.name if they are in the static constructor.

@CMCDragonkai
Copy link
Member

Ah I see you've done it.

@CMCDragonkai
Copy link
Member

@CMCDragonkai
Copy link
Member

Something was missed, unnecessary start calls.

For example:

    db = await DB.createDB({
      dbPath,
      logger,
      crypto: makeCrypto(keyManager),
    });
    await db.start();

The db.start is no longer required, since it is expected that when you create it asynchronously, it is started.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Nov 15, 2021

Might need to make an exception regarding session where it is being used by GRPCClientClient. In this case we have a optional dependency that isn't then constructed and managed internally because it is truly optional. If it is not set, it's not used.

@CMCDragonkai
Copy link
Member

Removed the unnecessary db.start now from tests. Some are still needed because there are tests, that test the stop and start of the db.

@CMCDragonkai
Copy link
Member

There are some parts of code still checking whether the domains are running or not. This is no longer necessary due to async init creation. We always expect they to be running by the time we are using start. Especially for this:

    if (!this.db.running) {
      throw new dbErrors.ErrorDBNotRunning();
    }

Will be removing these as I see these.

@CMCDragonkai
Copy link
Member


[nix-shell:~/Projects/js-polykey/src]$ ag 'this.db.running'
nodes/NodeManager.ts
134:    if (!this.db.running) {

sigchain/Sigchain.ts
98:    if (!this.db.running) {

Other places which has this.

@CMCDragonkai
Copy link
Member

Note that protected methods do not need the @ready decorator, it is only applied on external methods.

@CMCDragonkai
Copy link
Member

Should have a documented checklist on all the constraints on using async-init. This should be going to documentation.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Nov 23, 2021

I've noticed all of the network/Connection* classes haven't been integrated with async-init pattern.

Since these are managed internally by the proxies, I'm leaving them out so we can merge for now. However a new issue will be created for this. #293

@CMCDragonkai
Copy link
Member

This has been done with the merge of https://gitlab.com/MatrixAI/Engineering/Polykey/js-polykey/-/merge_requests/213.

Left over issues are #293 and larger scale refactoring of nodes in #225.

Developer documentation should be written here regarding this usage.

@CMCDragonkai CMCDragonkai added r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy labels Jul 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development procedure Action that must be executed r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management r&d:polykey:core activity 3 Peer to Peer Federated Hierarchy
Development

No branches or pull requests

3 participants