-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data stores, Collections, and Buffers #2190
Comments
@jyeshe @josephjclark I've updated the issue description for you guys to take a look and think before we chat next. |
Summary of discussion with @stuartc : We've agreed that support for the collections API is mostly in the form of a regular adaptor. A standalone adaptor, not common. To make this work, Runs will need to support two adaptors for each step. As it happens the runtime already supports this, so we just need to ensure support in the CLI, worker and engine. The Lightning contract does not need to change: the Worker will append the collections adaptor to the run spec. Lightning can later explicitly send an array of adaptors if it wants to (this will be important if the user wants to pick a collections provider from a list). The collections API will use the run token to authorise all incoming requests. Ie, when calling out to the collections API, you must include an API token. The Worker will attach the run token to The run token is totally safe to send to the job code. While the run token can be loads to load credentials and dataclips from lightning, it can only be used with a web socket API. And to connect to the web socket you need a separate worker token. So it's basically useless to you. We can however add further security by separating out state and configuration into two objects. The signature for an operation would become This approach is entirely compatible with the CLI. Users will need to explicitly specify the collections adaptoir (ie, So, here's the engineering work in JS land:
|
@stuartc any thoughts on expiration of records? Data retention stuff? A burn after reading option? |
We want to add the ability for Job code to be able to access a common datastore during execution.
The current proposal is that we leverage our existing postgres database, but must remain eagle eyed on performance implications as most of this data will not be indexable.
This will require work on both Lightning and the worker code.
Worker
Lightning
Grouped Deliverables
This is a rough bounding of features we can take on individually:
Job
Worker
Controller
Collections
Still TBD
The text was updated successfully, but these errors were encountered: