-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Permanent storage for Argo's data #745
Comments
It looks like a Kubernetes API extension could handle reconciling the state of Argo workflows with some external storage (such as a MySQL DB, GCS, etc.): |
Hi @vicaire, that is correct. Long term storage of workflows in etc is not a very scalable approach. In fact, ideally I would like to add some GC settings/options in the controller to simply delete workflows after some time. We have thought about long term persistence of workflows, and have come to the conclusion that it needs to be something done outside the purview of the workflow-controller. Internally, we have described something like an archiver service, which watches for completed workflows, and simply dumps the workflow payload into a database/S3/etc... It additionally could perform GC on workflow it has already archived. argo-ui would have to be taught about secondary location in order to present a unified view of workflows.. I do think there may be some controller work to do proper archiving of logs. See #454 for some thoughts around this. |
Thanks Jesse. That makes sense. |
@jessesuen re:
is there a ticket for this effort? i'd like to look into this as well if it's not already underway. i ended up just writing a cronjob that deletes wf's > N days, which works, but would be nice if it was something inherit to the system. |
jessesuen@, That looks like a nice way to do it. The libraries used to implement controllers (watch APIs, queues) should make it easy to monitor the workflows, save them and garbage collect them. What are your thoughts about K8 API extensions for this purpose (https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/#api-server-aggregation)? It looks like it could replace the controller/CRD and take care of (optionally) persisting all the data in a DB in addition to etcd. The UI/CLI could then call this API extension and get a complete view (list what is in the DB and etcd) instead of just a view of what has not yet been garbage collected from etcd. Alternatively, what about having the current CRD controller itself optionally save to permanent storage and garbage collect? (One issue seems to be that the list call would still just only return what is stored in etcd). |
Workflow persistence is actually back on the table, and is targeted for the next release v2.4. However, API server work (to leverage persistence) is scheduled for later v2.5. |
Hey guys, i was just playing with the idea of workflow object TTL with the obvious trade-off that all workflow data is essentially lost after the object gets cleaned up by the TTL controller mechanism. My question here is if the workflow data does have a place to live 'permanently', in a database for example, are you planning to have the prometheus metrics reflect the stats gathered in this more permanent database? or should those continue to reflect the objects currently present in the key/value store in etcd? i would like it if i could have prometheus getting its metrics from the database, but i can also see the utility of also having some metrics that show the current state in etcd... |
Is this a BUG REPORT or FEATURE REQUEST?: FEATURE REQUEST
It is great to be able to see the workflow history using "argo list" and to dig into more details using "argo get" and "argo logs".
If I understand correctly however, all this data is stored in the Kubernetes key-value store, which is not intended to be permanent storage.
What is the solution that you would recommend to export the workflow execution history in permanent storage? Rather than just exporting all the data at once, I am more looking into a solution that would export the data as it is being generated so that it can be analysed in near real time.
Would it make sense for Argo to support this as a feature? For instance, I would provide a MySQL database, and Argo would populate it as workflows are being executed and making progress.
Thanks!
The text was updated successfully, but these errors were encountered: