Permanent storage for Argo's data #745

vicaire · 2018-02-15T22:51:31Z

Is this a BUG REPORT or FEATURE REQUEST?: FEATURE REQUEST

It is great to be able to see the workflow history using "argo list" and to dig into more details using "argo get" and "argo logs".

If I understand correctly however, all this data is stored in the Kubernetes key-value store, which is not intended to be permanent storage.

What is the solution that you would recommend to export the workflow execution history in permanent storage? Rather than just exporting all the data at once, I am more looking into a solution that would export the data as it is being generated so that it can be analysed in near real time.

Would it make sense for Argo to support this as a feature? For instance, I would provide a MySQL database, and Argo would populate it as workflows are being executed and making progress.

Thanks!

vicaire · 2018-02-16T14:42:27Z

It looks like a Kubernetes API extension could handle reconciling the state of Argo workflows with some external storage (such as a MySQL DB, GCS, etc.):

https://github.com/kubernetes-incubator/apiserver-builder/blob/master/docs/concepts/api_building_overview.md#reconciliation

jessesuen · 2018-02-16T21:18:51Z

Hi @vicaire, that is correct. Long term storage of workflows in etc is not a very scalable approach. In fact, ideally I would like to add some GC settings/options in the controller to simply delete workflows after some time.

We have thought about long term persistence of workflows, and have come to the conclusion that it needs to be something done outside the purview of the workflow-controller. Internally, we have described something like an archiver service, which watches for completed workflows, and simply dumps the workflow payload into a database/S3/etc... It additionally could perform GC on workflow it has already archived. argo-ui would have to be taught about secondary location in order to present a unified view of workflows..

I do think there may be some controller work to do proper archiving of logs. See #454 for some thoughts around this.

vicaire · 2018-02-17T10:11:33Z

Thanks Jesse. That makes sense.

joshes · 2018-08-09T19:58:49Z

@jessesuen re:

In fact, ideally I would like to add some GC settings/options in the controller to simply delete workflows after some time.

is there a ticket for this effort? i'd like to look into this as well if it's not already underway. i ended up just writing a cronjob that deletes wf's > N days, which works, but would be nice if it was something inherit to the system.

vicaire · 2018-08-11T02:06:42Z

jessesuen@,

That looks like a nice way to do it. The libraries used to implement controllers (watch APIs, queues) should make it easy to monitor the workflows, save them and garbage collect them.

What are your thoughts about K8 API extensions for this purpose (https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/#api-server-aggregation)? It looks like it could replace the controller/CRD and take care of (optionally) persisting all the data in a DB in addition to etcd. The UI/CLI could then call this API extension and get a complete view (list what is in the DB and etcd) instead of just a view of what has not yet been garbage collected from etcd.

Alternatively, what about having the current CRD controller itself optionally save to permanent storage and garbage collect? (One issue seems to be that the list call would still just only return what is stored in etcd).

jessesuen · 2019-04-20T00:48:44Z

Workflow persistence is actually back on the table, and is targeted for the next release v2.4.

However, API server work (to leverage persistence) is scheduled for later v2.5.

agnewp · 2019-08-23T20:14:30Z

Hey guys, i was just playing with the idea of workflow object TTL with the obvious trade-off that all workflow data is essentially lost after the object gets cleaned up by the TTL controller mechanism. My question here is if the workflow data does have a place to live 'permanently', in a database for example, are you planning to have the prometheus metrics reflect the stats gathered in this more permanent database? or should those continue to reflect the objects currently present in the key/value store in etcd? i would like it if i could have prometheus getting its metrics from the database, but i can also see the utility of also having some metrics that show the current state in etcd...

…oj#745)

joshes mentioned this issue Aug 9, 2018

Capture container logs as part of artifacts #454

Closed

edlee2121 added this to the V2.3 milestone Aug 29, 2018

alexmt modified the milestones: v2.3, v2.4 Jan 25, 2019

sarabala1979 self-assigned this Apr 11, 2019

sarabala1979 added the type/feature Feature request label Apr 17, 2019

jessesuen mentioned this issue Apr 20, 2019

Argo API Server #1331

Closed

sarabala1979 closed this as completed Jul 9, 2019

dtaniwaki mentioned this issue Aug 2, 2019

WorkflowTemplate CRD #1312

Merged

suzuki-shunsuke mentioned this issue Jan 27, 2020

Argo Workflow 検証 suzuki-shunsuke/issue#34

Closed

icecoffee531 pushed a commit to icecoffee531/argo-workflows that referenced this issue Jan 5, 2022

feat: native nats eventbus metrics and template customization (argopr…

210aefb

…oj#745)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Permanent storage for Argo's data #745

Permanent storage for Argo's data #745

vicaire commented Feb 15, 2018

vicaire commented Feb 16, 2018

jessesuen commented Feb 16, 2018

vicaire commented Feb 17, 2018

joshes commented Aug 9, 2018

vicaire commented Aug 11, 2018

jessesuen commented Apr 20, 2019

agnewp commented Aug 23, 2019

Permanent storage for Argo's data #745

Permanent storage for Argo's data #745

Comments

vicaire commented Feb 15, 2018

vicaire commented Feb 16, 2018

jessesuen commented Feb 16, 2018

vicaire commented Feb 17, 2018

joshes commented Aug 9, 2018

vicaire commented Aug 11, 2018

jessesuen commented Apr 20, 2019

agnewp commented Aug 23, 2019