Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EPIC] Extend IOPointers to store values in addition to keys #214

Open
4 of 6 tasks
shreyashankar opened this issue Sep 4, 2021 · 1 comment
Open
4 of 6 tasks
Assignees
Labels

Comments

@shreyashankar
Copy link
Collaborator

shreyashankar commented Sep 4, 2021

Context: the current IOPointer abstraction only stores a string "pointer" to the data, or a key. An example might be features.csv or model.joblib. This means we currently can't do anything with the data, because we don't store any concept of it. If we store data, we could do many things, including the following:

  • Compare current values to historical ComponentRuns' values
  • Identify whether files have been tampered with outside of ComponentRuns
  • Have more fine-grained tracing (record-level)

Storing the data in its entirety may be expensive. For now we will store a hash of the data, to get us one step closer to being able to store the data. This itself may be complex.

Issues

In the future, we will incorporate an IOPointer "tag" model to store information about fine-grained tracing (i.e., PK values will be tags). This tagging is out of scope from the current project.

@shreyashankar shreyashankar added L Large task, maybe somewhat dreading (multiple day & refactor) epic labels Sep 4, 2021
@shreyashankar shreyashankar self-assigned this Sep 4, 2021
@shreyashankar shreyashankar removed the L Large task, maybe somewhat dreading (multiple day & refactor) label Sep 4, 2021
@shreyashankar
Copy link
Collaborator Author

Goal: Have all this done by September 15 EOD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant