-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configure S3 to store Harvester Records #4335
Labels
H2.0/Harvest-Runner
Harvest Source Processing for Harvesting 2.0
Comments
btylerburton
added
H2.0/Harvest-General
General Harvesting 2.0 Issues
H2.0/controller
labels
May 26, 2023
the PUT object is currently included in the work near completion in the extract ticket #4257 |
hkdctol
moved this from 📟 Sprint Backlog [7]
to 🏗 In Progress [8]
in data.gov team board
Jul 20, 2023
hkdctol
moved this from 🏗 In Progress [8]
to 📟 Sprint Backlog [7]
in data.gov team board
Jul 20, 2023
robert-bryson
moved this from 📟 Sprint Backlog [7]
to 🏗 In Progress [8]
in data.gov team board
Aug 11, 2023
Moving back to new dev as work superseded by Airflow work. |
btylerburton
changed the title
WIP Setup Interface for S3
WIP Use S3 to store XCom objects
Nov 27, 2023
btylerburton
changed the title
WIP Use S3 to store XCom objects
Configure S3 to store XCom objects
Dec 6, 2023
btylerburton
changed the title
Configure S3 to store XCom objects
Configure S3 to store Harvester Records
Feb 16, 2024
btylerburton
added
H2.0/Harvest-Runner
Harvest Source Processing for Harvesting 2.0
and removed
H2.0/Airflow
labels
Feb 16, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
User Story
In order to perform operations on the data records reliably, datagov wants an interface to interact with S3 or the localstack equivalent.
Acceptance Criteria
[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]
sourceId = UUID generated by controller on creation of new harvest source
jobId = UUID generated by controller when a new job is intiated
recordId = UUID generated by extract service to track status of record within pipeline
GIVEN I would like to save the extracted record from a harvest source
AND I have a prefix defined by the schema
<feature>/<sourceId>/<jobId>/<recordId>
THEN I want a utility to PUT that object using that prefix
GIVEN I would like to retrieve a previously saved record from S3
AND I have a prefix defined by the schema
<feature>/<sourceId>/<jobId>/<recordId>
THEN I want a utility to GET the object associated with that prefix
GIVEN I would like to delete a previously saved record
AND I have a prefix defined by the schema
<feature>/<sourceId>/<jobId>/<recordId>
THEN I want a utility to DELETE the object associated with that prefix
GIVEN I would like to query the count of added/updated/deleted records with a
<jobId>
AND I have a prefix defined by the schema
<feature>/<sourceId>/<jobId>
THEN I want a utility to GET the objects associated with that
<jobId>
and return them.Background
[Any helpful contextual notes or links to artifacts/evidence, if needed]
Data.gov would like all Boto / S3 references contained within a single module so that any upgrades to the service would happen simultaneously.
Security Considerations (required)
[Any security concerns that might be implicated in the change. "None" is OK, just be explicit here!]
Sketch
<feature>/<sourceId>/<jobId>/<recordId>
) and value (the record to store)The text was updated successfully, but these errors were encountered: