Configure S3 to store Harvester Records #4335

btylerburton · 2023-05-26T22:34:21Z

User Story

In order to perform operations on the data records reliably, datagov wants an interface to interact with S3 or the localstack equivalent.

Acceptance Criteria

[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]

sourceId = UUID generated by controller on creation of new harvest source
jobId = UUID generated by controller when a new job is intiated
recordId = UUID generated by extract service to track status of record within pipeline

GIVEN I would like to save the extracted record from a harvest source
AND I have a prefix defined by the schema <feature>/<sourceId>/<jobId>/<recordId>
THEN I want a utility to PUT that object using that prefix
GIVEN I would like to retrieve a previously saved record from S3
AND I have a prefix defined by the schema <feature>/<sourceId>/<jobId>/<recordId>
THEN I want a utility to GET the object associated with that prefix
GIVEN I would like to delete a previously saved record
AND I have a prefix defined by the schema <feature>/<sourceId>/<jobId>/<recordId>
THEN I want a utility to DELETE the object associated with that prefix
GIVEN I would like to query the count of added/updated/deleted records with a <jobId>
AND I have a prefix defined by the schema <feature>/<sourceId>/<jobId>
THEN I want a utility to GET the objects associated with that <jobId> and return them.

Background

[Any helpful contextual notes or links to artifacts/evidence, if needed]

Data.gov would like all Boto / S3 references contained within a single module so that any upgrades to the service would happen simultaneously.

Security Considerations (required)

[Any security concerns that might be implicated in the change. "None" is OK, just be explicit here!]

Sketch

Create a module to interact with S3 client
Create helper methods to abstract away all details except name (<feature>/<sourceId>/<jobId>/<recordId>) and value (the record to store)

The text was updated successfully, but these errors were encountered:

rshewitt · 2023-05-31T15:28:47Z

the PUT object is currently included in the work near completion in the extract ticket #4257

robert-bryson · 2023-08-25T16:16:49Z

Moving back to new dev as work superseded by Airflow work.

btylerburton added this to data.gov team board May 26, 2023

btylerburton moved this to 📟 Sprint Backlog [7] in data.gov team board May 26, 2023

btylerburton added H2.0/Harvest-General General Harvesting 2.0 Issues H2.0/controller labels May 26, 2023

btylerburton assigned FuhuXia May 30, 2023

btylerburton changed the title ~~WIP Setup Interface for S3~~ Setup Interface for S3 May 30, 2023

hkdctol moved this from 📟 Sprint Backlog [7] to 🏗 In Progress [8] in data.gov team board Jul 20, 2023

hkdctol moved this from 🏗 In Progress [8] to 📟 Sprint Backlog [7] in data.gov team board Jul 20, 2023

robert-bryson moved this from 📟 Sprint Backlog [7] to 🏗 In Progress [8] in data.gov team board Aug 11, 2023

robert-bryson assigned robert-bryson and unassigned FuhuXia Aug 11, 2023

robert-bryson moved this from 🏗 In Progress [8] to New Dev in data.gov team board Aug 25, 2023

robert-bryson removed their assignment Sep 5, 2023

btylerburton changed the title ~~Setup Interface for S3~~ WIP Setup Interface for S3 Nov 27, 2023

btylerburton changed the title ~~WIP Setup Interface for S3~~ WIP Use S3 to store XCom objects Nov 27, 2023

btylerburton assigned FuhuXia and unassigned FuhuXia Nov 27, 2023

btylerburton changed the title ~~WIP Use S3 to store XCom objects~~ Configure S3 to store XCom objects Dec 6, 2023

btylerburton removed the H2.0/Harvest-General General Harvesting 2.0 Issues label Dec 13, 2023

btylerburton moved this to 🧊 Icebox in data.gov team board Feb 16, 2024

btylerburton changed the title ~~Configure S3 to store XCom objects~~ Configure S3 to store Harvester Records Feb 16, 2024

btylerburton added H2.0/Harvest-Runner Harvest Source Processing for Harvesting 2.0 and removed H2.0/Airflow labels Feb 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configure S3 to store Harvester Records #4335

Configure S3 to store Harvester Records #4335

btylerburton commented May 26, 2023 •

edited

Loading

rshewitt commented May 31, 2023

robert-bryson commented Aug 25, 2023

Configure S3 to store Harvester Records #4335

Configure S3 to store Harvester Records #4335

Comments

btylerburton commented May 26, 2023 • edited Loading

User Story

Acceptance Criteria

Background

Security Considerations (required)

Sketch

rshewitt commented May 31, 2023

robert-bryson commented Aug 25, 2023

btylerburton commented May 26, 2023 •

edited

Loading