Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#172884867] Automatic user data delete on GDPR right to erasure claim #62

Merged
merged 100 commits into from
Jul 28, 2020

Conversation

balanza
Copy link
Contributor

@balanza balanza commented Jul 16, 2020

This PR introduces into this service the ability to autonomously start the job that backup and deletes user data on user's request.

It depends on this to be merged: pagopa/io-backend#691

Done

  • trigger on user-data-processing collection
  • orchestrator workflow with user's session lock/unlock
  • api interaction with Session Api endpoints for locking/unlocking
  • start delete data activity
  • wait for 7 days before effectively delete data
  • handle request cancelling
  • make wait time an env parameter (for testing purpose)

Things worth looking

Env

Variable name Description type
SESSION_API_URL Internal URL of the BACKEND API used to handle session lock/unlock requests string
SESSION_API_KEY service access key for the session API string
USER_DATA_BACKUP_CONTAINER_NAME Name of the storage container in which user data is backuped before being permanently deleted string
USER_DATA_DELETE_DELAY_DAYS How many days to wait when a user asks for cancellation before effectively delete her data number
UserDataBackupStorageConnection Storage connection string for GDPR user data storage string

Feature flags

Variable name Description default
FF_ENABLE_USER_DATA_DOWNLOAD Users' GDPR data access claims are processed true
FF_ENABLE_USER_DATA_DELETE Users' GDPR right to erasure claims are processed true

Notable Changes

  • introduced a utils/fetch module with pre-built timeout fetch api.
  • introduced a utils/appinsightsEvens module with Application Insights helpers.
  • introduced XyzDeletableModels that extend Xyzs models by adding deleting operations (example: ProfileDeletableModel extends ProfileModel). Such modules are located in utils/extensions.

userDataBackup: IBlobServiceInfo;
fiscalCode: FiscalCode;
}) => {
return executeRecursiveBackupAndDelete<RetrievedProfile>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor style note, remove the curly braces and the return statement

userDataBackup: IBlobServiceInfo;
notification: RetrievedNotification;
}): TaskEither<DataFailure, RetrievedNotification> => {
return sequenceT(taskEitherSeq)<
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general if you have a single return statement, just avoid the code block

* @param makeBackupBlobName takes an item and construct a name for the backup blob
* @param iterator an iterator of every result from the db
*/
const executeRecursiveBackupAndDelete = <T>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to generalize the recoursive logic and separate it from the specific use case (I.e the accumulators)

.mapLeft(toDocumentDeleteFailure)
.chain(_ => fromEither(_).mapLeft(toDocumentDeleteFailure)),
// recursive step
executeRecursiveBackupAndDelete<T>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this basically a reduce operation on the iteration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's basically this, but I didn't get to fit it in at the time of writing

.chain(e => fromEither(fromOption(undefined)(e)))
.foldTaskEither<DataFailure, Option<MessageContent>>(
_ => {
// unfortunately, a document not found is threated like a query error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that the user may get a subset of his data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in case of an error while fetching data. The rationale is: we don't want to impede the operation in case of a missing blob (which is a case of inconsistence of data, and should never happen).

@balanza balanza force-pushed the 172884867-automatic-delete branch from 2c212b9 to d9377f4 Compare July 24, 2020 12:41
@balanza
Copy link
Contributor Author

balanza commented Jul 24, 2020

what's the expected string value to obtain 'false' ?

Anything different from 1

@gunzip gunzip merged commit a1bb2b1 into master Jul 28, 2020
@gunzip gunzip deleted the 172884867-automatic-delete branch July 28, 2020 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants