Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate adding a distributed lock for ingest #2843

Open
anna-parker opened this issue Sep 19, 2024 · 2 comments
Open

Investigate adding a distributed lock for ingest #2843

anna-parker opened this issue Sep 19, 2024 · 2 comments
Labels
backend related to the loculus backend component deployment Code changes targetting the deployment infrastructure ingest Ingest pipeline

Comments

@anna-parker
Copy link
Contributor

We saw some suspicious duplication of ingest data again in preview instances (see #2653). We saw this before due to concurrency of ingest jobs. We thought that adding

replicas: 1
  strategy:
    type: Recreate

to the ingest deployment would prevent concurrency but maybe this isn't enough.

Some ideas how we could add a distributed lock:

  1. Redis with Redlock: Pods can acquire a distributed lock from Redis before communicating with another pod.
  2. Add a lock endpoint to the backend which would make the backend add a lock to the backend, sth along the lines of:
  • Add a lock table: with username (maybe also PID) and time of lock
  • Add a lock endpoint that adds an entry to the lock table for that user and otherwise returns an error if there is already an entry in the lock table for that user
  • Have ingest call the lock endpoint before calling get-original-metadata. Only continue with submission if the call is successful.
  • Have the backend clean up the lock table by removing old locks, e.g. older than 30min
@anna-parker anna-parker added ingest Ingest pipeline backend related to the loculus backend component deployment Code changes targetting the deployment infrastructure labels Sep 19, 2024
@theosanderson
Copy link
Member

I wouldn't necessarily proceed on this before we understand the issue by e.g. seeing the concurrency in logs

@theosanderson
Copy link
Member

We identified a likely cause of duplicate ingest (resolved by #2846). We can likely close this Issue without action if issues do not resurface in next few weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend related to the loculus backend component deployment Code changes targetting the deployment infrastructure ingest Ingest pipeline
Projects
None yet
Development

No branches or pull requests

2 participants