`update_lock` not race-condition-safe #168

timobrembeck · 2023-02-22T22:19:39Z

The update_lock uses the threading-library to avoid race conditions between the asynchronous post-save signal listeners:

django-linkcheck/linkcheck/__init__.py

Lines 4 to 5 in 6d933de

    
           # A global lock, showing whether linkcheck is busy 
        
           update_lock = threading.Lock()

django-linkcheck/linkcheck/listeners.py

Line 77 in a6599a6

with update_lock:

However, when Django is deployed in a production environment using multiple processes, this might not be race-condition-safe since each process has its own lock.

An alternative might be django-db-mutex (https://django-db-mutex.readthedocs.io/), which achieves the synchronization via a separate database model. There might be environments where more performant solutions are available (e.g. a redis cache etc.), but the least common denominator might simply be the database.

The text was updated successfully, but these errors were encountered:

claudep · 2023-02-23T20:27:56Z

I think adding django-db-mutex dependency is reasonable to solve this.

timobrembeck · 2023-02-25T17:05:14Z

Ah, I just noticed that this package does not implement a waiting lock, it will just throw an exception when the lock is currently in use. Adding a retrying-mechanism (by adding the failed task back to the queue) would also be possible, but I wonder whether there is an easier way?

multiprocessing.Lock() seems to be unsuitable because it's hard to share the lock resource between the processes that are spawned by the web server, right?

Do you think we could somehow abuse select_for_update() for this task?
If I understand it correctly, this would only be supported by postgresql, oracle, and mysql db backends... is that a problem?

Another alternative seems to be django-cache-lock, but the caveat is that all dependent applications would have to use a caching backend (and the LocMemCache is not shared between processes, so this is a non-trivial setup)...

claudep · 2023-02-26T12:10:04Z

I think it will be hard to find a process-safe locking mechanism working in all situations. I have no better solution to suggest for now. django-db-mutex with a retry mechanism or extracting the django-db-mutex functionality we need might be the way to go.

timobrembeck mentioned this issue Feb 26, 2023

Migrate from threading.Lock() to django-db-mutex #170

Closed

This was referenced Oct 16, 2023

Link checker recognizes a link to Google Map as internal link digitalfabrik/integreat-cms#2465

Closed

Hide links from old versions in link checker digitalfabrik/integreat-cms#2466

Closed

timobrembeck mentioned this issue Oct 27, 2023

Meta: 🚀 Upstream Contributions digitalfabrik/integreat-cms#2493

Open

timobrembeck mentioned this issue Nov 15, 2023

Perform link replacement as background task digitalfabrik/integreat-cms#2558

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`update_lock` not race-condition-safe #168

`update_lock` not race-condition-safe #168

timobrembeck commented Feb 22, 2023

claudep commented Feb 23, 2023

timobrembeck commented Feb 25, 2023

claudep commented Feb 26, 2023

update_lock not race-condition-safe #168

update_lock not race-condition-safe #168

Comments

timobrembeck commented Feb 22, 2023

claudep commented Feb 23, 2023

timobrembeck commented Feb 25, 2023

claudep commented Feb 26, 2023

`update_lock` not race-condition-safe #168

`update_lock` not race-condition-safe #168