Expiration for all locks #393

JacobEvelyn · 2019-05-14T20:44:57Z

Is your feature request related to a problem? Please describe.

I find it confusing that lock expiration doesn't allow me to expire locks after some long period of time (for instance, 2 days) when using strategies other than until_expired. I'm worried that if something goes wrong and a lock isn't released (or a job is suddenly killed mid-execution and it has a runtime lock), those locks will linger forever and the same job can never again be enqueued.

Based on past issues it looks like this has happened before (for example: #277), so I think this concern is well-founded.

Describe the solution you'd like

I'd like lock_expiration to simply be the Redis TTL that all locks use, without any fanciness. (For backwards compatibility, this can also be called something else if that's clearer.) Honestly I think it even makes sense to have this behavior enabled by default for all jobs, since a lock that's been present for many days is probably the sign of a bug and should be released.

Describe alternatives you've considered

We're using Sidekiq Pro's super_fetch reliability feature which addresses the same issue from a different angle, but I think it's still valid to want locks to disappear after some long amount of time.

The text was updated successfully, but these errors were encountered:

mhenrixon · 2019-05-15T11:12:45Z

The problems described is due to bugs in the scripts. It will not be a problem in the near future as there will be a cleanup running on Sidekiq.on(:heartbeat). The locking mechanism is also getting some serious love in #387 and will further reduce this issue.

I'll hold off on your suggestion until that work is complete but I will leave this issue open until we see if the problem is solved in a better way or not.

mhenrixon · 2019-06-02T17:44:00Z

This should be fixed now that #387 got merged. No release yet, still have a few things to work through first.

amitsaxena · 2019-11-20T21:45:26Z

@mhenrixon this has been happening a lot to us on Production recently and is breaking our crons (which use https://github.com/ondrejbartas/sidekiq-cron). It will be nice to disregard the lock after a specific amount of time and make it configurable. What's the recommended way to achieve this, instead of manually deleting the locks? If there isn't a way, I'll have to write a cron that looks at redis for the keys and clears out old lock keys. This has been giving us too much pain recently.

mhenrixon · 2019-11-21T11:31:14Z

@amitsaxena though it might not be a global setting lock_expiration is used for this purpose. In v7 there is also the orphan reaper that cleans out stale locks.

amitsaxena · 2019-11-21T13:51:41Z

Ahh! That seems to be missing from master README. But anyways @mhenrixon I tried it, and it doesn't seem to work as expected. We have this in the job:
sidekiq_options queue: :subscription, retry: false, lock: :until_executed, lock_expiration: 1.minute.to_i, on_conflict: :raise

So I am expecting if there is a lock in redis for more than 1 minute and I schedule a new job, it should get added to sidekiq. It doesn't seem to happen and an exception is raised instead. I tried with the job and then deleted the job directly from redis so that the lock still stays (and is orphaned), but still the new job doesn't get added. We are using v6.0.15.

aharpervc · 2020-07-24T19:06:36Z

@amitsaxena though it might not be a global setting lock_expiration is used for this purpose. In v7 there is also the orphan reaper that cleans out stale locks.

I'm a little confused, can you clarify how to use lock_expiration in v6? Aside from this feature, is there a way to find/expire orphaned locks in v6 at all?

amitsaxena · 2020-07-28T23:07:33Z

@aharpervc I couldn't find anything which works (in v6), so I had to write a script which looks for redis keys (related to locks) and deletes them.

aharpervc · 2020-07-29T13:21:48Z

@aharpervc I couldn't find anything which works (in v6), so I had to write a script which looks for redis keys (related to locks) and deletes them.

Sounds useful, are you able to share it?

amitsaxena · 2020-09-05T23:18:39Z

Something on the lines of:

class CleanStaleLocks
  include Sidekiq::Worker

  def perform
    SidekiqUniqueJobs::Digests.all.each do |digest|
      delete(digest) if stale?(digest)
    end
  end

  private

  def delete(digest)
    $redis.del("#{digest}:EXISTS")
    $redis.hgetall("#{digest}:GRABBED").keys.each do |k|
      $redis.hdel("#{digest}:GRABBED", k)
    end
    # Check for the correct key name in redis and use below
    $redis.srem("project_name:sidekiq:unique:keys", digest)
  end

  def stale?(digest)
    jid = $redis.get("#{digest}:EXISTS")
    return false if Sidekiq::Queue.new("queue_name_with_unique_jobs").find_job(jid)

    grabbed = $redis.hgetall("#{digest}:GRABBED")
    raise "Expected redis key not found" if grabbed.nil?

    locked_at = Time.at(grabbed[jid].to_f)
    # assuming lock is stale if it is older than 1 hour
    (Time.now - locked_at) > 1.hour.to_i
  end
end

You will need to tweak the code depending on what queues could have unique jobs in your case. I'd then add it as a cron job running every hour (or more frequently).

mhenrixon added the feature request label May 15, 2019

mhenrixon self-assigned this May 15, 2019

mhenrixon closed this as completed Jun 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expiration for all locks #393

Expiration for all locks #393

JacobEvelyn commented May 14, 2019

mhenrixon commented May 15, 2019

mhenrixon commented Jun 2, 2019

amitsaxena commented Nov 20, 2019

mhenrixon commented Nov 21, 2019

amitsaxena commented Nov 21, 2019 •

edited

Loading

aharpervc commented Jul 24, 2020

amitsaxena commented Jul 28, 2020

aharpervc commented Jul 29, 2020

amitsaxena commented Sep 5, 2020 •

edited

Loading

Expiration for all locks #393

Expiration for all locks #393

Comments

JacobEvelyn commented May 14, 2019

mhenrixon commented May 15, 2019

mhenrixon commented Jun 2, 2019

amitsaxena commented Nov 20, 2019

mhenrixon commented Nov 21, 2019

amitsaxena commented Nov 21, 2019 • edited Loading

aharpervc commented Jul 24, 2020

amitsaxena commented Jul 28, 2020

aharpervc commented Jul 29, 2020

amitsaxena commented Sep 5, 2020 • edited Loading

amitsaxena commented Nov 21, 2019 •

edited

Loading

amitsaxena commented Sep 5, 2020 •

edited

Loading