Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pessimistic lock errors with the heartbeat #350

Closed
danielschnee opened this issue Sep 16, 2024 · 9 comments · Fixed by #351
Closed

Pessimistic lock errors with the heartbeat #350

danielschnee opened this issue Sep 16, 2024 · 9 comments · Fixed by #351

Comments

@danielschnee
Copy link

danielschnee commented Sep 16, 2024

I get a lot of pessimistic lock errors. Something around every minute because of the heartbeat.

The error:

Locking a record with unpersisted changes is not supported. Use `save` to persist the changes, or `reload` to discard them explicitly. Changed attributes: "last_heartbeat_at".

Backtrace:

 activerecord (7.2.1) lib/active_record/locking/pessimistic.rb:72:in `lock!'
 activerecord (7.2.1) lib/active_record/locking/pessimistic.rb:96:in `block in with_lock'
 activerecord (7.2.1) lib/active_record/connection_adapters/abstract/transaction.rb:616:in `block in within_new_transaction'
 activesupport (7.2.1) lib/active_support/concurrency/null_lock.rb:9:in `synchronize'
 activerecord (7.2.1) lib/active_record/connection_adapters/abstract/transaction.rb:613:in `within_new_transaction'
 activerecord (7.2.1) lib/active_record/connection_adapters/abstract/database_statements.rb:361:in `transaction'
 activerecord (7.2.1) lib/active_record/transactions.rb:234:in `block in transaction'
 activerecord (7.2.1) lib/active_record/connection_adapters/abstract/connection_pool.rb:395:in `with_connection'
 activerecord (7.2.1) lib/active_record/connection_handling.rb:296:in `with_connection'
 activerecord (7.2.1) lib/active_record/transactions.rb:233:in `transaction'
 activerecord (7.2.1) lib/active_record/transactions.rb:354:in `transaction'
 activerecord (7.2.1) lib/active_record/locking/pessimistic.rb:95:in `with_lock'
 solid_queue (0.9.0) lib/solid_queue/processes/registrable.rb:56:in `heartbeat'
 solid_queue (0.9.0) lib/solid_queue/processes/registrable.rb:41:in `block (2 levels) in launch_heartbeat'
 activesupport (7.2.1) lib/active_support/execution_wrapper.rb:91:in `wrap'
 solid_queue (0.9.0) lib/solid_queue/app_executor.rb:7:in `wrap_in_app_executor'
 solid_queue (0.9.0) lib/solid_queue/processes/registrable.rb:41:in `block in launch_heartbeat'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/executor/safe_task_executor.rb:24:in `block in execute'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/synchronization/mutex_lockable_object.rb:48:in `block in synchronize'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/synchronization/mutex_lockable_object.rb:48:in `synchronize'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/synchronization/mutex_lockable_object.rb:48:in `synchronize'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/executor/safe_task_executor.rb:22:in `execute'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/timer_task.rb:339:in `execute_task'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/executor/safe_task_executor.rb:24:in `block in execute'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/synchronization/mutex_lockable_object.rb:48:in `block in synchronize'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/synchronization/mutex_lockable_object.rb:48:in `synchronize'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/synchronization/mutex_lockable_object.rb:48:in `synchronize'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/executor/safe_task_executor.rb:22:in `execute'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/ivar.rb:170:in `safe_execute'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/scheduled_task.rb:298:in `process_task'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/executor/timer_set.rb:166:in `block (2 levels) in process_tasks'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:359:in `run_task'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:350:in `block (3 levels) in create_worker'
 <internal:kernel>:187:in `loop'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:341:in `block (2 levels) in create_worker'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:340:in `catch'
 concurrent-ruby (1.3.4) lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:340:in `block in create_worker'

Is there something i can do on my side?

Running:

  • solid_queue 0.9.0 with its own db
  • rails 7.2.1
  • ruby 3.3.5
  • sqlite for all dbs

What else do you need from me?

@rosa
Copy link
Member

rosa commented Sep 16, 2024

Oh, that's a strange one! Let me look into it.

@rosa
Copy link
Member

rosa commented Sep 16, 2024

Do you get any errors before this, when Solid Queue starts, when registering any of the processes (dispatchers, workers, etc.)? You should get some lines like this:

SolidQueue-0.9.0 Register Supervisor (10.0ms)  pid: 34424, hostname: "Rosas-Air-M2.localdomain", process_id: 6, name: "supervisor-e232055b690c3c5d3e63"
SolidQueue-0.9.0 Fail claimed jobs (3.6ms)  job_ids: [], process_ids: []
SolidQueue-0.9.0 Started Supervisor (36.5ms)  pid: 34424, hostname: "Rosas-Air-M2.localdomain", process_id: 6, name: "supervisor-e232055b690c3c5d3e63"
SolidQueue-0.9.0 Prune dead processes (19.1ms)  size: 0
SolidQueue-0.9.0 Register Dispatcher (43.3ms)  pid: 34425, hostname: "Rosas-Air-M2.localdomain", process_id: 7, name: "dispatcher-8c9a168af2f75adbde63"
SolidQueue-0.9.0 Started Dispatcher (44.5ms)  pid: 34425, hostname: "Rosas-Air-M2.localdomain", process_id: 7, name: "dispatcher-8c9a168af2f75adbde63", polling_interval: 1, batch_size: 500, concurrency_maintenance_interval: 600
SolidQueue-0.9.0 Register Worker (42.5ms)  pid: 34427, hostname: "Rosas-Air-M2.localdomain", process_id: 8, name: "worker-426c95702eccb5c40ffd"
SolidQueue-0.9.0 Started Worker (43.4ms)  pid: 34427, hostname: "Rosas-Air-M2.localdomain", process_id: 8, name: "worker-426c95702eccb5c40ffd", polling_interval: 0.1, queues: "default", thread_pool_size: 5
SolidQueue-0.9.0 Register Worker (44.3ms)  pid: 34426, hostname: "Rosas-Air-M2.localdomain", process_id: 9, name: "worker-cd0acd29813c42fb1214"
SolidQueue-0.9.0 Started Worker (45.2ms)  pid: 34426, hostname: "Rosas-Air-M2.localdomain", process_id: 9, name: "worker-cd0acd29813c42fb1214", polling_interval: 0.1, queues: "background", thread_pool_size: 3
SolidQueue-0.9.0 Register Scheduler (48.1ms)  pid: 34428, hostname: "Rosas-Air-M2.localdomain", process_id: 10, name: "scheduler-7d5a0693d55b7a6b85d0"
SolidQueue-0.9.0 Started Scheduler (53.9ms)  pid: 34428, hostname: "Rosas-Air-M2.localdomain", process_id: 10, name: "scheduler-7d5a0693d55b7a6b85d0", recurring_schedule: ["periodic_store_result"]

I have a suspicion of what this might be about...

@danielschnee
Copy link
Author

danielschnee commented Sep 16, 2024

Local:
image

Running it with the puma plugin.

@danielschnee
Copy link
Author

From the CI:
image

@rosa
Copy link
Member

rosa commented Sep 16, 2024

Thank you! That's very helpful 🙏

@danielschnee
Copy link
Author

danielschnee commented Sep 16, 2024

If that helps.

Locally its running fine:
image

On Prod it produces that error:
image

@rosa
Copy link
Member

rosa commented Sep 16, 2024

Huh, ok, that's unexpected and changed what I thought this was about 😅 😅 Did you get any errors of other kind before the lock errors started?

@danielschnee
Copy link
Author

Ok. Found another thing, It happens after some time running.

Now i get the same error on local:
image

@rosa
Copy link
Member

rosa commented Sep 16, 2024

Aha! Now everything makes sense 😆 Ok, I'll get this fixed.

rosa added a commit that referenced this issue Sep 16, 2024
For example, in case of a previous heartbeat failed because
of a DB issue (with SQLite depending on configuration, a
`BusyException` is not rare) and we still have the unpersisted
value in `last_heartbeat_at`, which means that `with_lock` will result
in:
```
RuntimeError: Locking a record with unpersisted changes is not supported
```

Fixes #350
rosa added a commit that referenced this issue Sep 16, 2024
For example, in case of a previous heartbeat failed because
of a DB issue (with SQLite depending on configuration, a
`BusyException` is not rare) and we still have the unpersisted
value in `last_heartbeat_at`, which means that `with_lock` will result
in:
```
RuntimeError: Locking a record with unpersisted changes is not supported
```

Fixes #350
@rosa rosa closed this as completed in #351 Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants