You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello - we encountered an issue recently where we had a Hangfire instance that had collected a large number of failed jobs (over 6,000) due to some issues in our system. Luckily this was a staging environment, so they were accumulating unnoticed, but when we tried to load the dashboard it was incredibly slow to perform any action. This issue did not appear to affect the creation or execution of jobs themselves, thankfully, but our monitoring tooling is showing that the top 3 slowest queries in our system currently are all Hangfire dashboard / cleanup related:
SELECT j . id Id, j . invocationdata InvocationData, j . arguments Arguments, j . createdat CreatedAt, j . expireat ExpireAt, ? FetchedAt, j . statename StateName, s . reason StateReason, s . data StateData FROM hangfire . job j LEFT JOIN hangfire . state s ON j . stateid = ? . id WHERE j . statename = @StateName ORDER BY j . id DESC LIMIT @Limit OFFSET @Offset
SELECT statename State, COUNT ( id ) Count FROM hangfire . job WHERE statename IS NOT ? GROUP BY statename SELECT COUNT ( * ) FROM hangfire . server SELECT SUM ( value ) FROM ( SELECT SUM ( value ) FROM hangfire . counter WHERE key = ? UNION ALL SELECT SUM ( value ) FROM hangfire . aggregatedcounter WHERE key = ? ) c SELECT SUM ( value ) FROM ( SELECT SUM ( value ) FROM hangfire . counter WHERE key = ? UNION ALL SELECT SUM ( value ) FROM hangfire . aggregatedcounter WHERE key = ? ) c SELECT COUNT ( * ) FROM hangfire . set WHERE key = ?
DELETE FROM hangfire . job WHERE id IN ( SELECT id FROM hangfire . job WHERE expireat < NOW ( ) LIMIT ? )
Perhaps there are missing indexes, or index adjustments needed to handle the situation where there are large numbers of failed jobs?
We have wiped out the current state on our staging server for now, but we have backups for that time period if you need any more specific information.
Thanks in advance for any help you can provide and for all you do to maintain this library - it's greatly appreciated!
The text was updated successfully, but these errors were encountered:
Hello - we encountered an issue recently where we had a Hangfire instance that had collected a large number of failed jobs (over 6,000) due to some issues in our system. Luckily this was a staging environment, so they were accumulating unnoticed, but when we tried to load the dashboard it was incredibly slow to perform any action. This issue did not appear to affect the creation or execution of jobs themselves, thankfully, but our monitoring tooling is showing that the top 3 slowest queries in our system currently are all Hangfire dashboard / cleanup related:
Perhaps there are missing indexes, or index adjustments needed to handle the situation where there are large numbers of failed jobs?
We have wiped out the current state on our staging server for now, but we have backups for that time period if you need any more specific information.
Thanks in advance for any help you can provide and for all you do to maintain this library - it's greatly appreciated!
The text was updated successfully, but these errors were encountered: