-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reschedule pending tasks during startup #168
Conversation
Codecov Report
@@ Coverage Diff @@
## development #168 +/- ##
===============================================
+ Coverage 88.79% 89.34% +0.54%
===============================================
Files 42 43 +1
Lines 2302 2486 +184
===============================================
+ Hits 2044 2221 +177
- Misses 258 265 +7
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool 😎 👍
aquadoggo/src/db/stores/task.rs
Outdated
let task_row = query_as::<_, TaskRow>( | ||
" | ||
SELECT | ||
name, | ||
document_id, | ||
document_view_id | ||
FROM | ||
tasks | ||
", | ||
) | ||
.fetch_optional(&self.pool) | ||
.await | ||
.map_err(|err| SqlStorageError::Transaction(err.to_string()))?; | ||
|
||
// If yes, we are already done here | ||
if task_row.is_some() { | ||
return Ok(()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could spare this query with a unique constraint on the tasks table couldn't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, No 😅 When the pending tasks are loaded into the queue during rescheduling, they are not being removed from the database (yet). The worker itself will fire on_update
with a new TaskStatus::Pending
as soon as it got queued, which leads to a new row in the tasks
table and ultimately to a duplicate.
The worker itself makes sure that no duplicates exist in the queues, but thats not the case with the database, this is why I've added that check before insertion.
I guess one could take the rows out of the database after we run get_tasks
and before we queue them up. Or we need a new Factory::reschedule
method which is used for rescheduling tasks which should lead to not firing TaskStatus::Pending
? 🤔
aquadoggo/src/db/stores/task.rs
Outdated
DELETE FROM | ||
tasks | ||
WHERE | ||
name IS $1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never seen IS used for comparison and couldn't find anything in the pg manual. Does it actually work? The standard operator is = and I only know IS as part of sth like IS DISTINCT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I know! Funnily it didn't work when I did =
.. and using IS
it suddenly did. Maybe because there are NULL
values involved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, made some experiments, and this works (I changed name
to =
, the others stay with IS
):
DELETE FROM
tasks
WHERE
name = $1
AND document_id IS $2
AND document_view_id IS $3
And changing all to =
fails, as document_id
and document_view_id
can be NULL
and we need to exactly match that to fullfil the query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey hey! Great work 👍 this feels like an elegant/simple (as possible) solution to a tricky problem. Nice to have it in place already!
I had one comment on a method name but the rest is goood.
name TEXT NOT NULL, | ||
document_id TEXT NULL, | ||
document_view_id TEXT NULL, | ||
PRIMARY KEY (name, document_id, document_view_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just stumbling about this primary key with null
columns. If I understand correctly this is actually only possible in Sqlite:
According to the SQL standard, PRIMARY KEY should always imply NOT NULL. Unfortunately, due to a bug in some early versions, this is not the case in SQLite. Unless the column is an INTEGER PRIMARY KEY or the table is a WITHOUT ROWID table or a STRICT table or the column is declared NOT NULL, SQLite allows NULL values in a PRIMARY KEY column. SQLite could be fixed to conform to the standard, but doing so might break legacy applications. Hence, it has been decided to merely document the fact that SQLite allows NULLs in most PRIMARY KEY columns.
Beautiful! Thats a very elegant solution, thanks @cafca for the improvements! And I'm learning something new about |
Summary
Factory
to send status updates ("pending", "completed") of tasks to external subscribers, they can subscribe via theon_update
methodtasks
SQL table, and removes them again if they are "completed"Notes
test-utils
✨ )SendError
. It made testing a little bit annoying when you forgot to not drop your subscribers on the broadcast channelto_string
by causing bugs with it and added a note about it hereClosing #162
📋 Checklist
CHANGELOG.md