-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transactions logic upgrade #10014
Transactions logic upgrade #10014
Conversation
9029c9a
to
3af138c
Compare
@symbolist is this ready for review? |
@macdiesel Not yet. |
535f7c8
to
52a7f11
Compare
aedb2cc
to
0210770
Compare
@@ -52,7 +52,6 @@ class Command(TrackedCommand): | |||
help="If True, try to transfer certificate items to the new course.") | |||
) | |||
|
|||
@transaction.commit_manually |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Maybe just change this to @transaction.atomic
and remove the commit_on_success
context mgr below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I have already replaced commit_on_success
with atomic
below. The reason for keeping it inside the loop is that change for each student should get committed as it happens. Otherwise an error would revert all of the work. Or do you think we should be doing that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@symbolist @doctoryes since the transaction isn't in a try block. Does that mean that if command fails on a user part way through the transaction the database would end up in an inconsistent state as there is nothing to catch the exception? You would end up with all the students to that point transferred but the remainder not transferred.
Also would this end up failing on the rerun as the student has already been transferred to the new course?
We have two routes of action here:
- add the try/catch and migrate students with no errors and report the ones that were not moved
or - fail the entire transaction and keep all students in a consistent state.
I think that #2 is better as it would then allow us to troubleshoot and fix the issue without any inconsistencies in the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@macdiesel "if command fails on a user part way through the transaction" atomic()
will 1. catch the exception, 2. rollback the transaction and 3. re-raise the exception. And yes, that means some of the students would have been transferred and the others not. But this is how it currently works too.
Actually, in Django 1.4 commit_manually
is only used to disable automatic transaction management for views. Commands do not need it. So its not doing anything right now either.
https://docs.djangoproject.com/en/1.4/topics/db/transactions/#django.db.transaction.commit_manually
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@doctoryes I created a branch, reverted @nedbat's commit and then created this PR against it. So all the migration is visible in this PR. I thought it would give us a picture of all of the transaction related logic in one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not know how this all plays out in the large scheme of how students and courses are handled in edx. I realize we would be changing behavior of this command, but I would rather see us keep the state of our data consistent.
Let me pose this question what will be easier to track down in the future:
- a command that fails to move all the students and provides a clear and concise error message
or - a command that succeeds, maybe puts something to stdout or stderr that someone will read or maybe not, then later when someone realizes that not all the students are moved they call in support or devops. Then someone has to comb through the data looking for said students, then they have to look to the code to try and figure out what happened.
I just want to see us strive for both consistency in both our data and program behavior.
/soapbox
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - let me on that soapbox! 😄
Totally agree that 1) is better code/procedure. Just worried about the "hey - this command changed - rollback the release!" scenario. If we're willing to take on some of that risk (maybe release notes would help?), let's do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a transaction.atomic
on the command itself. This is a rarely used command so it shouldn't be an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See below.
0210770
to
9ec35d3
Compare
dog_stats_api.increment('instructor_task.subtask.update_exception') | ||
raise | ||
else: | ||
TASK_LOG.debug("about to commit....") | ||
transaction.commit() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The atomic
context manager takes care of the commit/rollback.
52a7f11
to
cea828d
Compare
46f58a7
to
65a6e00
Compare
@@ -494,7 +494,6 @@ def update_subtask_status(entry_id, current_task_id, new_subtask_status, retry_c | |||
_release_subtask_lock(current_task_id) | |||
|
|||
|
|||
@transaction.commit_manually |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use the @transaction.atomic
decorator here and avoid the reformatting below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, this change from the deprecated commit_manually
to atomic
will fix a few bulk email False is not true
failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated below.
PHEW! I've finished my first pass and left comments. Some points:
|
a8f472b
to
d0756f9
Compare
transaction.commit() | ||
kwargs = {'ccx': ccx, 'location': block.location, 'field': name} | ||
override = CcxFieldOverride.objects.get(**kwargs) | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of trying to create
an object first and do a get
on an IntegrityError
, lets just do a get
first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@symbolist , would this increase the likelihood of race conditions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adampalay Yes, to an extent. But this function is only called from 3 Instructor dashboard CCX tab views so I doubt we will be seeing many parallel requests causing IntegrityErrors
. If we do we can refactor the views a bit so that we can put outer_atomic
around this whole function or maybe shift to READ COMMITTED
for the views which do create IntegrityErrors
.
I have put in a ticket for this: https://openedx.atlassian.net/browse/TNL-3614
e3c8b44
to
f81e7b0
Compare
transaction.commit() | ||
return cls.objects.get(course_id=course_id, checkpoint_location=checkpoint_location) | ||
checkpoint, __ = cls.objects.get_or_create(course_id=course_id, checkpoint_location=checkpoint_location) | ||
return checkpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the view below to use read_committed
so we shouldn't need this anymore.
@doctoryes @macdiesel I have addressed all comments. Besides that, I have:
This is ready for another review. |
@adampalay Here is the PR I mentioned today. If you would like to discuss any of the changes, please ping me. |
Nit: There's no docstring or comments explaining why the |
In other words, it should be clear what's deprecated. |
Also, do you plan to create a ticket for |
A few nits and follow-ons - but no blockers for merging: 👍 |
2c5d4ba
to
055f3b1
Compare
|
||
Arguments: | ||
using (str): the name of the database. | ||
read_committed (bool): Whether to use read committed isolation level. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@doctoryes I have added more detailed documentation to outer_atomic()
.
@doctoryes I have added the tests you suggested as well. Yes, I have created a ticket for replacing the remaining instances of |
try: | ||
connection.rollback() | ||
except Error: | ||
# An error during rollback means that something |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will we want to know if the rollback didn't work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll put in some logging statements here. Though ideally we want to get rid of this commit_on_success
decorator before we deploy the upgrade. Right now it is only being used in courseware/grades.py
and on the progress page view. I have created a separate story for it.
055f3b1
to
5da08bf
Compare
5da08bf
to
4e5d812
Compare
@doctoryes @adampalay I have also eliminated the rest of the instances of A bit of context: The reason for adding |
@doctoryes Can you review the changes in the last two commits? |
👍 😎 |
Closing and creating a new PR against the upgrade branch. |
https://openedx.atlassian.net/browse/TNL-3428
As part of the Django upgrade, we need to move to the transaction framework that was introduced in Django 1.7.
https://docs.djangoproject.com/en/1.8/topics/db/transactions/
This Stackoverflow answer explains the differences of atomicity and durability between the two APIs: http://stackoverflow.com/a/24108491
The different cases I came across and how they are being handled is as follows:
The most important constraint to keep in mind is that we cannot have commit() or rollback() calls inside an atomic block. This means if a function has either of these calls any view which can call it cannot be wrapped in atomic.