Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add prepare hooks for creating short-time locks #4

Open
AndrewDryga opened this issue Nov 15, 2017 · 6 comments
Open

RFC: Add prepare hooks for creating short-time locks #4

AndrewDryga opened this issue Nov 15, 2017 · 6 comments

Comments

@AndrewDryga
Copy link
Member

The idea came from Azure Architecture guidelines:

Placing a short-term timeout-based lock on each resource that's required to complete an operation, and obtaining these resources in advance, can help increase the likelihood that the overall activity will succeed. The work should be performed only after all the resources have been acquired. All actions must be finalized before the locks expire.

@AndrewDryga
Copy link
Member Author

Right now it can be archived by adding a separate stage which code would create lock in transaction and remove it in compensation.

@michalmuskala
Copy link
Collaborator

Maybe something like:

Sage.new
|> Sage.lock(:foo, &lock_foo/1, &unlock_foo/1)
...
|> Sage.unlock(:foo)

Implicitly all locks are released at the end of everything if they weren't explicitly released.

@AndrewDryga
Copy link
Member Author

It can be extended even wider, we can create a support library(ies) which would allow starting a transaction on a remote server by acquiring a lock and then committing that transaction when lock is released (or rollback it either explicitly or by a timeout). This is somewhat similar to Ecto.Adapters.SQLSandbox, however, I need to investigate is it really possible to do that with a raw PostgreSQL adapter.

@tlvenn
Copy link

tlvenn commented Jun 23, 2018

Seems like a pretty good idea and would want to integrate it with redlock (https://github.com/lyokato/redlock) but i have a question on how it plays with the Ecto transaction.

It seems like it would be a good idea that the lock is acquired and only then the transaction is started.

If you do it the other way around, it seems like it would defeat a bit the purpose of the lock as the current transaction will not be able to read the most updated version of whatever resource you are locking and depending of your transaction isolation, it will lead to a serialisation error. Of course the assumption is that the resource is managed by Ecto and maybe it's a wrong assumption and the whole point of the lock is to managed external resources.

I guess one could argue that to deal with ecto resources, you can leverage either optimistic locking with ecto or use a strong isolation level (serializable). The issue with that approach is when you have costly side effect that you don't really want to compensate unless you have a good reason to.

Stripe charge is a good example, refunds used to be free but now they are not and depending of the source, you can't necessarily just create an authorization initially and only capture at the very end.

So for example, in our system, we have some cash vouchers and when a user uses one to partially pay an order, the system has to make sure 2 orders are not being processed at the same time with the same cash voucher otherwise 1 transaction would fail and it would end up with a refund to compensate the charge.

Right now I am manually acquiring a lock on the cash_voucher using redlock before execution my sage pipeline but there is some ceremony around capturing the mutex and making sure I am releasing it always at the end.

I was kinda hoping that this RFC could potentially help me out but the premise would be that the lock is being acquired before the ecto transaction start.

@AndrewDryga
Copy link
Member Author

AndrewDryga commented Jun 23, 2018

@tlvenn I think you might create a lock even now by writing a transaction that acquires it and compensation that releases the lock. By doing so it would be wrapped in a transaction (if you use Ecto) and the lock is released when compensation occurs, it should even play nicely with retries.

The RFC is not well thought out but original idea was to create a helper that internally still uses transactions and compensations, and another one to release the lock created before just by a stage name. Probably we would still need to store their state separately, but I'd like to avoid it because when Sage executor state grows - it becomes more error-prone.

I hope that would help you, if not - let's elaborate to make sure design would fit real-world needs :).

@tlvenn
Copy link

tlvenn commented Jun 23, 2018

Thanks for the feedback @AndrewDryga. Assuming we are talking about a resource managed by Ecto, so essentially something you have in your database, if you claim a lock within the transaction, by the time the lock is released because it was held by another TX, you will either have phantom read or depending of your tx isolation, a serialization error.

If you use UPDATE, DELETE, SELECT FOR UPDATE and therefore create implicitly a lock within the DB, that's a different story but for example, the database I am using (CockroachDB) does not support this, there is no lock whatsoever.

CockroachDB operates by default in serialisable mode so if there was no side effects, i would not need to bother with a distributed lock, the second TX would be signaled to give it another try which would re execute the fn passed to Repo.transaction (created by Sage.execute) and would just fail.

Now when we take into account that there is side effect, then i don't think I have many options but to use a distributed lock that I acquire before I start the transactions so that by the time the second TX is opened, it can read the most updated state of that resource and fail if needed even before any side effect had a chance to be created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants