Feature Name: Support SELECT FOR UPDATE
Status: draft
Start Date: 2017-10-17
Authors: Rebecca Taft
RFC PR: (PR # after acceptance of initial draft)
Cockroach Issue: #6583

Summary

Support the SELECT ... FOR UPDATE SQL syntax, which locks rows returned by the SELECT statement. Several potential customers have asked for this feature, and it would also get us closer to feature parity with Postgres. The easiest way to implement this in CockroachDB would be by setting row-level "dummy" intents, but another option would be to set span-level locks.

Motivation

As described in this issue, SELECT ... FOR UPDATE is not standard SQL, but many databases now support it, including Postgres. Several third party products such as the Quartz Scheduler, OpenJPA and Liquibase also rely on this feature, preventing some potential customers from switching to CockroachDB.

In some cases, SELECT ... FOR UPDATE is required to maintain correctness when running CockroachDB in SNAPSHOT mode. SELECT ... FOR UPDATE is essentially a no-op when running in SERIALIZABLE mode, but it may still be useful for controlling lock ordering and avoiding deadlocks.

Many implementations of this feature also include options to control whether or not to wait on locks. SELECT ... FOR UPDATE NOWAIT is one option, which causes the query to return an error if it is unable to immediately lock all target rows. This is useful for latency-critical situations, and could also be useful for auto-retrying transactions in CockroachDB. SELECT ... FOR UPDATE SKIP LOCKED is another option, which returns only the rows that could be locked immediately, and skips over the others.
This option returns an inconsistent view of the data, but may be useful for cases when multiple workers are trying to process data in the same table as if it were a queue of tasks.
The default behavior of SELECT ... FOR UPDATE is for the transaction to block if some of the target rows are already locked by another transaction.

Guide-level explanation

The Postgres Documentation describes this feature as it is supported by Postgres. As shown, the syntax of the locking clause has the form

FOR lock_strength [ OF table_name [, ...] ] [ NOWAIT | SKIP LOCKED ]

where lock_strength can be one of

UPDATE
NO KEY UPDATE
SHARE
KEY SHARE

For our initial implementation in CockroachDB, we will likely simplify this syntax to

FOR UPDATE

i.e., no variation in locking strength, no specified tables, and no options for avoiding waiting on locks. Using FOR UPDATE will result in locking the rows returned by the SELECT query with exclusive locks. As described above, this feature alone is useful because it helps maintain correctness when running CockroachDB in SNAPSHOT mode, and serves as a tool for optimization when running in SERIALIZABLE mode.

For example, consider the following transaction:

BEGIN;

SELECT * FROM employees WHERE name = 'John Smith' FOR UPDATE;

...

UPDATE employees SET salary = 50000 WHERE name = 'John Smith';

COMMIT;

will lock the rows of all employees named John Smith at the beginning of the transaction. As a result, the UPDATE employees ... statement at the end of the transaction will not need to acquire any additional locks. Note that FOR UPDATE will have no effect if it is used in a stand-alone query that is not part of any transaction.

Reference-level explanation

This section provides more detail about how and why the CockroachDB implementation of the locking clause will differ from Postgres.

With the current model of CockroachDB, it is not possible to support the locking strengths NO KEY UPDATE or KEY SHARE because these options require locking at a sub-row granularity. It is also not clear that CockroachDB can support SHARE, because there is currently no such thing as a "read intent". UPDATE can be supported by marking the affected rows with dummy write intents.

By default, if FOR UPDATE is used in Postgres without specifying tables (without the OF table_name [, ...] clause), Postgres will lock all rows returned by the SELECT query. The OF table_name [, ...] clause enables locking only the rows in the specified tables. To lock different tables with different strengths or different options, Postgres users can string multiple locking clauses together.
For example,

SELECT * from employees e, departments d, companies c
WHERE e.did = d.id AND d.cid = c.id
AND c.name = `Cockroach Labs`
FOR UPDATE OF employees SKIP LOCKED
FOR SHARE OF departments NOWAIT

locks rows in the employees table that satisfy the join condition with an exclusive lock, and skips over rows that are already locked by another transaction. It also locks rows in the departments table that satisfy the join condition with a shared lock, and returns an error if it cannot lock all of the rows immediately. It does not lock the companies table.

Implementing this flexibility in CockroachDB for use of different tables and different options may be excessively complicated, and it's not clear that our customers actually need it. To avoid spending too much time on this, as mentioned above, we will probably just implement the most basic functionality in which clients use FOR UPDATE to lock the rows returned by the query. Initially we won't include the SKIP LOCKED or NOWAIT options, but it may be worth implementing these at some point.

Another potential difference between the Postgres and CockroachDB implementations relates to the precision of locking. In general, Postgres locks exactly the rows returned by the query, and no more. There are a few examples given in the documentation where that's not the case. For example, SELECT ... LIMIT 5 OFFSET 5 FOR UPDATE may lock up to 10 rows even though only 5 rows are returned. It may be more difficult to be precise with locking in CockroachDB, since locking happens at the KV layer, and predicates may be applied later. This should not affect correctness, but could affect performance if there is high contention.

Detailed design

There are a number of changes that will need to be implemented in order to support FOR UPDATE.

Update the parser to support the syntax in SELECT statements.
Update the KV API to include new messages ScanForUpdate and ReverseScanForUpdate
Update the KV and storage layers to mimic processing of Scan and ReverseScan and set dummy write intents on every row touched.

As described by Ben in issue #6583, git grep -i reversescan provides an idea of the scope of the change. The bulk of the changes would consist of implementing new ScanForUpdate and ReverseScanForUpdate calls in the KV API. These would work similarly to regular scans, but A) would be flagged as read/write commands instead of read-only and B) after performing the scan, they'd use MVCCPut to write back the values that were just read (that's not the most efficient way to do things, but Ben thinks it's the right way to start since it will have the right semantics without complicating the backwards-compatibility story). Then the SQL layer would use these instead of plan Scan/ReverseScan when FOR UPDATE has been requested.

There was some discussion about whether we really needed new API calls, but Ben pointed out that making it possible to write on Scan requests would make debugging a nightmare.

Drawbacks

It seems that there is sufficient demand for some form of the FOR UPDATE syntax. However, there are pros and cons to implementing or not implementing certain features.

It will be a lot of work to implement all of the features supported by Postgres.
This is probably not worth our time since it's not clear these features will actually get used. We can easily add them later if needed.
If we use the approach of setting intents on every row returned by the SELECT, it could hurt performance if we are selecting a large range. But setting intents on every row will allow us to more easily implement the SKIP LOCKED option in the future. It's also easier to implement than the alternatives discussed below.

Rationale and Alternatives

The proposed solution is to lock rows by writing dummy write intents on each row. However, another alternative is to lock an entire Range if the SELECT statement would return the majority of rows in the range. This is similar to the approach suggested by the Revert Command RFC.

The advantage of locking entire ranges is that it would significantly improve the performance for large scans compared to setting intents on individual rows. The downside is that this feature is not yet implemented, so it would be significantly more effort than using simple row-level intents. It's also not clear that customers would use FOR UPDATE with large ranges, so this may be an unneeded performance optimization. Furthermore, range-level locks would not support the SKIP LOCKED option very well, so if we want to add SKIP LOCKED later, we would probably need to revert to row-level intents for these queries.

One advantage of implementing range-level locking is that we could reuse this feature for other applications such as point-in-time recovery. However, in the interest of getting something working sooner rather than later, I believe row-level intents make more sense at this time.

Unresolved questions

Should we move forward with implementing only FOR UPDATE at first, or are some of the other features essential as well?
Should we stick with row-level intents or implement range-level locking?
Should we be concerned about locking precision (i.e., lock exactly the rows returned by query and no more)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

20171024_select_for_update.md

20171024_select_for_update.md

Summary

Motivation

Guide-level explanation

Reference-level explanation

Detailed design

Drawbacks

Rationale and Alternatives

Unresolved questions

Files

20171024_select_for_update.md

Latest commit

History

20171024_select_for_update.md

File metadata and controls

Summary

Motivation

Guide-level explanation

Reference-level explanation

Detailed design

Drawbacks

Rationale and Alternatives

Unresolved questions