feat: Implementation of client side statements that return #1046

ankiaga · 2023-12-04T04:27:18Z

Implementation of SHOW_COMMIT_TIMESTAMP and SHOW_READ_TIMESTAMP client statements

conventional-commit-lint-gcf · 2023-12-04T04:44:03Z

🤖 I detect that the PR title and the commit message differ and there's only one commit. To use the PR title for the commit history, you can use Github's automerge feature with squashing, or use automerge label. Good luck human!

-- conventional-commit-lint bot
https://conventionalcommits.org/

olavloite · 2023-12-04T07:50:49Z

google/cloud/spanner_dbapi/client_side_statement_executor.py

+        parsed_statement.client_side_statement_type
+        == ClientSideStatementType.SHOW_COMMIT_TIMESTAMP
+    ):
+        if connection.is_closed:


Maybe move this to the start of this function. I think that we want to do this for any type of statement.

olavloite · 2023-12-04T07:53:47Z

google/cloud/spanner_dbapi/client_side_statement_parser.py

@@ -23,6 +23,12 @@
 RE_BEGIN = re.compile(r"^\s*(BEGIN|START)(TRANSACTION)?", re.IGNORECASE)
 RE_COMMIT = re.compile(r"^\s*(COMMIT)(TRANSACTION)?", re.IGNORECASE)
 RE_ROLLBACK = re.compile(r"^\s*(ROLLBACK)(TRANSACTION)?", re.IGNORECASE)
+RE_SHOW_COMMIT_TIMESTAMP = re.compile(
+    r"^\s*(SHOW VARIABLE COMMIT_TIMESTAMP)", re.IGNORECASE


Will this regex accept show variable commit timestamp? And show\nvariable\ncommit\ntimestamp?

Hopefully not because commit_timestamp is one token :-) But as to spacing elsewhere, good point.

Changed regex for spacing

olavloite · 2023-12-04T07:55:32Z

google/cloud/spanner_dbapi/connection.py

-            bool: True if Spanner transaction started, False otherwise.
-        """
+    def inside_transaction(self):
+        """Deprecated property which won't be supported in future versions.


nit:

Suggested change

"""Deprecated property which won't be supported in future versions.

"""Deprecated: This property may be removed in a future release.

If this is deprecated, should it be tagged @deprecated?

(Python does support stacking decorators. Order matters; they are applied in the reverse of the order in which they are listed. @deprecated deprecates a function and @property turns a function into a property, so probably the deprecation should be listed last so that it's evaluated first, on the function rather than its derived property.)

Thanks, Added @deprecated

google/cloud/spanner_dbapi/connection.py

olavloite · 2023-12-04T13:44:46Z

google/cloud/spanner_dbapi/connection.py

@@ -405,13 +397,15 @@ def begin(self):
        Marks the transaction as started.

        :raises: :class:`InterfaceError`: if this connection is closed.
-        :raises: :class:`OperationalError`: if there is an existing transaction that has begin or is running
+        :raises: :class:`OperationalError`: if there is an existing transaction
+        that has begin or is running


(I know that it's not a change in this PR, but I just noticed it now):

Suggested change

that has begin or is running

that has has been started

olavloite · 2023-12-04T13:59:29Z

tests/system/test_dbapi.py

@@ -152,6 +153,87 @@ def test_begin_client_side(self, shared_instance, dbapi_database):
        conn3.close()
        assert got_rows == [updated_row]

+    def test_commit_timestamp_client_side(self):


nit:

Suggested change

def test_commit_timestamp_client_side(self):

def test_commit_timestamp_client_side_transaction(self):

tests/system/test_dbapi.py

olavloite · 2023-12-04T14:01:29Z

tests/system/test_dbapi.py

+        self._cursor.execute("SELECT * FROM contacts")
+        self._cursor.execute("SELECT * FROM contacts")


I agree that this test should run two selects, but the results of both should also be consumed and verified that they contain what we expect.

Also, it would be good to verify that we can see the read timestamp already after executing the first query, and that it stays the same after the second query and after the commit.

tests/system/test_dbapi.py

aseering · 2023-12-04T15:07:54Z

google/cloud/spanner_dbapi/client_side_statement_executor.py

+        connection.rollback()
+        return None
+    if (
+        parsed_statement.client_side_statement_type


If there are going to be lots of these ifs and they'll all be long due to the long variable name, you might consider doing statement_type = parsed_statement.client_side_statement_type at the top of this block of ifs. Then you can use the shorter name throughout here.

You could also do something like Type = ClientSideStatementType. Aliasing a class is a little less common, though. And, I think, probably not necessary to get the line length consistently under the length limit?

Related, very minor thing:

In Python, everything is interpreted and there's no JIT compiler doing type inference, so every time you do a.b, the Python interpreter has to actually interpret that logic at runtime. Specifically -- go look up the type of a, look up and run code associated with any overrides on a's . operator (for example properties are implemented under the hood as an overload to the dot operator that checks whether the name of the field that you're looking up is the same as the name of a property on the class and, if so, replaces that lookup with a call to the underlying getter(/setter(/deleter)) function), then if still applicable go look up b in a's member dictionary.

As a result:

a.b a.b a.b a.b (... lots of times ...)

is actually slightly-but-measurably slower than

c = a.b c c c c (... lots of times ...)

In most other languages, unless a pointer dereference is required (which is still cheap), nested member-variable access is zero cost at runtime; all of this resolution is sorted out at compile time or by a JIT compiler.

The performance difference is not usually enough (and Python code is usually not performance-sensitive enough) that it matters. But it gently nudges Python to tend to use simple variables rather than complex nested structures.

Defined a variable statement_type. Thanks for the explanation, was not aware of it. Will try to remember and take care of this in future code

aseering · 2023-12-04T15:36:32Z

google/cloud/spanner_dbapi/client_side_statement_parser.py

@@ -23,6 +23,12 @@
 RE_BEGIN = re.compile(r"^\s*(BEGIN|START)(TRANSACTION)?", re.IGNORECASE)
 RE_COMMIT = re.compile(r"^\s*(COMMIT)(TRANSACTION)?", re.IGNORECASE)
 RE_ROLLBACK = re.compile(r"^\s*(ROLLBACK)(TRANSACTION)?", re.IGNORECASE)
+RE_SHOW_COMMIT_TIMESTAMP = re.compile(
+    r"^\s*(SHOW VARIABLE COMMIT_TIMESTAMP)", re.IGNORECASE


Hopefully not because commit_timestamp is one token :-) But as to spacing elsewhere, good point.

aseering · 2023-12-04T15:39:59Z

google/cloud/spanner_dbapi/connection.py

-            bool: True if Spanner transaction started, False otherwise.
-        """
+    def inside_transaction(self):
+        """Deprecated property which won't be supported in future versions.


If this is deprecated, should it be tagged @deprecated?

(Python does support stacking decorators. Order matters; they are applied in the reverse of the order in which they are listed. @deprecated deprecates a function and @property turns a function into a property, so probably the deprecation should be listed last so that it's evaluated first, on the function rather than its derived property.)

aseering · 2023-12-04T15:56:14Z

google/cloud/spanner_v1/snapshot.py

        self._read_request_count += 1
        self._execute_sql_count += 1

+        if self._read_only and not transaction_id_set:
+            peek = next(iterator)


Isn't this advancing the iterator by one?

It looks to me like line 485 is effectively rewinding the iterator? So this does implement the peek idiom (assuming no exception is thrown between here and there), though I'd agree that it's confusing.

Reading the implementation of _restart_on_unavailable, though: Do I read correctly that it's implemented by buffering the full resultset in memory? I'm curious if you know why we do that rather than yielding results as they arrive and, in case of ServiceUnavailable, just replay to the point where we left off and keep streaming? The semantics would be slightly different, so there might be a good reason for this choice.

(I started reading that function because I was wondering if this was a custom iterator and if we could add a peek() method to it. It looks to me like the current algorithm would in fact support peek() just fine (because it buffers all results in memory anyway), but it's not a custom iterator so it wouldn't be trivial to add such a method. The more-itertools library contains a peekable wrapper that would do this for us, though that would add a library dependency.)

aseering · 2023-12-04T16:01:12Z

google/cloud/spanner_v1/snapshot.py

        self._read_request_count += 1
        self._execute_sql_count += 1

+        if self._read_only and not transaction_id_set:
+            peek = next(iterator)


Will this fail if the query returns no results? (IIRC next() with no second/default argument throws a StopIteration exception or something if there's nothing left to return?) Should it?

There would be one result (PartialResultSet object) where metadata field would be present and values would be an empty list

olavloite · 2023-12-07T12:05:43Z

google/cloud/spanner_dbapi/client_side_statement_executor.py

+        return _get_streamed_result_set(
+            ClientSideStatementType.SHOW_COMMIT_TIMESTAMP.name,
+            TypeCode.TIMESTAMP,
+            connection._transaction.committed,


What happens if _transaction is None? Or if the transaction has not yet committed? Will this then return a result set containing a single row/column containing a NULL? Or something else?

made the change to return an empty result set for both cases

olavloite · 2023-12-07T12:06:20Z

google/cloud/spanner_dbapi/client_side_statement_executor.py

+        return _get_streamed_result_set(
+            ClientSideStatementType.SHOW_READ_TIMESTAMP.name,
+            TypeCode.TIMESTAMP,
+            connection._snapshot._transaction_read_timestamp,


Same here as above? What happens is _snapshot is None?

Same response as above

google/cloud/spanner_dbapi/client_side_statement_executor.py

olavloite · 2023-12-07T12:13:12Z

google/cloud/spanner_dbapi/connection.py

+    "This method is non-operational as transaction has not been started at " "client."
+)
+SPANNER_TRANSACTION_NOT_STARTED_WARNING = (
+    "This method is non-operational as transaction has not been started at " "spanner."


I don't think we should make a difference between these two in any warnings that we return to the user. Whether a transaction has been started on Spanner or not, is essentially an implementation detail.

(Note: Internally, we can make this difference. But I don't think we should communicate it like that to a user.)

Removed this warning as not needed now

olavloite · 2023-12-07T12:14:49Z

google/cloud/spanner_dbapi/connection.py

-        made atleast one call to Spanner. Property client_transaction_started
-        would always be true if this is true as transaction has to start first
-        at clientside than at Spanner
+    def ddl_statements(self):


Is this a new property? If so, any specific reason that we are adding this in this PR?

Intellij was giving a suggestion so created it. Let me know if I should revert it?

I would in that case at least make it internal (so let it start with an underscore). We should be as conservative as possible when it comes to adding public methods and properties to the interface.

Removed the property as there is no different in accessing a property or a field in python when they both start with underscore and Intellij will give the same warning

olavloite · 2023-12-07T12:28:37Z

google/cloud/spanner_dbapi/connection.py

        if not self._client_transaction_started:
            warnings.warn(
                CLIENT_TRANSACTION_NOT_STARTED_WARNING, UserWarning, stacklevel=2
            )
            return
+        if not self._spanner_transaction_started:


Same as above: I don't think we want to make this difference, but also this should be supported:

begin; rollback;

Removed the warning and added the system test test_begin_and_rollback to test the mentioned use case

tests/system/test_dbapi.py

olavloite · 2023-12-07T12:50:32Z

tests/system/test_dbapi.py

+        assert len(got_rows[0]) == 1
+        assert len(self._cursor.description) == 1
+        assert self._cursor.description[0].name == "SHOW_READ_TIMESTAMP"
+        assert isinstance(got_rows[0][0], DatetimeWithNanoseconds)


Can we add another query to this test, and then verify that the next query gives us a new read timestamp?

tests/system/test_dbapi.py

tests/unit/spanner_dbapi/test_connection.py

olavloite · 2023-12-11T12:57:00Z

google/cloud/spanner_dbapi/client_side_statement_executor.py

+
+CONNECTION_CLOSED_ERROR = "This connection is closed"
+TRANSACTION_NOT_STARTED_WARNING = (
+    "This method is non-operational as transaction has not been started."


nit:

Suggested change

"This method is non-operational as transaction has not been started."

"This method is non-operational as a transaction has not been started."

olavloite · 2023-12-11T12:58:30Z

google/cloud/spanner_dbapi/connection.py

@@ -35,7 +36,7 @@


 CLIENT_TRANSACTION_NOT_STARTED_WARNING = (
-    "This method is non-operational as transaction has not started"
+    "This method is non-operational as transaction has not been started."


nit:

Suggested change

"This method is non-operational as transaction has not been started."

"This method is non-operational as a transaction has not been started."

google/cloud/spanner_dbapi/connection.py

tests/unit/spanner_dbapi/test_connection.py

Implementation of client side statements that return

3f3b8cb

ankiaga requested review from a team as code owners December 4, 2023 04:27

product-auto-label bot added size: l Pull request size is large. api: spanner Issues related to the googleapis/python-spanner API. labels Dec 4, 2023

ankiaga changed the title ~~Implementation of client side statements that return~~ feat: Implementation of client side statements that return Dec 4, 2023

Small fix

8b63b9c

ankiaga requested review from aseering and olavloite December 4, 2023 05:25

ankiaga added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Dec 4, 2023

yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Dec 4, 2023

olavloite reviewed Dec 4, 2023

View reviewed changes

aseering reviewed Dec 4, 2023

View reviewed changes

Incorporated comments

2361dfb

ankiaga force-pushed the clientReturn branch from f5b704b to 2361dfb Compare December 6, 2023 06:19

ankiaga added 5 commits December 6, 2023 16:49

Added tests for exception in commit and rollback

72f6221

Fix in tests

27594e5

Skipping few tests from running in emulator

9108192

Few fixes

f688d54

Refactoring

03f42a2

ankiaga added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Dec 7, 2023

yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Dec 7, 2023

olavloite reviewed Dec 7, 2023

View reviewed changes

Incorporated comments

e603dc3

olavloite approved these changes Dec 11, 2023

View reviewed changes

ankiaga and others added 2 commits December 11, 2023 22:59

Incorporating comments

500b513

Merge branch 'main' into clientReturn

00df2ef

ankiaga enabled auto-merge (squash) December 11, 2023 17:41

ankiaga added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Dec 12, 2023

yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Dec 12, 2023

ankiaga merged commit bb5fa1f into googleapis:main Dec 12, 2023
16 of 17 checks passed

release-please bot mentioned this pull request Dec 12, 2023

chore(main): release 3.41.0 #1009

Merged

	"""Deprecated property which won't be supported in future versions.
	"""Deprecated: This property may be removed in a future release.

	def test_commit_timestamp_client_side(self):
	def test_commit_timestamp_client_side_transaction(self):

		self._cursor.execute("SELECT * FROM contacts")
		self._cursor.execute("SELECT * FROM contacts")

	"This method is non-operational as transaction has not been started."
	"This method is non-operational as a transaction has not been started."

feat: Implementation of client side statements that return #1046

feat: Implementation of client side statements that return #1046

Conversation

ankiaga commented Dec 4, 2023 • edited Loading

conventional-commit-lint-gcf bot commented Dec 4, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aseering Dec 4, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ankiaga commented Dec 4, 2023 •

edited

Loading

aseering Dec 4, 2023 •

edited

Loading