Include adapter response info in execution results #2961

kwigley · 2020-12-16T17:22:37Z

resolves #2747

Description

This PR adds an unstructured dictionary to results to track info parsed from query execution.

rename get_status -> get_response in SQLConnectionManager this is preference honestly, this may cause breaking changes in other adapters
execute now returns a tuple of adapter response data (this was previously a status string) and the agate table
updated the execution result we store in the context
write this new unstructured dictionary to a field named adapter_response in run_results.json and sources.json

Checklist

I have signed the CLA
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

jtcohen6

This is looking great!

A few thoughts about naming:

we already use the word status in the result object, to indicate dbt's interpretation of the response, so we probably shouldn't repeat it here as well
the dict is only populated if the query succeeds
today, we populate the dict only for materializations (models, seeds), i.e. DDL/DML statements, though someday we may want to report bytes processed by BigQuery tests (which are just select statements)

With that in mind, I'm thinking about this dict as a way of reporting the database's "API response" to dbt's successful "request" (materialization DDL/DML). An adapter_response is comprised of a code, optional attributes, and then a message that (based on the adapter's logic) is either recorded verbatim or synthesized from those attributes.

"adapter_response": {
    "message": "INSERT 0 2",
    "code": "INSERT",
    "rows_affected": 2
}

"adapter_response": {
    "message": "MERGE (2.0 rows, 16.0 Bytes processed)",
    "code": "MERGE",
    "rows_affected": 2,
    "bytes_processed": 16
}

"adapter_response": {
    "message": "CREATE VIEW",
    "code": "CREATE VIEW"
}

"adapter_response": {
    "message": "OK",
    "code": "OK"
}

"adapter_response": {
    "message": "SUCCESS 1",
    "code": "SUCCESS",
    "rows_affected": 1
}

What do you think?

plugins/bigquery/dbt/adapters/bigquery/connections.py

jtcohen6 · 2020-12-16T23:55:40Z

plugins/postgres/dbt/adapters/postgres/connections.py

-        return cursor.statusmessage
+    def get_status(cls, cursor) -> ExecutionStatus:
+        message = str(cursor.statusmessage)
+        rows = cursor.rowcount


This can be a weird number: it looks like CREATE VIEW returns a value of -1? Not sure if we should intervene here, given that we're pulling it right off the connection

I don't think we should intervene, well actually I don't feel comfortable intervening since I don't know all the possibilities that could be returned by cursor API.

plugins/postgres/dbt/adapters/postgres/connections.py

jtcohen6 · 2020-12-17T00:02:40Z

core/dbt/task/run.py

+            message=str(result.status),
+            adapter_query_status=adapter_query_status


This effectively duplicates the message in run_results.json, nested within the result dict and also within the adapter_query_status dict. Are we doing this to avoid a breaking change for plugins / users that expect to print message as a key to RunResult?

yes, that is correct. The message field on the run result could also contain error message so I think to be consistent it makes sense to keep this field. We could, however, remove the message field nested in the adapter response when we write out run results.

We could, however, remove the message field nested in the adapter response when we write out run results.

ok cool, I think this is the way to go, since message is more of a "dbt thing" and may or may not map to a specific adapter response value

sounds good! I'll get this in and mark up for dev review

jtcohen6 · 2020-12-17T00:22:35Z

plugins/snowflake/dbt/adapters/snowflake/connections.py

-        return "{} {}".format(state, cursor.rowcount)
+        return ExecutionStatus(
+            message="{} {}".format(state, cursor.rowcount),
+            rows=cursor.rowcount,


Snowflake always reports back SUCCESS 1 for create view + create table statements, even if the resulting view or table has many more rows. I don't think it's really our place to intervene here, though, since we're pulling this value right off the connection

same as above..

kwigley · 2020-12-17T01:49:20Z

@jtcohen6 woo! thanks so much for taking a look at this! I hadn't quite fleshed out what is involved here, but your comments definitely help guide what is left to do! 100% agree with your naming suggestions

kwigley · 2020-12-22T18:12:02Z

core/dbt/adapters/sql/connections.py

                elapsed=(time.time() - pre)
            )

            return connection, cursor

    @abc.abstractclassmethod
-    def get_status(cls, cursor: Any) -> str:
+    def get_response(cls, cursor: Any) -> Union[AdapterResponse, str]:


This is something that other adapter plugins may need to account for, this might cause a breaking change.

Example: this is override in dbt-spark, will need to update this

We can call this out in the release notes. I'd like to reorganize + clean up the changelog before final release, anyway

got it! I'll add a breaking section next time I break something :) please should tap me if you want a hand cleaning the changelog up

jtcohen6

lgtm! run results just get better and better

cla-bot bot added the cla:yes label Dec 16, 2020

kwigley changed the title ~~Add adapter specific query stats~~ [WIP] Add adapter specific query stats Dec 16, 2020

jtcohen6 reviewed Dec 17, 2020

View reviewed changes

This was referenced Dec 17, 2020

BigQuery: bytes processed should be logged for all dbt commands that perform queries #2747

Closed

[CT-1635] Populate adapter_response for tests + sources #2964

Closed

first pass at adding query stats, naming tbd

dddf1bc

kwigley force-pushed the feature/add-adapter-query-stats branch from 0370f26 to 122bd07 Compare December 21, 2020 18:24

update naming

aa3bdfe

kwigley force-pushed the feature/add-adapter-query-stats branch from 122bd07 to aa3bdfe Compare December 21, 2020 18:35

update api, fix tests, add placeholder for test/source results

bf4ee4f

jtcohen6 mentioned this pull request Dec 22, 2020

Add note about adapter_response in run_results.json, sources.json dbt-labs/docs.getdbt.com#495

Merged

3 tasks

kwigley requested review from jtcohen6 and gshank December 22, 2020 18:04

kwigley changed the title ~~[WIP] Add adapter specific query stats~~ Include adapter response info in execution results Dec 22, 2020

kwigley marked this pull request as ready for review December 22, 2020 18:05

update changelog

71bcf9b

kwigley commented Dec 22, 2020

View reviewed changes

gshank approved these changes Dec 22, 2020

View reviewed changes

jtcohen6 approved these changes Dec 22, 2020

View reviewed changes

kwigley merged commit 421aaab into dev/kiyoshi-kuromiya Dec 22, 2020

kwigley deleted the feature/add-adapter-query-stats branch December 22, 2020 18:57

jtcohen6 mentioned this pull request Jan 5, 2021

Breaking changes for plugin maintainers dbt-labs/docs.getdbt.com#503

Merged

3 tasks

jtcohen6 mentioned this pull request Jan 14, 2021

V0.19.0 dbt-msft/dbt-sqlserver#81

Merged

kwigley mentioned this pull request Jan 14, 2021

Add info from pyodbc cursor to adapter response dbt-labs/dbt-spark#142

Closed

vondras mentioned this pull request Jan 27, 2021

dbt version 0.19.0rc1 introduced a breaking change: SQLConnectionManager.get_status renamed to get_response fabrice-etanchaud/dbt-dremio#9

Closed

jtcohen6 mentioned this pull request Mar 3, 2021

Auditing row counts from dbt materializations #3142

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include adapter response info in execution results #2961

Include adapter response info in execution results #2961

kwigley commented Dec 16, 2020 •

edited

Loading

jtcohen6 left a comment

jtcohen6 Dec 16, 2020

kwigley Dec 21, 2020

jtcohen6 Dec 17, 2020

kwigley Dec 21, 2020

jtcohen6 Dec 21, 2020

kwigley Dec 21, 2020

jtcohen6 Dec 17, 2020

kwigley Dec 21, 2020

kwigley commented Dec 17, 2020

kwigley Dec 22, 2020

jtcohen6 Dec 22, 2020

kwigley Dec 22, 2020

jtcohen6 left a comment

		message=str(result.status),
		adapter_query_status=adapter_query_status

Include adapter response info in execution results #2961

Include adapter response info in execution results #2961

Conversation

kwigley commented Dec 16, 2020 • edited Loading

Description

Checklist

jtcohen6 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kwigley commented Dec 17, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jtcohen6 left a comment

Choose a reason for hiding this comment

kwigley commented Dec 16, 2020 •

edited

Loading