Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V0.19.0rc2 #40

Merged
merged 48 commits into from
Feb 17, 2021
Merged

V0.19.0rc2 #40

merged 48 commits into from
Feb 17, 2021

Conversation

dataders
Copy link
Contributor

@dataders dataders commented Jan 15, 2021

see changelog

@dataders
Copy link
Contributor Author

ok THIS is the error I've been getting locally... tox was doing something caching thing I don't understand...

Normally with dbt-synapsse I don't see those %s ever injected... that's a dbt-sqlserver thing... @mikaelene @jtcohen6 any ideas? i'm clueless

2021-01-15 09:45:42.224188 (Thread-1): On seed.dbt_test_project.added: 
            insert into "************"."dbt_test_210115094526928496909386"."added" (id, name, some_date) values
            (%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s)
        ....
2021-01-15 09:45:42.224337 (Thread-1): Database error: ('The SQL contains 0 parameter markers, but 60 parameters were supplied', 'HY000')
2021-01-15 09:45:42.224455 (Thread-1): On seed.dbt_test_project.added: ROLLBACK
2021-01-15 09:45:42.224562 (Thread-1): On seed.dbt_test_project.added: Close
2021-01-15 09:45:42.225497 (Thread-1): Database Error in seed added (data/added.csv)
  ('The SQL contains 0 parameter markers, but 60 parameters were supplied', 'HY000')
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/dbt/adapters/sqlserver/connections.py", line 179, in exception_handler
    yield
  File "/usr/local/lib/python3.7/site-packages/dbt/adapters/sqlserver/connections.py", line 350, in add_query
    cursor.execute(sql, bindings)
pyodbc.ProgrammingError: ('The SQL contains 0 parameter markers, but 60 parameters were supplied', 'HY000')

@dataders dataders mentioned this pull request Jan 18, 2021
@dataders
Copy link
Contributor Author

I'm very much clueless here. Still getting the empty table name issue for ts_snapshot.sql that is fixed in dbt-core and dbt-sqlserver...

@jtcohen6, if you have time this week, would you mind taking this branch for a spin and running pytest locally? also happy to team up and tackle this together synchronously if you'd like.

@mikaelene
Copy link
Contributor

It looks like the seeds fails. The tests first loads the database with data from seed files. Then executes the tests. The failure is that it tries to execute a snapshot on an empty table. If you add all tests to the spec-file again, I guess that you should see that all tests are failing.

@jtcohen6
Copy link
Contributor

Here's some best-guesswork from me. The immediate error is related to the fact that dbt seed seems to be using ordinal python style sqlparams (%s) instead of qmark style sqlparams (?). pyodbc only supports the latter. (Hence the the sqlserver seed implementation.) So in the most recent failed CI test, we see this error:

insert into "************"."dbt_test_210118095114296000933407"."added" (id, name, some_date) values
            (%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s),(%s,%s,%s)
Database error: ('The SQL contains 0 parameter markers, but 60 parameters were supplied', 'HY000')

It looks to be using the default dbt seed materialization, which has %s sqlparams, instead of the synapse version:

https://github.com/dbt-msft/dbt-synapse/blob/b6d2015aa72ef0c3302ea193c0ea09417a88efa5/dbt/include/synapse/macros/materializations/seed/seed.sql#L18-L37

So it does seem like an issue with properly dispatching synapse__load_csv_rows , though I'm not currently sure why. It feels like your best line of inquiry would be to check out this branch locally, pip install -e . into a virtualenv, and try running dbt seed.

Otherwise, I'm just observing that:

  • The synapse CI job is only running two tests because you commented out the rest (713620f)
  • The setup code looks different from successful past CI tests, both in this repo and for dbt-sqlserver, because of cb48c2b

@dataders
Copy link
Contributor Author

It looks like the seeds fails. The tests first loads the database with data from seed files. Then executes the tests. The failure is that it tries to execute a snapshot on an empty table. If you add all tests to the spec-file again, I guess that you should see that all tests are failing.

@mikaelene I just enabled the previous tests and I'm still getting the same error. very strange

@jtcohen6
Copy link
Contributor

This T-SQL syntax is unfamiliar to me. What should this look like?

EXEC('create view dbt_test_210119001001306735066970.cc_all_snapshot_temp_view as


select *,
  
CONVERT(VARCHAR(32), HashBytes(''MD5'', 
  coalesce(CONVERT(varchar, id, 121), '''')  + ''|'' + 

  coalesce(CONVERT(varchar, CONVERT(DATETIME2, ''2021-01-19 00:12:40.323000''), 121), '''') 
), 2)
as dbt_scd_id,
  CONVERT(DATETIME2, ''2021-01-19 00:12:40.323000'') as dbt_updated_at,
  CONVERT(DATETIME2, ''2021-01-19 00:12:40.323000'') as dbt_valid_from,
  nullif(CONVERT(DATETIME2, ''2021-01-19 00:12:40.323000''), CONVERT(DATETIME2, ''2021-01-19 00:12:40.323000'')) as dbt_valid_to
from (
  
) sbq


');

@dataders
Copy link
Contributor Author

This T-SQL syntax is unfamiliar to me. What should this look like?

@jtcohen6 Had a breakthrough thanks to your prompting and realized that I hadn't yet purged dbt/include/synapse/macros/materializations/snapshot/snapshot.sql of:

  • synapse__build_snapshot_staging_table() and
  • {% materialization snapshot, adapter='synapse' %}.

So now, i've done that, but the place where I'm stuck now is that insert_cols_csv is empty here (I've logged it to make sure). So, now my next step is to nvestigate by by copying the snapshot materialization back into dbt-synapse so that I may add log statements (unless you can think of something else).

For reference, here's a Gist of all the SQL called by test_dbt_snapshot_strategy_check_cols.

https://github.com/dbt-msft/dbt-synapse/blob/eba3b3f0f096b5ba014274b4469dcdfe61b72adf/dbt/include/synapse/macros/materializations/snapshot/snapshot_merge.sql#L1-L17

@jtcohen6
Copy link
Contributor

jtcohen6 commented Jan 20, 2021

You're right, doesn't seem to be anything here:

insert into "************"."dbt_test_210120093751790640424568"."cc_date_snapshot" (
                  
                  )

I'm guessing that the issue is adapter.get_columns_in_relation() here, which effectively runs this query (copied from CircleCI logs):

SELECT
          column_name,
          data_type,
          character_maximum_length,
          numeric_precision,
          numeric_scale
      FROM
          (select
              ordinal_position,
              column_name,
              data_type,
              character_maximum_length,
              numeric_precision,
              numeric_scale
          from INFORMATION_SCHEMA.COLUMNS
          where table_name = '#cc_all_snapshot__dbt_tmp'
            and table_schema = 'dbt_test_210120093751790640424568') cols

Do temp tables show up in INFORMATION_SCHEMA.COLUMNS in Synapse? I don't think they do.

So I think your options are:

  1. Reimplement get_columns_in_relation(relation) so that it does something else when relation is a temp table:
{% macro synapse__get_columns_in_relation(relation) -%}
  
  {# temp tables need different handling #}
  {% if relation.identifier.startswith("#") %}
    {% set select_star %} select * from {{ relation }} {% select_star %}
    {{ return(get_columns_in_query(select_star)) }}
  {% endif %}
  
  ... as normal ...

{% endmacro %}

get_columns_in_query('select * from ' ~ relation)
2. Update synapse__snapshot_merge_sql to override insert_cols by grabbing columns from the target instead, and hope the two line up (they should):

{% macro synapse__snapshot_merge_sql(target, source, insert_cols) -%} 
      {%- set target_columns = adapter.get_columns_in_relation(target)
                                   | rejectattr('name', 'equalto', 'dbt_change_type')
                                   | rejectattr('name', 'equalto', 'DBT_CHANGE_TYPE')
                                   | rejectattr('name', 'equalto', 'dbt_unique_key')
                                   | rejectattr('name', 'equalto', 'DBT_UNIQUE_KEY')
                                   | list %} -%} 
      {% set quoted_target_columns = [] %}
      {% for column in target_columns %}
        {% do quoted_source_columns.append(adapter.quote(column.name)) %}
      {% endfor %}
      {%- set insert_cols_csv = quoted_target_columns | join(', ') -%} 
      {{ log('insert_cols_csv: ' ~ insert_cols_csv, info=True) }} 
        
       EXEC(' 
       ...

Only other thing, you'll want this line
https://github.com/dbt-msft/dbt-synapse/blob/eba3b3f0f096b5ba014274b4469dcdfe61b72adf/dbt/include/synapse/macros/materializations/snapshot/snapshot_merge.sql#L11
To become

            and TMP.dbt_change_type in (''update'', ''delete'')

This enables the optional "hard delete" capture introduced in v0.19.0. (See dbt-labs/dbt-core#3004, it looks like dbt-sqlserver is now able to use the default version of this logic.)

@dataders
Copy link
Contributor Author

Do temp tables show up in INFORMATION_SCHEMA.COLUMNS in Synapse? I don't think they do.

🤦 ❌ 💯 ❌ 💯 (i.e. 10K facepalms).

It's slowly coming back to me now. I've solved this exact problem but I don't remember/can't pinpoint my exact work-around yet. My working theory is that I made made the staging table an actual table instead of temp. Thankfully 2020 Anders left Stack Overflow breadcrumbs so that 2021 Jeremy could help! Your x-ray vision has proved crucial.

For me, I think that your first suggestion is the more robust solution, in case more temp table troubles surface in the future. While it's a bit of unexpected work, lowering the code footprint of both dbt-sqlserver and dbt-synapse is well worth the effort as it will make long-term maintenance easier.

@dataders
Copy link
Contributor Author

sort of blocked by: dbt-msft/dbt-sqlserver#89

@dataders dataders marked this pull request as ready for review February 17, 2021 09:36
@dataders dataders merged commit f647b28 into master Feb 17, 2021
@prdpsvs prdpsvs deleted the v0.19.0rc2 branch May 21, 2023 04:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants