Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #127 - Respect SQL Servers Parameter Limit #151

Merged
merged 6 commits into from
Jul 19, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,15 @@
# Changelog

### v0.20.0

#### features

- users can now declare a custom `max_batch_size` in the project configuration to set the batch size used by the seed file loader. [#127](https://github.com/dbt-msft/dbt-sqlserver/issues/127) and [#151](https://github.com/dbt-msft/dbt-sqlserver/pull/151) thanks [@jacobm001](https://github.com/jacobm001)

#### under the hood

- `sqlserver__load_csv_rows` now has a safety provided by `calc_batch_size()` to ensure the insert statements won't exceed SQL Server's 2100 parameter limit. [#127](https://github.com/dbt-msft/dbt-sqlserver/issues/127) and [#151](https://github.com/dbt-msft/dbt-sqlserver/pull/151) thanks [@jacobm001](https://github.com/jacobm001)

### v0.19.2

#### fixes
Expand Down
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,15 @@ client_secret: clientsecret

### Seeds

By default, dbt-sqlserver will attempt to insert seed files in batches of 400 rows. If this exceeds SQL Server's 2100 parameter limit, the adapter will automatically limit to the highest safe value possible.

To set a different default seed value, you can set the variable `max_batch_size` in your project configuration.

```yaml
vars:
max_batch_size: 200 # Any integer less than or equal to 2100 will do.
```

### Hooks

### Custom schemas
Expand Down
23 changes: 20 additions & 3 deletions dbt/include/sqlserver/macros/materializations/seed/seed.sql
Original file line number Diff line number Diff line change
@@ -1,7 +1,23 @@
{% macro sqlserver__basic_load_csv_rows(model, batch_size, agate_table) %}
{% macro calc_batch_size(num_columns,max_batch_size) %}
{#
SQL Server allows for a max of 2100 parameters in a single statement.
Check if the max_batch_size fits with the number of columns, otherwise
reduce the batch size so it fits.
#}
{% if num_columns * max_batch_size < 2100 %}
{% set batch_size = max_batch_size %}
{% else %}
{% set batch_size = (2100 / num_columns)|int %}
{% endif %}

{{ return(batch_size) }}
{% endmacro %}

{% macro sqlserver__basic_load_csv_rows(model, max_batch_size, agate_table) %}
{% set cols_sql = get_seed_column_quoted_csv(model, agate_table.column_names) %}
{% set bindings = [] %}

{% set batch_size = calc_batch_size(cols_sql|length, max_batch_size) %}
{% set bindings = [] %}
{% set statements = [] %}

{% for chunk in agate_table.rows | batch(batch_size) %}
Expand Down Expand Up @@ -34,5 +50,6 @@
{% endmacro %}

{% macro sqlserver__load_csv_rows(model, agate_table) %}
{{ return(sqlserver__basic_load_csv_rows(model, 200, agate_table) )}}
{% set max_batch_size = var("max_batch_size", 400) %}
{{ return(sqlserver__basic_load_csv_rows(model, max_batch_size, agate_table) )}}
{% endmacro %}