diff --git a/README.md b/README.md index 7d309ece9..9297c9833 100644 --- a/README.md +++ b/README.md @@ -168,7 +168,8 @@ The following configurations can be supplied to models run with the dbt-spark pl **Incremental Models** To use incremental models, specify a `partition_by` clause in your model config. The default incremental strategy used is `insert_overwrite`, which will overwrite the partitions included in your query. Be sure to re-select _all_ of the relevant -data for a partition when using the `insert_overwrite` strategy. If a `partition_by` config is not specified, dbt will overwrite the entire table as an atomic operation, replacing it with new data of the same schema. This is analogous to `truncate` + `insert`. +data for a partition when using the `insert_overwrite` strategy. If a `partition_by` config is not specified, dbt will simply +append new data to the model, without overwriting any existing data. ``` {{ config( diff --git a/dbt/include/spark/macros/materializations/incremental.sql b/dbt/include/spark/macros/materializations/incremental.sql index 6dd9e3fab..8b67188f4 100644 --- a/dbt/include/spark/macros/materializations/incremental.sql +++ b/dbt/include/spark/macros/materializations/incremental.sql @@ -1,8 +1,11 @@ {% macro get_insert_overwrite_sql(source_relation, target_relation) %} + {%- set cols = config.get('partition_by', validator=validation.any[list, basestring]) -%} + {%- set insert = 'insert overwrite' if cols is not none else 'insert into' -%} + {%- set dest_columns = adapter.get_columns_in_relation(target_relation) -%} {%- set dest_cols_csv = dest_columns | map(attribute='quoted') | join(', ') -%} - insert overwrite table {{ target_relation }} + {{ insert }} table {{ target_relation }} {{ partition_cols(label="partition") }} select {{dest_cols_csv}} from {{ source_relation.include(database=false, schema=false) }}