Skip to content

Commit

Permalink
Development (#87)
Browse files Browse the repository at this point in the history
* added incremental model and test

* working code for incremental tests -- old unit test still works

* working code for incremental tests -- old unit test still works

* updated comments

* moved unit test incremental to its own method

* moved unit test incremental to its own method

* updates readme and version for incremental tests

* Enable macro unit tests (#79)

* Added all existing unit tests in macro form.

* Refactored the test blocks into macros blocks.
Replaced the original code inside the test blocks with the new macros.

* Adjusted the render logic for the the input mapping.
This has to be done for the values and the keys. This is necessary so that the unit tests can be executed as macros.

* Added explanations to the README on how to setup the tests as macros.

---------

Co-authored-by: Leopold Gabelmann <leopold.gabelmann@waschbaer.de>

* fixes merge issues

* minor fixes

* oops

* thought i got rid of that already

* o i see

* adds exclude_columns

* adds exclude_columns to test

* bumps versions

---------

Co-authored-by: Petter Söderlund <petter.soderlund@avanza.se>
Co-authored-by: Leopold Gabelmann <23480006+LeopoldGabelmann@users.noreply.github.com>
Co-authored-by: Leopold Gabelmann <leopold.gabelmann@waschbaer.de>
  • Loading branch information
4 people authored Jul 3, 2023
1 parent 15b08bf commit 5a4c311
Show file tree
Hide file tree
Showing 8 changed files with 107 additions and 19 deletions.
37 changes: 36 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ and test that the model produces the desired output (using another CSV seed).
1. Install this package by adding the following to your `packages.yml` file:
- ```yaml
- package: mjirv/dbt_datamocktool
version: [">=0.3.0"]
version: [">=0.3.2"]
```
2. Create your mocks: sample inputs for your models and the expected outputs of those models given the inputs.
- Save them to your seeds directory (usually `data/`; note that you can use any folder structure you would like within that directory)
Expand Down Expand Up @@ -196,3 +196,38 @@ _NOTE: currently only the MERGE strategy is supported, so `unit_test_incremental
this: ref('dmt__current_state_orders_2')
expected_output: ref('dmt__expected_stg_orders_2')
```

### Set the unit tests as macros

The unit tests can also be defined inside macros. This yields the disadvantage that not all the tests are defined at the same place, i.e. the yml file of the model. However, this allows to easlily run a specific unit test and enables easier selection criterias if the tests are for example run within a ci/cd pipeline, since all the tests can be excluded or included via their folder path within tests/.

To set a new test, a file has to be created within the tests/ folder like that:

```sql
{{ dbt_datamocktool.unit_test(
model = ref('stg_customers'),
input_mapping = {
source('jaffle_shop', 'raw_customers'): ref('dmt__raw_customers_1')
},
expected_output = ref('dmt__expected_stg_customers_1'),
) }}
```

To make use of the other configuration possibilities, like inlcuding only specific columns, they can be simply added the same then in the yml files. If a specification consists out of multiple items, it has to explicitly be setup as a dictionary. Does one key contain multiple calues, it has to be added as a list. A complete example would look like that:

```sql
{{ dbt_datamocktool.unit_test(
model = ref('<model_to_test>'),
input_mapping = {
ref('<input_one>'): ref('<replacement_one>'),
ref('<input_two>'): ref('<replacement_two>')
},
expected_output = ref('<expected_output>'),
name = '<Name of the unit test>',
description = '<Description of the unit test>',
compare_columns = ['<col_one>', '<col_two>'],
depends_on = [ref('<dependency_one>'), ref('<dependency_two>')],
)}}
```

Up to this moment, not multiple tests can defined per file. It can be only one test macro per file.
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: "dbt_datamocktool"
version: "0.3.1"
version: "0.3.2"
config-version: 2

require-dbt-version: [">=1.3.0"] # weird behavior/errors with the ref() function before 0.19.2
Expand Down
9 changes: 9 additions & 0 deletions integration_tests/tests/unit/staging/test_stg_customers.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{#- Unit test with seeds as the input as well as the output. -#}
{{ dbt_datamocktool.unit_test(
model = ref('stg_customers'),
input_mapping = {
source('jaffle_shop', 'raw_customers'): ref('dmt__raw_customers_1')
},
expected_output = ref('dmt__expected_stg_customers_1'),
exclude_columns = ['description']
) }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{#- Unit test with macro as the input and a model as the expected output. -#}
{{ dbt_datamocktool.unit_test(
model = ref('stg_customers'),
input_mapping = {
source('jaffle_shop', 'raw_customers'): "{{ dmt_raw_customers() }}"
},
expected_output = ref('dmt__expected_stg_customers_2'),
name = "This test is a unit test",
) }}
7 changes: 7 additions & 0 deletions integration_tests/tests/unit/staging/test_stg_orders.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{{ dbt_datamocktool.unit_test(
model = ref('stg_orders'),
input_mapping = {
ref('raw_orders'): ref('dmt__raw_orders_1')
},
expected_output = ref('dmt__expected_stg_orders_1'),
) }}
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{#- Unit test with for an incremental model with redefining `{{this}}`. -#}
{{ dbt_datamocktool.unit_test_incremental(
model = ref('stg_orders'),
input_mapping = {
ref('raw_orders'): ref('dmt__raw_orders_3'),
"this": ref('dmt__current_state_orders_2')
},
expected_output = ref('dmt__expected_stg_orders_2'),
) }}
37 changes: 24 additions & 13 deletions macros/dmt_get_test_sql.sql
Original file line number Diff line number Diff line change
@@ -1,19 +1,20 @@
{% macro get_unit_test_sql(model, input_mapping, depends_on) %}
{% set ns=namespace(
test_sql="(select 1) raw_code",
rendered_keys={},
rendered_mappings={},
graph_model=none
) %}

{% do dbt_datamocktool.__set_rendered_keys(ns, input_mapping.keys()) %}
{% do dbt_datamocktool.__set_rendered_mappings(ns, input_mapping) %}

{% if execute %}
{# inside an execute block because graph nodes aren't well-defined during parsing #}
{% set ns.graph_model = dbt_datamocktool.__get_graph_model(project_name, model.schema, model.name) %}
{% set ns.test_sql = ns.graph_model.raw_code %}

{% do dbt_datamocktool.__render_sql_and_replace_references(ns, input_mapping) %}

{# mock_model_relation is the mocked model name #}
{% set mock_model_relation = dbt_datamocktool._get_model_to_mock(
model, suffix=('_dmt_' ~ modules.datetime.datetime.now().strftime("%S%f"))
) %}
Expand All @@ -32,12 +33,12 @@
{% macro get_unit_test_incremental_sql(model, input_mapping, depends_on) %}
{% set ns=namespace(
test_sql="(select 1) raw_code",
rendered_keys={},
rendered_mappings={},
graph_model=none
) %}

{# doing this outside the execute block allows dbt to infer the proper dependencies #}
{% do dbt_datamocktool.__set_rendered_keys(ns, input_mapping.keys()) %}
{% do dbt_datamocktool.__set_rendered_mappings(ns, input_mapping) %}

{% if execute %}
{# inside an execute block because graph nodes aren't well-defined during parsing #}
Expand All @@ -55,7 +56,6 @@
{% set ns.test_sql = ns.test_sql|replace(this.dataset, model.dataset) %}
{% set ns.test_sql = ns.test_sql|replace(this.table, input_mapping.this) %}

{# mock_model_relation is the mocked model name #}
{% set mock_model_relation = dbt_datamocktool._get_model_to_mock(
model, suffix=('_dmt_' ~ modules.datetime.datetime.now().strftime("%S%f"))
) %}
Expand Down Expand Up @@ -124,9 +124,9 @@
{% do run_query(get_merge_sql(model, test_sql, dest_columns=dest_columns, unique_key=unique_key)) %}
{% endmacro %}

{% macro __set_rendered_keys(ns, keys) %}
{% for k in keys %}
{% do ns.rendered_keys.update({k: render("{{ " + k + " }}")}) %}
{% macro __set_rendered_mappings(ns, input_mapping) %}
{% for k, v in input_mapping.items() %}
{% do ns.rendered_mappings.update({k: render(v)}) %}
{% endfor %}
{% endmacro %}

Expand All @@ -144,9 +144,20 @@
{{ return(graph_model) }}
{% endmacro %}

{% macro __render_sql_and_replace_references(ns, input_mapping) %}
{% for k,v in input_mapping.items() %}
{# render the original sql and replacement key before replacing because v is already rendered when it is passed to this test #}
{% set ns.test_sql = render(ns.test_sql)|replace(ns.rendered_keys[k], v) %}
{% macro __render_sql_and_replace_references(ns, input_mapping) %}
{#- Replace the keys first, before the sql code is rendered -#}
{% for k, v in ns.rendered_mappings.items() %}
{% set ns.test_sql = ns.test_sql|replace("{{ "~render(k)~" }}", v) %}
{% endfor %}

{#- Render the original sql after all reference values are set according to the provided input
mapping. -#}
{% set ns.test_sql = render(ns.test_sql) %}

{#- Replace left over rendered keys with their reference values. This is only necessary, if the
unit test is defined within a macro, since then the input mapping is already rendered within
the macro itself.-#}
{% for k, v in ns.rendered_mappings.items() %}
{% set ns.test_sql = ns.test_sql|replace(k, v) %}
{% endfor %}
{% endmacro %}
16 changes: 12 additions & 4 deletions macros/dmt_unit_test.sql
Original file line number Diff line number Diff line change
@@ -1,14 +1,22 @@
{%- test unit_test(model, input_mapping, expected_output, name, description, compare_columns, exclude_columns, depends_on) -%}
{%- macro unit_test(model, input_mapping, expected_output, name, description, compare_columns, exclude_columns, depends_on) -%}
{%- set test_sql = dbt_datamocktool.get_unit_test_sql(model, input_mapping, depends_on)|trim -%}
{%- set test_report = dbt_datamocktool.test_equality(expected_output, name, compare_model=test_sql, compare_columns=compare_columns, exclude_columns=exclude_columns) -%}
{{ test_report }}
{%- endtest -%}
{%- endmacro -%}

{% test unit_test_incremental(model, input_mapping, expected_output, name, description, compare_columns, exclude_columns, depends_on) %}
{%- test unit_test(model, input_mapping, expected_output, name, description, compare_columns, exclude_columns, depends_on) -%}
{{ dbt_datamocktool.unit_test(model, input_mapping, expected_output, name, description, compare_columns, exclude_columns, depends_on) }}
{%- endtest -%}

{% macro unit_test_incremental(model, input_mapping, expected_output, name, description, compare_columns, exclude_columns, depends_on) %}
{%- set test_sql = dbt_datamocktool.get_unit_test_incremental_sql(model, input_mapping, depends_on)|trim -%}
{%- set test_report = dbt_datamocktool.test_equality(expected_output, name, compare_model=test_sql, compare_columns=compare_columns, exclude_columns=exclude_columns) -%}
{{ test_report }}
{% endtest %}
{% endmacro %}

{% test unit_test_incremental(model, input_mapping, expected_output, name, description, compare_columns, exclude_columns, depends_on) %}
{{ dbt_datamocktool.unit_test_incremental(model, input_mapping, expected_output, name, description, compare_columns, exclude_columns, depends_on) }}
{% endtest %}

{%- macro test_equality(model, name, compare_model, compare_columns=[], exclude_columns=[]) -%}

Expand Down

0 comments on commit 5a4c311

Please sign in to comment.