Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Er/3250 generic tests #4052

Merged
merged 16 commits into from
Oct 21, 2021
Merged

Er/3250 generic tests #4052

merged 16 commits into from
Oct 21, 2021

Conversation

emmyoop
Copy link
Member

@emmyoop emmyoop commented Oct 13, 2021

resolves #3250

Description

Adding ability to have generic tests in the /tests directory.

  • added new ParseFileType GenericTest and renamed Test to be SingularTest
  • added new GenericTestParser
  • parse generic tests in /tests directly after macros and before everything else.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change

@cla-bot cla-bot bot added the cla:yes label Oct 13, 2021
@emmyoop emmyoop requested a review from gshank October 13, 2021 16:23
@@ -44,6 +44,12 @@ class UnparsedMacro(UnparsedBaseNode, HasSQL):
resource_type: NodeType = field(metadata={'restrict': [NodeType.Macro]})


@dataclass
class UnparsedGenericTest(UnparsedBaseNode, HasSQL):
# TODO: should this be nodeType macro?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the long run we're probably going to want different NodeTypes for singular test and generic test. I don't know if that's something that we want to do now or not though.


self.build_macro_resolver()
# Look at changed macros and update the macro.depends_on.macros
self.macro_depends_on()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to do this twice, once for macros and once for tests. It should be done just once, after both macros and tests are in the 'macros' dictionary.

project, files, project.test_paths, '.sql', ParseFileType.SingularTest, saved_files
)

project_files['GenericTestParser'] = read_files_for_parser(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a problem to me. Every file in the 'tests' directory is going to be added to both the SingularTestParser and the GenericTestParser. I'm not exactly sure what the implications are, but I think it will make partial parsing for these files impossible. I think we might have to peek into the files and allocate them to either Singular or Generic, not both. I'm going to think about this some more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the main thing I don't like about this approach. It does work as I understand the tests to be structured. Singular tests end up getting ignored by generic tests because they're not in a jinja test block. And I'm purposefully ignoring generic tests in the singular test code by specifically checking to see if they're in jinja blocks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File objects are saved in the 'files' dictionary under the file_id key, so the second run over the files would save a duplicated file object except with a differe parse_file_type.

We could use a different filetype for the generic tests, but longterm we may want to make "singular" tests look more like generic tests and put them in a file block, so that might not be the right path. We could also put them in a subdirectory of tests, like 'generic' (Jeremy's suggestion). We would also parse all of the tests when the test parser is processed, not at macro processing time, and have to separately do the 'macro_depends_on' for just tests. As long as the generic tests are in 'macros' prior to schema file processing it should be okay.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current vote is for a special directory, tests/generic. That's a little jank, but it feels like a step in the right direction, and I like it more than having generic tests defined in macros/ forever.

What Gerda and I were discussing is an eventual future where all tests are defined as blocks, some with "singular" construction and others with "generic" construction, and they could be intermixed in the same .sql files:

-- tests/my_tests.sql

{% test singular_not_null %}
  select *
  from {{ ref('orders') }}
  where id is null
{% endtest %}

{% test not_null(model, column_name) %}
  select *
  from {{ model }}
  where {{ column_name }} is null
{% endtest %}

There's history for this proposal (#2274) that goes pretty far back for all resource types (#184). Snapshots are the only real DAG node that ever made the switchover, from files to blocks. It's something we once imagined for v1.0, but deprioritized. It has the side-effect of making these queries harder, though certainly not impossible, to run in IDEs, since line/cursor location becomes important.

In the meantime, we can take an important step in that direction, without needing to get all the way there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tweaked it to look for a generic folder in test-paths. Is this more along the lines of how you were thinking it would work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this looks good. We do end up with some generic tests having a Macro parse _file_type and some with a GenericTest parse file type. I'm not sure that's a problem, but it is something to be watchful for.

@emmyoop emmyoop force-pushed the er/3250-generic-tests branch 2 times, most recently from 8ad1780 to 6ee9771 Compare October 18, 2021 13:31
@emmyoop emmyoop marked this pull request as ready for review October 18, 2021 15:46
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functional change here looks mostly good. Thanks for the flexibility in working toward a way to achieve this functionality ahead of v1.

I noticed that it was quite easy to trip a partial parsing error. Create any test in tests/generic/my_cool_test.sql, parse the project, edit the file in any way, and re-parse the project (with partial parsing enabled):

Encountered an error:
Compilation Error
  dbt found two macros named "test_my_cool_test" in the project "testy".
   To fix this error, rename or remove one of the following macros:
      - tests/generic/my_cool_test.sql
      - tests/generic/my_cool_test.sql

I only have one test there by that name, so it's an issue with not removing the test macro before scheduling it for re-parsing.

@gshank
Copy link
Contributor

gshank commented Oct 19, 2021

In core/dbt/parser/partial.py, the GenericTest filetype does not belong in the mssat_files. It needs to be handled with the 'Macro' files. There are a number of places in core/dbt/parser/partial.py where it checks for a ParseFileType of Macro. Those places also need to check for a ParseFileType of GenericTest. You could make a list of the two of them, like the mssat_files list.

@gshank
Copy link
Contributor

gshank commented Oct 19, 2021

It would probably be a good idea to add a partial parsing test with a test/generic test file to the 068_partial_parsing_tests.

Copy link
Contributor

@gshank gshank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@jtcohen6
Copy link
Contributor

The substance of this change looks good to me.

As above, I'm seeing an issue with partial parsing. It no longer raises an exception, but every time I edit a test defined in test/generic, I see:

Partial parsing enabled but an error occurred. Switching to a full re-parse.

That's a step in the right direction, but I'd like to resolve that exception before merging. I threw a debugger into the partial parsing code, and it looks like the exception I'm tripping is:

Traceback (most recent call last):
  File "/Users/jerco/dev/product/dbt/core/dbt/parser/manifest.py", line 233, in load
    project_parser_files = self.partial_parser.get_parsing_files()
  File "/Users/jerco/dev/product/dbt/core/dbt/parser/partial.py", line 156, in get_parsing_files
    self.update_in_saved(file_id)
  File "/Users/jerco/dev/product/dbt/core/dbt/parser/partial.py", line 238, in update_in_saved
    self.update_macro_in_saved(new_source_file, old_source_file)
  File "/Users/jerco/dev/product/dbt/core/dbt/parser/partial.py", line 322, in update_macro_in_saved
    self.handle_macro_file_links(old_source_file, follow_references=True)
  File "/Users/jerco/dev/product/dbt/core/dbt/parser/partial.py", line 445, in handle_macro_file_links
    self.recursively_gather_macro_references(unique_id, referencing_nodes)
  File "/Users/jerco/dev/product/dbt/core/dbt/parser/partial.py", line 418, in recursively_gather_macro_references
    for unique_id in self.macro_child_map[macro_unique_id]:
KeyError: 'test.testy.test_my_cool_test'

@emmyoop emmyoop merged commit f79a968 into main Oct 21, 2021
@emmyoop emmyoop deleted the er/3250-generic-tests branch October 21, 2021 16:42
@jtcohen6 jtcohen6 mentioned this pull request Oct 28, 2021
3 tasks
iknox-fa pushed a commit that referenced this pull request Feb 8, 2022
* removed overlooked breakpoint

* first pass

* save progress - singualr tests broken

* fixed to work with both generic and singular tests

* fixed formatting

* added a comment

* change to use /generic subfolder

* fix formatting issues

* fixed bug on code consolidation

* fixed typo

* added test for generic tests

* added changelog entry

* added logic to treat generic tests like macro tests

* add generic test to macro_edges

* fixed generic tests to match unique_ids

* fixed test

automatic commit by git-black, original commits:
  f79a968
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

define generic tests in configured tests-paths directory
3 participants