Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8295 unit testing artifacts #8477

Merged
merged 28 commits into from
Aug 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
67baf75
Move unit tests classes out of unparsed. Add unit_tests to Manifest
gshank Aug 17, 2023
30c44fc
Move initial unit test parsing to SchemaParser, separate out unit test
gshank Aug 17, 2023
91a8494
Create UnparsedUnitTestSuite, give UnitTestSuite BaseNode fields
gshank Aug 17, 2023
87ce1d9
Add attached_node to UnitTestSuite, plus throw error if it doesn't exist
gshank Aug 18, 2023
69e0e70
partial parsing of manifest.unit_tests
gshank Aug 18, 2023
242ba29
Simplify construction of unit testing manifest somewhat. Can be added
gshank Aug 23, 2023
b6a3f01
Merge branch 'unit_testing_feature_branch' into 8295-unit_testing_art…
gshank Aug 23, 2023
bb5e1fa
Callback to reset job_queue and manifest in unit test task
gshank Aug 23, 2023
0eb5b5a
Different node_selectors
gshank Aug 23, 2023
3940cab
remove "collection", setup method to call UnitTestManifestLoader. Not
gshank Aug 23, 2023
53fa22d
"compile" the manifest and use the "selected" from the job_queue to
gshank Aug 23, 2023
a7ab834
Comments and cleanup
gshank Aug 24, 2023
1f0db54
Move code constructing unit test nodes and input nodes
gshank Aug 24, 2023
e11d8ae
Start on unit-test implementation of test_name
gshank Aug 25, 2023
55cffef
use pseudo-test-path to set unit test node path, initial steps for
gshank Aug 25, 2023
a42110b
Switch to individual unit test cases in manifest.unit_tests
gshank Aug 25, 2023
9e13179
Add unit tests to graph, adjust selection code to work.
gshank Aug 28, 2023
7b7b9ce
Write out unit_tests in manifest.json
gshank Aug 28, 2023
6b6b609
Changie
gshank Aug 28, 2023
d94c89e
Write unit test manifest out to different name
gshank Aug 28, 2023
e11a47c
Fix test_manifest.py
gshank Aug 28, 2023
ad070d7
Update artifacts test
gshank Aug 28, 2023
03f65a0
Fix unit test partial parsing to use "model" instead of "name"
gshank Aug 28, 2023
0f679a1
Merge branch 'unit_testing_feature_branch' into 8295-unit_testing_art…
gshank Aug 28, 2023
d89fce6
Combine tests and unit tests in TestNameSelectorMethod
gshank Aug 29, 2023
1ef2be1
Use args.which passed in to write_manifest
gshank Aug 29, 2023
1f413b6
rename UnitTestCase to UnitTestDefinition
gshank Aug 29, 2023
0625003
Fix accidental insertion of traceback
gshank Aug 29, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20230828-101825.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Unit test manifest artifacts and selection
time: 2023-08-28T10:18:25.958929-04:00
custom:
Author: gshank
Issue: "8295"
2 changes: 0 additions & 2 deletions core/dbt/cli/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -879,14 +879,12 @@ def test(ctx, **kwargs):
@requires.project
@requires.runtime_config
@requires.manifest
@requires.unit_test_collection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the reasoning for loading the unit test manifest (much prefer this naming to "collection"!) to the task as opposed to as a requires decorator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A requires decorator runs before the task starts. The unit test manifest can't be created until after selection is handled, which happens during task.run.

def unit_test(ctx, **kwargs):
"""Runs tests on data in deployed models. Run this after `dbt run`"""
task = UnitTestTask(
ctx.obj["flags"],
ctx.obj["runtime_config"],
ctx.obj["manifest"],
ctx.obj["unit_test_collection"],
)

results = task.run()
Expand Down
23 changes: 0 additions & 23 deletions core/dbt/cli/requires.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
from dbt.profiler import profiler
from dbt.tracking import active_user, initialize_from_flags, track_run
from dbt.utils import cast_dict_to_dict_of_strings
from dbt.parser.unit_tests import UnitTestManifestLoader
from dbt.plugins import set_up_plugin_manager, get_plugin_manager

from click import Context
Expand Down Expand Up @@ -266,25 +265,3 @@ def wrapper(*args, **kwargs):
if len(args0) == 0:
return outer_wrapper
return outer_wrapper(args0[0])


def unit_test_collection(func):
"""A decorator used by click command functions for generating a unit test collection provided a manifest"""

def wrapper(*args, **kwargs):
ctx = args[0]
assert isinstance(ctx, Context)

req_strs = ["manifest", "runtime_config"]
reqs = [ctx.obj.get(req_str) for req_str in req_strs]

if None in reqs:
raise DbtProjectError("manifest and runtime_config required for unit_test_collection")

collection = UnitTestManifestLoader.load(ctx.obj["manifest"], ctx.obj["runtime_config"])

ctx.obj["unit_test_collection"] = collection

return func(*args, **kwargs)

return update_wrapper(wrapper, func)
3 changes: 3 additions & 0 deletions core/dbt/compilation.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ def _generate_stats(manifest: Manifest):
stats[NodeType.Macro] += len(manifest.macros)
stats[NodeType.Group] += len(manifest.groups)
stats[NodeType.SemanticModel] += len(manifest.semantic_models)
stats[NodeType.Unit] += len(manifest.unit_tests)

# TODO: should we be counting dimensions + entities?

Expand Down Expand Up @@ -196,6 +197,8 @@ def link_graph(self, manifest: Manifest):
self.link_node(exposure, manifest)
for metric in manifest.metrics.values():
self.link_node(metric, manifest)
for unit_test in manifest.unit_tests.values():
self.link_node(unit_test, manifest)

cycle = self.find_cycles()

Expand Down
1 change: 1 addition & 0 deletions core/dbt/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@
MANIFEST_FILE_NAME = "manifest.json"
SEMANTIC_MANIFEST_FILE_NAME = "semantic_manifest.json"
PARTIAL_PARSE_FILE_NAME = "partial_parse.msgpack"
UNIT_TEST_MANIFEST_FILE_NAME = "unit_test_manifest.json"
1 change: 1 addition & 0 deletions core/dbt/contracts/files.py
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,7 @@ class SchemaSourceFile(BaseSourceFile):
# node patches contain models, seeds, snapshots, analyses
ndp: List[str] = field(default_factory=list)
semantic_models: List[str] = field(default_factory=list)
unit_tests: List[str] = field(default_factory=list)
# any macro patches in this file by macro unique_id.
mcp: Dict[str, str] = field(default_factory=dict)
# any source patches in this file. The entries are package, name pairs
Expand Down
18 changes: 18 additions & 0 deletions core/dbt/contracts/graph/manifest.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
SemanticModel,
SourceDefinition,
UnpatchedSourceDefinition,
UnitTestDefinition,
)
from dbt.contracts.graph.unparsed import SourcePatch, NodeVersion, UnparsedVersion
from dbt.contracts.graph.manifest_upgrade import upgrade_manifest_json
Expand Down Expand Up @@ -742,6 +743,7 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
disabled: MutableMapping[str, List[GraphMemberNode]] = field(default_factory=dict)
env_vars: MutableMapping[str, str] = field(default_factory=dict)
semantic_models: MutableMapping[str, SemanticModel] = field(default_factory=dict)
unit_tests: MutableMapping[str, UnitTestDefinition] = field(default_factory=dict)

_doc_lookup: Optional[DocLookup] = field(
default=None, metadata={"serialize": lambda x: None, "deserialize": lambda x: None}
Expand Down Expand Up @@ -895,6 +897,7 @@ def deepcopy(self):
files={k: _deepcopy(v) for k, v in self.files.items()},
state_check=_deepcopy(self.state_check),
semantic_models={k: _deepcopy(v) for k, v in self.semantic_models.items()},
unit_tests={k: _deepcopy(v) for k, v in self.unit_tests.items()},
)
copy.build_flat_graph()
return copy
Expand Down Expand Up @@ -954,6 +957,7 @@ def writable_manifest(self) -> "WritableManifest":
parent_map=self.parent_map,
group_map=self.group_map,
semantic_models=self.semantic_models,
unit_tests=self.unit_tests,
)

def write(self, path):
Expand All @@ -972,6 +976,8 @@ def expect(self, unique_id: str) -> GraphMemberNode:
return self.metrics[unique_id]
elif unique_id in self.semantic_models:
return self.semantic_models[unique_id]
elif unique_id in self.unit_tests:
return self.unit_tests[unique_id]
else:
# something terrible has happened
raise dbt.exceptions.DbtInternalError(
Expand Down Expand Up @@ -1374,6 +1380,12 @@ def add_semantic_model(self, source_file: SchemaSourceFile, semantic_model: Sema
self.semantic_models[semantic_model.unique_id] = semantic_model
source_file.semantic_models.append(semantic_model.unique_id)

def add_unit_test(self, source_file: SchemaSourceFile, unit_test: UnitTestDefinition):
if unit_test.unique_id in self.unit_tests:
raise DuplicateResourceNameError(unit_test, self.unit_tests[unit_test.unique_id])
self.unit_tests[unit_test.unique_id] = unit_test
source_file.unit_tests.append(unit_test.unique_id)

# end of methods formerly in ParseResult

# Provide support for copy.deepcopy() - we just need to avoid the lock!
Expand Down Expand Up @@ -1401,6 +1413,7 @@ def __reduce_ex__(self, protocol):
self.disabled,
self.env_vars,
self.semantic_models,
self.unit_tests,
self._doc_lookup,
self._source_lookup,
self._ref_lookup,
Expand Down Expand Up @@ -1479,6 +1492,11 @@ class WritableManifest(ArtifactMixin):
description="Metadata about the manifest",
)
)
unit_tests: Mapping[UniqueID, UnitTestDefinition] = field(
metadata=dict(
description="The unit tests defined in the project",
)
)

@classmethod
def compatible_previous_versions(self):
Expand Down
3 changes: 3 additions & 0 deletions core/dbt/contracts/graph/manifest_upgrade.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,9 @@ def upgrade_manifest_json(manifest: dict, manifest_schema_version: int) -> dict:
manifest["groups"] = {}
if "group_map" not in manifest:
manifest["group_map"] = {}
# add unit_tests key
if "unit_tests" not in manifest:
manifest["unit_tests"] = {}
for metric_content in manifest.get("metrics", {}).values():
# handle attr renames + value translation ("expression" -> "derived")
metric_content = upgrade_ref_content(metric_content)
Expand Down
21 changes: 19 additions & 2 deletions core/dbt/contracts/graph/nodes.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,8 @@
UnparsedSourceDefinition,
UnparsedSourceTableDefinition,
UnparsedColumn,
UnparsedUnitTestOverrides,
UnitTestOverrides,
InputFixture,
)
from dbt.contracts.graph.node_args import ModelNodeArgs
from dbt.contracts.util import Replaceable, AdditionalPropertiesMixin
Expand Down Expand Up @@ -1055,7 +1056,22 @@ def test_node_type(self):
class UnitTestNode(CompiledNode):
resource_type: NodeType = field(metadata={"restrict": [NodeType.Unit]})
attached_node: Optional[str] = None
overrides: Optional[UnparsedUnitTestOverrides] = None
overrides: Optional[UnitTestOverrides] = None


@dataclass
class UnitTestDefinition(GraphNode):
model: str
attached_node: str
given: Sequence[InputFixture]
expect: List[Dict[str, Any]]
description: str = ""
overrides: Optional[UnitTestOverrides] = None
depends_on: DependsOn = field(default_factory=DependsOn)

@property
def depends_on_nodes(self):
return self.depends_on.nodes


# ====================================
Expand Down Expand Up @@ -1754,6 +1770,7 @@ class ParsedMacroPatch(ParsedPatch):
Exposure,
Metric,
SemanticModel,
UnitTestDefinition,
]

# All "nodes" (or node-like objects) in this file
Expand Down
56 changes: 28 additions & 28 deletions core/dbt/contracts/graph/unparsed.py
Original file line number Diff line number Diff line change
Expand Up @@ -671,34 +671,6 @@ def validate(cls, data):
raise ValidationError("Group owner must have at least one of 'name' or 'email'.")


@dataclass
class UnparsedInputFixture(dbtClassMixin):
input: str
rows: List[Dict[str, Any]] = field(default_factory=list)


@dataclass
class UnparsedUnitTestOverrides(dbtClassMixin):
macros: Dict[str, Any] = field(default_factory=dict)
vars: Dict[str, Any] = field(default_factory=dict)
env_vars: Dict[str, Any] = field(default_factory=dict)


@dataclass
class UnparsedUnitTestCase(dbtClassMixin):
name: str
given: Sequence[UnparsedInputFixture]
expect: List[Dict[str, Any]]
description: str = ""
overrides: Optional[UnparsedUnitTestOverrides] = None


@dataclass
class UnparsedUnitTestSuite(dbtClassMixin):
model: str # name of the model being unit tested
tests: Sequence[UnparsedUnitTestCase]


#
# semantic interfaces unparsed objects
#
Expand Down Expand Up @@ -773,3 +745,31 @@ def normalize_date(d: Optional[datetime.date]) -> Optional[datetime.datetime]:
dt = dt.astimezone()

return dt


@dataclass
class InputFixture(dbtClassMixin):
input: str
rows: List[Dict[str, Any]] = field(default_factory=list)


@dataclass
class UnitTestOverrides(dbtClassMixin):
macros: Dict[str, Any] = field(default_factory=dict)
vars: Dict[str, Any] = field(default_factory=dict)
env_vars: Dict[str, Any] = field(default_factory=dict)


@dataclass
class UnparsedUnitTestDefinition(dbtClassMixin):
name: str
given: Sequence[InputFixture]
expect: List[Dict[str, Any]]
description: str = ""
overrides: Optional[UnitTestOverrides] = None


@dataclass
class UnparsedUnitTestSuite(dbtClassMixin):
model: str # name of the model being unit tested
tests: Sequence[UnparsedUnitTestDefinition]
13 changes: 11 additions & 2 deletions core/dbt/graph/selector.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ def can_select_indirectly(node):
"""
if node.resource_type == NodeType.Test:
return True
elif node.resource_type == NodeType.Unit:
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that this is to power selection for the unit test task, but does not actually include unit tests in other task execution because UnitTestNodes are not in the manifest. Is that correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case the objects with NodeType.Unit are actually the UnitTestCases in manifest.unit_tests, which are in the manifest. They're not included in other tasks because each task has a Selector which filters out other resource types. So unit test cases aren't included in the "run" task job_queue because it only executes NodeType.Model.

else:
return False

Expand Down Expand Up @@ -170,6 +172,8 @@ def _is_graph_member(self, unique_id: UniqueId) -> bool:
return metric.config.enabled
elif unique_id in self.manifest.semantic_models:
return True
elif unique_id in self.manifest.unit_tests:
return True
node = self.manifest.nodes[unique_id]

if self.include_empty_nodes:
Expand All @@ -195,6 +199,8 @@ def _is_match(self, unique_id: UniqueId) -> bool:
node = self.manifest.metrics[unique_id]
elif unique_id in self.manifest.semantic_models:
node = self.manifest.semantic_models[unique_id]
elif unique_id in self.manifest.unit_tests:
node = self.manifest.unit_tests[unique_id]
else:
raise DbtInternalError(f"Node {unique_id} not found in the manifest!")
return self.node_is_match(node)
Expand Down Expand Up @@ -240,8 +246,11 @@ def expand_selection(
)

for unique_id in self.graph.select_successors(selected):
if unique_id in self.manifest.nodes:
node = self.manifest.nodes[unique_id]
if unique_id in self.manifest.nodes or unique_id in self.manifest.unit_tests:
if unique_id in self.manifest.nodes:
node = self.manifest.nodes[unique_id]
elif unique_id in self.manifest.unit_tests:
node = self.manifest.unit_tests[unique_id] # type: ignore
if can_select_indirectly(node):
# should we add it in directly?
if indirect_selection == IndirectSelection.Eager or set(
Expand Down
Loading