Ensure MicrobatchModelRunner doesn't double compile batches

We were compiling the node for each batch _twice_. Besides making microbatch models more expensive than they needed to be, double compiling wasn't causing any issue. However the first compilation was happening _before_ we had added the batch context information to the model node for the batch. This was leading to models which try to access the `batch_context` information on the model to blow up, which was undesirable. As such, we've now gone and skipped the first compilation. We've done this similar to how SavedQuery nodes skip compilation.
dbt-labs · Nov 27, 2024 · 45daec7 · 45daec7
1 parent 585fb04
commit 45daec7
Showing 1 changed file with 7 additions and 0 deletions.
diff --git a/core/dbt/task/run.py b/core/dbt/task/run.py
@@ -341,6 +341,13 @@ def __init__(self, config, adapter, node, node_index: int, num_nodes: int):
         self.batches: Dict[int, BatchType] = {}
         self.relation_exists: bool = False
 
+    def compile(self, manifest: Manifest):
+        # The default compile function is _always_ called. However, we do our
+        # compilation _later_ in `_execute_microbatch_materialization`. This
+        # meant the node was being compiled _twice_ for each batch. To get around
+        # this, we've overriden the default compile method to do nothing
+        return self.node
+
     def set_batch_idx(self, batch_idx: int) -> None:
         self.batch_idx = batch_idx