Merge branch 'current' into patch-1

dbt-labs · Nov 9, 2024 · 071dcfa · 071dcfa
2 parents 36269e1 + dc911a3
commit 071dcfa
Show file tree

Hide file tree

Showing 38 changed files with 67,227 additions and 159 deletions.
diff --git a/README.md b/README.md
@@ -62,18 +62,3 @@ You can click a link available in a Vercel bot PR comment to see and review your
 
 Advisory:
 - If you run into an `fatal error: 'vips/vips8' file not found` error when you run `npm install`, you may need to run `brew install vips`. Warning: this one will take a while -- go ahead and grab some coffee!
-
-## Running the Cypress tests locally
-
-Method 1: Utilizing the Cypress GUI
-1. `cd` into the repo: `cd docs.getdbt.com`
-2. `cd` into the `website` subdirectory: `cd website`
-3. Install the required node packages: `npm install`
-4. Run `npx cypress open` to open the Cypress GUI, and choose `E2E Testing` as the Testing Type, before finally selecting your browser and clicking `Start E2E testing in {broswer}`
-5. Click on a test and watch it run!
-
-Method 2: Running the Cypress E2E tests headlessly
-1. `cd` into the repo: `cd docs.getdbt.com`
-2. `cd` into the `website` subdirectory: `cd website`
-3. Install the required node packages: `npm install`
-4. Run `npx cypress run`
diff --git a/contributing/developer-blog.md b/contributing/developer-blog.md
diff --git a/website/dbt-versions.js b/website/dbt-versions.js
@@ -15,7 +15,7 @@
  */
 exports.versions = [
   {
-    version: "1.9.1",
+    version: "1.10",
     customDisplay: "Cloud (Versionless)",
   },
   {
@@ -74,12 +74,5 @@ exports.versionedPages = [
  * @property {string} firstVersion The first version the category is visible in the sidebar
  */
 exports.versionedCategories = [
-  {
-    category: "Model governance",
-    firstVersion: "1.5",
-  },
-  {
-    category: "Build your metrics",
-    firstVersion: "1.6",
-  },
+
 ];
diff --git a/website/docs/docs/build/dimensions.md b/website/docs/docs/build/dimensions.md
@@ -67,7 +67,7 @@ semantic_models:
       type: categorical
 ```
 
-Dimensions are bound to the primary entity of the semantic model they are defined in. For example the dimensoin `type` is defined in a model that has `transaction` as a primary entity. `type` is scoped to the `transaction` entity, and to reference this dimension you would use the fully qualified dimension name i.e `transaction__type`. 
+Dimensions are bound to the primary entity of the semantic model they are defined in. For example the dimension `type` is defined in a model that has `transaction` as a primary entity. `type` is scoped to the `transaction` entity, and to reference this dimension you would use the fully qualified dimension name i.e `transaction__type`. 
 
 MetricFlow requires that all semantic models have a primary entity. This is to guarantee unique dimension names. If your data source doesn't have a primary entity, you need to assign the entity a name using the `primary_entity` key. It doesn't necessarily have to map to a column in that table and assigning the name doesn't affect query generation. We recommend making these "virtual primary entities" unique across your semantic model. An example of defining a primary entity for a data source that doesn't have a primary entity column is below:
 

diff --git a/website/docs/docs/build/incremental-microbatch.md b/website/docs/docs/build/incremental-microbatch.md
@@ -8,7 +8,7 @@ id: "incremental-microbatch"
 
 :::info Microbatch
 
-The `microbatch` strategy is available in beta for [dbt Cloud Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) and dbt Core v1.9. We have been developing it behind a flag to prevent unintended interactions with existing custom incremental strategies. To enable this feature, set the environment variable `DBT_EXPERIMENTAL_MICROBATCH` to `True` in your dbt Cloud environments or wherever you're running dbt Core.
+The `microbatch` strategy is available in beta for [dbt Cloud Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) and dbt Core v1.9. We have been developing it behind a flag to prevent unintended interactions with existing custom incremental strategies. To enable this feature, [set the environment variable](/docs/build/environment-variables#setting-and-overriding-environment-variables) `DBT_EXPERIMENTAL_MICROBATCH` to `True` in your dbt Cloud environments or wherever you're running dbt Core.
 
 Read and participate in the discussion: [dbt-core#10672](https://github.com/dbt-labs/dbt-core/discussions/10672)
 

diff --git a/website/docs/docs/build/incremental-models.md b/website/docs/docs/build/incremental-models.md
@@ -212,11 +212,11 @@ Currently, `on_schema_change` only tracks top-level column changes. It does not
 
 ### Default behavior
 
-This is the behavior if `on_schema_change: ignore`, which is set by default, and on older versions of dbt.
+This is the behavior of `on_schema_change: ignore`, which is set by default.
 
 If you add a column to your incremental model, and execute a `dbt run`, this column will _not_ appear in your target table.
 
-Similarly, if you remove a column from your incremental model, and execute a `dbt run`, this column will _not_ be removed from your target table.
+If you remove a column from your incremental model and execute a `dbt run`, `dbt run` will fail.
 
 Instead, whenever the logic of your incremental changes, execute a full-refresh run of both your incremental model and any downstream models.
 

diff --git a/website/docs/docs/build/incremental-strategy.md b/website/docs/docs/build/incremental-strategy.md
@@ -27,7 +27,7 @@ Click the name of the adapter in the below table for more information about supp
 | Data platform adapter | `append` | `merge` | `delete+insert` | `insert_overwrite` | `microbatch` <Lifecycle status="beta"/> |
 |-----------------------|:--------:|:-------:|:---------------:|:------------------:|:-------------------:|
 | [dbt-postgres](/reference/resource-configs/postgres-configs#incremental-materialization-strategies) |     ✅    |    ✅   |        ✅        |                    |      ✅            |
-| [dbt-redshift](/reference/resource-configs/redshift-configs#incremental-materialization-strategies) |     ✅    |    ✅   |        ✅        |                    |                    |
+| [dbt-redshift](/reference/resource-configs/redshift-configs#incremental-materialization-strategies) |     ✅    |    ✅   |        ✅        |                    |      ✅        |
 | [dbt-bigquery](/reference/resource-configs/bigquery-configs#merge-behavior-incremental-models)      |           |    ✅   |                 |          ✅         |      ✅            |
 | [dbt-spark](/reference/resource-configs/spark-configs#incremental-models)                           |     ✅    |    ✅   |                 |          ✅         |      ✅            |
 | [dbt-databricks](/reference/resource-configs/databricks-configs#incremental-models)                 |     ✅    |    ✅   |                 |          ✅         |                    |

diff --git a/website/docs/docs/build/measures.md b/website/docs/docs/build/measures.md
@@ -200,7 +200,7 @@ Parameters under the `non_additive_dimension` will specify dimensions that the m
 
 ```yaml
 semantic_models:
-  - name: subscription_id
+  - name: subscriptions
     description: A subscription table with one row per date for each active user and their subscription plans. 
     model: ref('your_schema.subscription_table')
     defaults:
@@ -209,7 +209,7 @@ semantic_models:
     entities:
       - name: user_id
         type: foreign
-        primary_entity: subscription_table
+    primary_entity: subscription
 
     dimensions:
       - name: subscription_date
@@ -224,21 +224,21 @@ semantic_models:
         expr: user_id
         agg: count_distinct
         non_additive_dimension: 
-          name: metric_time
+          name: subscription_date
           window_choice: max 
       - name: mrr
         description: Aggregate by summing all users' active subscription plans
         expr: subscription_value
         agg: sum 
         non_additive_dimension: 
-          name: metric_time
+          name: subscription_date
           window_choice: max
       - name: user_mrr
         description: Group by user_id to achieve each user's MRR
         expr: subscription_value
         agg: sum  
         non_additive_dimension: 
-          name: metric_time
+          name: subscription_date
           window_choice: max
           window_groupings: 
             - user_id 
@@ -255,15 +255,15 @@ We can query the semi-additive metrics using the following syntax:
 For dbt Cloud:
 
 ```bash
-dbt sl query --metrics mrr_by_end_of_month --group-by metric_time__month --order metric_time__month 
-dbt sl query --metrics mrr_by_end_of_month --group-by metric_time__week --order metric_time__week 
+dbt sl query --metrics mrr_by_end_of_month --group-by subscription__subscription_date__month --order subscription__subscription_date__month 
+dbt sl query --metrics mrr_by_end_of_month --group-by subscription__subscription_date__week --order subscription__subscription_date__week 
 ```
 
 For dbt Core:
 
 ```bash
-mf query --metrics mrr_by_end_of_month --group-by metric_time__month --order metric_time__month 
-mf query --metrics mrr_by_end_of_month --group-by metric_time__week --order metric_time__week 
+mf query --metrics mrr_by_end_of_month --group-by subscription__subscription_date__month --order subscription__subscription_date__month 
+mf query --metrics mrr_by_end_of_month --group-by subscription__subscription_date__week --order subscription__subscription_date__week 
 ```
 
 import SetUpPages from '/snippets/_metrics-dependencies.md';

diff --git a/website/docs/docs/build/metricflow-commands.md b/website/docs/docs/build/metricflow-commands.md
@@ -259,7 +259,7 @@ Create a new query with MetricFlow and execute it against your data platform. Th
 
 ```bash
 dbt sl query --metrics <metric_name> --group-by <dimension_name> # In dbt Cloud 
-dbt sl query --saved-query <name> # In dbt Cloud CLI
+dbt sl query --saved-query <name> # In dbt Cloud
 
 mf query --metrics <metric_name> --group-by <dimension_name> # In dbt Core
 

diff --git a/website/docs/docs/build/metricflow-time-spine.md b/website/docs/docs/build/metricflow-time-spine.md
@@ -150,7 +150,7 @@ final as (
 
 select * from final
 where date_day > dateadd(year, -4, current_timestamp()) 
-and date_hour < dateadd(day, 30, current_timestamp())
+and date_day < dateadd(day, 30, current_timestamp())
 ```
 
 ### Daily (BigQuery)
@@ -180,7 +180,7 @@ select *
 from final
 -- filter the time spine to a specific range
 where date_day > dateadd(year, -4, current_timestamp()) 
-and date_hour < dateadd(day, 30, current_timestamp())
+and date_day < dateadd(day, 30, current_timestamp())
 ```
 
 </File>
@@ -265,7 +265,7 @@ final as (
 
 select * from final
 where date_day > dateadd(year, -4, current_timestamp()) 
-and date_hour < dateadd(day, 30, current_timestamp())
+and date_day  < dateadd(day, 30, current_timestamp())
 ```
 
 </File>
@@ -296,7 +296,7 @@ select *
 from final
 -- filter the time spine to a specific range
 where date_day > dateadd(year, -4, current_timestamp()) 
-and date_hour < dateadd(day, 30, current_timestamp())
+and date_day < dateadd(day, 30, current_timestamp())
 ```
 
 </File>

diff --git a/website/docs/docs/build/snapshots.md b/website/docs/docs/build/snapshots.md
@@ -390,29 +390,6 @@ snapshots:
 
 </VersionBlock>
 
-## Snapshot query best practices
-
-This section outlines some best practices for writing snapshot queries:
-
-- #### Snapshot source data
-  Your models should then select from these snapshots, treating them like regular data sources. As much as possible, snapshot your source data in its raw form and use downstream models to clean up the data
-
-- #### Use the `source` function in your query
-  This helps when understanding <Term id="data-lineage">data lineage</Term> in your project.
-
-- #### Include as many columns as possible
-  In fact, go for `select *` if performance permits! Even if a column doesn't feel useful at the moment, it might be better to snapshot it in case it becomes useful – after all, you won't be able to recreate the column later.
-
-- #### Avoid joins in your snapshot query
-  Joins can make it difficult to build a reliable `updated_at` timestamp. Instead, snapshot the two tables separately, and join them in downstream models.
-
-- #### Limit the amount of transformation in your query
-  If you apply business logic in a snapshot query, and this logic changes in the future, it can be impossible (or, at least, very difficult) to apply the change in logic to your snapshots.
-
-Basically – keep your query as simple as possible! Some reasonable exceptions to these recommendations include:
-* Selecting specific columns if the table is wide.
-* Doing light transformation to get data into a reasonable shape, for example, unpacking a <Term id="json" /> blob to flatten your source data into columns.
-
 ## Snapshot meta-fields
 
 Snapshot <Term id="table">tables</Term> will be created as a clone of your source dataset, plus some additional meta-fields*.