Skip to content

Commit

Permalink
Merge branch 'current' into patch-1
Browse files Browse the repository at this point in the history
  • Loading branch information
jairus-m authored Nov 9, 2024
2 parents 36269e1 + dc911a3 commit 071dcfa
Show file tree
Hide file tree
Showing 38 changed files with 67,227 additions and 159 deletions.
15 changes: 0 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,18 +62,3 @@ You can click a link available in a Vercel bot PR comment to see and review your

Advisory:
- If you run into an `fatal error: 'vips/vips8' file not found` error when you run `npm install`, you may need to run `brew install vips`. Warning: this one will take a while -- go ahead and grab some coffee!

## Running the Cypress tests locally

Method 1: Utilizing the Cypress GUI
1. `cd` into the repo: `cd docs.getdbt.com`
2. `cd` into the `website` subdirectory: `cd website`
3. Install the required node packages: `npm install`
4. Run `npx cypress open` to open the Cypress GUI, and choose `E2E Testing` as the Testing Type, before finally selecting your browser and clicking `Start E2E testing in {broswer}`
5. Click on a test and watch it run!

Method 2: Running the Cypress E2E tests headlessly
1. `cd` into the repo: `cd docs.getdbt.com`
2. `cd` into the `website` subdirectory: `cd website`
3. Install the required node packages: `npm install`
4. Run `npx cypress run`
67 changes: 0 additions & 67 deletions contributing/developer-blog.md

This file was deleted.

11 changes: 2 additions & 9 deletions website/dbt-versions.js
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
*/
exports.versions = [
{
version: "1.9.1",
version: "1.10",
customDisplay: "Cloud (Versionless)",
},
{
Expand Down Expand Up @@ -74,12 +74,5 @@ exports.versionedPages = [
* @property {string} firstVersion The first version the category is visible in the sidebar
*/
exports.versionedCategories = [
{
category: "Model governance",
firstVersion: "1.5",
},
{
category: "Build your metrics",
firstVersion: "1.6",
},

];
2 changes: 1 addition & 1 deletion website/docs/docs/build/dimensions.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ semantic_models:
type: categorical
```
Dimensions are bound to the primary entity of the semantic model they are defined in. For example the dimensoin `type` is defined in a model that has `transaction` as a primary entity. `type` is scoped to the `transaction` entity, and to reference this dimension you would use the fully qualified dimension name i.e `transaction__type`.
Dimensions are bound to the primary entity of the semantic model they are defined in. For example the dimension `type` is defined in a model that has `transaction` as a primary entity. `type` is scoped to the `transaction` entity, and to reference this dimension you would use the fully qualified dimension name i.e `transaction__type`.

MetricFlow requires that all semantic models have a primary entity. This is to guarantee unique dimension names. If your data source doesn't have a primary entity, you need to assign the entity a name using the `primary_entity` key. It doesn't necessarily have to map to a column in that table and assigning the name doesn't affect query generation. We recommend making these "virtual primary entities" unique across your semantic model. An example of defining a primary entity for a data source that doesn't have a primary entity column is below:

Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/build/incremental-microbatch.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ id: "incremental-microbatch"

:::info Microbatch

The `microbatch` strategy is available in beta for [dbt Cloud Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) and dbt Core v1.9. We have been developing it behind a flag to prevent unintended interactions with existing custom incremental strategies. To enable this feature, set the environment variable `DBT_EXPERIMENTAL_MICROBATCH` to `True` in your dbt Cloud environments or wherever you're running dbt Core.
The `microbatch` strategy is available in beta for [dbt Cloud Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) and dbt Core v1.9. We have been developing it behind a flag to prevent unintended interactions with existing custom incremental strategies. To enable this feature, [set the environment variable](/docs/build/environment-variables#setting-and-overriding-environment-variables) `DBT_EXPERIMENTAL_MICROBATCH` to `True` in your dbt Cloud environments or wherever you're running dbt Core.

Read and participate in the discussion: [dbt-core#10672](https://github.com/dbt-labs/dbt-core/discussions/10672)

Expand Down
4 changes: 2 additions & 2 deletions website/docs/docs/build/incremental-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,11 +212,11 @@ Currently, `on_schema_change` only tracks top-level column changes. It does not

### Default behavior

This is the behavior if `on_schema_change: ignore`, which is set by default, and on older versions of dbt.
This is the behavior of `on_schema_change: ignore`, which is set by default.

If you add a column to your incremental model, and execute a `dbt run`, this column will _not_ appear in your target table.

Similarly, if you remove a column from your incremental model, and execute a `dbt run`, this column will _not_ be removed from your target table.
If you remove a column from your incremental model and execute a `dbt run`, `dbt run` will fail.

Instead, whenever the logic of your incremental changes, execute a full-refresh run of both your incremental model and any downstream models.

Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/build/incremental-strategy.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Click the name of the adapter in the below table for more information about supp
| Data platform adapter | `append` | `merge` | `delete+insert` | `insert_overwrite` | `microbatch` <Lifecycle status="beta"/> |
|-----------------------|:--------:|:-------:|:---------------:|:------------------:|:-------------------:|
| [dbt-postgres](/reference/resource-configs/postgres-configs#incremental-materialization-strategies) |||| ||
| [dbt-redshift](/reference/resource-configs/redshift-configs#incremental-materialization-strategies) |||| | |
| [dbt-redshift](/reference/resource-configs/redshift-configs#incremental-materialization-strategies) |||| | |
| [dbt-bigquery](/reference/resource-configs/bigquery-configs#merge-behavior-incremental-models) | || |||
| [dbt-spark](/reference/resource-configs/spark-configs#incremental-models) ||| |||
| [dbt-databricks](/reference/resource-configs/databricks-configs#incremental-models) ||| || |
Expand Down
18 changes: 9 additions & 9 deletions website/docs/docs/build/measures.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,7 @@ Parameters under the `non_additive_dimension` will specify dimensions that the m

```yaml
semantic_models:
- name: subscription_id
- name: subscriptions
description: A subscription table with one row per date for each active user and their subscription plans.
model: ref('your_schema.subscription_table')
defaults:
Expand All @@ -209,7 +209,7 @@ semantic_models:
entities:
- name: user_id
type: foreign
primary_entity: subscription_table
primary_entity: subscription
dimensions:
- name: subscription_date
Expand All @@ -224,21 +224,21 @@ semantic_models:
expr: user_id
agg: count_distinct
non_additive_dimension:
name: metric_time
name: subscription_date
window_choice: max
- name: mrr
description: Aggregate by summing all users' active subscription plans
expr: subscription_value
agg: sum
non_additive_dimension:
name: metric_time
name: subscription_date
window_choice: max
- name: user_mrr
description: Group by user_id to achieve each user's MRR
expr: subscription_value
agg: sum
non_additive_dimension:
name: metric_time
name: subscription_date
window_choice: max
window_groupings:
- user_id
Expand All @@ -255,15 +255,15 @@ We can query the semi-additive metrics using the following syntax:
For dbt Cloud:

```bash
dbt sl query --metrics mrr_by_end_of_month --group-by metric_time__month --order metric_time__month
dbt sl query --metrics mrr_by_end_of_month --group-by metric_time__week --order metric_time__week
dbt sl query --metrics mrr_by_end_of_month --group-by subscription__subscription_date__month --order subscription__subscription_date__month
dbt sl query --metrics mrr_by_end_of_month --group-by subscription__subscription_date__week --order subscription__subscription_date__week
```

For dbt Core:

```bash
mf query --metrics mrr_by_end_of_month --group-by metric_time__month --order metric_time__month
mf query --metrics mrr_by_end_of_month --group-by metric_time__week --order metric_time__week
mf query --metrics mrr_by_end_of_month --group-by subscription__subscription_date__month --order subscription__subscription_date__month
mf query --metrics mrr_by_end_of_month --group-by subscription__subscription_date__week --order subscription__subscription_date__week
```

import SetUpPages from '/snippets/_metrics-dependencies.md';
Expand Down
2 changes: 1 addition & 1 deletion website/docs/docs/build/metricflow-commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,7 @@ Create a new query with MetricFlow and execute it against your data platform. Th
```bash
dbt sl query --metrics <metric_name> --group-by <dimension_name> # In dbt Cloud
dbt sl query --saved-query <name> # In dbt Cloud CLI
dbt sl query --saved-query <name> # In dbt Cloud

mf query --metrics <metric_name> --group-by <dimension_name> # In dbt Core

Expand Down
8 changes: 4 additions & 4 deletions website/docs/docs/build/metricflow-time-spine.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ final as (
select * from final
where date_day > dateadd(year, -4, current_timestamp())
and date_hour < dateadd(day, 30, current_timestamp())
and date_day < dateadd(day, 30, current_timestamp())
```

### Daily (BigQuery)
Expand Down Expand Up @@ -180,7 +180,7 @@ select *
from final
-- filter the time spine to a specific range
where date_day > dateadd(year, -4, current_timestamp())
and date_hour < dateadd(day, 30, current_timestamp())
and date_day < dateadd(day, 30, current_timestamp())
```

</File>
Expand Down Expand Up @@ -265,7 +265,7 @@ final as (
select * from final
where date_day > dateadd(year, -4, current_timestamp())
and date_hour < dateadd(day, 30, current_timestamp())
and date_day < dateadd(day, 30, current_timestamp())
```

</File>
Expand Down Expand Up @@ -296,7 +296,7 @@ select *
from final
-- filter the time spine to a specific range
where date_day > dateadd(year, -4, current_timestamp())
and date_hour < dateadd(day, 30, current_timestamp())
and date_day < dateadd(day, 30, current_timestamp())
```

</File>
Expand Down
23 changes: 0 additions & 23 deletions website/docs/docs/build/snapshots.md
Original file line number Diff line number Diff line change
Expand Up @@ -390,29 +390,6 @@ snapshots:

</VersionBlock>

## Snapshot query best practices

This section outlines some best practices for writing snapshot queries:

- #### Snapshot source data
Your models should then select from these snapshots, treating them like regular data sources. As much as possible, snapshot your source data in its raw form and use downstream models to clean up the data

- #### Use the `source` function in your query
This helps when understanding <Term id="data-lineage">data lineage</Term> in your project.

- #### Include as many columns as possible
In fact, go for `select *` if performance permits! Even if a column doesn't feel useful at the moment, it might be better to snapshot it in case it becomes useful – after all, you won't be able to recreate the column later.

- #### Avoid joins in your snapshot query
Joins can make it difficult to build a reliable `updated_at` timestamp. Instead, snapshot the two tables separately, and join them in downstream models.

- #### Limit the amount of transformation in your query
If you apply business logic in a snapshot query, and this logic changes in the future, it can be impossible (or, at least, very difficult) to apply the change in logic to your snapshots.

Basically – keep your query as simple as possible! Some reasonable exceptions to these recommendations include:
* Selecting specific columns if the table is wide.
* Doing light transformation to get data into a reasonable shape, for example, unpacking a <Term id="json" /> blob to flatten your source data into columns.

## Snapshot meta-fields

Snapshot <Term id="table">tables</Term> will be created as a clone of your source dataset, plus some additional meta-fields*.
Expand Down
Loading

0 comments on commit 071dcfa

Please sign in to comment.