Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discovery API: update docs #3894

Merged
merged 71 commits into from
Aug 15, 2023
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
0e39350
discovery-use-cases-and-examples: update queries to use bigints and n…
eddowh Aug 8, 2023
7eec141
discovery-querying: update queries to use bigints and add TODOs to re…
eddowh Aug 9, 2023
7b3b1fa
discovery-use-cases-and-examples: add TODOs on querying model constra…
eddowh Aug 9, 2023
99a097f
Update website/docs/docs/dbt-cloud-apis/discovery-querying.md
nghi-ly Aug 9, 2023
fcbea5e
Update ModelByEnv and Environment schema obj docs
nghi-ly Aug 9, 2023
6e146bd
Merge branch 'current' into meta-1482/update-discovery-api-docs
nghi-ly Aug 9, 2023
9db6960
Update jobs. Minor nits
nghi-ly Aug 9, 2023
8894408
Update website/docs/docs/dbt-cloud-apis/schema-discovery-modelByEnv.mdx
nghi-ly Aug 9, 2023
5910903
Merge branch 'meta-1482/update-discovery-api-docs' of github.com:dbt-…
nghi-ly Aug 9, 2023
fd7bc98
Update website/docs/docs/dbt-cloud-apis/schema-discovery-environment.mdx
nghi-ly Aug 9, 2023
1c7e291
Update website/docs/docs/dbt-cloud-apis/schema-discovery-modelByEnv.mdx
nghi-ly Aug 9, 2023
4133c16
Fix callouts
nghi-ly Aug 9, 2023
4c76a3b
Update website/docs/docs/dbt-cloud-apis/schema-discovery-modelByEnv.mdx
nghi-ly Aug 9, 2023
2e5d4a4
Update website/docs/docs/dbt-cloud-apis/schema-discovery-environment.mdx
nghi-ly Aug 9, 2023
6a31d25
Update website/docs/docs/dbt-cloud-apis/discovery-use-cases-and-examp…
nghi-ly Aug 9, 2023
953673a
Update wording
nghi-ly Aug 9, 2023
412cde3
Feedback from PM
nghi-ly Aug 9, 2023
078c55c
Update website/docs/docs/dbt-cloud-apis/discovery-use-cases-and-examp…
nghi-ly Aug 9, 2023
943f101
discovery-schema: Add 'Job' page
eddowh Aug 9, 2023
63dffb8
Merge branch 'meta-1482/update-discovery-api-docs' of github.com:dbt-…
eddowh Aug 9, 2023
7db7e06
Add Discovery API Job Schema to sidebar
eddowh Aug 9, 2023
bdc0432
This branch was auto-updated!
github-actions[bot] Aug 9, 2023
01168db
Merge branch 'meta-1482/update-discovery-api-docs' of github.com:dbt-…
eddowh Aug 9, 2023
d95f422
discovery-api: various changes
eddowh Aug 9, 2023
872154a
This branch was auto-updated!
github-actions[bot] Aug 9, 2023
a8ea7ca
discovery-api-schema: Display deprecation notices for legacy job endp…
eddowh Aug 9, 2023
3c23a3c
Merge branch 'meta-1482/update-discovery-api-docs' of github.com:dbt-…
eddowh Aug 9, 2023
e5bb587
discovery-api: Remove wrongly placed deprecation notices
eddowh Aug 9, 2023
24ae1bf
discovery-api-schema: Add 'Model' endpoint under 'Job'
eddowh Aug 9, 2023
9c96820
discovery-api-schema: Add 'Models' docs under 'Job'
eddowh Aug 9, 2023
7642ff1
discovery-api-schema: Add exposure docs under 'Job'
eddowh Aug 9, 2023
2a2e8ff
discovery-api-schema: Add metric docs under 'Job'
eddowh Aug 9, 2023
d02b9f3
discovery-api-schema: Add seed docs under 'Job'
eddowh Aug 9, 2023
b8ef15c
discovery-api-schema: Add source docs under 'Job'
eddowh Aug 9, 2023
ea307d2
discovery-api-schema: Add snapshot and test docs under 'Job'
eddowh Aug 10, 2023
b0239e9
discovery-api-schema: Add modelHistoricalRuns docs under 'Environment…
eddowh Aug 10, 2023
3d27843
discovery-api: Finalize TODOs
eddowh Aug 10, 2023
0ed5284
This branch was auto-updated!
github-actions[bot] Aug 10, 2023
25b971a
This branch was auto-updated!
github-actions[bot] Aug 10, 2023
3bcda6f
This branch was auto-updated!
github-actions[bot] Aug 10, 2023
50b01aa
This branch was auto-updated!
github-actions[bot] Aug 10, 2023
a93ce45
This branch was auto-updated!
github-actions[bot] Aug 10, 2023
d7ed773
This branch was auto-updated!
github-actions[bot] Aug 10, 2023
907d5b5
This branch was auto-updated!
github-actions[bot] Aug 10, 2023
07a8800
This branch was auto-updated!
github-actions[bot] Aug 10, 2023
53b76c3
This branch was auto-updated!
github-actions[bot] Aug 10, 2023
45cfb25
Updates to query page
nghi-ly Aug 10, 2023
24cd278
Update snippet for deprecation callout
nghi-ly Aug 10, 2023
9e78499
Updates to use cases and ex
nghi-ly Aug 10, 2023
b4a9973
Updates to schema pages
nghi-ly Aug 11, 2023
2679d5a
Nits for consistency
nghi-ly Aug 11, 2023
a6e6503
This branch was auto-updated!
github-actions[bot] Aug 11, 2023
8666e59
Update website/docs/docs/dbt-cloud-apis/discovery-use-cases-and-examp…
nghi-ly Aug 11, 2023
ab4dc09
discovery-use-cases-and-example: temporarily remove 'What’s the full …
eddowh Aug 11, 2023
82e3c87
This branch was auto-updated!
github-actions[bot] Aug 11, 2023
dc0ed7b
This branch was auto-updated!
github-actions[bot] Aug 12, 2023
b4dc9e2
This branch was auto-updated!
github-actions[bot] Aug 14, 2023
bc1b22e
This branch was auto-updated!
github-actions[bot] Aug 14, 2023
0bdd58c
This branch was auto-updated!
github-actions[bot] Aug 14, 2023
c23c3dc
This branch was auto-updated!
github-actions[bot] Aug 15, 2023
e74a071
This branch was auto-updated!
github-actions[bot] Aug 15, 2023
67acd2f
This branch was auto-updated!
github-actions[bot] Aug 15, 2023
8e189e4
This branch was auto-updated!
github-actions[bot] Aug 15, 2023
dca68e2
This branch was auto-updated!
github-actions[bot] Aug 15, 2023
32d94d7
This branch was auto-updated!
github-actions[bot] Aug 15, 2023
937ea5e
This branch was auto-updated!
github-actions[bot] Aug 15, 2023
9bca438
Revert hardcoded metadataUrls
eddowh Aug 15, 2023
f8b7735
Update website/docs/docs/dbt-cloud-apis/schema-discovery-environment-…
runleonarun Aug 15, 2023
02b273a
Merge branch 'current' into meta-1482/update-discovery-api-docs
runleonarun Aug 15, 2023
1f0b4f2
Update contributing/single-sourcing-content.md
runleonarun Aug 15, 2023
bb7f63d
Update website/docs/docs/dbt-cloud-apis/schema-discovery-environment-…
runleonarun Aug 15, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
146 changes: 79 additions & 67 deletions website/docs/docs/dbt-cloud-apis/discovery-querying.md
Original file line number Diff line number Diff line change
@@ -1,32 +1,32 @@
---
title: "Query the Discovery API"
id: "discovery-querying"
sidebar_label: "Query the Discovery API"
sidebar_label: "Query the Discovery API"
---

The Discovery API supports ad-hoc queries and integrations.. If you are new to the API, read the [Discovery API overview](/docs/dbt-cloud-apis/discovery-api) for an introduction.
The Discovery API supports ad-hoc queries and integrations. If you are new to the API, refer to [About the Discovery API](/docs/dbt-cloud-apis/discovery-api) for an introduction.

Use the Discovery API to evaluate data pipeline health and project state across runs or at a moment in time. dbt Labs provide a [GraphQL explorer](https://metadata.cloud.getdbt.com/graphql) for this API, enabling you to run queries and browse the schema.
Use the Discovery API to evaluate data pipeline health and project state across runs or at a moment in time. dbt Labs provide a [GraphQL explorer](https://metadata.cloud.getdbt.com/graphql) for this API, enabling you to run queries and browse the schema.

Since GraphQL describes the data in the API, the schema displayed in the GraphQL explorer accurately represents the graph and fields available to query.
Since GraphQL describes the data in the API, the schema displayed in the GraphQL explorer accurately represents the graph and fields available to query.

<Snippet path="metadata-api-prerequisites" />

## Authorization

Currently, authorization of requests takes place [using a service token](/docs/dbt-cloud-apis/service-tokens). dbt Cloud admin users can generate a Metadata Only service token that is authorized to execute a specific query against the Discovery API.

Once you've created a token, you can use it in the Authorization header of requests to the dbt Cloud Discovery API. Be sure to include the Token prefix in the Authorization header, or the request will fail with a `401 Unauthorized` error. Note that `Bearer` can be used instead of `Token` in the Authorization header. Both syntaxes are equivalent.
Once you've created a token, you can use it in the Authorization header of requests to the dbt Cloud Discovery API. Be sure to include the Token prefix in the Authorization header, or the request will fail with a `401 Unauthorized` error. Note that `Bearer` can be used instead of `Token` in the Authorization header. Both syntaxes are equivalent.

## Access the Discovery API
## Access the Discovery API

1. Create a [service account token](/docs/dbt-cloud-apis/service-tokens) to authorize requests. dbt Cloud Admin users can generate a _Metadata Only_ service token, which can be used to execute a specific query against the Discovery API to authorize requests.

2. Find your API URL using the endpoint `https://metadata.{YOUR_ACCESS_URL}/graphql`.
2. Find your API URL using the endpoint `https://metadata.{YOUR_ACCESS_URL}/graphql`.

* Replace `{YOUR_ACCESS_URL}` with the appropriate [Access URL](/docs/cloud/about-cloud/regions-ip-addresses) for your region and plan. For example, if your multi-tenant region is North America, your endpoint is `https://metadata.cloud.getdbt.com/graphql`. If your multi-tenant region is EMEA, your endpoint is `https://metadata.emea.dbt.com/graphql`.

3. For specific query points, refer to the [schema documentation](/docs/dbt-cloud-apis/discovery-schema-model).
3. For specific query points, refer to the [schema documentation](/docs/dbt-cloud-apis/discovery-schema-model).


## Run queries using HTTP requests
Expand All @@ -36,7 +36,7 @@ You can run queries by sending a `POST` request to the `https://metadata.YOUR_AC
* `YOUR_TOKEN` in the Authorization header with your actual API token. Be sure to include the Token prefix.
* `QUERY_BODY` with a GraphQL query, for example `{ "query": "<query text>" }`
* `VARIABLES` with a dictionary of your GraphQL query variables, such as a job ID or a filter.
* `ENDPOINT` with the endpoint you're querying, such as environment.
* `ENDPOINT` with the endpoint you're querying, such as environment.

```shell
curl 'https://metadata.YOUR_ACCESS_URL/graphql' \
Expand All @@ -48,10 +48,13 @@ You can run queries by sending a `POST` request to the `https://metadata.YOUR_AC

Python example:

```py
response = requests.post('YOUR_ACCESS_URL',
headers={"authorization": "Bearer "+YOUR_TOKEN, "content-type": "application/json"},
json={"query": QUERY_BODY, "variables": VARIABLES})
```python
response = requests.post(
'YOUR_ACCESS_URL',
headers={"authorization": "Bearer "+YOUR_TOKEN, "content-type": "application/json"},
json={"query": QUERY_BODY, "variables": VARIABLES}
)

metadata = response.json()['data'][ENDPOINT]
```

Expand All @@ -72,66 +75,72 @@ You can use the Discovery API to query data from the previous three months. For

## Run queries with the GraphQL explorer

You can run ad-hoc queries directly in the [GraphQL API explorer](https://metadata.cloud.getdbt.com/graphql) and use the document explorer on the left-hand side, where you can see all possible nodes and fields.
You can run ad-hoc queries directly in the [GraphQL API explorer](https://metadata.cloud.getdbt.com/graphql) and use the document explorer on the left-hand side, where you can see all possible nodes and fields.

Refer to the [Apollo explorer documentation](https://www.apollographql.com/docs/graphos/explorer/explorer) for setup and authorization info.
Refer to the [Apollo explorer documentation](https://www.apollographql.com/docs/graphos/explorer/explorer) for setup and authorization info.

1. Access the [GraphQL API explorer](https://metadata.cloud.getdbt.com/graphql) and select fields you'd like query.
1. Access the [GraphQL API explorer](https://metadata.cloud.getdbt.com/graphql) and select fields you'd like query.

2. Go to **Variables** at the bottom of the explorer and replace any `null` fields with your unique values.

3. [Authenticate](https://www.apollographql.com/docs/graphos/explorer/connecting-authenticating#authentication) via Bearer auth with `YOUR_TOKEN`. Go to **Headers** at the bottom of the explorer and select **+New header**.

4. Select **Authorization** in the **header key** drop-down list and enter your Bearer auth token in the **value** field. Remember to include the Token prefix. Your header key should look like this `{"Authorization": "Bearer <YOUR_TOKEN>}`.

TODO: Screenshot needs to be replaced with new one. If we want to show model historical runs, show `environment.applied.modelHistoricalRuns`

<br />

<Lightbox src="/img/docs/dbt-cloud/discovery-api/graphql_header.jpg" width="85%" title="Enter the header key and Bearer auth token values"/>

5. Run your query by pressing the blue query button in the top-right of the Operation editor (to the right of the query). You should see a successful query response on the right side of the explorer.

TODO: Screenshot needs to be replaced with new one. If we want to show model historical runs, show `environment.applied.modelHistoricalRuns`

<Lightbox src="/img/docs/dbt-cloud/discovery-api/graphql.jpg" width="85%" title="Run queries using the Apollo Server GraphQL explorer"/>

### Fragments

Use the [`..on`](https://www.apollographql.com/docs/react/data/fragments/) notation to query across lineage and retrieve results from specific node types.
Use the [`...on`](https://www.apollographql.com/docs/react/data/fragments/) notation to query across lineage and retrieve results from specific node types.

```graphql

environment(id: $environmentId) {
applied {
models(first: $first,filter:{uniqueIds:"MODEL.PROJECT.MODEL_NAME"}) {
edges {
node {
name
ancestors(types:[Model, Source, Seed, Snapshot]) {
... on ModelAppliedStateNode {
name
resourceType
materializedType
executionInfo {
executeCompletedAt
query ($environmentId: BigInt!, $first: Int!) {
environment(id: $environmentId) {
applied {
models(first: $first, filter: { uniqueIds: "MODEL.PROJECT.MODEL_NAME" }) {
edges {
node {
name
ancestors(types: [Model, Source, Seed, Snapshot]) {
... on ModelAppliedStateNestedNode {
name
resourceType
materializedType
executionInfo {
executeCompletedAt
}
}
}
... on SourceAppliedStateNode {
sourceName
name
resourceType
freshness {
maxLoadedAt
... on SourceAppliedStateNestedNode {
sourceName
name
resourceType
freshness {
maxLoadedAt
}
}
}
... on SnapshotAppliedStateNode {
name
resourceType
executionInfo {
executeCompletedAt
... on SnapshotAppliedStateNestedNode {
name
resourceType
executionInfo {
executeCompletedAt
}
}
}
... on SeedAppliedStateNode {
name
resourceType
executionInfo {
executeCompletedAt
... on SeedAppliedStateNestedNode {
name
resourceType
executionInfo {
executeCompletedAt
}
}
}
}
Expand All @@ -140,39 +149,39 @@ environment(id: $environmentId) {
}
}
}

```

### Pagination

Querying large datasets can impact performance on multiple functions in the API pipeline. Pagination eases the burden by returning smaller data sets one page at a time. This is useful for returning a particular portion of the dataset or the entire dataset piece-by-piece to enhance performance. dbt Cloud utilizes cursor-based pagination, which makes it easy to return pages of constantly changing data.
Querying large datasets can impact performance on multiple functions in the API pipeline. Pagination eases the burden by returning smaller data sets one page at a time. This is useful for returning a particular portion of the dataset or the entire dataset piece-by-piece to enhance performance. dbt Cloud utilizes cursor-based pagination, which makes it easy to return pages of constantly changing data.

Use the `PageInfo` object to return information about the page. The following fields are available:

- `startCursor` string type - corresponds to the first `node` in the `edge`.
- `endCursor` string type - corresponds to the last `node` in the `edge`.
- `hasNextPage` boolean type - whether there are more `nodes` after the returned results.
- `hasPreviousPage` boolean type - whether `nodes` exist before the returned results.
- `hasPreviousPage` boolean type - whether `nodes` exist before the returned results.

There are connection variables available when making the query:

- `first` integer type - will return the first 'n' `nodes` for each page, up to 500.
- `after` string type sets the cursor to retrieve `nodes` after. It's best practice to set the `after` variable with the object ID defined in the `endcursor` of the previous page.
- `after` string type sets the cursor to retrieve `nodes` after. It's best practice to set the `after` variable with the object ID defined in the `endcursor` of the previous page.

The following example shows that we're returning the `first` 500 models `after` the specified Object ID in the variables. The `PageInfo` object will return where the object ID where the cursor starts, where it ends, and whether there is a next page.

The following example shows that we're returning the `first` 500 models `after` the specified Object ID in the variables. The `PageInfo` object will return where the object ID where the cursor starts, where it ends, and whether there is a next page.
TODO: Update screenshot to use `$environmentId: BigInt!`

<Lightbox src="/img/paginate.png" width="75%" title="Example of pagination"/>

Here is a code example of the `PageInfo` object:

```graphql
pageInfo {
startCursor
endCursor
hasNextPage
}
totalCount # Total number of pages

startCursor
endCursor
hasNextPage
}
totalCount # Total number records across all pages
```

### Filters
Expand All @@ -185,11 +194,13 @@ In the following example, we can see that we're filtering results to models that

Here is a code example that filters for models that have an error on their last run and tests that have failed:

```graphql
TODO: Update screenshot to use `$environmentId: BigInt!`

environment(id: $environmentId) {
```graphql
query ModelsAndTests($environmentId: BigInt!, $first: Int!) {
environment(id: $environmentId) {
applied {
models(first: $first, filter: {lastRunStatus:error}) {
models(first: $first, filter: { lastRunStatus: error }) {
edges {
node {
name
Expand All @@ -199,7 +210,7 @@ environment(id: $environmentId) {
}
}
}
tests(first: $first, filter: {status:"fail"}) {
tests(first: $first, filter: { status: "fail" }) {
edges {
node {
name
Expand All @@ -208,9 +219,10 @@ environment(id: $environmentId) {
}
}
}
}
}
}
}
}

```

## Related content
Expand Down
Loading