Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add charts to Performance Insights #7557

Merged
merged 1 commit into from
Dec 19, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion docs/pages/product/deployment/cloud.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ In Cube Cloud, you can:
API endpoints for the source code in the main branch, any other branch,
or any user-specific [development mode][ref-dev-mode] branch.
* Assign a [custom domain][ref-domains] to API endpoints of any deployment.
* Review [performance insights][ref-performance] and fine-tune deployments for better [scalability][ref-scalability].
* Set up account-wide [budgets][ref-budgets] to control resource consumption
and use [auto-suspension][ref-auto-sus] to reduce resource consumption of
non-production deployments.
Expand All @@ -51,4 +52,6 @@ In Cube Cloud, you can:
[ref-dev-mode]: /product/workspace/dev-mode
[ref-domains]: /product/deployment/cloud/custom-domains
[ref-auto-sus]: /product/deployment/cloud/auto-suspension
[ref-budgets]: /product/workspace/budgets
[ref-budgets]: /product/workspace/budgets
[ref-performance]: /product/workspace/performance
[ref-scalability]: /product/deployment/cloud/scalability
1 change: 1 addition & 0 deletions docs/pages/product/deployment/cloud/_meta.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ module.exports = {
"continuous-deployment": "Continuous deployment",
"custom-domains": "Custom domains",
"auto-suspension": "Auto-suspension",
"scalability": "Scalability",
"pricing": "Pricing",
"support": "Support",
"limits": "Limits"
Expand Down
58 changes: 3 additions & 55 deletions docs/pages/product/deployment/cloud/deployment-types.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -64,8 +64,8 @@ Production Clusters are designed to support high-availability production
workloads. It consists of several key components, including starting with 2 Cube
API instances, 1 Cube Refresh Worker and 2 Cube Store Routers - all of which run
on dedicated infrastructure. The cluster can automatically scale to meet the
needs of your workload by adding more components as necessary; check the
[Scalability section](#scalability) below.
needs of your workload by adding more components as necessary; check the page on
[scalability][ref-scalability] to learn more.

## Production multi-cluster

Expand Down Expand Up @@ -96,59 +96,6 @@ Cube Cloud routes traffic between clusters based on
Each cluster is billed separately, and all clusters can use auto-scaling to
match demand.

## Scalability

Cube Cloud also allows adding additional infrastructure to your deployment to
increase scalability and performance beyond what is available with each
Production Deployment.

### Cube Store Worker

Cube Store Workers are used to build and persist pre-aggregations. Each Worker
has a **maximum of 150GB** of storage; [additional Cube Store
workers][ref-limits] can be added to your deployment to both increase storage
space and improve pre-aggregation performance. A **minimum of 2** Cube Store
Workers is required for pre-aggregations; this can be adjusted. For a rough
estimate, it will take approximately 2 Cube Store Workers per 4 GB of
pre-aggregated data per day.

<InfoBox>

Idle workers will automatically hibernate after 10 minutes of inactivity, and
will not consume CCUs until they are resumed. Workers are resumed automatically
when Cube receives a query that should be accelerated by a pre-aggregation, or
when a scheduled refresh is triggered.

</InfoBox>

To change the number of Cube Store Workers in a deployment, go to the
deployment’s <Btn>Settings</Btn> screen, and open the <Btn>Configuration</Btn>
tab. From this screen, you can set the number of Cube Store Workers from the
dropdown:

<Screenshot
alt="Cube Cloud Deployment Settings page showing auto-scaling configuration options"
src="https://ucarecdn.com/3b39c56f-d553-4612-b4f0-07084cc4b742/"
/>

### Cube API Instance

With a Production Deployment, 2 Cube API Instances are included. That said, it
is very common to use more, and [additional API instances][ref-limits] can be
added to your deployment to increase the throughput of your queries. A rough
estimate is that 1 Cube API Instance is needed for every 5-10
requests-per-second served. Cube API Instances can also auto-scale as needed.

To change how many Cube API instances are available in the Production Cluster,
go to the deployment’s <Btn>Settings</Btn> screen, and open
the <Btn>Configuration</Btn> tab. From this screen, you can set the minimum and
maximum number of Cube API instances for a deployment:

<Screenshot
alt="Cube Cloud Deployment Settings page showing auto-scaling configuration options"
src="https://ucarecdn.com/3b39c56f-d553-4612-b4f0-07084cc4b742/"
/>

## Switching between deployment types

To switch a deployment's type, go to the deployment's <Btn>Settings</Btn> screen
Expand All @@ -161,3 +108,4 @@ and select from the available options:

[ref-conf-ref-ctx-to-app-id]: /reference/configuration/config#contexttoappid
[ref-limits]: /product/deployment/cloud/limits#resources
[ref-scalability]: /product/deployment/cloud/scalability
55 changes: 55 additions & 0 deletions docs/pages/product/deployment/cloud/scalability.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Scalability

Cube Cloud also allows adding additional infrastructure to your deployment to
increase scalability and performance beyond what is available with each
Production Deployment.

## Auto-scaling of API instances

With a Production Cluster, 2 Cube API Instances are included. That said, it
is very common to use more, and [additional API instances][ref-limits] can be
added to your deployment to increase the throughput of your queries. A rough
estimate is that 1 Cube API Instance is needed for every 5-10
requests-per-second served. Cube API Instances can also auto-scale as needed.

To change how many Cube API instances are available in the Production Cluster,
go to the deployment’s <Btn>Settings</Btn> screen, and open
the <Btn>Configuration</Btn> tab. From this screen, you can set the minimum and
maximum number of Cube API instances for a deployment:

<Screenshot
alt="Cube Cloud Deployment Settings page showing auto-scaling configuration options"
src="https://ucarecdn.com/3b39c56f-d553-4612-b4f0-07084cc4b742/"
/>

## Sizing Cube Store workers

Cube Store Workers are used to build and persist pre-aggregations. Each Worker
has a **maximum of 150GB** of storage; [additional Cube Store
workers][ref-limits] can be added to your deployment to both increase storage
space and improve pre-aggregation performance. A **minimum of 2** Cube Store
Workers is required for pre-aggregations; this can be adjusted. For a rough
estimate, it will take approximately 2 Cube Store Workers per 4 GB of
pre-aggregated data per day.

<InfoBox>

Idle workers will automatically hibernate after 10 minutes of inactivity, and
will not consume CCUs until they are resumed. Workers are resumed automatically
when Cube receives a query that should be accelerated by a pre-aggregation, or
when a scheduled refresh is triggered.

</InfoBox>

To change the number of Cube Store Workers in a deployment, go to the
deployment’s <Btn>Settings</Btn> screen, and open the <Btn>Configuration</Btn>
tab. From this screen, you can set the number of Cube Store Workers from the
dropdown:

<Screenshot
alt="Cube Cloud Deployment Settings page showing auto-scaling configuration options"
src="https://ucarecdn.com/3b39c56f-d553-4612-b4f0-07084cc4b742/"
/>


[ref-limits]: /product/deployment/cloud/limits#resources
2 changes: 1 addition & 1 deletion docs/pages/product/workspace/_meta.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ module.exports = {
"pre-aggregations": "Pre-Aggregations",
"performance": "Performance Insights",
"access-control": "Access Control",
"sso": "Single Sign-On",
"sso": "Single Sign-on",
"budgets": "Budgets",
"preferences": "Preferences",
"cli": "CLI"
Expand Down
150 changes: 113 additions & 37 deletions docs/pages/product/workspace/performance.mdx
Original file line number Diff line number Diff line change
@@ -1,20 +1,16 @@
# Performance Insights

<WarningBox>

This page is work-in-progress.

</WarningBox>

The&nbsp;<Btn>Performance</Btn> page in Cube Cloud displays charts that help
analyze the performance of your deployment and fine-tune its configuration.
It's recommended to review Performance Insights when the workload changes
or if you face any performance-related issues with your deployment.

<SuccessBox>

Performance Insights are available in Cube Cloud on
[all tiers](https://cube.dev/pricing).
Performance Insights are available in Cube Cloud on [Premium and above
tiers](https://cube.dev/pricing). Please contact us through the in-product
chat or check with your dedicated CSM to enable Performance Insights in
your account.

</SuccessBox>

Expand All @@ -25,60 +21,140 @@ Charts provide insights into different aspects of your deployment.
### API instances

The&nbsp;<Btn>API instances</Btn> chart shows the number of API instances
that served queries to the deployment over time.
that served queries to the deployment.

You can use this chart to **fine-tune the
[auto-scaling][ref-scalability-api] configuration of API instances**, e.g.,
increase the minimum and maximum number of API instances.

For example, the following chart shows a deployment with sane auto-scaling
limits that don't need adjusting. It looks like the deployment needs to
sustain just a few infrequent load bursts per day and auto-scaling to 3 API
instances does the job just fine:

<Screenshot src="https://ucarecdn.com/71de8978-8d3f-42cd-a32f-03daa73ad561/"/>

The next chart shows a deployment with auto-scaling limits that definitely
need an adjustment. It looks like the load is so high that most of the time
this deployment has to use at least 4-6 API instances. So, it would be wise
to increase the minimum auto-scaling limit to 6 API instances:

<Screenshot src="https://ucarecdn.com/e5c074b0-e4d4-442e-af48-e50ec0f61963/"/>

When in doubt, consider using a higher minimum auto-scaling limit: when an
additional API instance starts, it needs some time to compile the data model
before it would be able to serve the requests. So, over-provisioning API
instances with a higher minimum auto-scaling limit would allow to decrease
the number of requests that had to wait for the [data model
compilation](#data-model-compilation).

Also, you can use this chart to **fine-tune the
[auto-suspension][ref-auto-sus] configuration**, e.g., by turning
auto-suspension off or increasing the auto-suspension threshold.
For example, the following chart shows a [Development
Instance][ref-dev-instance] deployment that is only accessed a few times
a day and automatically suspends after a short period of inactivity:

{/* TODO: Add screenshot */}
<Screenshot src="https://ucarecdn.com/9bf6760b-805c-413c-85fb-9402b48718cb/"/>

You can use this chart to fine-tune the auto-scaling configuration of API
instances, e.g., increase the minimum and maximum number of API instances.
The next chart shows a misconfigured [Production Cluster][ref-prod-cluster]
deployment that serves the requests throughout the whole day but was
configured to auto-suspend with a tiny threshold:

Also, you can use this chart to fine-tune the auto-suspension configuration,
e.g., by turning auto-suspension off or increasing the auto-suspension
threshold.
<Screenshot src="https://ucarecdn.com/2938ff51-0699-4f60-bba6-03a0132774f0/"/>

### Data sources
### Cache type

The&nbsp;<Btn>Requests by data source</Btn> chart shows the number of API
requests that were fulfilled by using cache or querying the upstream data
source over time. The&nbsp;<Btn>Avg. response time by data source</Btn>
shows the difference in the response time for
requests that hit the cache or go to the upstream data source.
The&nbsp;<Btn>Requests by cache type</Btn> chart shows the number of API
requests that were fulfilled by using pre-aggregations, in-memory cache,
or no cache (i.e., by querying the upstream data source). For example, the
following chart shows a deployment that fulfills about 50% of requests by
using pre-aggregations:

{/* TODO: Add screenshot (x2) */}
<Screenshot src="https://ucarecdn.com/fe784a74-edd5-44c0-803f-267237219b1d/"/>

The&nbsp;<Btn>Avg. response time by data source</Btn> shows the difference
in the response time for requests that hit pre-aggregations, in-memory cache,
or no cache (i.e., the upstream data source). The next chart shows that
pre-aggregations usually provide sub-second response times while queries to
the data source take much longer:

<Screenshot src="https://ucarecdn.com/94ac15b6-a59c-4474-ba68-e07657d55d78/"/>

You can use these charts to see if you'd like to have more queries that hit
the cache and have lower response time. In that case, consider adding more
pre-aggregations in Cube Store or fine-tune the existing ones.
the cache and have lower response time. In that case, **consider adding more
[pre-aggregations][ref-pre-aggregations] in Cube Store** or fine-tune the
existing ones.

### Data model compilation

The&nbsp;<Btn>Requests by data model compilation</Btn> chart shows the
number of API requests that had or had not to wait for the data model
compilation. The&nbsp;<Btn>Wait time for data model compilation</Btn> chart
compilation. For example, the following chart shows a deployment that
only has a tiny fraction of requests that require the data model to be
compiled:

<Screenshot src="https://ucarecdn.com/022a6a71-121a-4b45-ba97-1b0fd2571556/"/>

The&nbsp;<Btn>Wait time for data model compilation</Btn> chart
shows the total time requests had to wait for the data model compilation.
The next chart shows that at certain points of time requests had to wait
dozens of seconds while the data model was being compiled:

{/* TODO: Add screenshot */}
<Screenshot src="https://ucarecdn.com/520d7e4b-3838-48ae-b0aa-c988f588c3d7/"/>

You can use these charts to identify multitenancy misconfiguration,
fine-tune the auto-suspension configuration, or consider using a
multi-cluster deployment.
You can use these charts to **fine-tune the [auto-suspension][ref-auto-sus]
configuration** (e.g., turn it off or increase the threshold so that API
instances suspend less frequently), **identify [multitenancy][ref-multitenancy]
misconfiguration** (e.g., suboptimal bucketing via
[`context_to_app_id`][ref-context-to-app-id]), or
**consider using a [multi-cluster deployment][ref-multi-cluster]** to
distribute requests to different tenants over a number of Production
Cluster deployments.

### Cube Store

The&nbsp;<Btn>Saturation for queries by Cube Store workers</Btn> chart
shows if Cube Store workers are overloaded with serving **queries**. High
saturation for queries prevents Cube Store workers from fulfilling requests
and results in wait time displayed at the&nbsp;<Btn>Wait time for queries
by Cube Store workers</Btn> chart.
shows if Cube Store workers are overloaded with serving **queries**.
High saturation for queries prevents Cube Store workers from fulfilling
requests and results in wait time displayed at the&nbsp;<Btn>Wait time for
queries by Cube Store workers</Btn> chart.

For example, the following chart shows a deployment that uses 4 Cube Store
workers and almost never lets them come to saturation, resulting in no wait
time for queries:

{/* TODO: Add screenshot */}
<Screenshot src="https://ucarecdn.com/9f33377e-ebf4-4227-9f49-a30b7f5bc04b/"/>

Similarly, the&nbsp;<Btn>Saturation for jobs by Cube Store workers</Btn>
and <Btn>Wait time for jobs by Cube Store workers</Btn> charts show if
Cube Store Workers are overloaded with serving **jobs**, i.e., building
pre-aggregations or performing internal tasks such as data compaction.

{/* TODO: Add screenshot */}
For example, the following chart shows a misconfigured deployment that uses
8 Cube Store workers and keeps them at full saturation during prolonged
intervals, resulting in huge wait time and, in case of jobs, delayed refresh
of pre-aggregations:

<Screenshot src="https://ucarecdn.com/eb3f8897-5358-4e5b-8507-b10c122d6206/"/>

The next chart shows that oversaturated Cube Store workers might yield
hours of wait time for queries and jobs:

<Screenshot src="https://ucarecdn.com/14edcb1d-a22c-47f8-aef4-636c0d726fb2/"/>

You can use these charts to **fine-tune the [number of Cube Store
workers][ref-scalability-cube-store]** used by your deployment, e.g.,
increase it until you see that there's no saturation and no wait time
for queries and jobs.


You can use these charts to consider fine-tuning the number of Cube Store
workers used by your deployment.
[ref-scalability-api]: /product/deployment/cloud/scalability#auto-scaling-of-api-instances
[ref-scalability-cube-store]: /product/deployment/cloud/scalability#sizing-cube-store-workers
[ref-auto-sus]: /product/deployment/cloud/auto-suspension
[ref-dev-instance]: /product/deployment/cloud/deployment-types#development-instance
[ref-prod-cluster]: /product/deployment/cloud/deployment-types#production-cluster
[ref-multi-cluster]: /product/deployment/cloud/deployment-types#production-multi-cluster
[ref-pre-aggregations]: /product/caching/using-pre-aggregations
[ref-multitenancy]: /product/configuration/advanced/multitenancy
[ref-context-to-app-id]: /reference/configuration/config#context_to_app_id
2 changes: 1 addition & 1 deletion docs/pages/product/workspace/sso.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ redirect_from:
- /workspace/sso/
---

# Single Sign-On
# Single Sign-on

As an account administrator, you can manage how your team accesses Cube Cloud.
There are options to log in using email and password, a GitHub account, or a
Expand Down
Loading