Skip to content

Commit

Permalink
Revert "Revert "Doc updates for compression on continuous aggregate f…
Browse files Browse the repository at this point in the history
…eature"" (github#773)

* Revert "Revert "Doc changes for compression on continuous aggregates feature (#666)" (github#730)"

This reverts commit ebb417c.

* Update timescaledb/how-to-guides/continuous-aggregates/compression-on-continuous-aggregates.md

Co-authored-by: Ryan Booz <ryan@timescale.com>

* Update api/add_compression_policy.md

Co-authored-by: Nuno Santos <nuno@timescale.com>

Co-authored-by: Ryan Booz <ryan@timescale.com>
Co-authored-by: Jacob Prall <prall.jacob@gmail.com>
Co-authored-by: Nuno Santos <nuno@timescale.com>
  • Loading branch information
4 people authored Feb 23, 2022
1 parent 2bd0921 commit 473671a
Show file tree
Hide file tree
Showing 7 changed files with 136 additions and 32 deletions.
30 changes: 23 additions & 7 deletions api/add_compression_policy.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,21 @@
Allows you to set a policy by which the system compresses a chunk
automatically in the background after it reaches a given age.

Note that compression policies can only be created on hypertables that already
have compression enabled, e.g., via the [`ALTER TABLE`][compression_alter-table] command
to set `timescaledb.compress` and other configuration parameters.
Note that compression policies can only be created on hypertables or continuous
aggregates that already have compression enabled. Use the [`ALTER TABLE`][compression_alter-table] command
to set `timescaledb.compress` and other configuration parameters for hypertables.
Use [`ALTER MATERIALIZED VIEW`][compression_continuous-aggregate] command to
enable compression on continuous aggregates

### Required Arguments
### Required arguments

|Name|Type|Description|
|---|---|---|
| `hypertable` |REGCLASS| Name of the hypertable|
| `hypertable` |REGCLASS| Name of the hypertable or continuous aggregate|
| `compress_after` | INTERVAL or INTEGER | The age after which the policy job compresses chunks|

The `compress_after` parameter should be specified differently depending on the type of the time column of the hypertable:
The `compress_after` parameter should be specified differently depending
on the type of the time column of the hypertable or continuous aggregate:
- For hypertables with TIMESTAMP, TIMESTAMPTZ, and DATE time columns: the time interval should be an INTERVAL type.
- For hypertables with integer-based timestamps: the time interval should be an integer type (this requires
the [integer_now_func][set_integer_now_func] to be set).
Expand All @@ -24,7 +27,14 @@ the [integer_now_func][set_integer_now_func] to be set).
|---|---|---|
| `if_not_exists` | BOOLEAN | Setting to true causes the command to fail with a warning instead of an error if a compression policy already exists on the hypertable. Defaults to false.|

### Sample Usage
<highlight type="important">
Compression policies on continuous aggregates should be set up so that they do
not overlap with refresh policies on continuous aggregates. This is due to a
current TimescaleDB limitation that prevents refresh of compressed regions of
continuous aggregates.
</highlight>

### Sample usage
Add a policy to compress chunks older than 60 days on the 'cpu' hypertable.

``` sql
Expand All @@ -37,6 +47,12 @@ Add a compress chunks policy to a hypertable with an integer-based time column:
SELECT add_compression_policy('table_with_bigint_time', BIGINT '600000');
```

Add a policy to compress chunks of a continuous aggregate called `cpu_weekly`, that are
older than eight weeks:
``` sql
SELECT add_compression_policy('cpu_weekly', INTERVAL '8 weeks');
```

[compression_alter-table]: /api/:currentVersion:/compression/alter_table_compression/
[compression_continuous-aggregate]: /api/:currentVersion:/continuous-aggregates/alter_materialized_view/
[set_integer_now_func]: /hypertable/set_integer_now_func
17 changes: 14 additions & 3 deletions api/alter_materialized_view.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,17 +18,28 @@ ALTER MATERIALIZED VIEW <view_name> SET ( timescaledb.<option> = <value> [, ...
|---|---|---|
| `<view_name>` | TEXT | Name (optionally schema-qualified) of continuous aggregate view to be created.|

### Sample Usage
### Options
|Name|Description|
|-|-|
|timescaledb.materialized_only|Enable and disable real time aggregation|
|timescaledb.compress|Enable and disable compression|

### Sample usage
To disable real-time aggregates for a
continuous aggregate:

```sql
ALTER MATERIALIZED VIEW contagg_view SET (timescaledb.materialized_only = true);
```

The only option that currently can be modified with `ALTER
MATERIALIZED VIEW` is `materialized_only`. The other options
To enable compression for a continuous aggregate:

```sql
ALTER MATERIALIZED VIEW contagg_view SET (timescaledb.compress = true);
```

The only options that currently can be modified with `ALTER
MATERIALIZED VIEW` are `materialized_only` and `compress`. The other options
`continuous` and `create_group_indexes` can only be set when creating
the continuous aggregate.

Expand Down
6 changes: 4 additions & 2 deletions api/continuous_aggregates.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ Get metadata and settings information for continuous aggregates.
|`view_schema` | TEXT | Schema for continuous aggregate view |
|`view_name` | TEXT | User supplied name for continuous aggregate view |
|`view_owner` | TEXT | Owner of the continuous aggregate view|
|`materialized_only` | BOOLEAN | Return only materialized data when querying the continuous aggregate view. |
|`materialized_only` | BOOLEAN | Return only materialized data when querying the continuous aggregate view|
|`compression_enabled` | BOOLEAN | Is compression enabled for the continuous aggregate view?|
|`materialization_hypertable_schema` | TEXT | Schema of the underlying materialization table|
|`materialization_hypertable_name` | TEXT | Name of the underlying materialization table|
|`view_definition` | TEXT | `SELECT` query for continuous aggregate view|
Expand All @@ -27,11 +28,12 @@ view_schema | public
view_name | contagg_view
view_owner | postgres
materialized_only | f
compression_enabled | f
materialization_hypertable_schema | _timescaledb_internal
materialization_hypertable_name | _materialized_hypertable_2
view_definition | SELECT foo.a, +
| COUNT(foo.b) AS countb +
| FROM foo +
| GROUP BY (time_bucket('1 day', foo.a)), foo.a;

```
```
12 changes: 8 additions & 4 deletions api/remove_compression_policy.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,10 @@ If you need to remove the compression policy. To re-start policy-based compressi
### Required Arguments

|Name|Type|Description|
|---|---|---|
| `hypertable` | REGCLASS | Name of the hypertable the policy should be removed from.|

### Optional Arguments
|-|-|-|
|`hypertable`|REGCLASS|Name of the hypertable or continuous aggregate the policy should be removed from|

### Optional arguments
|Name|Type|Description|
|---|---|---|
| `if_exists` | BOOLEAN | Setting to true causes the command to fail with a notice instead of an error if a compression policy does not exist on the hypertable. Defaults to false.|
Expand All @@ -18,3 +17,8 @@ Remove the compression policy from the 'cpu' table:
``` sql
SELECT remove_compression_policy('cpu');
```

Remove the compression policy from the 'cpu_weekly' continuous aggregate:
``` sql
SELECT remove_compression_policy('cpu_weekly');
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Compression on continuous aggregates
Continuous aggregates are often used to store downsampled historical data.
The historical data is almost never modified or recomputed and is only used
for serving analytic queries. For this use case, it is often beneficial to
store the materialized data in compressed form to save on storage costs.
You can get these cost savings by enabling compression on continuous
aggregates.

Currently, TimescaleDB does not support refreshing compressed regions of a
continuous aggregate. To do this, you have to manually decompress
the compressed chunk and then execute a `refresh_continuous_aggregate` call.

## Enable compression on continuous aggregates
You can enable and disable compression on continuous aggregated by setting
`compress` parameter when you alter the view.

<procedure>

### Enabling and disabling compression on continuous aggregates
1. For an existing continuous aggregate, at the `psql` prompt, enable
compression:
```sql
ALTER MATERIALIZED VIEW cagg_name set (timescaledb.compress = true);
```
1. Disable compression:
```sql
ALTER MATERIALIZED VIEW cagg_name set (timescaledb.compress = false);
```
</procedure>
The decompress command fails if there are compressed chunks associated with the
continuous aggregate. In this case, you need to decompress the chunks, and then
drop any compression policy on the continuous aggregate, before you disable
compression. For more detailed information, see the
[decompress chunks] [decompress-chunks] section:
```sql
SELECT decompress_chunk(c, true) FROM show_chunks('cagg_name') c;
## Compression policies on continuous aggregates
Before setting up a compression policy on a continuous aggregate, you should
set up a refresh policy. The compression policy interval should be set so that
actively refreshed regions are not compressed. This is to prevent refresh
policies from failing. For example, consider a refresh policy like this:
```sql
SELECT add_continuous_aggregate_policy('cagg_name', start_offset=>'30 days', end_offset=>'1 day', '1 h');
```

With this kind of refresh policy, the compression policy needs the `compress_after`
parameter greater than the `refresh_start` parameter of the continuous aggregate policy:
```sql
SELECT add_compression_policy('cagg_name', compress_after=>'45 days'::interval);
```

After a chunk is compressed, manual refresh calls that attempt to refresh the
continuous aggregate's compressed region will fail with an error like this:

```sql
CALL refresh_continuous_aggregate('cagg_name', NULL, now() - '30 days'::interval );
ERROR: cannot update/delete rows from chunk "_hyper_3_3_chunk" as it is compressed
```

[decompress-chunks]: how-to-guides/compression/decompress-chunks.md
2 changes: 2 additions & 0 deletions timescaledb/how-to-guides/continuous-aggregates/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ only the data that has changed needs to be computed, not the entire dataset.
* [Drop data][cagg-drop] from your continuous aggregates.
* [Manage materialized hypertables][cagg-mat-hypertables].
* [Use real-time aggregates][cagg-realtime].
* [Compression with continuous aggregates][cagg-compression].
* [Troubleshoot][cagg-tshoot] continuous aggregates.


Expand All @@ -24,4 +25,5 @@ only the data that has changed needs to be computed, not the entire dataset.
[cagg-drop]: /how-to-guides/continuous-aggregates/drop-data
[cagg-mat-hypertables]: /how-to-guides/continuous-aggregates/materialized-hypertables
[cagg-realtime]: /how-to-guides/continuous-aggregates/real-time-aggregates
[cagg-compression]: /how-to-guides/continuous-aggregates/compression-on-continuous-aggregates
[cagg-tshoot]: /how-to-guides/continuous-aggregates/troubleshooting
38 changes: 22 additions & 16 deletions timescaledb/how-to-guides/continuous-aggregates/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,13 @@ with continuous aggregates.
* Copy this comment at the top of every troubleshooting page
-->

## Compression policies
## Retention policies
If you have hypertables that use a different retention policy to your continuous
aggregates, the retention policies are applied separately. The retention policy
on a hypertable determines how long the raw data is kept for. The retention
policy on a continuous aggregate determines how long the continuous aggregate is
kept for. For example, if you have a hypertable with a retention policy of a
week, but a continuous aggregate with a retention policy of a month, the raw
week and a continuous aggregate with a retention policy of a month, the raw
data is kept for a week, and the continuous aggregate is kept for a month.

## Insert irregular data into a continuous aggregate
Expand Down Expand Up @@ -50,13 +50,14 @@ be hard to refresh and would make more sense to isolate these columns in another
hypertable. Alternatively, you might create one hypertable per metric and
refresh them independently.

### New data is not shown in real-time aggregates
### Updates to previously materialized regions are not shown in real-time aggregates
If you have a time bucket that has already been materialized, the real-time
aggregate won't show the data that has been inserted, updated, or deleted. In
this worked example, `refresh_continuous_aggregate()` is called for the data
that is not going to change. When you need to change data that has already been
materialized, use `refresh_continuous_aggregate()` for the corresponding
buckets.
aggregate does not show the data that has been inserted, updated, or deleted
into that bucket until the next `refresh_continuous_aggregate` call is executed.
The continuous aggregate is refreshed either when you manually call
`refresh_continuous_aggregate` or when a continuous aggregate policy is executed.
This worked example shows the expected behavior of continuous aggregates, when
real time aggregation is enabled.

Create and fill the hypertable:
```sql
Expand Down Expand Up @@ -87,7 +88,8 @@ INSERT INTO conditions (day, city, temperature) VALUES
('2021-06-27', 'Moscow', 31);
```

Create a real-time aggregate, but don't refresh the data:
Create a continuous aggregate but do not materialize any data. Note that real
time aggregation is enabled by default:
```sql
CREATE MATERIALIZED VIEW conditions_summary
WITH (timescaledb.continuous) AS
Expand All @@ -99,18 +101,21 @@ FROM conditions
GROUP BY city, bucket
WITH NO DATA;

The select query returns data as real time aggregates are enabled. The query on
the continuous aggregate fetches data directly from the hypertable:
SELECT * FROM conditions_summary ORDER BY bucket;
city | bucket | min | max
--------+------------+-----+-----
Moscow | 2021-06-14 | 22 | 30
Moscow | 2021-06-21 | 31 | 34
```

Refresh the data:
Materialize data into the continuous aggregate:
```sql
CALL refresh_continuous_aggregate('conditions_summary', '2021-06-14', '2021-06-21');

-- The CAGG didn't change, that's expected
The select query returns the same data, as expected, but this time the data is
fetched from the underlying materialized table
SELECT * FROM conditions_summary ORDER BY bucket;
city | bucket | min | max
--------+------------+-----+-----
Expand All @@ -125,8 +130,9 @@ SET temperature = 35
WHERE day = '2021-06-14' and city = 'Moscow';
```

The updated data is not yet visible in the continuous aggregate. Additionally,
INSERT and DELETE are not visible:
The updated data is not yet visible when you query the continuous aggregate. This
is because these changes have not been materialized.( Similarly, any
INSERTs or DELETEs would also not be visible).
```sql
SELECT * FROM conditions_summary ORDER BY bucket;
city | bucket | min | max
Expand All @@ -135,7 +141,7 @@ SELECT * FROM conditions_summary ORDER BY bucket;
Moscow | 2021-06-21 | 31 | 34
```

Refresh the data again to see the updates:
Refresh the data again to update the previously materialized region:
```sql
CALL refresh_continuous_aggregate('conditions_summary', '2021-06-14', '2021-06-21');

Expand All @@ -159,8 +165,8 @@ aggregates like `SUM` and `AVG`. You can also use more complex expressions on
top of the aggregate functions, for example `max(temperature)-min(temperature)`.

However, aggregates using `ORDER BY` and `DISTINCT` cannot be used with
continuous aggregates since they are not possible to parallelize with
PostgreSQL. TimescaleDB does not currently support `FILTER` or `JOIN` clauses,
continuous aggregates since they cannot be parallelized with
PostgreSQL. TimescaleDB does not support `FILTER` or `JOIN` clauses,
or window functions in continuous aggregates.

[postgres-parallel-agg]: https://www.postgresql.org/docs/current/parallel-plans.html#PARALLEL-AGGREGATION
Expand Down

0 comments on commit 473671a

Please sign in to comment.