Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
villebro committed Mar 2, 2022
1 parent dd77329 commit 9c5f209
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 20 deletions.
2 changes: 1 addition & 1 deletion UPDATING.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ assists people when migrating to a new version.

### Breaking Changes

- [18976](https://github.com/apache/superset/pull/18976): A new `DEFAULT_CACHE_CONFIG_FUNC` parameter has been introduced in `config.py` which makes it possible to define a default cache config that will be used as the basis for all cache configs. When running the app in debug mode, the app will default to use `SimpleCache`; in other cases the default cache type will be `NullCache`. In addition, `DEFAULT_CACHE_TIMEOUT` has been deprecated and moved into `DEFAULT_CACHE_CONFIG_FUNC` (will be removed in Superset 2.0). For installations using Redis or other caching backends, it is recommended to set the default cache options in `DEFAULT_CACHE_CONFIG_FUNC` to ensure the primary cache is always used if new caches are added.
- [18976](https://github.com/apache/superset/pull/18976): A new `DEFAULT_CACHE_CONFIG` parameter has been introduced in `config.py` which makes it possible to define a default cache config that will be used as the basis for all cache configs. When running the app in debug mode, the app will default to use `SimpleCache`; in other cases the default cache type will be `NullCache`. In addition, `DEFAULT_CACHE_TIMEOUT` has been deprecated and moved into `DEFAULT_CACHE_CONFIG` (will be removed in Superset 2.0). For installations using Redis or other caching backends, it is recommended to set the default cache options in `DEFAULT_CACHE_CONFIG` to ensure the primary cache is always used if new caches are added.
- [17881](https://github.com/apache/superset/pull/17881): Previously simple adhoc filter values on string columns were stripped of enclosing single and double quotes. To fully support literal quotes in filters, both single and double quotes will no longer be removed from filter values.
- [17984](https://github.com/apache/superset/pull/17984): Default Flask SECRET_KEY has changed for security reasons. You should always override with your own secret. Set `PREVIOUS_SECRET_KEY` (ex: PREVIOUS_SECRET_KEY = "\2\1thisismyscretkey\1\2\\e\\y\\y\\h") with your previous key and use `superset re-encrypt-secrets` to rotate you current secrets
- [15254](https://github.com/apache/superset/pull/15254): Previously `QUERY_COST_FORMATTERS_BY_ENGINE`, `SQL_VALIDATORS_BY_ENGINE` and `SCHEDULED_QUERIES` were expected to be defined in the feature flag dictionary in the `config.py` file. These should now be defined as a top-level config, with the feature flag dictionary being reserved for boolean only values.
Expand Down
37 changes: 18 additions & 19 deletions docs/docs/installation/cache.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,37 +7,36 @@ version: 1

## Caching

Superset uses [Flask-Caching](https://flask-caching.readthedocs.io/) for caching purpose. For security reasons,
there are two separate cache configs for Superset's own metadata (`CACHE_CONFIG`) and charting data queried from
connected datasources (`DATA_CACHE_CONFIG`). However, Query results from SQL Lab are stored in another backend
called `RESULTS_BACKEND`, See [Async Queries via Celery](/docs/installation/async-queries-celery) for details.

Configuring caching is as easy as providing `CACHE_CONFIG` and `DATA_CACHE_CONFIG` in your
Superset uses [Flask-Caching](https://flask-caching.readthedocs.io/) for caching purpose. Default caching options
can be set by overriding the `DEFAULT_CACHE_CONFIG` in your `superset_config.py`. Unless overridden, the default
cache type will be set to `SimpleCache` when running in debug mode, and `NullCache` otherwise.

Currently there are five separate cache configurations to provide additional security and more granular customization options:
- Metadata cache (optional): `CACHE_CONFIG`
- Charting data queried from datasets (optional): `DATA_CACHE_CONFIG`
- SQL Lab query results (optional): `RESULTS_BACKEND`. See [Async Queries via Celery](/docs/installation/async-queries-celery) for details
- Dashboard filter state (required): `FILTER_STATE_CACHE_CONFIG`.
- Explore chart form data (required): `EXPLORE_FORM_DATA_CACHE_CONFIG`

Configuring caching is as easy as providing a custom cache config in your
`superset_config.py` that complies with [the Flask-Caching specifications](https://flask-caching.readthedocs.io/en/latest/#configuring-flask-caching).

Flask-Caching supports various caching backends, including Redis, Memcached, SimpleCache (in-memory), or the
local filesystem.
local filesystem. Custom cache backends are also supported. See [here](https://flask-caching.readthedocs.io/en/latest/#custom-cache-backends) for specifics.

Note that Dashboard and Explore caching is required, and configuring the application with either of these caches set to `NullCache` will
cause the application to fail on startup. Also keep in mind, tht when running Superset on a multi-worker setup, a dedicated cache is required.
For this we recommend running either Redis or Memcached:

- Redis (recommended): we recommend the [redis](https://pypi.python.org/pypi/redis) Python package
- Memcached: we recommend using [pylibmc](https://pypi.org/project/pylibmc/) client library as
`python-memcached` does not handle storing binary data correctly.
- Redis: we recommend the [redis](https://pypi.python.org/pypi/redis) Python package

Both of these libraries can be installed using pip.

For chart data, Superset goes up a “timeout search path”, from a slice's configuration
to the datasource’s, the database’s, then ultimately falls back to the global default
defined in `DATA_CACHE_CONFIG`.

```
DATA_CACHE_CONFIG = {
'CACHE_TYPE': 'redis',
'CACHE_DEFAULT_TIMEOUT': 60 * 60 * 24, # 1 day default (in secs)
'CACHE_KEY_PREFIX': 'superset_results',
'CACHE_REDIS_URL': 'redis://localhost:6379/0',
}
```

Custom cache backends are also supported. See [here](https://flask-caching.readthedocs.io/en/latest/#custom-cache-backends) for specifics.

Superset has a Celery task that will periodically warm up the cache based on different strategies.
To use it, add the following to the `CELERYBEAT_SCHEDULE` section in `config.py`:
Expand Down
2 changes: 2 additions & 0 deletions superset/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -592,12 +592,14 @@ def _try_json_readsha(filepath: str, length: int) -> Optional[str]:
# Cache for filters state (will be merged with DEFAULT_CACHE_CONFIG)
FILTER_STATE_CACHE_CONFIG: CacheConfig = {
"CACHE_DEFAULT_TIMEOUT": int(timedelta(days=90).total_seconds()),
# should the timeout be reset when retrieving a cached value
"REFRESH_TIMEOUT_ON_RETRIEVAL": True,
}

# Cache for chart form data (will be merged with DEFAULT_CACHE_CONFIG)
EXPLORE_FORM_DATA_CACHE_CONFIG: CacheConfig = {
"CACHE_DEFAULT_TIMEOUT": int(timedelta(days=7).total_seconds()),
# should the timeout be reset when retrieving a cached value
"REFRESH_TIMEOUT_ON_RETRIEVAL": True,
}

Expand Down

0 comments on commit 9c5f209

Please sign in to comment.