This repository contains code to install Apache Superset through Helm, onto a Kubernetes cluster. Specifically, this implementation includes configurations that allow for Ingress to expose a http/https route with a domain name, and OAuth authentication.
The components of this implementation are as follows:
- Google Cloud Platform (GCP): Google's broad suite of cloud computing services.
- Kubernetes (K8s): K8s is an open-source system for automating deployment, scaling, and management of containerised applications. K8s is can be deployed standalone, or provisioned through a 3rd party cloud service provider such as Google Kubernetes Engine (GKE).
- Helm: A package manager for K8s, where each package is referred to as a "chart". Charts are an abstracted collection of .yaml K8s config files which would otherwise have to be manually created and maintained for every installation of a containerised application.
- Apache Superset: An open-source visualisation tool and SQL IDE that is built atop Flask. It is a robust tool with a large variety of visualisations and connections to different databases.
The best practice for configuration is for the key-value pairs within <chart dir>/template/<filename>.yaml
config files to dynamically reference a values.yaml
file using the Go templating engine under the hood. This section discusses configuring the release of the Superset Helm chart to use Ingress and OAuth.
Input your google key
and google secret
into these segments of the my-values.yaml
file.
extraSecretEnv:
GOOGLE_KEY: <insert key ending with .apps.googleusercontent.com>
GOOGLE_SECRET: <insert secret>
The <chart dir>/template/ingress.yaml
config file already has the necessary key-value pairs to utilise the Go templating engine for dynamic referencing. What needs to be done is to input your hostname
into this segment of the my-values.yaml
file.
ingress:
enabled: true
annotations:
acme.cert-manager.io/http01-edit-in-place: "true"
cert-manager.io/cluster-issuer: letsencrypt-prod
cert-manager.io/issue-temporary-certificate: "true"
kubernetes.io/ingress.class: nginx
meta.helm.sh/release-name: superset
path: /
pathType: Prefix
hosts:
- <insert hostname>
tls:
- hosts:
- <insert hostname>
Input your hostname
into these segments of the my-values.yaml
file.
extraEnv:
OAUTH_HOME_DOMAIN: <insert OAuth home domain>
Go to Mapbox's website and sign up for a free account to generate an API token. Input your API token into this segment of the my-values.yaml
file.
extraSecretEnv:
MAPBOX_API_KEY: <insert mapbox api key>
Assuming you already have a working K8s cluster with Helm installed, execute the following command in your CLI.
helm upgrade superset superset/superset --install --values my_values.yaml --namespace <insert namespace>
The current deployment is connected to an Apache Spark SQL database server.
To connect to other databases, please refer to the official Apache Superset docs to check for PyPI dependencies and connection string format.
Go to the Dashboards list page, from the "Actions" columns of to-be-exported dashboard, click "Export" button. The Dashboards, its Charts and its Datasets will be exported together.
5.1 Go to the Dashboards list page, from the right top cornor, click "Import Dashboards" button. The Dashboards, its Charts and its Datasets will be imported together.
5.2 Choose the file that going to be imported and click "IMPORT" button. The dashboards will be imported after that.
Roles can be configured in Settings > List Roles > +
.
Superset comes with several predefined roles as described in the official documentation.
The "Gamma" role can be used as a basic template for new roles. Thereafter, datasource accesses can be granted to roles to restrict access to specific datasets, which in turn restricts what users with that role can view in dashboards.
helm uninstall superset --namespace batch11-dataops-playground
- Database connection error message:
For a user-friendly guide on how to use Superset, you can refer to this post :)