Velodrome is the dashboard, monitoring and metrics for Kubernetes Developer Productivity. It is hosted at:
It is comprised of three components:
- Grafana stack is the front-end website where users can visualize the metrics along with the back-end databases used to print those metrics. It has:
- an InfluxDB (a time-series database) to save precalculated metrics,
- a Prometheus instance to save poll-based metrics (more monitoring based)
- a Grafana instance to display graphs based on these metrics
- and an nginx to proxy all of these services in a single URL.
- A SQL Database containing a copy of the issues, events, and PRs in Github repositories. It is used for calculating statistics about developer productivity. It has the following components:
- Fetcher: fetches Github data and stores in a SQL database
- SQL Proxy: SQL Proxy deployment to Cloud SQL
- Transform: Transform SQL (Github db) into valuable metrics
- Other monitoring tools, only one for the moment:
- token-counter: Monitors RateLimit usage of your github tokens
Here is how the github statistics are communicating between each other:
=> pulls from
-> pushes to
* External components
Github* <= Fetcher -> Cloud SQL* <= Transform -> InfluxDb
One can set-up monitoring components in two different ways:
-
Push data directly into InfluxDb. Influx uses a SQL-like syntax and receives that data (there is no scraping). If you have events that you would like to push from time to time rather than reporting a current status, you should push to InfluxDB. Examples: build time, test time, etc ...
-
Data can be polled on a regular interval by Prometheus. Prometheus will scrape the data and measure the current state of something. This is much more useful for monitoring as you can see what is the health of a service at a given time.
As an example, the token counter measures the usage of our github-tokens, and has a new value every hour. We can push the new value to InfluxDB.
To disambiguate how each word is used, let's give a description of the naming convention used by velodrome:
- Organization: This has the same meaning as the Github Organization. This is
holding multiple repositories. e.g. In
github.com/istio/manager
, the organization isistio
. - Repository can be either the last part of the github repository URL (i.e. in
github.com/istio/manager
, it would bemanager
), or the fully qualified repository name:istio/manager
. - Project: A project describe a completely hermetic instance of the website for
a given team. A project can span across multiple organizations and multiple
repositories. e.g. The kubernetes project is made of repositories in the
kubernetes
organization, andkubernetes-incubator
.
Adding a new project is as simple as adding it to config.yaml.
Typically, add the name of your project, the list of repositories. Don't worry
about the public-ip
field as the IP will be created later. You can also leave
prometheus configuration if you don't need it initially.
There are new project specific deployments necessary, and they are described below.
config.py will generate all the deployments file for you. It reads
the configuration in config.yaml to generate deployments for each
project and/or repositories with proper labels. You can then use kubectl
labels help you select what you want to do exactly, for example:
./config.py # Generates the configuration and prints it on stdout
./config.py | kubectl apply -f - # Creates/Updates everything
./config.py | kubectl delete -f - # Deletes everything
./config.py | kubectl apply -f - -l project=kubernetes # Only creates/updates kubernetes
./config.py | kubectl apply -f - -l app=fetcher # Only creates/updates fetcher
- Make sure you create the secrets for SQL Proxy
- Make sure your github tokens are also in a secret:
kubectl create secret generic github-tokens --from-file=${TOKEN_FILE_1} --from-file=${TOKEN_FILE_2}
- Create secret for InfluxDB
- Deploy everything:
./config.py | kubectl apply -f - -l project=${NEW_PROJECT_NAME}
- Once the kubernetes service has its public IP, connect to the grafana instance, and add the default dashboard, star the dashboard, set-it as the default dashboard in the org preference.
- Set the static IP in the GCP project, and update
config.yaml
with its value. Potentially create a domain-name pointing to it.