Skip to content
This repository has been archived by the owner on May 26, 2020. It is now read-only.

Add Application metrics Section in User Docs #119

Closed
wants to merge 58 commits into from

Conversation

sablumiah
Copy link
Contributor

WHAT
Add section on how to create application metrics

WHY
This will enable users to create an application metrics endpoint via instrumentation or using an exporter, and then create alerts based on those metrics using Prometheus

David Salgado and others added 6 commits February 28, 2019 12:33
This uses a much simpler example application, with
the minimum possible kubernetes configuration.
Some outdated values have been corrected (e.g. the
hostname at which the deployed application can be
accessed).
This may save the reader the hassle of following
the link to the ECR section and then coming back.
The user isn't configuring the cluster, just their
namespace. Similarly, we don't need to specify
that we're deploying the application 'to the
cluster'
@sablumiah sablumiah changed the title Add Application metrics section Add Application metrics Section in User Docs Mar 1, 2019
sablumiah and others added 5 commits March 1, 2019 15:54
Update the 'deploying an app.' walkthrough
This commit adds a brief explanation of how to
remove unneeded cluster resources that the user
created while working through sections of the
guide.

The 'Cleaning Up' section is added as the last
part of the 'Deploying an app' part of the user
docs.
Copy link
Member

@jasonBirchall jasonBirchall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this to its own document?

@@ -93,6 +93,111 @@ kubectl -n <namespace> describe prometheusrules prometheus-custom-rules-<applica
## PrometheusRule examples
If you're struggling for ideas on how and which alerts to setup, please see some examples [here](https://github.com/ministryofjustice/cloud-platform-infrastructure/blob/master/terraform/cloud-platform-components/resources/prometheusrule-examples/application-alerts.yaml).

## Application Metrics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the volume of this entry, I would suggest this needs its own document.

@jennyd
Copy link
Contributor

jennyd commented Mar 5, 2019

I think it's worth including a mention of an important consideration: if you're using a pre-forking web server (like unicorn or puma for Ruby, or gunicorn for Python) and have it configured to use multiple processes, then you need to use a Prometheus client library which supports exporting metrics from multiple processes. Not all the official clients do that - for example prometheus/client_ruby#9 has been open for a long time, although prometheus/client_ruby#95 looks promising so 🤞

If you don't use a library which supports this, then requests to /metrics could be served by any of the processes, which would mean Prometheus sees inconsistent data on each scrape.

David Salgado and others added 16 commits March 5, 2019 10:13
Our policy for the environments repo is for developers to create
branches with the changes they want, not forks.
Add a 'Cleaning Up' section to the user guide
* Add 'Step 4' (since we use 'Step 1-3' earlier)
* Use imperative tense, like the other steps
Describe the process of using `kubectl delete`
with a directory full of yaml files to delete all
named resources from the user's namespace using a
single command.
Explain `kubectl delete --filename` to delete
As I've had to run this a few times and have to keep googling it...
Note: the example text decodes to `Aladdin:open sesame` and is not
an actual secret
…mple

Add an example showing use of base64 to decode ECR secrets
For rails applications with force_ssl = true, probes need extra
http headers. Adding this explanation here will help developers
avoid problems with failing probes, without our help.
Rails 5 sends a 307, not a 301. We don't really care which
redirect we're avoiding.
Add description of using https headers for http liveness/readiness probes, where the application only responds to https.
If users are not in the MoJ/webops github team, they won't be able
to clone the helloworld demo app. via SSH.
Amend instructions to git clone via https, not ssh
People who are familiar with github pages will know how to go from

https://ministryofjustice.github.io/cloud-platform-user-docs

...to

https://github.com/ministryofjustice/cloud-platform-user-docs

For those who don't, this change provides instructions on how to
find the source repo and contribute to the guide.
David Salgado and others added 29 commits March 8, 2019 14:48
This makes things less confusing when using a
development jekyll server, while editing the
document. It also protects against the 
documentation repository being moved, in the 
future.
This requires approval from the cloud platform team, and takes a
while to apply once approved. So, it makes sense to tell the user
to do this first.
If the user tags their docker images with a
different version number (using the makefile in
the demo app), they will need to update the image
reference in the kubernetes deployment yaml files.
It's quite likely that someone will already have
deployed an instance of the demo app at

https://multi-container-demo.apps.cloud-platform-live-0.k8s.integration.dsd.io/

In that case, creating the ingress on the cluster
will fail. This warns the user to expect that
error.
…-app

Add a guide for deploying a multi-container app
Add instructions on how to contribute to the guide
This is a temporary measure to forewarn teams that live services
will be hosted on the live-1 cluster (although it's not quite
ready yet).
…ce/cloud-platform-user-docs into live0-deprecation-notice
Add a deprecation warning for the live-0 cluster
…ud-platform-user-docs into applicationmetrics
@sablumiah sablumiah closed this Mar 11, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants