The Product & Engineering Service Catalogue

A database of information from AWS, GitHub, Synk, and other sources, Service Catalogue aims to provide a picture of the Guardian's estate, broken down by Product & Engineering (P&E) team.

In contrast with Prism, which collects data from a subset of AWS resources, Service Catalogue offers a more complete picture of production services, as we may provision a resource that Prism doesn't know about.

Purpose

The Guardian has hundreds of EC2, lambda, and other services in AWS, each built from one of thousands of GitHub repositories, by one of many P&E teams.

Some of the questions Service Catalogue aims to answer include:

For P&E teams:
- Which services do I own?
- Which services follow DevX best practice/use tooling?
- Which repo do services come from?
- What is my service reliability? (time since last incident)
For the Developer Experience stream:
- What proportion of all services follow best practice/use tooling?
- What kinds of technologies are different streams using?
- Which teams are struggling with reliability and need more support?
- Which services belong to specific P&E product teams

Pricing information is not yet available in Service Catalogue, therefore, we're unable to answer questions such as:

What does each service cost?
What services are costing us the most money?

How does it work?

Service Catalogue has two parts:

Data collection
Data analysis

Data collection

We use CloudQuery to collect data from AWS, GitHub, Snyk, and other sources.

We've implemented CloudQuery as a set of ECS tasks, writing to a Postgres database. For more details, see CloudQuery implementation.

Tip

To update CloudQuery, see Updating CloudQuery.

Data analysis

The data in Service Catalogue is analysed in two ways:

Grafana, at https://metrics.gutools.co.uk
AWS Lambda functions, for example RepoCop or data-audit

Support with production issues

Service Catalogue has a runbook you can access here that explains how deal with common problems, respond to alerts and how to perform useful operations like triggering tasks manually.

How to run locally

Follow the instruction in the dev-environment README to run cloudquery locally. Then follow the instructions in the repocop README to run repocop locally.

Architecture

The diagram below outlines the architecture of the major components of the service catalogue.

flowchart TB
    DB[(Cloudquery Database)]
    snyk[Snyk Rest API]
    github[GitHub Rest API]
    cq[CloudQuery Batch Jobs]
    devxDev[Developer on the DevX team]
    dev[P&E Developer]
    repocop[Repocop Lambdas]
    aws[AWS APIs]

    snyk --> |Data from snyk populates Cloudquery tables|cq
    github --> |Data from dependabot populates Cloudquery tables|cq
    aws --> |Data from aws populates Cloudquery tables|cq
    cq --> |Cloudquery writes data to the DB|DB
    DB --> |1 - Cloudquery data is used to calculate departmental compliance with obligations|repocop
    repocop --> |2 - Repocop stores compliance information about repos as a table in the cloudquery DB|DB
    repocop --> |Repocop raises PRs to fix issues, that are reviewed by developers|dev
    Grafana --> |Compliance dashboards are used by DevX developers to track departmental progress towards obligations|devxDev
    repocop --> |Repocop sends notifications of events, or warnings to teams  via Anghammarad|Anghammarad
    Anghammarad --> |Anghammarad delivers messages to developers about changes to their systems|dev
    DB --> |Cloudquery data powers \n grafana dashboards|Grafana
    Grafana --> |Compliance dashboards are used by developers to track their team's progress towards obligations. They also have read access to raw cloudquery tables.|dev

Name	Name	Last commit message	Last commit date
.github	.github	Merge branch 'main' into dependabot/github_actions/docker/build-push-…	Dec 3, 2024
.hooks	.hooks	Minor fixes/improvements (#684 )	Jan 16, 2024
.idea	.idea	chore: Add IDE configuration to format files on save	Nov 29, 2022
.vscode	.vscode	Create architecture diagram for Service Catalogue (#737 )	Mar 21, 2024
ADR	ADR	docs: Update ADR for tagging obligation	Jun 5, 2024
containers	containers	fix: Copy data from CODE with column names	Jul 5, 2024
docs	docs	Make packages baked into images by Amigo visible to the service catal…	Dec 4, 2024
packages	packages	Increase memory for task	Dec 12, 2024
scripts	scripts	refactor: check for existing PR before sending event to dep graph int…	Nov 27, 2024
sql	sql	delete test db entries for local reusability	Sep 30, 2024
.env	.env	Update plugin version to latest	Dec 12, 2024
.gitignore	.gitignore	chore: add iml file to gitignore	Sep 19, 2023
.nvmrc	.nvmrc	chore: Update Node version	Jul 12, 2024
README.md	README.md	refactor: increase dependency submission PR creation to 5 per day (#1329	Nov 14, 2024
jest.config.js	jest.config.js	refactor: increase dependency submission PR creation to 5 per day (#1329	Nov 14, 2024
jest.setup.js	jest.setup.js	ci: Silence console messages to make build log easier to read	Jul 15, 2024
package-lock.json	package-lock.json	Merge branch 'main' into dependabot/npm_and_yarn/prisma-9395f686e6	Dec 3, 2024
package.json	package.json	chore(deps-dev): Bump @types/node from 20.14.10 to 22.8.7	Nov 4, 2024
tsconfig.json	tsconfig.json	ci: Compile each package individually	May 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

README.md

The Product & Engineering Service Catalogue

Purpose

How does it work?

Data collection

Data analysis

Support with production issues

How to run locally

Architecture

Files

Directory actions

More options

Directory actions

More options

Latest commit

History

Folders and files

README.md

The Product & Engineering Service Catalogue

Purpose

How does it work?

Data collection

Data analysis

Support with production issues

How to run locally

Architecture