Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(telemetry): instrument rafiki #2299

Merged
merged 67 commits into from
Feb 22, 2024

Conversation

beniaminmunteanu
Copy link
Member

@beniaminmunteanu beniaminmunteanu commented Dec 23, 2023

Changes proposed in this pull request

This PR introduces telemetry into Rafiki to observe and measure the growth and activities of the ILP Network. This will provide valuable insights for future improvements and developments. The implementation includes the following key features:

  • Rafiki instances are instrumented with Opentelemetry to adhere to a standard format and allow integrating account servicing entities to build their own telemetry.
  • Instrumented Rafiki communicates with an Opentelemetry collector through the otel protocol (GRPC).
  • Rafiki can send metrics to multiple Opentelemetry collectors, defined in the .env file.
  • Telemetry is entirely optional and controlled through an env variable.
  • Two counter metrics are introduced: one for counting the number of transactions and another for counting amounts being sent (packet level). These metrics are used to answer three key questions about the network:
    • What is the average amount that a transaction holds?
    • What is the total number of transactions?
    • How much money is being sent through the network?
  • A centralized Telemetry service is introduced, which includes a new RatesService. The RatesService reuses the existing Rafiki Rates service but queries a different URL for rates and has a different cache TTL setup. This rates URL is also built on the AWS infrastructure.
  • The telemetry data collected by the OTEL collector is further sent to Amazon Managed Service for Prometheus (AMP), and then used in Grafana Cloud to build our "Network insights" dashboards. See the public dashboard here
  • Privacy considerations have been a major focus in this PR, with a full documentation on our privacy solution included as an .md file which describes how we implement Local Differential Privacy (LDP).

Infrastructure

The infrastructure for this feature is built on AWS. We are using AWS ECS Fargate, with a cluster of custom ADOT (AWS Distro for OpenTelemetry) collectors. There is also a Network Load Balancer (NLB) in AWS that sits in front of the cluster and load balances the multiple ECS otel collector tasks. All these tasks then send data further downstream to Prometheus. The infrastructure and deployed resource details are documented separately to keep this PR focused on the Rafiki changes.

Context

Full telemetry DOCS

The goal of this PR is to provide a foundation for observing the ILP Network at the money transfer level. By instrumenting Rafiki with Opentelemetry, we can collect data that will help us understand the network's growth and make informed decisions about future developments. The telemetry data is sent to a cluster of Opentelemetry collectors in AWS, but the system is designed to allow integrating ASEs to have their own collectors and build their own telemetry solutions.

Testing

To test out the telemetry feature you can use pnpm localenv:compose up command and everything will be set up for you. If you use the Rafiki peer to peer and cross currency examples in Postman you should see your metrics reflecting on the dashboard after a few minutes.

You can also checkout our tests in packages/backend/src/payment-method/ilp/connector/core/test/middleware/telemetry.test.ts and packages/backend/src/telemetry/meter.test.ts

closes #1913 #1923 #1921 #1956

Checklist

  • Related issues linked using fixes #number
  • Tests added/updated
  • Documentation added
  • Make sure that all checks pass
  • Postman collection updated

Copy link

netlify bot commented Dec 23, 2023

Deploy Preview for brilliant-pasca-3e80ec ready!

Name Link
🔨 Latest commit 8135aac
🔍 Latest deploy log https://app.netlify.com/sites/brilliant-pasca-3e80ec/deploys/65d73113bfb113000824061d
😎 Deploy Preview https://deploy-preview-2299--brilliant-pasca-3e80ec.netlify.app/telemetry/integrating
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@github-actions github-actions bot added type: tests Testing related pkg: backend Changes in the backend package. type: source Changes business logic type: localenv Local playground type: documentation (archived) Improvements or additions to documentation labels Dec 23, 2023
@beniaminmunteanu beniaminmunteanu self-assigned this Dec 25, 2023
@github-actions github-actions bot added the pkg: documentation Changes in the documentation package. label Jan 9, 2024
localenv/collector/otel-collector-config.yaml Outdated Show resolved Hide resolved
packages/backend/src/config/app.ts Outdated Show resolved Hide resolved
packages/backend/src/index.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@mkurapov mkurapov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking solid, thank you for adding those docs!

packages/backend/src/accounting/psql/service.ts Outdated Show resolved Hide resolved
localenv/cloud-nine-wallet/docker-compose.yml Outdated Show resolved Hide resolved
packages/backend/src/telemetry/transaction-amount.ts Outdated Show resolved Hide resolved
packages/backend/src/telemetry/mocks.ts Outdated Show resolved Hide resolved
packages/backend/src/telemetry/transaction-amount.ts Outdated Show resolved Hide resolved
packages/backend/src/telemetry/transaction-amount.ts Outdated Show resolved Hide resolved
@beniaminmunteanu beniaminmunteanu marked this pull request as ready for review January 30, 2024 06:20
@beniaminmunteanu
Copy link
Member Author

beniaminmunteanu commented Jan 30, 2024

new docs section added for integrating your own otel collector
Please have a look.

Copy link
Contributor

@mkurapov mkurapov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will get back to this!

@njlie
Copy link
Contributor

njlie commented Jan 30, 2024

How does one get access to the Grafana dashboard? I tried signing in with an account created using my Github account and I wasn't able to see it.

@beniaminmunteanu
Copy link
Member Author

beniaminmunteanu commented Jan 31, 2024

How does one get access to the Grafana dashboard? I tried signing in with an account created using my Github account and I wasn't able to see it.

Me | Sarah need to add your emails to the grafana cloud instance. Unfortunately since AMP wasn't a good fit for us, we can't use the already existing AWS accounts.

Our subscription is limited to 3 users now. So I just added you for now, (nathan@interledger.org) If you want to see the grafana admin side, but it will be just you for now, since me and @JoblersTune are already the other 2

another way to just view what it looks like is through the public dashboard
If you checkout locally and use local playground to do transactions, you should see the activity on the public dashboard

packages/backend/src/telemetry/transaction-amount.ts Outdated Show resolved Hide resolved
packages/backend/src/telemetry/service.ts Outdated Show resolved Hide resolved
packages/backend/src/telemetry/transaction-amount.ts Outdated Show resolved Hide resolved
packages/backend/src/app.ts Outdated Show resolved Hide resolved
packages/backend/src/tests/telemetry.ts Outdated Show resolved Hide resolved
packages/backend/src/telemetry/privacy.test.ts Outdated Show resolved Hide resolved
packages/backend/src/telemetry/service.ts Outdated Show resolved Hide resolved
mkurapov
mkurapov previously approved these changes Feb 15, 2024
Copy link
Contributor

@mkurapov mkurapov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think anything after can be added to separate PR

packages/backend/src/tests/telemetry.ts Outdated Show resolved Hide resolved
mkurapov
mkurapov previously approved these changes Feb 15, 2024
@sabineschaller
Copy link
Member

Tested it and works smoothly. Final question: Should this work on localenv?
@beniaminmunteanu @JoblersTune @AlexLakatos @mkurapov

@mkurapov
Copy link
Contributor

@sabineschaller you mean enabled by default on localenv, correct?

I think it'll be good to have it on at rafiki.money (of course) but I don't think that would be particularly useful on localenv

@sabineschaller
Copy link
Member

@mkurapov that's what I thought, too. It should only be enabled in production.

@beniaminmunteanu
Copy link
Member Author

@sabineschaller @mkurapov
Updated such that telemetry is still enabled by default, but localenv docker-compose files turn it off.

mkurapov
mkurapov previously approved these changes Feb 19, 2024
mkurapov
mkurapov previously approved these changes Feb 21, 2024
packages/backend/src/config/app.ts Outdated Show resolved Hide resolved
@beniaminmunteanu beniaminmunteanu merged commit 8100fa7 into main Feb 22, 2024
22 checks passed
@beniaminmunteanu beniaminmunteanu deleted the 1921/telemetry-instrument-rafiki branch February 22, 2024 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg: backend Changes in the backend package. pkg: documentation Changes in the documentation package. type: documentation (archived) Improvements or additions to documentation type: localenv Local playground type: source Changes business logic type: tests Testing related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Telemetry] - Instrument Rafiki
9 participants