A custom exporter that collects traces from Open Telekom Cloud CloudTrace service and loads them as graph in a Neo4j database.
Cloud Trace Service (CTS) is an effective monitoring tool that allows users to analyze their cloud resources using traces. A tracker is automatically generated when the service is started and monitors access to all the respective user’s cloud resources using the generated traces. The monitoring logs can be saved long-term and cost-effectively in the Object Storage Service (OBS). The CTS can also be used in conjunction with Simple Message Notification (SMN), allowing the user to receive a message when certain events occur.
This custom exporter is taking a different route. It's utilizing Knative Eventing
to create a custom source (cts_exporter) that collects traces from CTS and forwards them, as Cloud Events
to an agnostic sink, defined by an environment variable called K_SINK
, as is required by Knative Eventing specifications
for interconnecting microservices. In addition to cts_exporter, a custom sink (neo4j_sink) that listens for those
Cloud Events is provided, which loads these events in a Neo4j database as graphs. You could positively bind the cts_exporter
to any other sink that conforms to Knative specifications. You can find an example in the repo that uses
gcr.io/knative-releases/knative.dev/eventing/cmd/event_display as a target sink. That is a demo Knative Eventing Service that
simply logs the events in the os.Stdout
.
Neo4j is a highly acclaimed graph database management system developed by Neo4j, Inc. Unlike traditional relational databases that store data in tables, Neo4j is designed around the concept of storing and managing data as nodes and relationships. This structure is particularly well-suited for handling complex and interconnected data, making it easier to model, store, and query relationships directly.
Graph databases like Neo4j are based on graph theory and use graph structures with nodes, edges, and properties to represent and store data. In this context:
- Nodes represent entities (such as subjects, actions, resources, tenants & regions in the context of CloudTrace domain).
- Relationships provide directed, named connections between nodes. These relationships can also have properties that provide more context about the connection (such as who performed an action, on which resource this action was performed, in which tenant is this resource is member, in which region is this tenant located)
- Properties are key-value pairs attached to nodes and relationships, allowing for the storage of additional information about those elements (such as unique identifiers for nodes, tenant and domain identifiers, subjects name etc)
The graph generated for every CloudTrace record can be summarized by the following domain object:
An ACTION (login, logout, start an ECS instance etc) is PERFORMED_BY a SUBJECT (user, agent etc) and is APPLIED_ON a RESOURCE (ECS instance, CCE cluster etc) resulting WITH_STATUS either NORMAL, WARNING or INCIDENT depending on the outcome of this ACTION. The RESOURCE is MEMBER_OF a TENANT which is LOCATED_AT a specific REGION. The central element of this domain model is the ACTION.
Terms in BOLD signify a Node and those in ITALICS signify a Relationship.
Neo4j is widely used in various applications that require efficient analysis and querying of complex networks of data. Examples include social networks, recommendation engines, fraud detection, network and IT operations, and more. It offers a powerful query language called Cypher, specifically designed for working with graph data, enabling users to intuitively and efficiently retrieve and manipulate data within a graph structure.
Use the clouds.tpl
as a template, and fill in a clouds.yaml
that contains all the relevant auth information for your connecting
to your Open Telekom Cloud Tenant. cts_exporter requires the presence of this file.
clouds:
otc:
profile: otc
auth:
username: '<USER_NAME>'
password: '<PASSWORD>'
ak: '<ACCESS_KEY>'
sk: '<SECRET_KEY>'
project_name: 'eu-de_<PROJECT_NAME>
user_domain_name: 'OTC0000000000xxxxxxxxxx'
auth_url: 'https://iam.eu-de.otc.t-systems.com:443/v3'
interface: 'public'
identity_api_version: 3
Caution
clouds.yaml is already added to .gitignore, so there is no danger leaking its sensitive contents in public!
Additionally, you need to set the following environment variables for cts_exporter:
OS_CLOUD
the cloud profile you want to choose from your cloud.yaml fileOS_DEBUG
whether you want to swap to debug mode, defaults tofalse
CTS_TRACKER
the CTS tracker you want to hook on, default tosystem
CTS_FROM
an integer value in minutes, that signifies how long in the past to look for traces and the interval between two consecutive queries, defaults to5
CTS_X_PNP
whether you want to push the collected traces to a sink, defaults totrue
Important
There are two additional environment variables, that need to be addressed separately, and those are:
K_SINK
the URL of the resolved sinkK_CE_OVERRIDES
a JSON object that specifies overrides to the outbound event
If you choose to deploy cts_exporter as a plain Kubernetes Deployment
, for test reasons, using deploy/manifests/cloudtrace-exporter-deployment.yaml
you need to explicitly set the value of K_SINK
yourself.
This will not unfold the whole functionality, because the resource will be deployed outside of the realm of responsibility
of Knative reconcilers. As mentioned again, this is exclusively for quick test purposes.
If you deploy cts_exporter as a ContainerSource
or SinkBinding
, Knative will take care of the rest and inject in
your container an environment variable named K_SINK
by itself.
For neo4j_sink you need to set the following environment variables:
NEO4J_URI
the Neo4j connection uri for your instance, defaults toneo4j://localhost:7687
NEO4J_USER
the username to use for authenticationNEO4J_PASSWORD
the password to use for authentication
Note
At the moment, the client wrapper around Neo4j driver, built in neo4j_sink, is supporting only Basic Auth.
The project is coming with a Makefile
that takes care of everything for you, from building (using ko;
neither a Dockerfile
is needed nor docker registries to push the generated container images) to deployment on a
Kubernetes cluster. Only thing you need, if you not working inside the provided Dev Container, is to have a Kubernetes
cluster in place, already employed with Knative Serving & Eventing artifacts and a Neo4j database instance, whose
endpoints are reachable from your Kubernetes pods.
Before deploying anything, you need to define:
-
the values of cts_exporter environment variables in
deploy/manifests/cloudtrace-exporter-configmap.yaml
e.g:apiVersion: v1 kind: ConfigMap metadata: name: cloudtrace-exporter-config namespace: default data: OS_CLOUD: "otc" OS_DEBUG: "false" CTS_X_PNP: "true" CTS_FROM: "1"
-
the values of neo4j_sink environment variables in
deploy/manifests/cloudtrace-neo4j-sink-secrets.yaml
e.g:apiVersion: v1 kind: ConfigMap metadata: name: cloudtrace-exporter-config namespace: default data: OS_CLOUD: "otc" OS_DEBUG: "false" CTS_X_PNP: "true" CTS_FROM: "1"
You can (re)deploy the configuration (ConfigMaps
and Secrets
) of all workloads using one target:
make install-configuration
Note
The targets below will rebuild all the container images from code, redeploy configuration and then deploy our custom exporter and sink.
As mentioned earlier, you are given two options as how to deploy cts_exporter as a Knative workload; either as a
ContainerSource
:
make install-containersource
or as a SinkBinding
:
make install-sinkbinding
Important
neo4j_sink will be deployed as a Knative Service
, and its endpoint will serve as the value of K_SINK
environment
variable that cts_explorer will push the collected CloudEvents to.
make uninstall
Development comes as well with "batteries included". You can either go ahead and start debugging straight on your local
machine, or take advantage of the .devcontainer.json
file that can be found in the repo, that instructs any IDE that
supports Dev Containers, to set up an isolated containerized environment for you with a Neo4j database included.
Working on your plain local host machine (no remote containers), requires the following:
- Assign values to the environment variables for both binaries, as mentioned earlier in this document
- Provide a Neo4j database instance. You can choose among a simple container, a Kubernetes workload or even the new Neo4j Desktop
- Have a Kubernetes cluster, already set up for Knative Serving & Eventing.
A Dev Container will be created, with all the necessary prerequisites to get you started developing immediately. A
container, based on mcr.microsoft.com/devcontainers/base:jammy
will be spawned with the following features pre-installed:
- Resource Monitor
- Git, Git Graph
- Docker in Docker
- Kubectl, Helm, Helmfile, K9s, KinD, Dive
- Bridge to Kubernetes Visual Studio Code Extension
- Latest version of Golang
A postCreateCommand
(.devcontainer/setup.sh) will provision:
- A containerized Kubernetes cluster with 1 control and 3 worker nodes and a private registry, using KinD (cluster manifest is in .devcontainer/cluster.yaml)
- A standalone Neo4j cluster (you can change that and get a HA cluster by increasing the value of
minimumClusterSize
in .devcontainer/overrides.yaml) - the necessary resources for the Knative Serving & Eventing infrastructure
You can access Neo4j either internally within the cluster or externally from your container or from your local host.
If you want to access Neo4j internally from another pod of the cluster, you just need to consume the Kubernetes Service
endpoint which in our setup would be neo4j://n4j-cluster.n4j-lb-neo4j.service.cluster.local
You need, as long as you are working with Visual Studio Code,
to forward the 3 ports (7473
, 7474
and 7687
) exposed from the n4j-cluster-lb-neo4j Service, so your Neo4j
database is accessible from your Dev Container environment.
Tip
You can just port-forward the Kubernetes Service ports straight from K9s, in an integrated Visual Studio Code terminal, and Visual Studio Code will pick up automatically those ports and forward them to your local machine.