Skip to content

Commit

Permalink
Merge branch 'main' into collector_architecture
Browse files Browse the repository at this point in the history
  • Loading branch information
tiffany76 authored Feb 28, 2024
2 parents d9858b7 + aa893b3 commit 64865d0
Show file tree
Hide file tree
Showing 35 changed files with 714 additions and 127 deletions.
2 changes: 1 addition & 1 deletion .gitmodules
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[submodule "themes/docsy"]
path = themes/docsy
url = https://github.com/cncf/docsy.git
docsy-pin = v0.9.0
docsy-pin = v0.9.1
docsy-reminder = "Ensure that all tags from google/docsy are also present in cncf/docsy, otherwise add (push) them."
[submodule "content-modules/opentelemetry-specification"]
path = content-modules/opentelemetry-specification
Expand Down
2 changes: 1 addition & 1 deletion content/en/blog/2023/end-user-discussions-03.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@ agent to the host metrics receiver for infrastructure monitoring.

**A:** It depends on the use cases:

- [Auto instrumentation](/docs/concepts/instrumentation/automatic/) options are
- [Auto instrumentation](/docs/concepts/instrumentation/zero-code/) options are
maturing in OTel; for example, the Java JAR agent takes care of instrumenting
[most libraries](https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/docs/supported-libraries.md#libraries--frameworks)
that are used by applications. Auto-instrumentation is also available for
Expand Down
2 changes: 1 addition & 1 deletion content/en/blog/2023/logs-collection/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -425,7 +425,7 @@ extend Yoda's code to do the following:
1. Once you have traces and logs ingested in a backend, try to correlate these
two telemetry signal types in the backend along with a frontend such as
Grafana.
1. Use [Automatic Instrumentation](/docs/concepts/instrumentation/automatic/) to
1. Use [Automatic Instrumentation](/docs/concepts/instrumentation/zero-code/) to
further enrich telemetry.

The community is currently working on the [Events API
Expand Down
109 changes: 109 additions & 0 deletions content/en/blog/2024/demo-skyscanner/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
---
title:
"Making observability fun: How we increased engineers' confidence in incident
management using a game"
linkTitle: Skyscanner using OTel Demo
date: 2024-02-26
author: >-
[Jordi Bisbal Ansaldo](https://github.com/jordibisbal8) (Skyscanner)
cSpell:ignore: Ansaldo Bisbal Jordi runbooks Skyscanner upskilled Yankova
---

At [Skyscanner](https://www.skyscanner.net), as in many organizations, teams
tend to follow specific runbooks for individual failure modes. With modern and
complex distributed systems, this has the downside of most of the errors being
unknowns, which makes runbooks only partially applicable.

After migrating our telemetry data to the OpenTelemetry standards at Skyscanner,
we now have richer instrumentation and can rely on observability directly. As a
result, we are ready to adopt a new
[observability mindset](https://charity.wtf/2019/09/20/love-and-alerting-in-the-time-of-cholera-and-observability/),
which requires training our engineers to work effectively with the new
ecosystem. This allows them to react efficiently to any known or unknown issues,
even under pressure.

To achieve this, we believe that the best way to gain knowledge isn’t through
one-time viewings of documents or videos. Instead, it’s through practical
exercises that include situations with never-before-seen (or at least rarely
seen) problems. This helps the company reduce the time to mitigate an issue
(TTM), which starts when a first responder acknowledges the incident, until
users stop suffering from the incident.

## Environment

To begin with, we need to set up an environment that demonstrates the best
practices for monitoring and debugging using OpenTelemetry instrumentation and
observability. For this, we propose the use of the official
[OpenTelemetry Demo](/docs/demo/), which is a realistic example of a distributed
system called Astronomy Shop. Thanks to the
[OpenTelemetry Protocol](/docs/specs/otlp/) (OTLP), it allows us to simply point
the standard OTLP exporter in the Collector to
[New Relic](https://newrelic.com/), our chosen observability platform at
Skyscanner which, like other platforms, is fully embracing open standards to
ingest telemetry data.

This system contains regressions that can be injected into the platform and
helps us demonstrate the importance of Service Levels Objectives (SLOs),
tracing, logs, metrics, etc. For instance, we can observe traffic flow through
various components, as shown in the image below. Since part of the OpenTelemetry
ecosystem is open source, we can easily introduce any new features that will be
reviewed by OpenTelemetry contributors.

![Distributed tracing example in Astronomy shop](tracing-example.png)

## Observability game day

Once the environment is set up, we can introduce the Observability Game Day, an
initiative based on the Wheel of Misfortune practices that Google uses and
describes in the [Site Reliability Engineering book](https://sre.google/books/).

This game simulates a production incident, where a moderator known as the game
master (GM) conducts the session and someone from the audience spins the wheel
and explains an incident or outage. The participants are then divided into teams
and tasked with identifying and resolving the issue as quickly as possible. If
the solution is not optimal, the GM can help by introducing a new tool or view,
which gives a different perspective on how to tackle the incident (knowledge
sharing). This exercise can be repeated multiple times for different incidents.

![Wheel of misfortune example](wheel.png)

## Results

The Observability Game Day has already been completed by multiple Skyscanner
teams, where each team observability expert (ambassador) runs the session. The
participants have given extremely positive feedback, where 90% of the responders
say that after the Game Day, they feel more confident debugging production
systems and would love to have further sessions.

- Hugely valuable to run against real services and to compare and contrast
different debugging methods. I'm certain everyone, regardless of skill level,
will have got something out of the session - I know I did! Thank you for
taking the time to set this up and promoting it for us -
[Dominic Fraser](https://github.com/dominicfraser) (Senior Software Engineer)
- It is a really great (company-wide) initiative to get people upskilled in
observability and OpenTelemetry/New Relic and I personally found it very
useful, as well as a lot of fun! :D - Polly Yankova (Software Engineer)

In addition, we learned that:

1. OTLP makes it incredibly simple to integrate a standard application with an
observability vendor. Just point it to the right endpoint and job done.
2. Our winning teams relied primarily on tracing data to analyze regressions
that helped them understand the root cause faster. Tracing FTW!
3. Front-end engineers found the Game Day lacked focus on client-side
observability, so we decided to contribute upstream (see next steps below).
This was my first contribution to the project, and it was a great experience!
Maintainers were very welcoming and helped me to test and release. Thanks!

## Next steps

The next action is to run sessions for all the engineering teams in the company
and convert them into a Skyscanner learning course. This way, the content can be
used during the onboarding process for new joiners or even reviewed at any time
as a refresher for those who have been in the company longer. In addition, after
observing common feedback, we identified that it would be beneficial to extend
the current incidents to include more front-end-specific ones, such as incidents
triggered by browser traffic. To achieve this, we have contributed to the
OpenTelemetry Demo and enabled these features for other interested parties. For
more information, please have a look at the
[raised PR](https://github.com/open-telemetry/opentelemetry-demo/pull/1345).
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added content/en/blog/2024/demo-skyscanner/wheel.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
147 changes: 147 additions & 0 deletions content/en/blog/2024/kubecon-eu.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
---
title:
Join us for OpenTelemetry Talks and Activities at KubeCon + CloudNativeCon
Europe 2024
linkTitle: KubeCon EU '24
date: 2024-02-28
# prettier-ignore
cSpell:ignore: Aiven Alexandre Anusha Arbiv Beemer Benedikt Blanco Bongartz Chekuri Coralogix Cosmonic Dyrmishi Jiekun Joonas Kanal Kolachala Kowall Machado Magno Marcin Matej Mirabella Narapureddy Nenashev Oleg Oluwalolope Outshift Pismo Purvi Quwan Reddy Ridwan Rollouts Ryanair Skyscanner Sodkiewicz Soluções Srikanth Tecnológicas Yosef
author: '[Severin Neumann](https://github.com/svrnm) (Cisco)'
---

The OpenTelemetry project maintainers, members of the governance committee, and
technical committee are thrilled to be at [KubeCon + CloudNativeCon Europe][]
and at the co-located
[Observability Day](https://events.linuxfoundation.org/kubecon-cloudnativecon-europe/co-located-events/observability-day/)
in Paris from March 19 - 22, 2024.

Read on to learn about all the things related OpenTelemetry during KubeCon.

This post may be updated as we receive notice of other activities, please check
it again right before KubeCon!

## KubeCon Talks and Maintainer Sessions

- **[OpenTelemetry: Project Updates, Next Steps, and AMA](https://sched.co/1R2mK)**<br>
by Severin Neumann, Cisco; Austin Parker, Honeycomb; Trask Stalnaker,
Microsoft; Daniel Gomez Blanco, Skyscanner; Alolita Sharma, Apple<br>
Wednesday, March 20 • 11:15 - 11:50
- **[Distributed Tracing with Jaeger and OpenTelemetry](https://sched.co/1YhfT)**<br>
by Pavol Loffay, Red Hat & Jonah Kowall, Aiven<br> Wednesday, March 20 •
12:10 - 12:45
- **[Disintegrated Telemetry: The Pains of Monitoring Asynchronous Workflows](https://sched.co/1YeNV)**<br>
by Johannes Tax, Grafana Labs<br> Wednesday, March 20 • 16:30 - 17:05
- **[From RUM to Front-End Observability with OpenTelemetry](https://sched.co/1YeOH)**<br>
by Purvi Kanal, Honeycomb<br> Thursday, March 21 • 11:00 - 11:35
- **[Tutorial: Exploring the Power of Distributed Tracing with OpenTelemetry on Kubernetes](https://sched.co/1YePA)**<br>
by Pavol Loffay & Benedikt Bongartz, Red Hat; Matej Gera, Coralogix; Anthony
Mirabella, AWS; Anusha Reddy Narapureddy, Apple<br> Thursday, March 21 •
14:30 - 16:00
- **[Prometheus and OpenTelemetry: Better Together](https://sched.co/1YePz)**<br>
by Adriana Villela, ServiceNow Cloud Observability & Reese Lee, New Relic<br>
Thursday, March 21 • 16:30 - 17:05
- **[Observable Feature Rollouts with OpenTelemetry and OpenFeature](https://sched.co/1YeSC)**<br>
by Daniel Dyla & Michael Beemer, Dynatrace<br> Friday, March 22 • 16:00 -
16:35

## Observability Day

_[Observability Day][] fosters collaboration, discussion, and knowledge sharing
of cloud-native observability projects_. This event will be held on March 19,
2024 from 9:00 - 17:35. There will be several sessions on OpenTelemetry as well:

- **[Welcome + Project Updates](https://sched.co/1YGT9)**<br> by Eduardo Silva,
FluentBit & Austin Parker, honeycomb.io<br> Tuesday, March 19th • 09:00 -
09:20
- **[Dude, Where’s My Error?: How OpenTelemetry Records Errors, and Why It Does It Like That](https://sched.co/1YFeM)**<br>
by Adriana Villela, ServiceNow Cloud Observability (formerly Lightstep) &
Reese Lee, New Relic<br> Tuesday, March 19th • 10:00 - 10:25
- **[How to Think About Instrumentation Overhead](https://sched.co/1YFfb)**<br>
by Jason Plumb, Splunk<br> Tuesday, March 19th • 11:05 - 11:30
- **[TTChat’s Story: Connect Metrics, Logs and Traces with eBPF](https://sched.co/1YFfe)**<br>
by Zhu Jiekun, Quwan<br> Tuesday, March 19th • 11:05 - 11:30
- **[Panel: OpenTelemetry: Realizing the Value of Open Standards](https://sched.co/1YFgW)**<br>
by Daniel Gomez Blanco, Skyscanner; Marcin Sodkiewicz, Ryanair; Iris Dyrmishi,
Miro; Hope Oluwalolope, Microsoft<br> Tuesday, March 19th • 12:15 - 12:50
- **[Telemetry Showdown: Fluent Bit Vs. OpenTelemetry Collector - a Comprehensive Benchmark Analysis](https://sched.co/1YFhI)**<br>
by Henrik Rexed, Dynatrace<br> Tuesday, March 19th • 13:30 - 13:55
- **[Monitoring Serverless Workloads with OpenTelemetry and Prometheus](https://sched.co/1YFhh)**<br>
by Ridwan Sharif, Google<br> Tuesday, March 19th • 14:05 - 14:30
- **[Observability at the Edge: Instrumenting WebAssembly with OpenTelemetry](https://sched.co/1YFik)**<br>
by Dan Norris & Joonas Bergius, Cosmonic<br> Tuesday, March 19th • 15:15 -
15:40
- **[Real-World Sampling – Lessons Learned After Reducing ~80% of Our O11y Costs](https://sched.co/1YFii)**<br>
by Juraci Paixão Kröhling, Grafana Labs & Alexandre Magno Prado Machado, Pismo
Soluções Tecnológicas<br> Tuesday, March 19th • 15:15 - 15:40
- **[⚡ Lightning Talk: Not Just Enterprise. Modern Java App CI/CD Observability with OTel, Quarkus and Gradle](https://sched.co/1YFin)**<br>
by Oleg Nenashev, WireMock<br> Tuesday, March 19th • 15:45 - 15:50
- **[Shift Into an Observability Mindset with OpenTelemetry](https://sched.co/1YFjB)**<br>
by Daniel Gomez Blanco, Skyscanner<br> Tuesday, March 19th • 15:45 - 16:15
- **[⚡ Lightning Talk: Federated Search Over Distributed Observability Data](https://sched.co/1YFjC)**<br>
by Kalyan Kolachala, Intuit<br> Tuesday, March 19th • 15:55 - 16:00
- **[⚡ Lightning Talk: Application Security Through the Lens of OpenTelemetry - Yosef Arbiv, Outshift by Cisco](https://sched.co/1YFf5)**<br>
by Kalyan Kolachala, Intuit<br> Tuesday, March 19th • 16:05 - 16:10
- **[Lazy Robots: Telemetry Buffering on Android](https://sched.co/1YFk3)**<br>
by Cesar Munoz, Elastic & Jason Plumb, Splunk<br> Tuesday, March 19th •
17:00 - 17:25
- **[OpAMP in Action: User Configurable Observability Pipelines](https://sched.co/1YFk6)**<br>
by Srikanth Chekuri, SigNoz<br> Tuesday, March 19th • 17:00 - 17:25

{{% alert title="Important access note" color="danger" %}}

You need an _in-person all-access_ pass for on-site access to **Observability
Day**. For details, see [KubeCon registration][]. If you have a virtual ticket,
you will be able to follow **Observability Day** through a live stream.

[kubecon registration]:
https://events.linuxfoundation.org/kubecon-cloudnativecon-europe/register/

{{% /alert %}}

## OpenTelemetry Observatory

Drop by and say _"Hi!"_ at OpenTelemetry Observatory presented by Splunk in the
Expo Hall. This will be a place for informal chats, meetups, and other
discussions led by OpenTelemetry community members and maintainers. Check out
the schedule of activities [here](https://shorturl.at/qEUX1).

If you’d like to participate and lead a discussion or short presentation, out to
the
[OpenTelemetry End User Working Group](https://cloud-native.slack.com/archives/C01RT3MSWGZ)
to indicate your interest.

You can help us improve the project by sharing your thoughts and feedback about
your OpenTelemetry adoption, implementation, and usage.

To join a feedback session, book online below:

- [End User Feedback Sessions 1](https://calendly.com/otel-euwg/end-user-feedback-sessions-1?month=2024-03)
- [End User Feedback Sessions 2](https://calendly.com/otel-euwg/end-user-feedback-sessions-2?month=2024-03)
- [End User Feedback Sessions 3](https://calendly.com/otel-euwg/end-user-feedback-sessions-3?month=2024-03)
- [End User Feedback Sessions 4](https://calendly.com/otel-euwg/end-user-feedback-sessions-4?month=2024-03)
- [End User Feedback Sessions 5](https://calendly.com/otel-euwg/end-user-feedback-sessions-5?month=2024-03)

A maximum of 5 participants will join one SIG maintainer to provide feedback for
that SIG. Sessions will be recorded and posted on the
[OTel YouTube channel](https://youtube.com/@otel-official). The final SIG list
is still TBD, so check back here often!

We will create action items from your comments as appropriate. Check
[#otel-user-research][] in CNCF's Slack instance for results and action item
updates to come after KubeCon EU.

Back by popular demand! We'll be recording
[Humans of OTel interviews](/blog/2023/humans-of-otel/) at the OTel Observatory.
If you'd like to share your experiences as an OpenTelemetry practitioner or
maintainer, sign up for an interview session
[here](https://calendly.com/otel-euwg/humans-of-otel).

Come join us to listen, learn, and get involved in OpenTelemetry.

See you in Paris!

[#otel-user-research]: https://cloud-native.slack.com/archives/C01RT3MSWGZ
[KubeCon + CloudNativeCon Europe]:
https://events.linuxfoundation.org/kubecon-cloudnativecon-europe/
[Observability Day]:
https://events.linuxfoundation.org/kubecon-cloudnativecon-europe/co-located-events/observability-day/
27 changes: 27 additions & 0 deletions content/en/docs/collector/transforming-telemetry.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,33 @@ processors:
k8sattributes/default:
```
## Setting a span status
**Processor**:
[transform processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/transformprocessor)
Use the transform processor to set a span's status. The following example sets
the span status to `Ok` when the `http.request.status_code` attribute is 400:

<!-- prettier-ignore-start -->

```yaml
transform:
error_mode: ignore
trace_statements:
- context: span
statements:
- set(status.code, STATUS_CODE_OK) where attributes["http.request.status_code"] == 400
```

<!-- prettier-ignore-end -->

You can also use the transform processor to modify the span name based on its
attributes or extract span attributes from the span name. For examples, see an
example
[config file](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/9b28f76c02c18f7479d10e4b6a95a21467fd85d6/processor/transformprocessor/testdata/config.yaml)
file for the transform processor.

## Advanced Transformations

More advanced attribute transformations are also available in the
Expand Down
6 changes: 3 additions & 3 deletions content/en/docs/concepts/components.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ OpenTelemetry is currently made up of several main components:
- [Language-specific API \& SDK implementations](#language-specific-api--sdk-implementations)
- [Instrumentation Libraries](#instrumentation-libraries)
- [Exporters](#exporters)
- [Automatic Instrumentation](#automatic-instrumentation)
- [Zero-Code Instrumentation](#zero-code-instrumentation)
- [Resource Detectors](#resource-detectors)
- [Cross Service Propagators](#cross-service-propagators)
- [Sampler](#sampler)
Expand Down Expand Up @@ -75,7 +75,7 @@ For more information, see

{{% docs/languages/exporters/intro %}}

### Automatic Instrumentation
### Zero-Code Instrumentation

If applicable a language specific implementation of OpenTelemetry will provide a
way to instrument your application without touching your source code. While the
Expand All @@ -84,7 +84,7 @@ OpenTelemetry API and SDK capabilities to your application. Additionally they
may add a set of Instrumentation Libraries and exporter dependencies.

For more information, see
[Instrumenting](/docs/concepts/instrumentation/automatic/).
[Zero-Code Instrumentation](/docs/concepts/instrumentation/zero-code/).

### Resource Detectors

Expand Down
Loading

0 comments on commit 64865d0

Please sign in to comment.