Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negate condition is not working for multiple attributes in filter processor #34605

Open
meSATYA opened this issue Aug 12, 2024 · 16 comments
Open
Labels
bug Something isn't working processor/filter Filter processor

Comments

@meSATYA
Copy link

meSATYA commented Aug 12, 2024

Component(s)

processor/filter

What happened?

Description

I am using filter processor to drop the spans which doesn't match to some environments.
For example, I want to drop the spans where below 3 conditions should match.
attributes["checkpoint.log"] == nil and resource.attributes["platform_env"] != "apr" or resource.attributes["platform.env"] != "approval-eu"

But, it is not working as expected and also it is also filtering out span where platform.env is approval-nar,live-eu or live-nar along with the mentioned environments and it is dropping spans where resource.attributes["platform_env"] != "apr"

Steps to Reproduce

Expected Result

It should only filter out the spans having span attribute checkpoint.log and the resource attribute matches platform_env as apr and the resource attribute matches platform.env as approval-eu.

Actual Result

It is filtering out span where platform.env is approval-nar,live-eu or live-nar along with the mentioned environments and it is dropping spans where resource.attributes["platform_env"] != "apr"

Collector version

0.102.1

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

receivers:
  otlp:
    protocols:
      http:
        endpoint: ${env:MY_POD_IP}:4318
processors:
  filter/checkpoint-traces-preprod:
        error_mode: ignore
        traces:
          span:
            - attributes["checkpoint.log"] == nil and (resource.attributes["platform_env"] != "apr" or resource.attributes["platform.env"] != "approval-eu")

exporter:
  debug/checkpoint:
    verbosity: detailed


pipelines:
  traces/checkpoint-egress-preprod:
          exporters:
          - debug/checkpoint
          processors:
          - filter/checkpoint-traces-preprod
          - memory_limiter
          - batch
          receivers:
          - otlp

Log output

No response

Additional context

No response

@meSATYA meSATYA added bug Something isn't working needs triage New item requiring triage labels Aug 12, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the processor/filter Filter processor label Aug 12, 2024
@iblancasa
Copy link
Contributor

iblancasa commented Aug 12, 2024

Try increasing the log level to debug the issue: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/filterprocessor#troubleshooting

Maybe the problem is with the first condition.

@TylerHelmuth
Copy link
Member

@meSATYA and has precedence over or so I believe

attributes["checkpoint.log"] == nil and resource.attributes["platform_env"] != "apr" or resource.attributes["platform.env"] != "approval-eu"

is evaluated as

(attributes["checkpoint.log"] == nil and resource.attributes["platform_env"] != "apr") or resource.attributes["platform.env"] != "approval-eu"

Try:

attributes["checkpoint.log"] == nil and (resource.attributes["platform_env"] != "apr" or resource.attributes["platform.env"] != "approval-eu")

@iblancasa
Copy link
Contributor

@TylerHelmuth that's what he wrote in the OpenTelemetry configuration shared in the first comment.

@TylerHelmuth
Copy link
Member

Then I agree with your assessment to enable debug logging. The other option is to add a debug exporter with verbosity: detailed.

This issues often end up being caused by a mismatch between how the user thinks the data looks like and how the collector is actually seeing the data. Likely an attribute is named differently or is located somewhere else in the data hierarchy (like on a resource instead of a span or vise versa).

@meSATYA
Copy link
Author

meSATYA commented Aug 12, 2024

I enabled debug logging and here is the log snippet for few spans. As you can find here platform.env: Str(live-eu) is also exported to debug exporter.

2024-08-12 19:52:19.103	ScopeSpans SchemaURL: 
2024-08-12 19:52:19.103	     -> http.response.content.length: Int(8402)
2024-08-12 19:52:19.103	InstrumentationScope OpenTelemetry.Instrumentation.AspNetCore 1.0.0.0
2024-08-12 19:52:19.103	     -> http.response.content.type: Str(application/json; charset=utf-8)
2024-08-12 19:52:19.103	Span #0
2024-08-12 19:52:19.103	     -> RouteTemplate: Str(RegistrationService|GetuserByUuidAsync|)
2024-08-12 19:52:19.103	    Trace ID       : 7e396e9dd15fcf5a822c08e53708e15b
2024-08-12 19:52:19.103	    Parent ID      : 
2024-08-12 19:52:19.103	     -> _MS.ProcessedByMetricExtractors: Str((Name: X,Ver:'1.1'))
2024-08-12 19:52:19.103	    ID             : c19fb6239686ec52
2024-08-12 19:52:19.103	    Name           : GET
2024-08-12 19:52:19.103	     -> transfer_method: Str(insecure)
2024-08-12 19:52:19.103	    Kind           : Server
2024-08-12 19:52:19.103	    Start time     : 2024-08-12 19:52:16.8731499 +0000 UTC
2024-08-12 19:52:19.103	     -> http.dp_client_ip: Str(20.8.199.219)
2024-08-12 19:52:19.103	    End time       : 2024-08-12 19:52:16.9019822 +0000 UTC
2024-08-12 19:52:19.103	    Status code    : Ok
2024-08-12 19:52:19.103	Events:
2024-08-12 19:52:19.103	    Status message : 
2024-08-12 19:52:19.103	SpanEvent #0
2024-08-12 19:52:19.103	Attributes:
2024-08-12 19:52:19.103	     -> server.address: Str(northeurope)
2024-08-12 19:52:19.103	     -> Name: Checking if user already exists from cache and get it, remove cache. otherwise get from DB
2024-08-12 19:52:19.103	     -> server.port: Int(9912)
2024-08-12 19:52:19.103	     -> Timestamp: 2024-08-12 19:52:07.6756133 +0000 UTC
2024-08-12 19:52:19.103	     -> url.query: Str(?api-version=Redacted)
2024-08-12 19:52:19.103	     -> http.request.method: Str(GET)
2024-08-12 19:52:19.103	     -> DroppedAttributesCount: 0
2024-08-12 19:52:19.103	     -> url.scheme: Str(https)
2024-08-12 19:52:19.103	SpanEvent #1
2024-08-12 19:52:19.103	     -> url.path: Str(/api/registration/user/2669179c8db84b869c82f6868c5db5be/uuid)
2024-08-12 19:52:19.103	     -> network.protocol.version: Str(1.1)
2024-08-12 19:52:19.103	     -> Name: Removing user from cache
2024-08-12 19:52:19.103	     -> user_agent.original: Str(Java-http-client/11.0.23)
2024-08-12 19:52:19.103	     -> Timestamp: 2024-08-12 19:52:07.6769643 +0000 UTC
2024-08-12 19:52:19.103	     -> http.client_ip: Str(::ffff:20.223.85.168)
2024-08-12 19:52:19.103	     -> host.name: Str(generic00000O)
2024-08-12 19:52:19.103	     -> DroppedAttributesCount: 0
2024-08-12 19:52:19.103	     -> oid: Str(78f85355-989e-4dfe-9352-6e6c2bbb462e)
2024-08-12 19:52:19.103	     -> enduser.role: Str(Level.2.*,Registration.App.ReadWrite,Level.1.,.Level.3.*)
2024-08-12 19:52:19.103	SpanEvent #2
2024-08-12 19:52:19.103	     -> checkpoint.log: Str(get-useruuid-by-userid)
2024-08-12 19:52:19.103	     -> user.vid: Str(2669179c8db84b869c82f6868c5db5be)
2024-08-12 19:52:19.103	     -> Name: Getting all claims for a user
2024-08-12 19:52:19.103	     -> http.response.status_code: Int(200)
2024-08-12 19:52:19.103	     -> http.response.content.length: Int(35)
2024-08-12 19:52:19.103	     -> Timestamp: 2024-08-12 19:52:07.6777312 +0000 UTC
2024-08-12 19:52:19.103	     -> http.response.content.type: Str(application/json; charset=utf-8)
2024-08-12 19:52:19.103	     -> DroppedAttributesCount: 0
2024-08-12 19:52:19.103	     -> RouteTemplate: Str(RegistrationService|GetuserUuidByuserIdAsync|userId|)
2024-08-12 19:52:19.103	     -> _MS.ProcessedByMetricExtractors: Str((Name: X,Ver:'1.1'))
2024-08-12 19:52:19.103	Span #2
2024-08-12 19:52:19.103	     -> transfer_method: Str(insecure)
2024-08-12 19:52:19.103	     -> http.dp_client_ip: Str(20.8.199.213)
2024-08-12 19:52:19.103	    Trace ID       : 7502fe0700f88a66ec08e68c748ca98c
2024-08-12 19:52:19.103	Events:
2024-08-12 19:52:19.103	    Parent ID      : 9308b5039e840bbd
2024-08-12 19:52:19.103	SpanEvent #0
2024-08-12 19:52:19.103	     -> Name: Making request to get user details
2024-08-12 19:52:19.103	    ID             : 6c8353c76195822d
2024-08-12 19:52:19.103	     -> Timestamp: 2024-08-12 19:52:16.8823532 +0000 UTC
2024-08-12 19:52:19.103	    Name           : GET
2024-08-12 19:52:19.103	     -> DroppedAttributesCount: 0
2024-08-12 19:52:19.103	ResourceSpans #18
2024-08-12 19:52:19.103	    Kind           : Server
2024-08-12 19:52:19.103	Resource SchemaURL: 
2024-08-12 19:52:19.103	    Start time     : 2024-08-12 19:52:07.7083193 +0000 UTC
2024-08-12 19:52:19.103	Resource attributes:
2024-08-12 19:52:19.103	     -> telemetry.sdk.name: Str(opentelemetry)
2024-08-12 19:52:19.103	    End time       : 2024-08-12 19:52:07.7839798 +0000 UTC
2024-08-12 19:52:19.103	     -> telemetry.sdk.language: Str(dotnet)
2024-08-12 19:52:19.103	    Status code    : Ok
2024-08-12 19:52:19.103	     -> telemetry.sdk.version: Str(1.8.1)
2024-08-12 19:52:19.103	     -> deployment.environment: Str(live-eu)
2024-08-12 19:52:19.103	    Status message : 
2024-08-12 19:52:19.103	     -> host.name: Str(generic00000O)
2024-08-12 19:52:19.103	     -> service.name: Str(servicemanagement)
2024-08-12 19:52:19.103	Attributes:
2024-08-12 19:52:19.103	     -> service.version: Str(1.0.0+3e7fd537a75fa241c89ad47943ab7359dd479592)
2024-08-12 19:52:19.103	     -> server.address: Str(northeurope)
2024-08-12 19:52:19.103	     -> service.instance.id: Str(generic00000O)
2024-08-12 19:52:19.103	     -> os.type: Str(windows)
2024-08-12 19:52:19.103	     -> server.port: Int(9912)
2024-08-12 19:52:19.103	     -> os.description: Str(Microsoft Windows NT 10.0.20348.0)
2024-08-12 19:52:19.103	     -> os.name: Str(Windows)
2024-08-12 19:52:19.103	     -> url.query: Str(?api-version=Redacted)
2024-08-12 19:52:19.103	     -> os.version: Str(Windows NT 10.0.20348.0)
2024-08-12 19:52:19.103	     -> http.request.method: Str(GET)
2024-08-12 19:52:19.103	     -> git.commitid: Str(3e7fd537)
2024-08-12 19:52:19.103	     -> process.runtime.name: Str(.NET)
2024-08-12 19:52:19.103	     -> url.scheme: Str(https)
2024-08-12 19:52:19.103	     -> process.runtime.version: Str( 6.0.0-rtm.21522.10)
2024-08-12 19:52:19.103	     -> feature_flag.key.enableotelmetrics: Str(False)
2024-08-12 19:52:19.103	     -> url.path: Str(/api/registration/user/92f01f652b694232a31a9092b91f30ea/userServiceClaims)
2024-08-12 19:52:19.103	     -> feature_flag.key.etagenforcement: Str(False)
2024-08-12 19:52:19.103	     -> network.protocol.version: Str(1.1)
2024-08-12 19:52:19.103	     -> platform.env: Str(live-eu)

@TylerHelmuth
Copy link
Member

@meSATYA can you provide a unit tests or a sample of input data that could be used to reproduce this issue.

@meSATYA
Copy link
Author

meSATYA commented Aug 12, 2024

@TylerHelmuth , sure. Attached the export of the trace from Grafana Tempo where we store these traces. I don't have the input data to be particular as these events are generated from the services where otel is instrumented. Thanks.

Trace-08e15b-2024-08-12 20_08_58.json

@meSATYA
Copy link
Author

meSATYA commented Aug 12, 2024

I updated the filter to be more constraint about resource attribute or span attribute. But, there is no improvement and there is no event is exported where the attribute platform_env is present whereas we are receiving spans with checkpoint.log and platform_env attribute.

filter/checkpoint-traces-preprod:
        error_mode: ignore
        traces:
          span:
            - attributes["checkpoint.log"] == nil and (attributes["platform_env"] != "apr" or attributes["platform.env"] != "approval-eu" or resource.attributes["platform_env"] != "apr" or resource.attributes["platform.env"] != "approval-eu")

Screenshot of the debug logs with platform.env attribute.
image

@meSATYA
Copy link
Author

meSATYA commented Aug 12, 2024

I tried to reproduce the issue with some dummy and smaller trace data.

Here is the input data:

{
  "resourceSpans": [
    {
      "resource": {
        "attributes": [
          {
            "key": "service.name",
            "value": {
              "stringValue": "checkpoint-test"
            }
          },
          {
            "key":"platform.env",
            "value":{
              "stringValue":"prd"
            }
        }
        ]
      },
      "scopeSpans": [
        {
          "scope": {
            "name": "my.library",
            "version": "1.0.0",
            "attributes": [
              {
                "key": "my.scope.attribute",
                "value": {
                  "stringValue": "some scope attribute"
                }
              }
            ]
          },
          "spans": [
            {
              "traceId": "5B8EFFF798038103D269B633813FC60C",
              "spanId": "EEE19B7EC3C1B174",
              "parentSpanId": "EEE19B7EC3C1B173",
              "name": "I'm a server span",
              "startTimeUnixNano": "1723491695000000000",
              "endTimeUnixNano": "1723491699000000000",
              "kind": 2,
              "attributes": [
                {
                  "key": "my.span.attr",
                  "value": {
                    "stringValue": "some value"
                  }
                },
                {
                    "key": "checkpoint.log",
                    "value": {
                        "stringValue": "execute-command"
                    }
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}

Here is the filter used:

filter/checkpoint-traces-preprod:
    error_mode: ignore
    traces:
      span:
        - attributes["checkpoint.log"] == nil and (resource.attributes["platform_env"] != "apr" or resource.attributes["platform.env"] != "approval-eu")

Here is the log output:

2024-08-12T21:08:00.182Z	info	TracesExporter	{"kind": "exporter", "data_type": "traces", "name": "debug", "resource spans": 1, "spans": 1}
2024-08-12T21:08:00.183Z	info	ResourceSpans #0
Resource SchemaURL:
Resource attributes:
     -> service.name: Str(checkpoint-test)
     -> platform.env: Str(prd)
ScopeSpans #0
ScopeSpans SchemaURL:
InstrumentationScope my.library 1.0.0
InstrumentationScope attributes:
     -> my.scope.attribute: Str(some scope attribute)
Span #0
    Trace ID       : 5b8efff798038103d269b633813fc60c
    Parent ID      : eee19b7ec3c1b173
    ID             : eee19b7ec3c1b174
    Name           : I'm a server span
    Kind           : Server
    Start time     : 2024-08-12 19:41:35 +0000 UTC
    End time       : 2024-08-12 19:41:39 +0000 UTC
    Status code    : Unset
    Status message :
Attributes:
     -> my.span.attr: Str(some value)
     -> checkpoint.log: Str(execute-command)
	{"kind": "exporter", "data_type": "traces", "name": "debug"}

@meSATYA
Copy link
Author

meSATYA commented Aug 12, 2024

Hi @TylerHelmuth @iblancasa, I observed that the filter processor works only with one ottl negate condition, but not multiple.

filter/checkpoint-traces-preprod:
    error_mode: ignore
    traces:
      span:
        - attributes["checkpoint.log"] == nil
        - resource.attributes["platform_env"] != "apr"
2024-08-13 00:28:10 2024-08-12T22:28:10.980Z    info    TracesExporter  {"kind": "exporter", "data_type": "traces", "name": "debug", "resource spans": 1, "spans": 1}
2024-08-13 00:28:10 2024-08-12T22:28:10.980Z    info    ResourceSpans #0
2024-08-13 00:28:10 Resource SchemaURL: 
2024-08-13 00:28:10 Resource attributes:
2024-08-13 00:28:10      -> service.name: Str(checkpoint-test)
2024-08-13 00:28:10      -> platform_env: Str(apr)
2024-08-13 00:28:10 ScopeSpans #0
2024-08-13 00:28:10 ScopeSpans SchemaURL: 
2024-08-13 00:28:10 InstrumentationScope my.library 1.0.0
2024-08-13 00:28:10 InstrumentationScope attributes:
2024-08-13 00:28:10      -> my.scope.attribute: Str(some scope attribute)
2024-08-13 00:28:10 Span #0
2024-08-13 00:28:10     Trace ID       : 5b8efff798038103d269b633813fc60c
2024-08-13 00:28:10     Parent ID      : eee19b7ec3c1b173
2024-08-13 00:28:10     ID             : eee19b7ec3c1b174
2024-08-13 00:28:10     Name           : I'm a server span
2024-08-13 00:28:10     Kind           : Server
2024-08-13 00:28:10     Start time     : 2024-08-12 19:43:15 +0000 UTC
2024-08-13 00:28:10     End time       : 2024-08-12 19:43:19 +0000 UTC
2024-08-13 00:28:10     Status code    : Unset
2024-08-13 00:28:10     Status message : 
2024-08-13 00:28:10 Attributes:
2024-08-13 00:28:10      -> my.span.attr: Str(some value)
2024-08-13 00:28:10      -> checkpoint.log: Str(execute-command)
2024-08-13 00:28:10     {"kind": "exporter", "data_type": "traces", "name": "debug"}

@iblancasa
Copy link
Contributor

@meSATYA can you change the value of error_mode to see if there is any kind of error while processing the OTTL expression?

@meSATYA
Copy link
Author

meSATYA commented Aug 13, 2024

@iblancasa, I tried with ignore, silent, & propagate, but for none of the cases it logs anything to the console when two or more negate condition is used.

For example,

filter/checkpoint-traces-preprod:
    error_mode: ignore
    traces:
      span:
        - attributes["checkpoint.log"] == nil
        - resource.attributes["platform_env"] != "apr"
        - resource.attributes["platform_env"] != "prd"

@TylerHelmuth
Copy link
Member

@meSATYA in the test scenario you described here the span is not dropped because it has an attribute named checkpoint.log with a value of execute-command. The filterprocessor is configured to check for attributes["checkpoint.log"] == nil which is not true for the input data you provided, so that data is kept.

If I update the condition to attributes["checkpoint.log"] != nil, then the data is dropped:

2024-08-13T13:09:45.993-0600    debug   ottl@v0.107.0/parser.go:394     condition evaluation result     {"kind": "processor", "name": "filter/checkpoint-traces-preprod", "pipeline": "traces", "condition": "attributes[\"checkpoint.log\"] != nil and (resource.attributes[\"platform_env\"] != \"apr\" or resource.attributes[\"platform.env\"] != \"approval-eu\")", "match": true, "TransformContext": {"resource": {"attributes": {"service.name": "checkpoint-test", "platform.env": "prd"}, "dropped_attribute_count": 0}, "scope": {"attributes": {"my.scope.attribute": "some scope attribute"}, "dropped_attribute_count": 0, "name": "my.library", "version": "1.0.0"}, "span": {"attributes": {"my.span.attr": "some value", "checkpoint.log": "execute-command"}, "dropped_attribute_count": 0, "dropped_events_count": 0, "dropped_links_count": 0, "end_time_unix_nano": 1723491699000000000, "events": [], "kind": "Server", "links": [], "name": "I'm a server span", "parent_span_id": "eee19b7ec3c1b173", "span_id": "eee19b7ec3c1b174", "start_time_unix_nano": 1723491695000000000, "status.code": "Unset", "status.message": "", "trace_id": "5b8efff798038103d269b633813fc60c", "trace_state": ""}, "cache": {}}}

Likewise, when you split the statements into separate statements here, the data will fail the attributes["checkpoint.log"] == nil condition, but pass the resource.attributes["platform_env"] != "apr" condition, so it gets dropped.

@meSATYA
Copy link
Author

meSATYA commented Aug 13, 2024

Hi @TylerHelmuth, no I don't want to drop the span where it holds the attribute checkpoint.log. Rather, I want to export those spans where this attribute is present and its value is not null.

To simplify the scenario here, I want to export data where it matches below conditions.

  1. The span must have checkpoint.log attribute without null value.
  2. Multiple types of environment values should match using negate condition.
  3. Multiple attributes should be supported in a single filter.

@TylerHelmuth
Copy link
Member

@meSATYA please supply 2 test inputs like you did here, one that should be dropped and one that should be kept.

In your previous test input, the value of checkpoint.log is not null, has an environment value of prd and is dropped by this filterprocessor:

processors:
  filter/checkpoint-traces-preprod:
    error_mode: ignore
    traces:
      span:
        - attributes["checkpoint.log"] != nil and (resource.attributes["platform_env"] != "apr" or resource.attributes["platform.env"] != "approval-eu")

Which is correct for the way the statement is written. platform_env=prd means that resource.attributes["platform_env"] != "apr" is evaluated to true, so both the left and the right of the and are true, so the condition is true, so the data is dropped.

If your goal is to drop data where checkpoint.log is not null and platform_env is either apr or approval-eu you'd do:

processors:
  filter/checkpoint-traces-preprod:
    error_mode: ignore
    traces:
      span:
        - attributes["checkpoint.log"] != nil and (resource.attributes["platform_env"] == "apr" or resource.attributes["platform.env"] == "approval-eu")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working processor/filter Filter processor
Projects
None yet
Development

No branches or pull requests

4 participants