Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support isolated service node extraction in service-map #628

Closed
chenqi0805 opened this issue Nov 18, 2021 · 8 comments · Fixed by #2384
Closed

Support isolated service node extraction in service-map #628

chenqi0805 opened this issue Nov 18, 2021 · 8 comments · Fixed by #2384
Assignees
Labels
bug Something isn't working Priority-High
Milestone

Comments

@chenqi0805
Copy link
Collaborator

chenqi0805 commented Nov 18, 2021

Is your feature request related to a problem? Please describe.
The current StatefulServiceMapProcessor only extracts service-map relationship among those internal services that have interactions (API calls) being provisioned. Some users desire that services only invoked by non-provisioned external client calls also show up in the service-map model and written under otel-v1-service-map index in opensearch backend so that it can be displayed by trace-analytics dashboard.

Describe the solution you'd like
We will need to

  1. modify model and schema
    (1) ServiceMapRelationship model
    (2) service-map template: https://github.com/opensearch-project/data-prepper/blob/main/docs/schemas/trace-analytics/otel-v1-apm-service-map-index-template.md

  2. modify the stateful processing business logic in service-map plugin that adapts to the new model

Describe alternatives you've considered (Optional)
A clear and concise description of any alternative solutions or features you've considered.

Additional context

See opensearch-project/dashboards-observability#57 for a related, but distinct issue.

@cmanning09
Copy link
Contributor

Thanks for mentioning the related issue @joshuali925. I have updated the issue with the context as well.

@dlvenable
Copy link
Member

I created opensearch-project/dashboards-observability#57 to support external services in the service map. That concept is somewhat the opposite of what this current issue describes. However, they may have related solutions.

@sbayer55
Copy link
Member

On 3/22/2022 a user on the Data Prepper form experienced this issue creating a confusing introduction to Data Prepper. Link to thread.

@dlvenable dlvenable removed this from the v1.4 milestone May 10, 2022
@andrevcf
Copy link

Any news on this? It was scheduled for v1.4 but removed from that release, when it will be addressed?

@dlvenable
Copy link
Member

@andrevcf , I do apologize that this bug will not be resolved in the upcoming 1.4.0. This was originally in the roadmap, but our efforts have turned toward supporting OpenSearch 2.0.0. Presently, I am unable to say which version this will be addressed in. I'd be interested to hear how this bug is affecting you to help us prioritize the issue.

@andrevcf
Copy link

andrevcf commented May 20, 2022

@dlvenable, thanks for your answer.


Tested with
OpenSearch 1.3.1
OpenSearch-dashboards 1.3.0 and 1.3.1
Dataprepper 1.4.0 and 1.3.0

If I have a single autoinstrumented java application (for example keycloak or other wildfly single service), then the service map is not updated to add the single service app as a "service" even if the traces are coming successfully.
It works if I have services communicating with other services like with Jaeger HotROD.

Code that I'm using for autoinstrument
JAVA_OPTS=-javaagent:/javaagent/opentelemetry-javaagent.jar -Dotel.service.name=sgps -Dotel.traces.exporter=jaeger -Dotel.exporter.jaeger.endpoint=http://main-collector-collector.tracing:14250 -Dotel.metrics.exporter=none

Traces sent:
image
If I click on a trace ID, then the OpenSearch Dashboards crash with this log on the console
image

BUT, if I manually add the service like (let's say "sgps" is the "missing service")

POST otel-v1-apm-service-map/_doc
{
  "serviceName" : "sgps",
  "kind" : "SPAN_KIND_SERVER",
  "destination" : null,
  "target" : null,
  "traceGroupName" : null
}

Then, it works with a few bugs on the console
image

And a single circle as the single service appears on the service map (sgps is the single service and the others are the Jaeger HotROD)
image

Below is the trace (sanitized)

[
  {
    "_index": "otel-v1-apm-span-000003",
    "_type": "_doc",
    "_id": "0880001ae8815067",
    "_score": 7.3006973,
    "_source": {
      "traceId": "50f671b5db7464dff94e0c4ea34c78f0",
      "droppedLinksCount": 0,
      "kind": "SPAN_KIND_CLIENT",
      "droppedEventsCount": 0,
      "traceGroupFields": {
        "endTime": "2022-05-20T17:57:55.145782117Z",
        "durationInNanos": 1477237,
        "statusCode": 0
      },
      "traceGroup": "SELECT sanitized_teste.pg_prepared_xacts",
      "serviceName": "sgps",
      "parentSpanId": "",
      "spanId": "0880001ae8815067",
      "traceState": "",
      "name": "SELECT sanitized_teste.pg_prepared_xacts",
      "startTime": "2022-05-20T17:57:55.144304880Z",
      "links": [],
      "endTime": "2022-05-20T17:57:55.145782117Z",
      "droppedAttributesCount": 0,
      "durationInNanos": 1477237,
      "events": [],
      "instrumentationLibrary.version": "1.14.0-alpha",
      "span.attributes.thread@id": 114,
      "resource.attributes.process@pid": 143,
      "resource.attributes.host@arch": "amd64",
      "span.attributes.net@peer@name": "sanitizedpeername.com.br",
      "resource.attributes.telemetry@sdk@version": "1.14.0",
      "resource.attributes.service@name": "sgps",
      "status.code": 0,
      "instrumentationLibrary.name": "io.opentelemetry.jdbc",
      "span.attributes.net@peer@port": 5432,
      "resource.attributes.process@runtime@name": "OpenJDK Runtime Environment",
      "resource.attributes.opencensus@exporterversion": "Jaeger-opentelemetry-java",
      "resource.attributes.os@type": "linux",
      "resource.attributes.hostname": "sanitized",
      "span.attributes.db@connection_string": "postgresql://sanitizeddb.com.br:5432",
      "span.attributes.db@sql@table": "pg_prepared_xacts",
      "span.attributes.thread@name": "Periodic Recovery",
      "span.attributes.db@user": "sgps",
      "resource.attributes.telemetry@sdk@language": "java",
      "span.attributes.db@operation": "SELECT",
      "resource.attributes.host@name": "sanitized-855dc98c9d-9rkzd",
      "resource.attributes.process@runtime@description": "Red Hat, Inc. OpenJDK 64-Bit Server VM 16.0.1+9",
      "resource.attributes.process@executable@path": "/usr/lib/jvm/java-16-openjdk-16.0.1.0.9-3.rolling.el7.x86_64:bin:java",
      "span.attributes.db@name": "sanitized_teste",
      "resource.attributes.process@command_line": "/usr/lib/jvm/java-16-openjdk-16.0.1.0.9-3.rolling.el7.x86_64:bin:java -D[Standalone] -javaagent:/javaagent/opentelemetry-javaagent.jar -Dotel.service.name=sgps -Dotel.traces.exporter=jaeger -Dotel.exporter.jaeger.endpoint=http://main-collector-collector.tracing:14250 -Dotel.metrics.exporter=none -Xms164m -Xmx1024m -XX:MetaspaceSize=96M -XX:MaxMetaspaceSize=256m --add-exports=java.desktop/sun.awt=ALL-UNNAMED --add-exports=java.naming/com.sun.jndi.ldap=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.security=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.management/javax.management=ALL-UNNAMED --add-opens=java.naming/javax.naming=ALL-UNNAMED -Dorg.jboss.boot.log.file=/opt/jboss/wildfly/standalone/log/server.log -Dlogging.configuration=file:/opt/jboss/wildfly/standalone/configuration/logging.properties",
      "span.attributes.db@system": "postgresql",
      "resource.attributes.ip": "10.205.97.36",
      "resource.attributes.process@runtime@version": "16.0.1+9",
      "resource.attributes.telemetry@sdk@name": "opentelemetry",
      "span.attributes.db@statement": "SELECT gid FROM pg_prepared_xacts where database = current_database()",
      "resource.attributes.container@id": "e56854fb25c56a7c038c91f393b13980c03aea16280d496b65fa3b6ef299ed36",
      "resource.attributes.telemetry@auto@version": "1.14.0",
      "resource.attributes.os@description": "Linux 5.4.0-1073-azure",
      "span.attributes.otel@scope@version": "1.14.0-alpha",
      "span.attributes.otel@scope@name": "io.opentelemetry.jdbc"
    }
  }
]

@andrevcf
Copy link

Any update on this?

@Arithmetika
Copy link

Just spend a whole day on this. When will this be solved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Priority-High
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants