[Infra] infra services api #173875

neptunian · 2023-12-21T20:09:47Z

Summary

Creation of a new endpoint within Infra to get services from APM indices that are related to a give host through host.name. These services will be listed in the Host Detail view in another PR. This endpoint queries apm transaction metrics and apm logs to get services.

Closes #171661

Test

The easiest way to test this api is to visit it directly using a host that has some services attached to it using our test cluster

URL: http://localhost:5601/api/infra/services
eg usage: http://localhost:5601/api/infra/services?from=now-15m&to=now&filters={"host.name":"gke-edge-oblt-edge-oblt-pool-5fbec7a6-nfy0"}&size=5

response:

{
    "services": [
        {
            "service.name": "productcatalogservice",
            "agent.name": "opentelemetry/go"
        },
        {
            "service.name": "frontend",
            "agent.name": "opentelemetry/nodejs"
        }
    ]
}

Follow up

Have APM server collect host.name as part of service_summary metrics and query that instead. Service summary aggregates transaction, error, log, and metric events into service-summary metrics. This would simplify the query.
added apm-synthtrace to metrics_ui api tests and created follow up PR for removing the code i needed to duplicate [Infra] Make apm synthtrace kibana client a service available to functional and integration tests #175064

apmmachine · 2023-12-21T20:10:00Z

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

/oblt-deploy : Deploy a Kibana instance using the Observability test environments.
/oblt-deploy-serverless : Deploy a serverless Kibana instance using the Observability test environments.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

…eptunian/kibana into 171661-infra-services-endpoint

…d custom buildRouteValidationWithExcess

elasticmachine · 2024-01-18T21:00:16Z

Pinging @elastic/obs-ux-infra_services-team (Team:obs-ux-infra_services)

neptunian · 2024-01-18T21:04:30Z

x-pack/plugins/infra/server/utils/route_validation.ts

+ * 2.0; you may not use this file except in compliance with the Elastic License
+ * 2.0.
+ */
+import type {


This adds route validation that can validate against excess props. I think this is useful when passing in unsupported query filters instead of ignoring them which io-ts runtime type validation will do. A lot of this was copied from a couple other plugins which seem to have copied from each other. i had to adjust it to handle the ExactType. Eventually I might add this to the kbn utils package.

x-pack/plugins/infra/common/http_api/host_details/get_infra_services.ts

sorenlouv · 2024-01-19T10:39:34Z

x-pack/plugins/infra/server/lib/host_details/get_services.ts

+  const result = await client<{}, ServicesAPIQueryAggregation>({
+    body,
+    index: [transaction, error, metric],
+  });


Afaict this is querying both transaction samples and transaction metrics. There could very well be billions of transaction samples for the past day on even clusters of modest size and this will therefore quickly run into scaling issues.

I suggest using service transaction metrics (and the appropriate interval) where possible, and only using transaction samples as a fallback.

Another idea: The fastest way to get all service names would be the terms enum api. That comes with some big limitation compared to the normal Elasticsearch DSL. For instance, you won't be able to get the agent.name per service. It might still be faster to get the service names via terms enum api, then fetching the agent names using a combination of bulk api and terminate_after: 1

... but at the end of the day, service transaction metrics probably provides a better balance between perf and DX.

@sqren good point on the scaling issues. But don't you think that APM should be responsible for determining the appropriate interval?

@sqren good point on the scaling issues. But don't you think that APM should be responsible for determining the appropriate interval?

Maybe I'm missing something but this doesn't call any APM api's, does it? If this indeed did call the APM services API, then yes.

@sqren Thanks, oversight on my part. Service transaction metrics don't collect the host name. Ideally as Dario suggested I'd like to only have to query the service_summary metrics, but it's not collecting host.name either. If it could I think that would really simplify things and we could avoid querying anything else. Is this something I could request from the APM Server team? In lieu of that, I'll avoid querying the transaction samples and focus on transaction metrics and logs. I've separated the queries out to target the transaction metricset. what do you think?

Querying transaction metrics sounds good for now. Just note that the plan is to remove host information from the transaction metrics, and instead have instance specific metrics. This will probably not happen anytime soon but when it does, this needs to be changed.

Related:

https://github.com/elastic/apm-dev/issues/1021

Change instance specific dimensions in aggregated metrics apm-server#11266

[APM] Use new host-specific transaction metrics #162392

sorenlouv · 2024-01-19T10:53:23Z

x-pack/plugins/infra/server/routes/services/index.ts

+      validate: {
+        query: (q, res) => {
+          const [invalidResponse, parsedFilters] = validateStringAssetFilters(q, res);
+          if (invalidResponse) {
+            return invalidResponse;
+          }
+          if (parsedFilters) {
+            q.validatedFilters = parsedFilters;
+          }
+          return validate(q, res);
+        },
+      },
+    },
+    async (requestContext, request, response) => {
+      const [{ savedObjects }] = await libs.getStartServices();
+      const { from, to, size = 10, validatedFilters } = request.query;
+
+      try {
+        if (!validatedFilters) {
+          throw Boom.badRequest('Invalid filters');
+        }
+        const client = createSearchClient(requestContext, framework, request);
+        const soClient = savedObjects.getScopedClient(request);
+        const apmIndices = await libs.getApmIndices(soClient);
+        const services = await getServices(client, apmIndices, {
+          from,
+          to,
+          size,
+          filters: validatedFilters,
+        });
+        return response.ok({
+          body: ServicesAPIResponseRT.encode(services),
+        });
+      } catch (err) {
+        if (Boom.isBoom(err)) {
+          return response.customError({
+            statusCode: err.output.statusCode,
+            body: { message: err.output.payload.message },
+          });
+        }
+
+        return response.customError({
+          statusCode: err.statusCode ?? 500,
+          body: {
+            message: err.message ?? 'An unexpected error occurred',
+          },
+        });
+      }
+    }


There is a lot of boilerplate here and it's hard to see what it has to do with this route. Mostly the validation and error handling looks very generic. Shouldn't this be handled by the framework?

I cleaned this up. When trying to validate a strict type of allowed filters it made things a bit more complicated. TS can't infer that validatedFilters exist which is a different type than the filters param which is a string. Since it definitely does exist or it would fail in validateStringAssetFilters, I've used a type assertion.

x-pack/plugins/infra/server/routes/services/index.ts

x-pack/plugins/infra/server/utils/route_validation.ts

sorenlouv · 2024-01-19T11:37:44Z

x-pack/test/api_integration/apis/metrics_ui/services.ts

+        })
+        .expect(200);
+
+      const { services } = decodeOrThrow(ServicesAPIResponseRT)(response.body);


General suggestion: it's VERY useful to have a typed api client. For apm we have this which makes it possible to call REST apis and get typed responses back - no custom parsing or explicit type annotations needed

…eptunian/kibana into 171661-infra-services-endpoint

…d add unit tests

…eptunian/kibana into 171661-infra-services-endpoint

neptunian · 2024-01-29T13:50:02Z

x-pack/plugins/infra/server/utils/route_validation.ts

+      | rt.InterfaceType<rt.Props>
+      | GenericIntersectionC
+      | rt.PartialType<rt.Props>
+      | rt.ExactC<any>,


support the Exact type directly

x-pack/plugins/infra/server/routes/services/index.ts

…-fix'

neptunian · 2024-02-02T12:49:48Z

@pzl @tomsonpl I haven't seen these autocommits before. Is this a new thing and something to ignore? I didn't modify any files in the osquery plugin which seems to be what triggered it.

neptunian · 2024-02-02T12:52:24Z

x-pack/plugins/infra/server/routes/services/lib/utils.ts

+import { RouteValidationError, RouteValidationResultFactory } from '@kbn/core/server';
+
+type ValidateStringAssetFiltersReturn = [{ error: RouteValidationError }] | [null, any];
+


this validation function makes sure the filters exist on the request and parses them, then we can continue validation of the filter object shape in the type validation

neptunian · 2024-02-02T12:56:50Z

x-pack/test/api_integration/apis/metrics_ui/config.ts

+    apmSynthtraceEsClient: (context: InheritedFtrProviderContext) => Promise<ApmSynthtraceEsClient>;
+  };
+}
+export default async function createTestConfig({


Add the synthtrace client as a service to our test config

crespocarlos

LGTM

kibana-ci · 2024-02-05T13:24:13Z

💚 Build Succeeded

Buildkite Build
Commit: b7d771c

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id	before	after	diff
`infra`	1419	1420	+1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`infra`	1.3MB	1.3MB	+2.1KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id	before	after	diff
`infra`	99.9KB	99.9KB	+60.0B

History

💚 Build #190903 succeeded 720312b
💚 Build #190555 succeeded f473495
💚 Build #189880 succeeded 5931647
💚 Build #189733 succeeded b6b9bd2
💚 Build #189670 succeeded 76d7cd5
💛 Build #189476 was flaky 02e6017

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

## Summary Creation of a new endpoint within Infra to get services from APM indices that are related to a give host through `host.name`. These services will be listed in the Host Detail view in another PR. This endpoint queries apm transaction metrics and apm logs to get services. Closes elastic#171661 ### Test The easiest way to test this api is to visit it directly using a host that has some services attached to it using our test cluster URL: http://localhost:5601/api/infra/services eg usage: `http://localhost:5601/api/infra/services?from=now-15m&to=now&filters={"host.name":"gke-edge-oblt-edge-oblt-pool-5fbec7a6-nfy0"}&size=5` response: ``` { "services": [ { "service.name": "productcatalogservice", "agent.name": "opentelemetry/go" }, { "service.name": "frontend", "agent.name": "opentelemetry/nodejs" } ] } ``` ### Follow up - Have APM server collect host.name as part of service_summary metrics and query that instead. Service summary aggregates transaction, error, log, and metric events into service-summary metrics. This would simplify the query. - `added apm-synthtrace` to `metrics_ui` api tests and created follow up PR for removing the code i needed to duplicate elastic#175064 --------- Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

neptunian added 2 commits January 2, 2024 08:49

draft of working services by host endpoint

d66dce4

type es agg response

d7fd655

neptunian force-pushed the 171661-infra-services-endpoint branch from 4279082 to d7fd655 Compare January 2, 2024 20:23

neptunian and others added 10 commits January 3, 2024 11:10

cleanup

e609054

make date required

589e4c8

Merge branch 'main' into 171661-infra-services-endpoint

4ac488b

change filters name

ad84ed4

add size

a66aa68

Merge branch '171661-infra-services-endpoint' of https://github.com/n…

7796b27

…eptunian/kibana into 171661-infra-services-endpoint

Merge branch 'main' into 171661-infra-services-endpoint

614bc18

add some basic tests, add synthtrace service to metrics api tests, ad…

8b5ff6d

…d custom buildRouteValidationWithExcess

Merge branch 'main' into 171661-infra-services-endpoint

6523a29

Merge branch 'main' into 171661-infra-services-endpoint

67afdfa

neptunian marked this pull request as ready for review January 18, 2024 20:59

neptunian requested review from a team as code owners January 18, 2024 20:59

neptunian added the Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team label Jan 18, 2024

neptunian added the release_note:skip Skip the PR/issue when compiling release notes label Jan 18, 2024

neptunian changed the title ~~[Obs UX] infra services api~~ [Infra] infra services api Jan 18, 2024

neptunian commented Jan 18, 2024

View reviewed changes

sorenlouv reviewed Jan 19, 2024

View reviewed changes

x-pack/plugins/infra/common/http_api/host_details/get_infra_services.ts Show resolved Hide resolved

sorenlouv reviewed Jan 19, 2024

View reviewed changes

x-pack/plugins/infra/server/routes/services/index.ts Show resolved Hide resolved

sorenlouv reviewed Jan 19, 2024

View reviewed changes

x-pack/plugins/infra/server/utils/route_validation.ts Show resolved Hide resolved

sorenlouv reviewed Jan 19, 2024

View reviewed changes

neptunian and others added 8 commits January 24, 2024 15:05

remove unused import

45e0319

Merge branch '171661-infra-services-endpoint' of https://github.com/n…

aa40f99

…eptunian/kibana into 171661-infra-services-endpoint

add default size value

93f2af3

add test for logs only services

02e6017

Merge branch 'main' into 171661-infra-services-endpoint

76d7cd5

add support for ExactType schema in buildRouteValidationwithExcess an…

64802ab

…d add unit tests

Merge branch '171661-infra-services-endpoint' of https://github.com/n…

b6b9bd2

…eptunian/kibana into 171661-infra-services-endpoint

Merge branch 'main' into 171661-infra-services-endpoint

5931647

neptunian commented Jan 29, 2024

View reviewed changes

cauemarcondes reviewed Jan 30, 2024

View reviewed changes

x-pack/plugins/infra/server/routes/services/index.ts Show resolved Hide resolved

neptunian and others added 2 commits January 31, 2024 12:25

Merge branch 'main' into 171661-infra-services-endpoint

e380d5e

[CI] Auto-commit changed files from 'node scripts/eslint --no-cache -…

f473495

…-fix'

neptunian requested a review from a team as a code owner January 31, 2024 18:10

neptunian requested review from pzl and tomsonpl January 31, 2024 18:10

pzl approved these changes Jan 31, 2024

View reviewed changes

Merge branch 'main' into 171661-infra-services-endpoint

720312b

neptunian commented Feb 2, 2024

View reviewed changes

Merge branch 'main' into 171661-infra-services-endpoint

b7d771c

crespocarlos approved these changes Feb 5, 2024

View reviewed changes

neptunian merged commit 6fc6950 into elastic:main Feb 5, 2024
17 checks passed

kibanamachine added v8.13.0 backport:skip This commit does not require backporting labels Feb 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Infra] infra services api #173875

[Infra] infra services api #173875

neptunian commented Dec 21, 2023 •

edited

Loading

apmmachine commented Dec 21, 2023

elasticmachine commented Jan 18, 2024

neptunian Jan 18, 2024 •

edited

Loading

sorenlouv Jan 19, 2024 •

edited

Loading

sorenlouv Jan 19, 2024 •

edited

Loading

kpatticha Jan 22, 2024

sorenlouv Jan 22, 2024 •

edited

Loading

neptunian Jan 23, 2024

sorenlouv Jan 24, 2024

sorenlouv Jan 19, 2024

neptunian Jan 24, 2024

sorenlouv Jan 19, 2024 •

edited

Loading

neptunian Jan 26, 2024

neptunian Jan 29, 2024

neptunian commented Feb 2, 2024

neptunian Feb 2, 2024

neptunian Feb 2, 2024

crespocarlos left a comment

kibana-ci commented Feb 5, 2024

		import { RouteValidationError, RouteValidationResultFactory } from '@kbn/core/server';

		type ValidateStringAssetFiltersReturn = [{ error: RouteValidationError }] \| [null, any];

[Infra] infra services api #173875

[Infra] infra services api #173875

Conversation

neptunian commented Dec 21, 2023 • edited Loading

Summary

Test

Follow up

apmmachine commented Dec 21, 2023

🤖 GitHub comments

elasticmachine commented Jan 18, 2024

neptunian Jan 18, 2024 • edited Loading

Choose a reason for hiding this comment

sorenlouv Jan 19, 2024 • edited Loading

Choose a reason for hiding this comment

sorenlouv Jan 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sorenlouv Jan 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sorenlouv Jan 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neptunian commented Feb 2, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

crespocarlos left a comment

Choose a reason for hiding this comment

kibana-ci commented Feb 5, 2024

💚 Build Succeeded

Metrics [docs]

Module Count

Async chunks

Page load bundle

History

neptunian commented Dec 21, 2023 •

edited

Loading

neptunian Jan 18, 2024 •

edited

Loading

sorenlouv Jan 19, 2024 •

edited

Loading

sorenlouv Jan 19, 2024 •

edited

Loading

sorenlouv Jan 22, 2024 •

edited

Loading

sorenlouv Jan 19, 2024 •

edited

Loading