Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Reporting] Instrument for PerformanceMetrics capture #46851

Closed

Conversation

tsullivan
Copy link
Member

@tsullivan tsullivan commented Sep 27, 2019

Summary

Part of efforts towards #29281

This change has come up from an opportunity that came up after Typescriptifying most of Reporting and spending a lot of time looking into ways of improving performance for Kibana Reporting.

The screenshotsObservable module in Reporting is actually a pretty good module, from the perspective of its consumers. Currently, those consumers are the generatePdf and generatePng modules, however as part of scheduled reports, we'll want to call screenshotsObservable from code that will be coming out soon.

If anyone had an interest in capturing Kibana page performance stats in automation, adding the tooling with the chain of observables in screenshots/index.ts would be a good way to do it. I recommend using this module, because using the same module that's in our products is a way to dogfood our own work.

Previously, the screenshotsObservable fired data to observers that contained 2 fields of data: screenshots and timeRange. These were needed for generating PNG and PDF files, with proper titles (time range is part of the title). This PR adds an additional field of data called metrics which includes the page URL, a timestamp, and the metrics data exposed by the Puppeteer API: https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagemetrics

From there, the metrics data can be carried on in multiple ways:

  • add as metadata to a completed job document
  • add to Kibana stats for uploading to a Monitoring cluster
    • "Kibana stats" means: /api/stats?extended
  • testing / CI
  • simple logging

Checklist

Use strikethroughs to remove checklist items you don't feel are applicable to this PR.

[1] There won't be documentation until we are able to make use of this data.
[2] screenshotsObservable today does not have any unit tests. Functional tests should be possible once the new metric data is used or shown via the stats API.

@tsullivan tsullivan added review (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead v8.0.0 release_note:skip Skip the PR/issue when compiling release notes Team:Stack Services v7.5.0 labels Sep 27, 2019
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-stack-services

@@ -211,6 +211,7 @@ export class HeadlessChromiumDriverFactory {
mergeMap((err: Error) => Rx.throwError(err))
);

// FIXME: log it but don't throwError?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this comment as a way to have a discussion about it: This throwError causes some screenshots of dashboards to fail, because of aborted request errors. Example:

server   error  [12:29:48.632] [error][reporting-tools-plugin] Error: Request to [https://spicy.local/kbn/elasticsearch/_msearch?rest_total_hits_as_int=true&ignore_throttled=true] failed! [net::ERR_ABORTED]
    at MergeMapSubscriber.Rx.fromEvent.pipe.req [as project] (/Users/tsullivan/elastic/kibana/x-pack/legacy/plugins/reporting/server/browsers/chromium/driver_factory/index.ts:220:13)
    at MergeMapSubscriber._tryNext (/Users/tsullivan/elastic/kibana/node_modules/rxjs/src/internal/operators/mergeMap.ts:132:21)
    at MergeMapSubscriber._next (/Users/tsullivan/elastic/kibana/node_modules/rxjs/src/internal/operators/mergeMap.ts:122:12)
    at MergeMapSubscriber.Subscriber.next (/Users/tsullivan/elastic/kibana/node_modules/rxjs/src/internal/Subscriber.ts:102:12)
    at Page.handler (/Users/tsullivan/elastic/kibana/node_modules/rxjs/src/internal/observable/fromEvent.ts:197:20)
    at Page.emit (events.js:189:13)
    at NetworkManager.Page.networkManager.on.event (/Users/tsullivan/elastic/kibana/node_modules/puppeteer-core/lib/Page.js:112:74)
    at NetworkManager.emit (events.js:189:13)
    at NetworkManager._onLoadingFailed (/Users/tsullivan/elastic/kibana/node_modules/puppeteer-core/lib/NetworkManager.js:315:10)
    at CDPSession.emit (events.js:189:13)
    at CDPSession._onMessage (/Users/tsullivan/elastic/kibana/node_modules/puppeteer-core/lib/Connection.js:200:12)
    at Connection._onMessage (/Users/tsullivan/elastic/kibana/node_modules/puppeteer-core/lib/Connection.js:112:17)
    at PipeTransport._dispatch (/Users/tsullivan/elastic/kibana/node_modules/puppeteer-core/lib/PipeTransport.js:61:22)
    at Socket.PipeTransport._eventListeners.helper.addEventListener.buffer (/Users/tsullivan/elastic/kibana/node_modules/puppeteer-core/lib/PipeTransport.js:30:64)
    at Socket.emit (events.js:189:13)
    at addChunk (_stream_readable.js:284:12)

This code essentially handles page.on('requestfailed', ...) by throwing an error and cancelling out the observable to everything but error handlers. I like the detail we are adding in the logs, but I think logging is maybe the only thing this handler should do.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@ycombinator ycombinator requested review from a team and removed request for ycombinator September 27, 2019 20:54
@@ -80,6 +81,15 @@ export class HeadlessChromiumDriver {
logger.info(`handled ${interceptedCount} page requests`);
}

public async getMetrics(): Promise<PerformanceMetrics> {
const puppeteerMetrics = await this.page.metrics();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, can't believe it's that simple of a call

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... that kind of tells me the peripheral changes here should be a separate PR. I'm actually a little hesitant with just adding this call without a followup plan on what to do with the data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -37,14 +36,21 @@ export interface Screenshot {
description: any;
}

export interface ScreenShotOpts {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused on this deletion: did these go anywhere or are they no longer needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the Opts references from these lines: https://github.com/elastic/kibana/pull/46851/files#diff-1a70afc14f0154fde139ff3e5eeca902L116-L133 because they didn't provide a value of safety, and they were creating hard-to-read syntax.

They got into the code originally to prevent NOTE: Typescript does not throw an error if this interface has errors! which is a new comment in this PR. But Typescript just does not match the observable's payload data with what the function should return, so that code isn't saving us from mistakes.

As a replacement for all those Opts types, I kept the type references there just for the first time it is referenced.

So in:

(browser, metrics: PerformanceMetrics | null) => ({ browser, metrics })

is the first time the metrics variable was referenced. In:

({ browser, metrics }, timeRange: TimeRange | null) => ({ browser, metrics, timeRange })

is the first time the timeRange variable is referenced

Typescript seems to be able to understand that the output of one mergeMap should match the input of the next mergeMap, but I haven't found a way to make it complain if the observable isn't what the function says it should return: Rx.Observable<ScreenshotResults>

Copy link
Contributor

@joelgriffith joelgriffith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just wanted to make sure we don't lose any types that are needed elsewhere

@chrisronline
Copy link
Contributor

@tsullivan

add to Kibana stats for uploading to a Monitoring cluster
"Kibana stats" means: /api/stats?extended

I don't see any changes affecting this. Is this listed as work to do in the future?

@tsullivan
Copy link
Member Author

I don't see any changes affecting this. Is this listed as work to do in the future?

Yes, something to do in the future. We would work with the Monitoring team to find the best way of doing this, but that can be done after data starts to be available.

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@tsullivan
Copy link
Member Author

just wanted to make sure we don't lose any types that are needed elsewhere

The types that I removed weren't in the code for very long. They were a shortcut to define the types going through the observable pipeline as the data accumulates. It's limited to that, and it was removed because the shortcut wasn't adding actual type safety or the readability that I wanted.

@tsullivan
Copy link
Member Author

I prefer to hold off merging and further work on this until #46924 is resolved. That would give us a service to send the collected metrics into.

@Bamieh
Copy link
Member

Bamieh commented Oct 4, 2019

@tsullivan we have ui_metrics plugin. I have an upcoming feature to add performance metrics. I acually have the code written but pushed it back until I am able to update the way we report those metrics (es mapping and how we'd consume them in the telemetry dashboard). I'd love to talk more about this before it gets merged. I am interested in learning your usecase for reporting and how you're imagining consuming this data. We have ui_metrics in which this could be baked into.

@tsullivan
Copy link
Member Author

@tsullivan we have ui_metrics plugin. I have an upcoming feature to add performance metrics. I acually have the code written but pushed it back until I am able to update the way we report those metrics (es mapping and how we'd consume them in the telemetry dashboard).

We talked offline and it sounds like what your working on covers the need that this PR would provide, so I guess this overlaps with your work. Glad we synced up on it!

I imagine that the metrics you are capturing will allow users to see what their "heaviest" dashboards and visualizations are, which will be useful information for Reporting.

In Reporting, we still need to capture the Reporting-specific metrics, such as how much memory and CPU utilization Reporting consumes on a server, and how much disk space is consumed by the historical reports. There are issues to track those needs:

@tsullivan
Copy link
Member Author

I'm going to close this PR to support the centralized collection of these stats :)

@tsullivan tsullivan closed this Oct 7, 2019
@tsullivan tsullivan deleted the reporting/runnable-screenshots branch October 18, 2019 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
(Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead release_note:skip Skip the PR/issue when compiling release notes review stalled v7.5.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants