[Reporting] Instrument for PerformanceMetrics capture #46851

tsullivan · 2019-09-27T19:27:47Z

Summary

Part of efforts towards #29281

This change has come up from an opportunity that came up after Typescriptifying most of Reporting and spending a lot of time looking into ways of improving performance for Kibana Reporting.

The screenshotsObservable module in Reporting is actually a pretty good module, from the perspective of its consumers. Currently, those consumers are the generatePdf and generatePng modules, however as part of scheduled reports, we'll want to call screenshotsObservable from code that will be coming out soon.

If anyone had an interest in capturing Kibana page performance stats in automation, adding the tooling with the chain of observables in screenshots/index.ts would be a good way to do it. I recommend using this module, because using the same module that's in our products is a way to dogfood our own work.

Previously, the screenshotsObservable fired data to observers that contained 2 fields of data: screenshots and timeRange. These were needed for generating PNG and PDF files, with proper titles (time range is part of the title). This PR adds an additional field of data called metrics which includes the page URL, a timestamp, and the metrics data exposed by the Puppeteer API: https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagemetrics

From there, the metrics data can be carried on in multiple ways:

add as metadata to a completed job document
add to Kibana stats for uploading to a Monitoring cluster
- "Kibana stats" means: /api/stats?extended
testing / CI
simple logging

Checklist

Use ~~strikethroughs~~ to remove checklist items you don't feel are applicable to this PR.

~~[ ] This was checked for cross-browser compatibility, including a check against IE11~~
~~[ ] Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support~~
~~[ ] Documentation was added for features that require explanation or tutorials~~ [1]
~~[ ] Unit or functional tests were updated or added to match the most common scenarios~~ [2]
~~[ ] This was checked for keyboard-only and screenreader accessibility~~

[1] There won't be documentation until we are able to make use of this data.
[2] screenshotsObservable today does not have any unit tests. Functional tests should be possible once the new metric data is used or shown via the stats API.

elasticmachine · 2019-09-27T19:27:50Z

Pinging @elastic/kibana-stack-services

tsullivan · 2019-09-27T19:31:53Z

x-pack/legacy/plugins/reporting/server/browsers/chromium/driver_factory/index.ts

@@ -211,6 +211,7 @@ export class HeadlessChromiumDriverFactory {
      mergeMap((err: Error) => Rx.throwError(err))
    );

+    // FIXME: log it but don't throwError?


Added this comment as a way to have a discussion about it: This throwError causes some screenshots of dashboards to fail, because of aborted request errors. Example:

server error [12:29:48.632] [error][reporting-tools-plugin] Error: Request to [https://spicy.local/kbn/elasticsearch/_msearch?rest_total_hits_as_int=true&ignore_throttled=true] failed! [net::ERR_ABORTED] at MergeMapSubscriber.Rx.fromEvent.pipe.req [as project] (/Users/tsullivan/elastic/kibana/x-pack/legacy/plugins/reporting/server/browsers/chromium/driver_factory/index.ts:220:13) at MergeMapSubscriber._tryNext (/Users/tsullivan/elastic/kibana/node_modules/rxjs/src/internal/operators/mergeMap.ts:132:21) at MergeMapSubscriber._next (/Users/tsullivan/elastic/kibana/node_modules/rxjs/src/internal/operators/mergeMap.ts:122:12) at MergeMapSubscriber.Subscriber.next (/Users/tsullivan/elastic/kibana/node_modules/rxjs/src/internal/Subscriber.ts:102:12) at Page.handler (/Users/tsullivan/elastic/kibana/node_modules/rxjs/src/internal/observable/fromEvent.ts:197:20) at Page.emit (events.js:189:13) at NetworkManager.Page.networkManager.on.event (/Users/tsullivan/elastic/kibana/node_modules/puppeteer-core/lib/Page.js:112:74) at NetworkManager.emit (events.js:189:13) at NetworkManager._onLoadingFailed (/Users/tsullivan/elastic/kibana/node_modules/puppeteer-core/lib/NetworkManager.js:315:10) at CDPSession.emit (events.js:189:13) at CDPSession._onMessage (/Users/tsullivan/elastic/kibana/node_modules/puppeteer-core/lib/Connection.js:200:12) at Connection._onMessage (/Users/tsullivan/elastic/kibana/node_modules/puppeteer-core/lib/Connection.js:112:17) at PipeTransport._dispatch (/Users/tsullivan/elastic/kibana/node_modules/puppeteer-core/lib/PipeTransport.js:61:22) at Socket.PipeTransport._eventListeners.helper.addEventListener.buffer (/Users/tsullivan/elastic/kibana/node_modules/puppeteer-core/lib/PipeTransport.js:30:64) at Socket.emit (events.js:189:13) at addChunk (_stream_readable.js:284:12)

This code essentially handles page.on('requestfailed', ...) by throwing an error and cancelling out the observable to everything but error handlers. I like the detail we are adding in the logs, but I think logging is maybe the only thing this handler should do.

elasticmachine · 2019-09-27T20:40:34Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request

joelgriffith · 2019-09-30T17:24:29Z

x-pack/legacy/plugins/reporting/server/browsers/chromium/driver/chromium_driver.ts

@@ -80,6 +81,15 @@ export class HeadlessChromiumDriver {
    logger.info(`handled ${interceptedCount} page requests`);
  }

+  public async getMetrics(): Promise<PerformanceMetrics> {
+    const puppeteerMetrics = await this.page.metrics();


Wow, can't believe it's that simple of a call

Yeah... that kind of tells me the peripheral changes here should be a separate PR. I'm actually a little hesitant with just adding this call without a followup plan on what to do with the data.

joelgriffith · 2019-09-30T17:26:08Z

x-pack/legacy/plugins/reporting/export_types/common/lib/screenshots/types.ts

@@ -37,14 +36,21 @@ export interface Screenshot {
  description: any;
 }

-export interface ScreenShotOpts {


I'm confused on this deletion: did these go anywhere or are they no longer needed?

I removed the Opts references from these lines: https://github.com/elastic/kibana/pull/46851/files#diff-1a70afc14f0154fde139ff3e5eeca902L116-L133 because they didn't provide a value of safety, and they were creating hard-to-read syntax.

They got into the code originally to prevent NOTE: Typescript does not throw an error if this interface has errors! which is a new comment in this PR. But Typescript just does not match the observable's payload data with what the function should return, so that code isn't saving us from mistakes.

As a replacement for all those Opts types, I kept the type references there just for the first time it is referenced.

So in:

(browser, metrics: PerformanceMetrics | null) => ({ browser, metrics })

is the first time the metrics variable was referenced. In:

({ browser, metrics }, timeRange: TimeRange | null) => ({ browser, metrics, timeRange })

is the first time the timeRange variable is referenced

Typescript seems to be able to understand that the output of one mergeMap should match the input of the next mergeMap, but I haven't found a way to make it complain if the observable isn't what the function says it should return: Rx.Observable<ScreenshotResults>

joelgriffith

LGTM, just wanted to make sure we don't lose any types that are needed elsewhere

chrisronline · 2019-09-30T20:19:02Z

@tsullivan

add to Kibana stats for uploading to a Monitoring cluster
"Kibana stats" means: /api/stats?extended

I don't see any changes affecting this. Is this listed as work to do in the future?

tsullivan · 2019-09-30T21:07:52Z

I don't see any changes affecting this. Is this listed as work to do in the future?

Yes, something to do in the future. We would work with the Monitoring team to find the best way of doing this, but that can be done after data starts to be available.

elasticmachine · 2019-09-30T22:36:07Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request

tsullivan · 2019-09-30T22:53:14Z

just wanted to make sure we don't lose any types that are needed elsewhere

The types that I removed weren't in the code for very long. They were a shortcut to define the types going through the observable pipeline as the data accumulates. It's limited to that, and it was removed because the shortcut wasn't adding actual type safety or the readability that I wanted.

tsullivan · 2019-10-03T19:32:42Z

I prefer to hold off merging and further work on this until #46924 is resolved. That would give us a service to send the collected metrics into.

Bamieh · 2019-10-04T15:55:47Z

@tsullivan we have ui_metrics plugin. I have an upcoming feature to add performance metrics. I acually have the code written but pushed it back until I am able to update the way we report those metrics (es mapping and how we'd consume them in the telemetry dashboard). I'd love to talk more about this before it gets merged. I am interested in learning your usecase for reporting and how you're imagining consuming this data. We have ui_metrics in which this could be baked into.

tsullivan · 2019-10-07T16:33:50Z

@tsullivan we have ui_metrics plugin. I have an upcoming feature to add performance metrics. I acually have the code written but pushed it back until I am able to update the way we report those metrics (es mapping and how we'd consume them in the telemetry dashboard).

We talked offline and it sounds like what your working on covers the need that this PR would provide, so I guess this overlaps with your work. Glad we synced up on it!

I imagine that the metrics you are capturing will allow users to see what their "heaviest" dashboards and visualizations are, which will be useful information for Reporting.

In Reporting, we still need to capture the Reporting-specific metrics, such as how much memory and CPU utilization Reporting consumes on a server, and how much disk space is consumed by the historical reports. There are issues to track those needs:

tsullivan · 2019-10-07T17:35:07Z

I'm going to close this PR to support the centralized collection of these stats :)

tsullivan added 3 commits September 27, 2019 12:01

[Reporting] Instrument for PerformanceMetrics capture

8df73f5

interface of screenshotsObservable payload

024b3b2

revert most of the dev hack changes

a599a78

tsullivan added review (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead v8.0.0 release_note:skip Skip the PR/issue when compiling release notes Team:Stack Services v7.5.0 labels Sep 27, 2019

tsullivan requested review from tylersmalley, ycombinator, joelgriffith, stacey-gammon and a team September 27, 2019 19:27

tsullivan commented Sep 27, 2019

View reviewed changes

ycombinator requested review from a team and removed request for ycombinator September 27, 2019 20:54

joelgriffith reviewed Sep 30, 2019

View reviewed changes

joelgriffith approved these changes Sep 30, 2019

View reviewed changes

tsullivan added 2 commits September 30, 2019 14:24

Merge branch 'master' into reporting/runnable-screenshots

a213310

Merge branch 'master' into reporting/runnable-screenshots

d48f29e

tsullivan mentioned this pull request Oct 1, 2019

[Reporting] Cleanup to ScreenshotResults interface #47081

Merged

2 tasks

tsullivan added the stalled label Oct 3, 2019

tsullivan mentioned this pull request Oct 7, 2019

[Reporting] Add counter for total size of exported reports #30495

Closed

tsullivan closed this Oct 7, 2019

tsullivan deleted the reporting/runnable-screenshots branch October 18, 2019 19:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Reporting] Instrument for PerformanceMetrics capture #46851

[Reporting] Instrument for PerformanceMetrics capture #46851

tsullivan commented Sep 27, 2019 •

edited

Loading

elasticmachine commented Sep 27, 2019

tsullivan Sep 27, 2019

elasticmachine commented Sep 27, 2019

joelgriffith Sep 30, 2019

tsullivan Oct 1, 2019

tsullivan Oct 1, 2019

joelgriffith Sep 30, 2019

tsullivan Sep 30, 2019

joelgriffith left a comment

chrisronline commented Sep 30, 2019

tsullivan commented Sep 30, 2019

elasticmachine commented Sep 30, 2019

tsullivan commented Sep 30, 2019

tsullivan commented Oct 3, 2019

Bamieh commented Oct 4, 2019 •

edited

Loading

tsullivan commented Oct 7, 2019

tsullivan commented Oct 7, 2019

[Reporting] Instrument for PerformanceMetrics capture #46851

[Reporting] Instrument for PerformanceMetrics capture #46851

Conversation

tsullivan commented Sep 27, 2019 • edited Loading

Summary

Checklist

elasticmachine commented Sep 27, 2019

tsullivan Sep 27, 2019

Choose a reason for hiding this comment

elasticmachine commented Sep 27, 2019

💚 Build Succeeded

joelgriffith Sep 30, 2019

Choose a reason for hiding this comment

tsullivan Oct 1, 2019

Choose a reason for hiding this comment

tsullivan Oct 1, 2019

Choose a reason for hiding this comment

joelgriffith Sep 30, 2019

Choose a reason for hiding this comment

tsullivan Sep 30, 2019

Choose a reason for hiding this comment

joelgriffith left a comment

Choose a reason for hiding this comment

chrisronline commented Sep 30, 2019

tsullivan commented Sep 30, 2019

elasticmachine commented Sep 30, 2019

💚 Build Succeeded

tsullivan commented Sep 30, 2019

tsullivan commented Oct 3, 2019

Bamieh commented Oct 4, 2019 • edited Loading

tsullivan commented Oct 7, 2019

tsullivan commented Oct 7, 2019

tsullivan commented Sep 27, 2019 •

edited

Loading

Bamieh commented Oct 4, 2019 •

edited

Loading