Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable UI Metric Collector #6203

Merged
merged 1 commit into from
Apr 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
- [Workspace] Add workspaces filter to saved objects page. ([#6458](https://github.com/opensearch-project/OpenSearch-Dashboards/pull/6458))
- [Multiple Datasource] Support multi data source in Region map ([#6654](https://github.com/opensearch-project/OpenSearch-Dashboards/pull/6654))
- Add `rightNavigationButton` component in chrome service for applications to register and add dev tool to top right navigation. ([#6553](https://github.com/opensearch-project/OpenSearch-Dashboards/pull/6553))
- Enable UI Metric Collector to collect UI Metrics and Application Usage ([#6203](https://github.com/opensearch-project/OpenSearch-Dashboards/pull/6203))

### 🐛 Bug Fixes

Expand Down
4 changes: 4 additions & 0 deletions config/opensearch_dashboards.yml
Original file line number Diff line number Diff line change
Expand Up @@ -313,3 +313,7 @@

# Set the value to true to enable workspace feature
# workspace.enabled: false

# Set the value to true to enable Ui Metric Collectors in Usage Collector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call our this might cause permission issue to OpenSearch. turn on with caution

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comments

# This publishes the Application Usage and UI Metrics into the saved object, which can be accessed by /api/stats?extended=true&legacy=true&exclude_usage=false
# usageCollection.uiMetric.enabled: false
Copy link
Member

@Flyingliuhub Flyingliuhub Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a chance to try this feature flag? I saw that warning in the local.

server    log   [22:53:51.746] [warning][collector-set][plugins][usageCollection] TypeError: Cannot destructure property 'index' of '(intermediate value).kibana' as it is undefined.
    at UsageCollector.fetch (/home/ubuntu/opensearchwork/OpenSearch-Dashboards-0721/OpenSearch-Dashboards/src/plugins/vis_type_vega/server/usage_collector/get_usage_collector.ts:154:15)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
    at /home/ubuntu/opensearchwork/OpenSearch-Dashboards-0721/OpenSearch-Dashboards/src/plugins/usage_collection/server/collector/collector_set.ts:150:21
    at async Promise.all (index 10)
    at CollectorSet.bulkFetch (/home/ubuntu/opensearchwork/OpenSearch-Dashboards-0721/OpenSearch-Dashboards/src/plugins/usage_collection/server/collector/collector_set.ts:144:23)
    at CollectorSet.bulkFetchUsage (/home/ubuntu/opensearchwork/OpenSearch-Dashboards-0721/OpenSearch-Dashboards/src/plugins/usage_collection/server/collector/collector_set.ts:177:12)
    at getUsage (/home/ubuntu/opensearchwork/OpenSearch-Dashboards-0721/OpenSearch-Dashboards/src/plugins/usage_collection/server/routes/stats/stats.ts:80:19)
    at async Promise.all (index 0)
    at /home/ubuntu/opensearchwork/OpenSearch-Dashboards-0721/OpenSearch-Dashboards/src/plugins/usage_collection/server/routes/stats/stats.ts:125:38
    at Router.handle (/home/ubuntu/opensearchwork/OpenSearch-Dashboards-0721/OpenSearch-Dashboards/src/core/server/http/router/router.ts:286:44)
    at handler (/home/ubuntu/opensearchwork/OpenSearch-Dashboards-0721/OpenSearch-Dashboards/src/core/server/http/router/router.ts:241:11)
    at exports.Manager.execute (/home/ubuntu/opensearchwork/OpenSearch-Dashboards-0721/OpenSearch-Dashboards/node_modules/@hapi/hapi/lib/toolkit.js:60:28)
    at Object.internals.handler (/home/ubuntu/opensearchwork/OpenSearch-Dashboards-0721/OpenSearch-Dashboards/node_modules/@hapi/hapi/lib/handler.js:46:20)
    at exports.execute (/home/ubuntu/opensearchwork/OpenSearch-Dashboards-0721/OpenSearch-Dashboards/node_modules/@hapi/hapi/lib/handler.js:31:20)
    at Request._lifecycle (/home/ubuntu/opensearchwork/OpenSearch-Dashboards-0721/OpenSearch-Dashboards/node_modules/@hapi/hapi/lib/request.js:371:32)
    at Request._execute (/home/ubuntu/opensearchwork/OpenSearch-Dashboards-0721/OpenSearch-Dashboards/node_modules/@hapi/hapi/lib/request.js:281:9)
server    log   [22:53:51.758] [warning][collector-set][plugins][usageCollection] Unable to fetch data from vis_type_vega collector

Copy link
Collaborator Author

@LDrago27 LDrago27 Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Flyingliuhub for spotting the error.

This issue is not related to my current change rather can also be found in the current main branch. I have found the root cause of the issue and have included a fix with this PR.

How to reproduce this issue?

  • Run opensearch dashboards
  • Call the /api/stats?extended=true&legacy=true&exclude_usage=false important include the exclude_usage=false flag

Root Cause:
Vega Visualization is not using ui_metric collector but rather is using a custom version of usage_collector for itself. When we call the api/stats with exclude_usage=false flag, It asks all usage collectors to fetch their metrics which would be consolidated and sent back. The fetching metric method for Vega is trying to fetch the stats from an index. It was unable to resolve the index properly from the config.

It was currently trying to find index field within kibana section of config however this information is present within opensearchDashboards section of config. Changing the path has helped to resolve the issue.

Config:
{ opensearchDashboards: { index: '.kibana' } }

Why was it not noticed till now?
The workflow to fetch metric from usage collectors can be triggered either by telemetry or by calling the api/stats API with exclude_usage=false flags. Since telemetry was disabled and api/stats was not being called with exclude_usage=false flags this issue wasn't uncovered.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LDrago27 can you create a new issue for this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. #6376

14 changes: 7 additions & 7 deletions packages/osd-analytics/src/reporter.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ import { Report, ReportManager } from './report';
import { ApplicationUsage } from './metrics';

export interface ReporterConfig {
// http: ReportHTTP;
http: ReportHTTP;
storage?: Storage;
checkInterval?: number;
debug?: boolean;
Expand All @@ -49,7 +49,7 @@ export class Reporter {
checkInterval: number;
private interval?: NodeJS.Timer;
private lastAppId?: string;
// private http: ReportHTTP;
private http: ReportHTTP;
private reportManager: ReportManager;
private storageManager: ReportStorageManager;
private readonly applicationUsage: ApplicationUsage;
Expand All @@ -59,8 +59,8 @@ export class Reporter {
private started = false;

constructor(config: ReporterConfig) {
const { storage, debug, checkInterval = 90000, storageKey = 'analytics' } = config;
// this.http = http;
const { http, storage, debug, checkInterval = 90000, storageKey = 'analytics' } = config;
this.http = http;
this.checkInterval = checkInterval;
this.applicationUsage = new ApplicationUsage();
this.storageManager = new ReportStorageManager(storageKey, storage);
Expand Down Expand Up @@ -144,14 +144,14 @@ export class Reporter {
public reportApplicationUsage(appId?: string) {
this.log(`Reporting application changed to ${appId}`);
this.lastAppId = appId || this.lastAppId;
// const appChangedReport = this.applicationUsage.appChanged(appId);
// if (appChangedReport) this.saveToReport([appChangedReport]);
const appChangedReport = this.applicationUsage.appChanged(appId);
if (appChangedReport) this.saveToReport([appChangedReport]);
}

public sendReports = async () => {
if (!this.reportManager.isReportEmpty()) {
try {
// await this.http(this.reportManager.report);
await this.http(this.reportManager.report);
this.flushReport();
} catch (err) {
this.log(`Error Sending Metrics Report ${err}`);
Expand Down
20 changes: 10 additions & 10 deletions src/plugins/usage_collection/public/services/create_reporter.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,20 +38,20 @@ interface AnalyicsReporterConfig {
}

export function createReporter(config: AnalyicsReporterConfig): Reporter {
const { localStorage, debug } = config;
const { localStorage, debug, fetch } = config;

return new Reporter({
debug,
storage: localStorage,
// async http(report) {
// const response = await fetch.post('/api/ui_metric/report', {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashwin-pc Do you know the reason why it was commented out previously? Is it related to a performance issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to hunt down he history here, but it looks like anything that seemed to send telemetry data was commented out during the fork out of caution and this was one of those. @kavilla can provide more context here but I doubt this had anything to do with performance since this does not add any significant overhead to the node server or its heap. If you can point me to any resources about that performance issue, we can try reproducing it to make sure that performance isnt impacted because of it. But otherwise i hope to merge this in sooner so that we can use the bake time to weed out any performance issues earlier on (if any)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collecting telemetry won't cause a load issue. Actually, when there are too many UI activities/requests to update the saved object, that can cause a significant load to OpenSearch

How about conduct some performance tests to collect baseline data so we could inform the customer about the performance impact (not OSD, but the target OpenSearch). We don't need too many tests; as minimal set, 1 concurrent user and 10 concurrent users, with each user potentially generating 1 UI event per second.

Understand that we might not have a performance test setup as part of CI. However, could we at least run some one-time tests to understand the baseline.

Another approach is to make it experimental (reminding users to enable it on the production environment at their own risk) if they want to enjoy this benefit.

As long as we don't introduce one-way door decisions, it is controllable behind a feature flag, and then it is up to the user to make the call.

Copy link
Collaborator Author

@LDrago27 LDrago27 Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seraphjiang @ashwin-pc Thanks for bringing up the point. UI metrics generated aren't saved immediately to the saved object, rather are batched in intervals and saved. If you are currently on a single page the metrics are batched in every 90s interval and then saved, however if you switch across pages the metrics gets saved at that the timestamp when the switch takes place. Therefore we have client side batching present.

However we don't have any server side batching while we are saving the Ui metrics into the saved object. We will have a subsequent PR to resolve this issue which is currently being tracked by this Issue #6375. In the meanwhile the above changes are being guarded by the feature flag which is disabled by default.

// body: JSON.stringify({ report }),
// });
async http(report) {
const response = await fetch.post('/api/ui_metric/report', {
body: JSON.stringify({ report }),
});

// if (response.status !== 'ok') {
// throw Error('Unable to store report.');
// }
// return response;
// },
if (response.status !== 'ok') {
throw Error('Unable to store report.');
}
return response;
},
});
}
2 changes: 1 addition & 1 deletion src/plugins/usage_collection/server/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ import { DEFAULT_MAXIMUM_WAIT_TIME_FOR_ALL_COLLECTORS_IN_S } from '../common/con

export const configSchema = schema.object({
uiMetric: schema.object({
enabled: schema.boolean({ defaultValue: true }),
enabled: schema.boolean({ defaultValue: false }),
debug: schema.boolean({ defaultValue: schema.contextRef('dev') }),
}),
maximumWaitTimeForAllCollectorsInS: schema.number({
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,15 +28,14 @@
* under the License.
*/

// import { schema } from '@osd/config-schema';
import { schema } from '@osd/config-schema';
import { IRouter, ISavedObjectsRepository } from 'opensearch-dashboards/server';
// import { storeReport, reportSchema } from '../report';
import { storeReport, reportSchema } from '../report';

export function registerUiMetricRoute(
router: IRouter,
getSavedObjects: () => ISavedObjectsRepository | undefined
) {
/*
router.post(
{
path: '/api/ui_metric/report',
Expand All @@ -59,5 +58,5 @@ export function registerUiMetricRoute(
return res.ok({ body: { status: 'fail' } });
}
}
);*/
);
}
2 changes: 1 addition & 1 deletion src/plugins/vis_type_vega/server/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ import { DataSourcePluginSetup } from 'src/plugins/data_source/server';
import { HomeServerPluginSetup } from '../../home/server';
import { UsageCollectionSetup } from '../../usage_collection/server';

export type ConfigObservable = Observable<{ kibana: { index: string } }>;
export type ConfigObservable = Observable<{ opensearchDashboards: { index: string } }>;

export interface VegaSavedObjectAttributes {
title: string;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ const getMockCallCluster = (hits?: unknown[]) =>
jest.fn().mockReturnValue(Promise.resolve({ hits: { hits } }) as unknown) as LegacyAPICaller;

describe('Vega visualization usage collector', () => {
const configMock = of({ kibana: { index: '' } });
const configMock = of({ opensearchDashboards: { index: '' } });
const usageCollector = getUsageCollector(configMock, {
home: ({
sampleData: {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -151,7 +151,7 @@ export function getUsageCollector(
type: VEGA_USAGE_TYPE,
isReady: () => true,
fetch: async (callCluster: LegacyAPICaller) => {
const { index } = (await config.pipe(first()).toPromise()).kibana;
const { index } = (await config.pipe(first()).toPromise()).opensearchDashboards;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch


return await getStats(callCluster, index, dependencies);
},
Expand Down
Loading