Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: base measureFunction API #271

Merged
merged 24 commits into from
Oct 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
d0d175f
feat: base measureExecution API
mdjastrzebski Nov 24, 2022
d19738b
fix: build errors due to invalid imports
adhorodyski Sep 29, 2023
9e19197
chore: rename dropWorst to warmupRuns for consistency
adhorodyski Sep 29, 2023
c696349
chore: adjust naming convention to drop render prefix where possible
adhorodyski Sep 29, 2023
e5975f8
refactor: collect current time using performance api
adhorodyski Sep 29, 2023
5cad008
feat: add type property to each measurement
adhorodyski Oct 10, 2023
a5b94d5
feat: render test type within the output
adhorodyski Oct 10, 2023
c74fdcd
docs: align methodology categories with the new output
adhorodyski Oct 10, 2023
994cb0d
docs: add measureFunction api, fix formatting
adhorodyski Oct 10, 2023
0a60da0
fix: unit tests for compare module
adhorodyski Oct 10, 2023
e6335dd
fix: save compare test snapshot with a matching node version
adhorodyski Oct 10, 2023
f9009b4
fix: default to render when validating performance entries
adhorodyski Oct 10, 2023
9ed6790
docs: tweaks
mdjastrzebski Oct 11, 2023
4644476
docs: more tweaks
mdjastrzebski Oct 11, 2023
327d0e2
refactor: code review change
mdjastrzebski Oct 11, 2023
5f4cd7e
refactor: cleanup
mdjastrzebski Oct 11, 2023
9c97a8a
refactor: code review changes
mdjastrzebski Oct 11, 2023
64b89aa
refactor: code review changes
mdjastrzebski Oct 11, 2023
9365419
chore: add tests
mdjastrzebski Oct 11, 2023
d4bb666
docs: tweak
mdjastrzebski Oct 11, 2023
8e08125
chore: fix build
mdjastrzebski Oct 11, 2023
ccca977
refactor: cleanup
mdjastrzebski Oct 11, 2023
0febcd6
docs: tweaks
mdjastrzebski Oct 11, 2023
30c0767
refactor: tweaks
mdjastrzebski Oct 11, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 34 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ You can think about it as a React performance testing library. In fact, Reassure

Reassure works by measuring render characteristics – duration and count – of the testing scenario you provide and comparing that to the stable version. It repeats the scenario multiple times to reduce the impact of random variations in render times caused by the runtime environment. Then, it applies statistical analysis to determine whether the code changes are statistically significant. As a result, it generates a human-readable report summarizing the results and displays it on the CI or as a comment to your pull request.

In addition to measuring component render times it can also measure execution of regular JavaScript functions.

## Installation and setup

To install Reassure, run the following command in your app folder:
Expand Down Expand Up @@ -158,7 +160,7 @@ To measure your first test performance, you need to run the following command in
yarn reassure
```

This command will run your tests multiple times using Jest, gathering render statistics and will write them to `.reassure/current.perf` file. To check your setup, check if the output file exists after running the command for the first time.
This command will run your tests multiple times using Jest, gathering performance statistics and will write them to `.reassure/current.perf` file. To check your setup, check if the output file exists after running the command for the first time.

> **Note:** You can add `.reassure/` folder to your `.gitignore` file to avoid accidentally committing your results.

Expand Down Expand Up @@ -340,9 +342,9 @@ You can refer to our example [GitHub workflow](https://github.com/callstack/reas

Looking at the example, you can notice that test scenarios can be assigned to certain categories:

- **Significant Changes To Render Duration** shows test scenarios where the change is statistically significant and **should** be looked into as it marks a potential performance loss/improvement
- **Meaningless Changes To Render Duration** shows test scenarios where the change is not statistically significant
- **Changes To Render Count** shows test scenarios where the render count did change
- **Significant Changes To Duration** shows test scenarios where the performance change is statistically significant and **should** be looked into as it marks a potential performance loss/improvement
- **Meaningless Changes To Duration** shows test scenarios where the performance change is not statistically significant
- **Changes To Count** shows test scenarios where the render or execution count did change
- **Added Scenarios** shows test scenarios which do not exist in the baseline measurements
- **Removed Scenarios** shows test scenarios which do not exist in the current measurements

Expand All @@ -357,7 +359,10 @@ measuring its performance and writing results to the output file. You can use th
of the testing

```ts
async function measurePerformance(ui: React.ReactElement, options?: MeasureOptions): Promise<MeasureRenderResult> {
async function measurePerformance(
ui: React.ReactElement,
options?: MeasureOptions,
): Promise<MeasureResults> {
```

#### `MeasureOptions` type
Expand All @@ -374,7 +379,30 @@ interface MeasureOptions {
- **`runs`**: number of runs per series for the particular test
- **`warmupRuns`**: number of additional warmup runs that will be done and discarded before the actual runs (default 1).
- **`wrapper`**: React component, such as a `Provider`, which the `ui` will be wrapped with. Note: the render duration of the `wrapper` itself is excluded from the results; only the wrapped component is measured.
- **`scenario`**: a custom async function, which defines user interaction within the UI by utilising RNTL functions
- **`scenario`**: a custom async function, which defines user interaction within the UI by utilising RNTL or RTL functions

#### `measureFunction` function

Allows you to wrap any synchronous function, measure its execution times and write results to the output file. You can use optional `options` to customize aspects of the testing. Note: the execution count will always be one.

```ts
async function measureFunction(
fn: () => void,
options?: MeasureFunctionOptions
): Promise<MeasureResults> {
```

#### `MeasureFunctionOptions` type

```ts
interface MeasureFunctionOptions {
runs?: number;
warmupRuns?: number;
}
```

- **`runs`**: number of runs per series for the particular test
- **`warmupRuns`**: number of additional warmup runs that will be done and discarded before the actual runs.

### Configuration

Expand Down
48 changes: 43 additions & 5 deletions docusaurus/docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,10 @@ measuring its performance and writing results to the output file. You can use op
of the testing

```ts
async function measurePerformance(ui: React.ReactElement, options?: MeasureOptions): Promise<MeasureRenderResult> {
async function measurePerformance(
ui: React.ReactElement,
options?: MeasureOptions,
): Promise<MeasureResults> {
```

#### Example
Expand Down Expand Up @@ -51,6 +54,41 @@ interface MeasureOptions {
- **`wrapper`**: React component, such as a `Provider`, which the `ui` will be wrapped with. Note: the render duration of the `wrapper` itself is excluded from the results, only the wrapped component is measured.
- **`scenario`**: a custom async function, which defines user interaction within the ui by utilized RNTL functions

### `measureFunction` function

Allows you to wrap any synchronous function, measure its performance and write results to the output file. You can use optional `options` to customize aspects of the testing.

```ts
async function measureFunction(
fn: () => void,
options?: MeasureFunctionOptions,
): Promise<MeasureResults> {
```

#### Example

```ts
// sample.perf-test.tsx
import { measureFunction } from 'reassure';
import { fib } from './fib';

test('fib 30', async () => {
await measureFunction(() => fib(30));
});
```

### `MeasureFunctionOptions` type

```ts
interface MeasureFunctionOptions {
runs?: number;
warmupRuns?: number;
}
```

- **`runs`**: number of runs per series for the particular test
- **`warmupRuns`**: number of additional warmup runs that will be done and discarded before the actual runs.

## Configuration

### Default configuration
Expand Down Expand Up @@ -81,11 +119,11 @@ const defaultConfig: Config = {
};
```

**`runs`**: number of repeated runs in a series per test (allows for higher accuracy by aggregating more data). Should be handled with care.
- **`runs`**: number of repeated runs in a series per test (allows for higher accuracy by aggregating more data). Should be handled with care.
- **`warmupRuns`**: number of additional warmup runs that will be done and discarded before the actual runs.
**`outputFile`**: name of the file the records will be saved to
**`verbose`**: make Reassure log more, e.g. for debugging purposes
**`testingLibrary`**: where to look for `render` and `cleanup` functions, supported values `'react-native'`, `'react'` or object providing custom `render` and `cleanup` functions
- **`outputFile`**: name of the file the records will be saved to
- **`verbose`**: make Reassure log more, e.g. for debugging purposes
- **`testingLibrary`**: where to look for `render` and `cleanup` functions, supported values `'react-native'`, `'react'` or object providing custom `render` and `cleanup` functions

### `configure` function

Expand Down
8 changes: 4 additions & 4 deletions docusaurus/docs/methodology.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ You can refer to our example [GitHub workflow](https://github.com/callstack/reas

Looking at the example you can notice that test scenarios can be assigned to certain categories:

- **Significant Changes To Render Duration** shows test scenario where the change is statistically significant and **should** be looked into as it marks a potential performance loss/improvement
- **Meaningless Changes To Render Duration** shows test scenarios where the change is not stastatistically significant
- **Changes To Render Count** shows test scenarios where render count did change
- **Significant Changes To Duration** shows test scenario where the performance change is statistically significant and **should** be looked into as it marks a potential performance loss/improvement
- **Meaningless Changes To Duration** shows test scenarios where the performance change is not stastatistically significant
- **Changes To Count** shows test scenarios where the render or execution count did change
- **Added Scenarios** shows test scenarios which do not exist in the baseline measurements
- **Removed Scenarios** shows test scenarios which do not exist in the current measurements
- **Removed Scenarios** shows test scenarios which do not exist in the current measurements
13 changes: 7 additions & 6 deletions packages/reassure-compare/src/compare.ts
Original file line number Diff line number Diff line change
Expand Up @@ -21,16 +21,16 @@ import { parseHeader, parsePerformanceEntries } from './utils/validate';
const PROBABILITY_CONSIDERED_SIGNIFICANT = 0.02;

/**
* Render duration threshold (in ms) for treating given difference as significant.
* Duration threshold (in ms) for treating given difference as significant.
*
* This is additional filter, in addition to probability threshold above.
* Too small duration difference might be result of measurement grain of 1 ms.
*/
const DURATION_DIFF_THRESHOLD_SIGNIFICANT = 4;

/**
* Threshold for considering render count change as significant. This implies inclusion
* of scenario results in Render Count Changed output section.
* Threshold for considering render or execution count change as significant. This implies inclusion
* of scenario results in Count Changed output section.
*/
const COUNT_DIFF_THRESHOLD = 0.5;

Expand Down Expand Up @@ -145,9 +145,9 @@ function compareResults(current: PerformanceResults, baseline: PerformanceResult
if (currentEntry && baselineEntry) {
compared.push(buildCompareEntry(name, currentEntry, baselineEntry));
} else if (currentEntry) {
added.push({ name, current: currentEntry });
added.push({ name, type: currentEntry.type, current: currentEntry });
} else if (baselineEntry) {
removed.push({ name, baseline: baselineEntry });
removed.push({ name, type: baselineEntry.type, baseline: baselineEntry });
}
});

Expand Down Expand Up @@ -176,7 +176,7 @@ function compareResults(current: PerformanceResults, baseline: PerformanceResult
}

/**
* Establish statisticial significance of render duration difference build compare entry.
* Establish statisticial significance of render/execution duration difference build compare entry.
*/
function buildCompareEntry(name: string, current: PerformanceEntry, baseline: PerformanceEntry): CompareEntry {
const durationDiff = current.meanDuration - baseline.meanDuration;
Expand All @@ -192,6 +192,7 @@ function buildCompareEntry(name: string, current: PerformanceEntry, baseline: Pe

return {
name,
type: current.type,
baseline,
current,
durationDiff,
Expand Down
24 changes: 11 additions & 13 deletions packages/reassure-compare/src/output/console.ts
Original file line number Diff line number Diff line change
@@ -1,12 +1,6 @@
import { logger } from '@callstack/reassure-logger';
import type { AddedEntry, CompareResult, CompareEntry, RemovedEntry } from '../types';
import {
formatCount,
formatDuration,
formatMetadata,
formatRenderCountChange,
formatRenderDurationChange,
} from '../utils/format';
import { formatCount, formatDuration, formatMetadata, formatCountChange, formatDurationChange } from '../utils/format';
import type { PerformanceMetadata } from '../types';

export function printToConsole(data: CompareResult) {
Expand All @@ -16,13 +10,13 @@ export function printToConsole(data: CompareResult) {
printMetadata('Current', data.metadata.current);
printMetadata('Baseline', data.metadata.baseline);

logger.log('\n➡️ Significant changes to render duration');
logger.log('\n➡️ Significant changes to duration');
data.significant.forEach(printRegularLine);

logger.log('\n➡️ Meaningless changes to render duration');
logger.log('\n➡️ Meaningless changes to duration');
data.meaningless.forEach(printRegularLine);

logger.log('\n➡️ Render count changes');
logger.log('\n➡️ Count changes');
data.countChanged.forEach(printRegularLine);

logger.log('\n➡️ Added scenarios');
Expand All @@ -39,15 +33,19 @@ function printMetadata(name: string, metadata?: PerformanceMetadata) {
}

function printRegularLine(entry: CompareEntry) {
logger.log(` - ${entry.name}: ${formatRenderDurationChange(entry)} | ${formatRenderCountChange(entry)}`);
logger.log(` - ${entry.name} [${entry.type}]: ${formatDurationChange(entry)} | ${formatCountChange(entry)}`);
}

function printAddedLine(entry: AddedEntry) {
const { current } = entry;
logger.log(` - ${entry.name}: ${formatDuration(current.meanDuration)} | ${formatCount(current.meanCount)}`);
logger.log(
` - ${entry.name} [${entry.type}]: ${formatDuration(current.meanDuration)} | ${formatCount(current.meanCount)}`
);
}

function printRemovedLine(entry: RemovedEntry) {
const { baseline } = entry;
logger.log(` - ${entry.name}: ${formatDuration(baseline.meanDuration)} | ${formatCount(baseline.meanCount)}`);
logger.log(
` - ${entry.name} [${entry.type}]: ${formatDuration(baseline.meanDuration)} | ${formatCount(baseline.meanCount)}`
);
}
31 changes: 20 additions & 11 deletions packages/reassure-compare/src/output/markdown.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ import {
formatDuration,
formatMetadata,
formatPercent,
formatRenderCountChange,
formatRenderDurationChange,
formatCountChange,
formatDurationChange,
} from '../utils/format';
import type {
AddedEntry,
Expand All @@ -22,7 +22,7 @@ import type {
PerformanceMetadata,
} from '../types';

const tableHeader = ['Name', 'Render Duration', 'Render Count'] as const;
const tableHeader = ['Name', 'Type', 'Duration', 'Count'] as const;

export const writeToMarkdown = async (filePath: string, data: CompareResult) => {
try {
Expand Down Expand Up @@ -68,13 +68,13 @@ function buildMarkdown(data: CompareResult) {
});
}

result += `\n\n${headers.h3('Significant Changes To Render Duration')}`;
result += `\n\n${headers.h3('Significant Changes To Duration')}`;
result += `\n${buildSummaryTable(data.significant)}`;
result += `\n${buildDetailsTable(data.significant)}`;
result += `\n\n${headers.h3('Meaningless Changes To Render Duration')}`;
result += `\n\n${headers.h3('Meaningless Changes To Duration')}`;
result += `\n${buildSummaryTable(data.meaningless, true)}`;
result += `\n${buildDetailsTable(data.meaningless)}`;
result += `\n\n${headers.h3('Changes To Render Count')}`;
result += `\n\n${headers.h3('Changes To Count')}`;
result += `\n${buildSummaryTable(data.countChanged)}`;
result += `\n${buildDetailsTable(data.countChanged)}`;
result += `\n\n${headers.h3('Added Scenarios')}`;
Expand All @@ -95,7 +95,7 @@ function buildMetadataMarkdown(name: string, metadata: PerformanceMetadata | und
function buildSummaryTable(entries: Array<CompareEntry | AddedEntry | RemovedEntry>, collapse: boolean = false) {
if (!entries.length) return emphasis.i('There are no entries');

const rows = entries.map((entry) => [entry.name, formatEntryDuration(entry), formatEntryCount(entry)]);
const rows = entries.map((entry) => [entry.name, entry.type, formatEntryDuration(entry), formatEntryCount(entry)]);
const content = markdownTable([tableHeader, ...rows]);

return collapse ? collapsibleSection('Show entries', content) : content;
Expand All @@ -104,21 +104,26 @@ function buildSummaryTable(entries: Array<CompareEntry | AddedEntry | RemovedEnt
function buildDetailsTable(entries: Array<CompareEntry | AddedEntry | RemovedEntry>) {
if (!entries.length) return '';

const rows = entries.map((entry) => [entry.name, buildDurationDetailsEntry(entry), buildCountDetailsEntry(entry)]);
const rows = entries.map((entry) => [
entry.name,
entry.type,
buildDurationDetailsEntry(entry),
buildCountDetailsEntry(entry),
]);
const content = markdownTable([tableHeader, ...rows]);

return collapsibleSection('Show details', content);
}

function formatEntryDuration(entry: CompareEntry | AddedEntry | RemovedEntry) {
if ('baseline' in entry && 'current' in entry) return formatRenderDurationChange(entry);
if ('baseline' in entry && 'current' in entry) return formatDurationChange(entry);
if ('baseline' in entry) return formatDuration(entry.baseline.meanDuration);
if ('current' in entry) return formatDuration(entry.current.meanDuration);
return '';
}

function formatEntryCount(entry: CompareEntry | AddedEntry | RemovedEntry) {
if ('baseline' in entry && 'current' in entry) return formatRenderCountChange(entry);
if ('baseline' in entry && 'current' in entry) return formatCountChange(entry);
if ('baseline' in entry) return formatCount(entry.baseline.meanCount);
if ('current' in entry) return formatCount(entry.current.meanCount);
return '';
Expand Down Expand Up @@ -149,7 +154,7 @@ function buildDurationDetails(title: string, entry: PerformanceEntry) {
emphasis.b(title),
`Mean: ${formatDuration(entry.meanDuration)}`,
`Stdev: ${formatDuration(entry.stdevDuration)} (${formatPercent(relativeStdev)})`,
entry.durations ? `Runs: ${entry.durations.join(' ')}` : '',
entry.durations ? `Runs: ${formatRunDurations(entry.durations)}` : '',
]
.filter(Boolean)
.join(`<br/>`);
Expand All @@ -171,3 +176,7 @@ function buildCountDetails(title: string, entry: PerformanceEntry) {
export function collapsibleSection(title: string, content: string) {
return `<details>\n<summary>${title}</summary>\n\n${content}\n</details>\n\n`;
}

export function formatRunDurations(values: number[]) {
return values.map((v) => (Number.isInteger(v) ? `${v}` : `${v.toFixed(1)}`)).join(' ');
}
Loading