Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution][Alerts] improves performance of new terms multi fields #145167

Merged
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
e918ed0
POC for performancd improvements
vitaliidm Nov 14, 2022
33bd49e
Merge branch 'main' into alerts/new_terms_perfromance_improvements
vitaliidm Nov 15, 2022
f3bf6ac
fix tests
vitaliidm Nov 15, 2022
4700752
Merge branch 'alerts/new_terms_perfromance_improvements' of https://g…
vitaliidm Nov 15, 2022
3545a6f
Merge branch 'main' into alerts/new_terms_perfromance_improvements
vitaliidm Nov 15, 2022
7921c50
Merge branch 'main' into alerts/new_terms_perfromance_improvements
vitaliidm Nov 15, 2022
5308301
[CI] Auto-commit changed files from 'node scripts/precommit_hook.js -…
kibanamachine Nov 15, 2022
5d53cfa
Merge branch 'main' into alerts/new_terms_perfromance_improvements
vitaliidm Nov 15, 2022
8d50f10
Merge branch 'main' into alerts/new_terms_perfromance_improvements
vitaliidm Nov 16, 2022
65f0e30
Merge branch 'main' into alerts/new_terms_perfromance_improvements
vitaliidm Nov 16, 2022
f28d144
add tests/update README
vitaliidm Nov 16, 2022
c74b919
Merge branch 'alerts/new_terms_perfromance_improvements' of https://g…
vitaliidm Nov 16, 2022
1b0ce67
null tests
vitaliidm Nov 16, 2022
2087dc6
Merge branch 'main' into alerts/new_terms_perfromance_improvements
vitaliidm Nov 16, 2022
29c026b
Revert "[CI] Auto-commit changed files from 'node scripts/precommit_h…
vitaliidm Nov 16, 2022
02e2948
[CI] Auto-commit changed files from 'node scripts/precommit_hook.js -…
kibanamachine Nov 16, 2022
5b95dd9
Merge branch 'main' into alerts/new_terms_perfromance_improvements
vitaliidm Nov 16, 2022
7516076
Merge branch 'main' into alerts/new_terms_perfromance_improvements
vitaliidm Nov 16, 2022
981896b
Merge branch 'main' into alerts/new_terms_perfromance_improvements
vitaliidm Nov 16, 2022
dbb71e3
Update x-pack/plugins/security_solution/server/lib/detection_engine/r…
vitaliidm Nov 17, 2022
3b26fa6
Update x-pack/plugins/security_solution/server/lib/detection_engine/r…
vitaliidm Nov 17, 2022
da52b8d
CR feedback
vitaliidm Nov 17, 2022
705984d
CR feedback
vitaliidm Nov 17, 2022
26cb0b8
tests
vitaliidm Nov 17, 2022
f2be791
Merge branch 'main' into alerts/new_terms_perfromance_improvements
vitaliidm Nov 17, 2022
8f10528
Update README.md
vitaliidm Nov 17, 2022
a153d77
Update README.md
vitaliidm Nov 17, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -27,4 +27,3 @@ The new terms rule type reuses the singleSearchAfter function which implements t
## Limitations and future enhancements

- Value list exceptions are not supported at the moment. Commit ead04ce removes an experimental method I tried for evaluating value list exceptions.
- Runtime field supports only 100 emitted values. So for large arrays or combination of values greater than 100, results may not be exhaustive. This applies only to new terms with multiple fields
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This limitation could still be hit, right? It's just less likely now

Copy link
Contributor Author

@vitaliidm vitaliidm Nov 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated README with recent findings

Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ import {
getNewTermsRuntimeMappings,
getAggregationField,
decodeMatchedValues,
createFieldValuesMap,
} from './utils';
import {
addToSearchAfterReturn,
Expand Down Expand Up @@ -193,6 +194,7 @@ export const createNewTermsAlertType = (
}
const bucketsForField = searchResultWithAggs.aggregations.new_terms.buckets;
const includeValues = transformBucketsToValues(params.newTermsFields, bucketsForField);
const fieldsValuesMap = createFieldValuesMap(params.newTermsFields, bucketsForField);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could make this something like
const newTermsRuntimeMappings = getNewTermsRuntimeMappings(...);
and reuse the same runtime mappings variable for phase 2 and 3 searches

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// PHASE 2: Take the page of results from Phase 1 and determine if each term exists in the history window.
// The aggregation filters out buckets for terms that exist prior to `tuple.from`, so the buckets in the
// response correspond to each new term.
Expand All @@ -209,7 +211,7 @@ export const createNewTermsAlertType = (
}),
runtimeMappings: {
...runtimeMappings,
...getNewTermsRuntimeMappings(params.newTermsFields),
...getNewTermsRuntimeMappings(params.newTermsFields, fieldsValuesMap),
},
searchAfterSortIds: undefined,
index: inputIndex,
Expand Down Expand Up @@ -255,7 +257,7 @@ export const createNewTermsAlertType = (
}),
runtimeMappings: {
...runtimeMappings,
...getNewTermsRuntimeMappings(params.newTermsFields),
...getNewTermsRuntimeMappings(params.newTermsFields, fieldsValuesMap),
},
searchAfterSortIds: undefined,
index: inputIndex,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import {
getAggregationField,
decodeMatchedValues,
getNewTermsRuntimeMappings,
createFieldValuesMap,
AGG_FIELD_NAME,
} from './utils';

Expand Down Expand Up @@ -209,3 +210,137 @@ describe('new terms utils', () => {
});
});
});

describe('createFieldValuesMap', () => {
it('should return undefined if new terms fields has only one field', () => {
expect(
createFieldValuesMap(
['host.name'],
[
{
key: {
'source.host': 'host-0',
},
doc_count: 1,
},
{
key: {
'source.host': 'host-1',
},
doc_count: 3,
},
]
)
).toBeUndefined();
});

it('should return values map if new terms fields has more than one field', () => {
expect(
createFieldValuesMap(
['source.host', 'source.ip'],
[
{
key: {
'source.host': 'host-0',
'source.ip': '127.0.0.1',
},
doc_count: 1,
},
{
key: {
'source.host': 'host-1',
'source.ip': '127.0.0.1',
},
doc_count: 1,
},
]
)
).toEqual({
'source.host': {
'host-0': true,
'host-1': true,
},
'source.ip': {
'127.0.0.1': true,
},
});
});

it('should not put value in map if it is null', () => {
expect(
createFieldValuesMap(
['source.host', 'source.ip'],
[
{
key: {
'source.host': 'host-1',
'source.ip': null,
},
doc_count: 1,
},
]
)
).toEqual({
'source.host': {
'host-1': true,
},
'source.ip': {},
});
});

it('should not put value in map if it is a number', () => {
vitaliidm marked this conversation as resolved.
Show resolved Hide resolved
expect(
createFieldValuesMap(
['source.host', 'source.id'],
[
{
key: {
'source.host': 'host-1',
'source.id': 100,
},
doc_count: 1,
},
]
)
).toEqual({
'source.host': {
'host-1': true,
},
'source.id': {
'100': true,
},
});
});

it('should not put value in map if it is a boolean', () => {
vitaliidm marked this conversation as resolved.
Show resolved Hide resolved
expect(
createFieldValuesMap(
['source.host', 'user.enabled'],
[
{
key: {
'source.host': 'host-1',
'user.enabled': true,
},
doc_count: 1,
},
{
key: {
'source.host': 'host-1',
'user.enabled': false,
},
doc_count: 1,
},
]
)
).toEqual({
'source.host': {
'host-1': true,
},
'user.enabled': {
true: true,
false: true,
},
});
});
});
Original file line number Diff line number Diff line change
Expand Up @@ -80,8 +80,43 @@ export const transformBucketsToValues = (
);
};

/**
* transforms arrays of new terms fields and its values in object
* [new_terms_field]: { [value1]: true, [value1]: true }
* It's needed to have constant time complexity of accessing whether value is present in new terms
* It will be passed to Painless script used in runtime field
*/
export const createFieldValuesMap = (
newTermsFields: string[],
buckets: estypes.AggregationsCompositeBucket[]
) => {
if (newTermsFields.length === 1) {
return undefined;
}

const valuesMap = newTermsFields.reduce<Record<string, Record<string, boolean>>>(
(acc, field) => ({ ...acc, [field]: {} }),
{}
);

buckets
.map((bucket) => bucket.key)
.forEach((bucket) => {
Object.entries(bucket).forEach(([key, value]) => {
if (value == null) {
return;
}
const strValue = typeof value !== 'string' ? value.toString() : value;
valuesMap[key][strValue] = true;
});
});

return valuesMap;
};

export const getNewTermsRuntimeMappings = (
newTermsFields: string[]
newTermsFields: string[],
values?: Record<string, Record<string, boolean>>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

values seems like it should be required since the runtime field now depends on it to work correctly. Both createFieldValuesMap and getNewTermsRuntimeMappings also depend on the number of new terms fields - could it be easier to pass buckets in to getNewTermsRuntimeMappings and call createFieldValuesMap after the if (newTermsFields.length <= 1) check in this function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, done

): undefined | { [AGG_FIELD_NAME]: estypes.MappingRuntimeField } => {
// if new terms include only one field we don't use runtime mappings and don't stich fields buckets together
if (newTermsFields.length <= 1) {
Expand All @@ -92,7 +127,7 @@ export const getNewTermsRuntimeMappings = (
[AGG_FIELD_NAME]: {
type: 'keyword',
script: {
params: { fields: newTermsFields },
params: { fields: newTermsFields, values },
source: `
def stack = new Stack();
// ES has limit in 100 values for runtime field, after this query will fail
Expand All @@ -110,9 +145,14 @@ export const getNewTermsRuntimeMappings = (
emit(line);
emitLimit = emitLimit - 1;
} else {
for (field in doc[params['fields'][index]]) {
def fieldName = params['fields'][index];
for (field in doc[fieldName]) {
def fieldStr = String.valueOf(field);
if (!params['values'][fieldName].containsKey(fieldStr)) {
continue;
}
def delimiter = index === 0 ? '' : '${DELIMITER}';
def nextLine = line + delimiter + String.valueOf(field).encodeBase64();
def nextLine = line + delimiter + fieldStr.encodeBase64();

stack.add([index + 1, nextLine])
}
Expand Down
Loading