-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[7.11][Telemetry] Diagnostic Alert Telemetry #84422
Conversation
x-pack/plugins/security_solution/server/lib/telemetry/sender.ts
Outdated
Show resolved
Hide resolved
x-pack/plugins/security_solution/server/lib/telemetry/sender.ts
Outdated
Show resolved
Hide resolved
Just dropping an update on this work item - there turned out to be a couple of background data plumbing pieces that needed to be put in place before this could be tested e2e. I hope to get them all in within the next day or 2. |
Remove 2nd var to track telemetry opt in. Add ES client to start querying index. Use query to get docs from a dummy index. Change how index is queried. Get diagnostic alerts to send to staging cluster. Record last timestamp. PoC on telemetry opt in via 2 processes. Revert to original solution
bace39b
to
e091be3
Compare
@elasticmachine merge upstream |
…stic/kibana into pjhampton/diagnostic-alert-telemetry
Pinging @elastic/kibana-security (Team:Security) |
@elasticmachine merge upstream |
x-pack/plugins/security_solution/server/lib/telemetry/sender.ts
Outdated
Show resolved
Hide resolved
sort: [ | ||
{ | ||
'event.ingested': { | ||
order: 'asc', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want asc
here so we get the most recent events?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got it wrong. Updated here 5638da9
desc
will order by most recent I believe from my testing.
public async fetchDiagnosticAlerts() { | ||
const query = { | ||
expand_wildcards: 'open,hidden', | ||
index: 'logs-endpoint.diagnostic.collection-default*', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need logs-endpoint.diagnostic.collection-*
here, because I think @ferullo was saying that the diagnostic alerts will respect the namespace setting, so they might come with something else than default
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Updated here: 2018132
@elasticmachine merge upstream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc in analytics-staging looks good!
💚 Build SucceededMetrics [docs]Distributable file count
History
To update your PR or re-run it, just comment with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I reviewed the new allowlist fields and the content is consistent with other data that we're already collecting, so good to go on that front.
One minor question about the task scheduler.
return `${TelemetryDiagTaskConstants.TYPE}:${TelemetryDiagTaskConstants.VERSION}`; | ||
}; | ||
|
||
public runTask = async (taskId: string, searchFrom: string, searchTo: string) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't an objection in the code, just a question: If there are multiple Kibana instances, it is possible for this task to be running simultaneously in multiple instances? Or does the task manager ensure that only one execution can happen at any given time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stevewritescode Good question. The taskManager uses a distributed model which ensures that only a single Kibana instance will run the task. Each Kibana instance polls for new tasks on an interval and attempts to "claim" tasks that are ready to fire. If another Kibana instance tries to claim the same task, only one will succeed and the others will get a 409 Conflict as per OCC. If I'm remembering all this correctly. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice work!
* Port @tsg's work on task manager. Remove 2nd var to track telemetry opt in. Add ES client to start querying index. Use query to get docs from a dummy index. Change how index is queried. Get diagnostic alerts to send to staging cluster. Record last timestamp. PoC on telemetry opt in via 2 processes. Revert to original solution * Update on agreed method. Fixes race condition. * Expand wildcards. * stage. * Add rule.ruleset collection. * Update telemetry sender with correct query for loading diag alerts. * Add similar task tests to endpont artifact work. * Fix broken import statement. * Create sender mocks. * Update test to check for func call. * Update unused reference. * record last run. * Update index. * fix import * Fix test. * test fix. * Pass unit to time diff calc. * Tests should pass now hopefully. * Add additional process fields to allowlist. Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
Summary
This PR extends the existing security telemetry collection by transmitting diagnostic alerts via a Kibana task manager.
Related PRs:
Implementation
event.ingested
field.queueTelemetryEvents
function from the EventsTelemetry (See: [Security] Alert Telemetry for the Security app #77200)Query the index we decide to for the time since last execution to present. Record the last execution timeChecklist
For maintainers