-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Winlogbeat] Prevent Winlogbeat from dropping events with invalid XML #11006
Conversation
Golang's xml parser is pretty strict about the presence of control characters in the XML it is fed. This patch replaces those characters with an unicode escape sequence: "\uNNNN".
This updates the wineventlog logic to ingest windows event logs whose rendering fails.
winlogbeat/sys/xmlreader.go
Outdated
return output(n) | ||
} | ||
|
||
func newXmlSafeReader(rawXML []byte) io.Reader { |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
As a side-effect, at batch_size=100 it's 47.2% faster! 🥇 (update: Something else must have been using CPU during the first run b/c I couldn't replicate the results from the first run. Both master and this branch performed about the same.)
|
@andrewkroh please have another look to see if you like the changes I did to RenderErr. Now Kill me if I know where that speedup comes from. Buffering? In my tests, decoding the XML was marginally slower than before (from 105 to 125 us per op) |
jenkins, test this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test case passes that causes some invalid XML.
Otherwise it LGTM.
…elastic#11006) Golang's xml parser is pretty strict about the presence of control characters in the XML it is fed. This patch replaces those characters with an unicode escape sequence: "\uNNNN". (cherry picked from commit a6102a8)
…elastic#11006) * [Winlogbeat] Escape control characters in XML Golang's xml parser is pretty strict about the presence of control characters in the XML it is fed. This patch replaces those characters with an unicode escape sequence: "\uNNNN". (cherry picked from commit a6102a8)
Previous fix (elastic#11006) made Winlogbeat escape CRLF control characters which are expected in Windows event logs. Fixes elastic#11328
Previous fix (elastic#11006) made Winlogbeat escape CRLF control characters which are expected in Windows event logs. Fixes elastic#11328 (cherry picked from commit 6865403)
Previous fix (elastic#11006) made Winlogbeat escape CRLF control characters which are expected in Windows event logs. Fixes elastic#11328 (cherry picked from commit 6865403)
Previous fix (elastic#11006) made Winlogbeat escape CRLF control characters which are expected in Windows event logs. Fixes elastic#11328 (cherry picked from commit 6865403)
Previous fix (elastic#11006) made Winlogbeat escape CRLF control characters which are expected in Windows event logs. Fixes elastic#11328 (cherry picked from commit 6865403)
…elastic#11006) (elastic#11039) Golang's xml parser is pretty strict about the presence of control characters in the XML it is fed. This patch replaces those characters with an unicode escape sequence: "\uNNNN". (cherry picked from commit 6a078f5)
…ces (elastic#11370) Previous fix (elastic#11006) made Winlogbeat escape CRLF control characters which are expected in Windows event logs. Fixes elastic#11328 (cherry picked from commit 5db0f15)
…elastic#11006) (elastic#11066) * [Winlogbeat] Escape control characters in XML Golang's xml parser is pretty strict about the presence of control characters in the XML it is fed. This patch replaces those characters with an unicode escape sequence: "\uNNNN". (cherry picked from commit a6102a8)
…ces (elastic#11372) Previous fix (elastic#11006) made Winlogbeat escape CRLF control characters which are expected in Windows event logs. Fixes elastic#11328 (cherry picked from commit 6865403)
Golang's xml parser is refusing to process XML documents that contain an ASCII control character, resulting in those events being dropped. This patch pre-processes the XML document in order to replace invalid characters with the unicode escape sequence
\uNNNN
.