-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial MySQL banner support #30
Conversation
- Moved editions to service.edition - Changed how Percona and MariaDB are labeled - Added/tweaked fingerprints - Added fingerprints for errors - Started the process of adding example matches
<param pos="0" name="os.product" value="Windows"/> | ||
</fingerprint> | ||
|
||
<fingerprint pattern="^(\d\.\d{1,2}\.\h{1,3})(?:-m\d)?(?:-rc)?(?:-alpha)?(?:-beta)?-[Cc]ommunity(?:-max)?(?:-debug)?(?:-log)?(?:-debug)?$"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you are searching for Community or community, but in the previous one you are just looking for community. IMO, make both of these (and perhaps all of them) case insensitive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If i'm not mistaken there was a limited number of instances of Community, something around 40 or in the data set. I will try to mine the data and also see if there is a performance impact for doing case insensitivity.
- Added example lines - Removed the os.device and os.product entries as requested Pending - Community vs commmunity metrics - Base generic regex for MySQL versions not matched so far - fingerprint optimization based on sample data - reviewing use of trailing terminator for regex
This is looking pretty good. To your pending commit comments:
|
@@ -0,0 +1,812 @@ | |||
<?xml version="1.0"?> | |||
<fingerprints> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should add a matches
attribute here to indicate what it should be matching.
That brings up something I think we discussed in email but I don't see addressed here. This single file matches against two different types of data -- version banners and error messages. IMO these should go in separate files and use different match
attributes. Then, anyone/anything using these files can use the right file depending on the type of message received.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll address both as well as all of the outstanding issues. I hope to have the final PR for review by the end of this coming weekend.
- Added 'matches' attribute to the root 'fingerprints' element in both files - Reordered 2 entries in mysql_errors based on most commonly seen banners - Added a few additional fingerprints to mysq_banners.xml
Current match metrics against the 2015.02.01 dataset: MySQL records 5,258,834 Additional metrics can be seen in this gits: |
…tuning - Removed fingerprint for Percona due to too much overlap with standard version identifiers. This was causing a Travis failure due to a Percona example matching on the 'Oracle MySQL (common)' fingerprint. - Percona: 5.5.28-29.3 - !Percona: 5.5.30-1.1 - !Percona: 5.5.23-55 The 'MariaDB MariaDB' fingerprint still needs end of line regex anchor review.
That last commit should say that removed 'a' fingerprint for Percona, There are still multiple fingerprints in place that are reliable for the related banners. |
<?xml version="1.0"?> | ||
<fingerprints matches="mysql.banners"> | ||
|
||
<fingerprint pattern="^(\d\.\d{1,2}\.\h{1,3}(?:[.-]\d{1,2})?(?:[.-]\d{1})?)(?:-m\d{1,2})?(?:-rc)?(?:-alpha)?(?:-beta)?(?:-gamma)?(?:-?[Mm]ax)?(?:-rs)?(?:-modified)?(?:-debug)?(?:-log)?$"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and everywhere else where you are trying to match the numeric part of the version, would it make sense to make this more flexible such that version parts of arbitrary length are matched? For example, would 10.0.0 or 5.100.9999 ever be a valid version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about MySQL proper, but MariaDB uses 10.x.x versions as they diverge from Oracle's implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, but isn't in our data set yet. MariaDB does use 10.x.x and that is accounted for in the matches starting at line 568. The 'MariaDB MariaDB on Debian Lenny' fingerprint is missing this.
I can adjust the regex as you requested. I keep falling back to my 'unknown service' mindset and forgetting that the assumption is that we already know its MySQL and just want to fingerprint it.
- Reviewed/tweaked some EOL regex anchors - Added OS specific entries for MariaDB - Added Percona version number only match back - Added OS specific entries for Percona
@jhart-r7 This morning's updates contain the changes to the regex to better handle future strings such as 10.xxx.xxx. There are also quite additions that identify the target's OS. |
- additional FPs for backports, certain OS/product combinations
@jhart-r7 @gschneider-r7 The previous is likely my last commit tweaking fingerprints. Anything that follows will be bug fixes for issues that someone discovers. |
Thanks, @TomSellers! I should be able to land this today. My only last bit of feedback, which I'll address while landing, is to add some comments in these two files to indicate how these are to be used (IOW, what parts of the protocol are being fingerprinted). |
Add initial MySQL banner support