Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial MySQL banner support #30

Merged
merged 13 commits into from
Mar 31, 2015

Conversation

TomSellers
Copy link
Contributor

Add initial MySQL banner support

- Moved editions to service.edition
- Changed how Percona and MariaDB are labeled
- Added/tweaked fingerprints
- Added fingerprints for errors
- Started the process of adding example matches
@jhart-r7 jhart-r7 self-assigned this Feb 20, 2015
<param pos="0" name="os.product" value="Windows"/>
</fingerprint>

<fingerprint pattern="^(\d\.\d{1,2}\.\h{1,3})(?:-m\d)?(?:-rc)?(?:-alpha)?(?:-beta)?-[Cc]ommunity(?:-max)?(?:-debug)?(?:-log)?(?:-debug)?$">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you are searching for Community or community, but in the previous one you are just looking for community. IMO, make both of these (and perhaps all of them) case insensitive

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If i'm not mistaken there was a limited number of instances of Community, something around 40 or in the data set. I will try to mine the data and also see if there is a performance impact for doing case insensitivity.

- Added example lines
- Removed the os.device and os.product entries as requested

Pending

- Community vs commmunity metrics
- Base generic regex for MySQL versions not matched so far
- fingerprint optimization based on sample data
- reviewing use of trailing terminator for regex
@jhart-r7
Copy link
Contributor

This is looking pretty good. To your pending commit comments:

  • Community vs community metrics -- would be nice to know, but my suggestion for this wasn't necessarily that I think there may be a banner like this today, but that there may be one some day.
  • Base generic regex for MySQL versions not matched so far -- the trouble with this is that it is then hard for you to tell if you have a new/unmatched/unknown banner because this generic regex would catch all mysql versions you hadn't seen before and might want to handle differently for whatever reason. I'm not totally opposed as I see some value in this. Perhaps post a sample of what you were thinking?
  • fingerprint optimization based on sample data -- can you elaborate? do you mean modifying the existing fingerprints to match more banners based on what you see in the sample data, or do you mean modifying the existing fingerprints to match more efficiently?
  • reviewing use of trailing terminator for regex -- this is probably worth a separate PR/ticket, but my thinking is that the regex anchors ^and $ as well as the common pattern of .*$ are perfectly fine and are in fact preferential provided there is sufficient data to prove that it is necessary. Put another way, something like ^foo version (\d+) .*$ is perfectly OK if you are sure that you do not care about anything that comes after the 1 or more digits. Note that this is distinctly different than not knowing about what comes after the 1 or more digits.

@@ -0,0 +1,812 @@
<?xml version="1.0"?>
<fingerprints>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add a matches attribute here to indicate what it should be matching.

That brings up something I think we discussed in email but I don't see addressed here. This single file matches against two different types of data -- version banners and error messages. IMO these should go in separate files and use different match attributes. Then, anyone/anything using these files can use the right file depending on the type of message received.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll address both as well as all of the outstanding issues. I hope to have the final PR for review by the end of this coming weekend.

 - Added 'matches' attribute to the root 'fingerprints' element in both files
 - Reordered 2 entries in mysql_errors based on most commonly seen banners
 - Added a few additional fingerprints to mysq_banners.xml
@TomSellers
Copy link
Contributor Author

@jhart-r7

  • I have left out the generic match. I had been leaning towards leaving it out, but wasn't sure of the best fit for your use case. Your comments cleared that up for me.
  • Fingerprint optimization was a reference to changing the order of the fingerprints with the goal of having the most common earlier in the file so that matches occurred faster. The downside of this is that similar fingerprints would no longer be grouped which would complicate the effort of maintaining fingerprints. With 1 exception, testing indicate that the benefit across the 5.2 million banner test set was not significant enough to change the FP order. That change, in the mysql_errors, has been made.
  • The review of the regex trailing terminator was at the request of HD. I prefer to use the $ anchor where possible when I am fairly certain that I have all the string accounted for. This reduces the chance of inadvertently matching and loosing an opportunity to gather more information about a target service.
    • The end of line anchor needs to be added to the 'MariaDB MariaDB' fingerprint around line 595 and retested. That is next on my list. It will allow us to add a few fingerprints for MariaDB on certain flavors of Ubuntu and Debian.

Current match metrics against the 2015.02.01 dataset:

MySQL records 5,258,834
MySQL match count 5,255,535 99.937%
Unmatched: 3,229

Additional metrics can be seen in this gits:
https://gist.github.com/TomSellers/d1749e7b993a1270a526

…tuning

- Removed fingerprint for Percona due to too much overlap with standard version identifiers.  This was causing a Travis failure due to a Percona example matching on the 'Oracle MySQL (common)' fingerprint.
  - Percona:  5.5.28-29.3
  - !Percona: 5.5.30-1.1
  - !Percona: 5.5.23-55

The 'MariaDB MariaDB' fingerprint still needs end of line regex anchor review.
@TomSellers
Copy link
Contributor Author

That last commit should say that removed 'a' fingerprint for Percona, There are still multiple fingerprints in place that are reliable for the related banners.

<?xml version="1.0"?>
<fingerprints matches="mysql.banners">

<fingerprint pattern="^(\d\.\d{1,2}\.\h{1,3}(?:[.-]\d{1,2})?(?:[.-]\d{1})?)(?:-m\d{1,2})?(?:-rc)?(?:-alpha)?(?:-beta)?(?:-gamma)?(?:-?[Mm]ax)?(?:-rs)?(?:-modified)?(?:-debug)?(?:-log)?$">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and everywhere else where you are trying to match the numeric part of the version, would it make sense to make this more flexible such that version parts of arbitrary length are matched? For example, would 10.0.0 or 5.100.9999 ever be a valid version?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about MySQL proper, but MariaDB uses 10.x.x versions as they diverge from Oracle's implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, but isn't in our data set yet. MariaDB does use 10.x.x and that is accounted for in the matches starting at line 568. The 'MariaDB MariaDB on Debian Lenny' fingerprint is missing this.

I can adjust the regex as you requested. I keep falling back to my 'unknown service' mindset and forgetting that the assumption is that we already know its MySQL and just want to fingerprint it.

- Reviewed/tweaked some EOL regex anchors
- Added OS specific entries for MariaDB
- Added Percona version number only match back
- Added OS specific entries for Percona
@TomSellers
Copy link
Contributor Author

@jhart-r7 This morning's updates contain the changes to the regex to better handle future strings such as 10.xxx.xxx. There are also quite additions that identify the target's OS.

- additional FPs for backports, certain OS/product combinations
@TomSellers
Copy link
Contributor Author

@jhart-r7 @gschneider-r7 The previous is likely my last commit tweaking fingerprints. Anything that follows will be bug fixes for issues that someone discovers.

@jhart-r7
Copy link
Contributor

Thanks, @TomSellers! I should be able to land this today. My only last bit of feedback, which I'll address while landing, is to add some comments in these two files to indicate how these are to be used (IOW, what parts of the protocol are being fingerprinted).

@jhart-r7 jhart-r7 merged commit 1f3f159 into rapid7:master Mar 31, 2015
jhart-r7 added a commit that referenced this pull request Mar 31, 2015
@TomSellers TomSellers deleted the pattern/mysql_banner branch April 23, 2024 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants