Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offset between selection and what is sent to stdin when using 'Process Selection' and multi-byte character encoding #26

Closed
DidierA opened this issue Oct 6, 2021 · 4 comments

Comments

@DidierA
Copy link

DidierA commented Oct 6, 2021

Hi,
Using Piper with a 'Context Menu Item' definition sh -c "cat >/tmp/Piper", which basicly dumps to a file what is sent to it, I noticed that in some cases, when using it from the Context menu Extensions>Piper>Process Selection , the contents of the file is different than what I have selected: it has the same length, but starts a few characters before the selection.
For instance if the sentence "Hello World" is in the Reponse tab, I select "World", and get "Hello" sent to the script.

By trial and error, I managed to understand that it occurs when the HTTP Response is encoded in utf-8 (Content-Type: text/html; charset=utf-8), and only in portions of the body that appear after characters that take more than one byte to encode (such as 'é' or 'è'). In fact I have noticed that there will be an offset of one position for each of those specific characters that appear anywhere before the selected text.

So if the body contains Nous sommes désolés de vous voir partir,
If I select anything before désolés, there is no offset. If I select the word partir, the string r part will be send to the script, as if the selection had been made 2 characters ahead.

@DidierA DidierA changed the title offset between selection and what is sent to stdin when using 'Process Selection' Offset between selection and what is sent to stdin when using 'Process Selection' and multi-byte character encoding Oct 6, 2021
@DidierA
Copy link
Author

DidierA commented Oct 10, 2021

In function private fun messagesToMap(), if I replace the line

val body = bytes.copyOfRange(bounds[0], bounds[1])

with

val body = bytes.toString(Charsets.UTF_8).substring(bounds[0], bounds[1]).toByteArray(Charsets.UTF_8)

I get the expected result, but this assumes the contents is UTF-8 encoded, which depends on what the HTTP Server sent.
pushed in DidierA@318ced7, but needs testing to check if it does not break other use cases

@DidierA
Copy link
Author

DidierA commented Oct 10, 2021

I did further tests and it seems Burp is using some sort of 'auto-detection' technique to check if the contents is utf-8 or not. A strategy would be to use the same technique: Try to decode the whole message as utf-8. If it fails , use bytes.copyOfRange() to retrieve the selection, else use the above mehtod with Charsets.UTF_8.

DidierA pushed a commit to DidierA/burp-piper that referenced this issue Oct 11, 2021
@DidierA
Copy link
Author

DidierA commented Oct 11, 2021

submited PR #27

dnet pushed a commit that referenced this issue Oct 12, 2021
@dnet
Copy link
Contributor

dnet commented Oct 12, 2021

Thanks for the detailed report and the PR; with that being merged, I'll close this as well.

@dnet dnet closed this as completed Oct 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants