-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uaddr(), usym(), ustack to support PIE ASLR #75
Comments
This probably requires #59 to be fixed first. Here's why:
Ok, then trying that address:
doesn't work, because:
It's turned that 64-bit address (0x55a801fbd704) into a 32-bit number. @mmarchini I wonder if you hit this while you were debugging as well... |
That makes sense. Maybe fixing #59 will fix this as well. |
I fixed #59, and can now read the string:
Note I'm not dereferencing (*) the address since gdb has given me the direct address rather than a pointer. Maybe the
|
As part of fixing uaddr() for PIE ASLR, since it probably involves switching to bcc_resolve_symname(), I'd also improve the error message when the symbol can't be found. It's currently:
I'd file this as a separate ticket, but I think the code will all change anyway to support PIE. |
I believe I hit something related using ustack. The reproducing workload is:
And tracing it:
Note that the first symbol lookups work, but subsequent ones do not. It sounds like we've cached the symbol addresses for libc in a way that doesn't account for PIE ASLR randomization. It works the first time, but on the second time those symbols are at different addresses. |
I hit this issue recently and investigated. As of
This assumes that the start address of the first entry of the As of |
`resolve_usym()` caches a `bcc_symbol` object using an executable name as a key, but on the ASLR-enabled platform, symbol addresses change with each execution. Disable (discard) a cache, in this case, to resolve symbol names properly. Note and known issues: - A cache is discarded whenever resolve_usym is called even if a pid is the same as the old one. This is because pid may be reused. - This does not check whether a binary is PIE ASLAR or not. Note that even if a binary is not PIE ASLR, addresses of shared libraries are randomized if ASLR is enabled. (If a binary is not PIE ASLR and `resolve_usym()` resolves symbol in a binary, we can utilize a cache.) - If ASLR is disabled on the first execution but enabled on the second execution, `resolve_usym()` for the second run will use the previous cache. - I'm not sure how much performance impact this has. If the impact is huge, maybe this should be an option. - As discussed in bpftrace#246, symbolizing will fail after process termination (this is a separate issue). For example: ``` % bpftrace -e 'u:/lib/x86_64-linux-gnu/libc.so.6:*nanosleep* /comm == "sleep"/ { @[ustack] = count(); }' Attaching 7 probes... ^C @[ 0x7ff1917cb990 ]: 3 @no such file or directory: /proc/3557/personality [ 0x7fea4211c990 ]: 3 @no such file or directory: /proc/3554/personality [ 0x7f32bc51a990 ]: 3 ``` ----- Closes bpftrace#1031 and solves the second part of bpftrace#75.
… given `resolve_usym()` caches a `bcc_symbol` object using an executable name as a key, but on the ASLR-enabled platform, symbol addresses change with each execution. Disable (discard) a cache, in this case, to resolve symbol names properly. Introduce `BPFTRACE_CACHE_USER_SYMBOLS` env variable to force caching. Caching is fine if only trace one program execution. Note and known issues: - A cache is discarded whenever resolve_usym is called even if a pid is the same as the old one. This is because pid may be reused. - This does not check whether a binary is PIE ASLAR or not. Note that even if a binary is not PIE ASLR, addresses of shared libraries are randomized if ASLR is enabled. (If a binary is not PIE ASLR and `resolve_usym()` resolves symbol in a binary, we can utilize a cache.) - If ASLR is disabled on the first execution but enabled on the second execution, `resolve_usym()` for the second run will use the previous cache. - As discussed in bpftrace#246, symbolizing will fail after process termination (this is a separate issue). For example: ``` % bpftrace -e 'u:/lib/x86_64-linux-gnu/libc.so.6:*nanosleep* /comm == "sleep"/ { @[ustack] = count(); }' Attaching 7 probes... ^C @[ 0x7ff1917cb990 ]: 3 @ [ 0x7fea4211c990 ]: 3 @ [ 0x7f32bc51a990 ]: 3 ``` ----- Closes bpftrace#1031 and solves the second part of bpftrace#75.
… given `resolve_usym()` caches a `bcc_symbol` object using an executable name as a key, but on the ASLR-enabled platform, symbol addresses change with each execution. Disable (discard) a cache, in this case, to resolve symbol names properly. Introduce `BPFTRACE_CACHE_USER_SYMBOLS` env variable to force caching. Caching is fine if only trace one program execution. Note and known issues: - A cache is discarded whenever resolve_usym is called even if a pid is the same as the old one. This is because pid may be reused. - This does not check whether a binary is PIE ASLAR or not. Note that even if a binary is not PIE ASLR, addresses of shared libraries are randomized if ASLR is enabled. (If a binary is not PIE ASLR and `resolve_usym()` resolves symbol in a binary, we can utilize a cache.) - If ASLR is disabled on the first execution but enabled on the second execution, `resolve_usym()` for the second run will use the previous cache. - As discussed in bpftrace#246, symbolizing will fail after process termination (this is a separate issue). For example: ``` % bpftrace -e 'u:/lib/x86_64-linux-gnu/libc.so.6:*nanosleep* /comm == "sleep"/ { @[ustack] = count(); }' Attaching 7 probes... ^C @[ 0x7ff1917cb990 ]: 3 @ [ 0x7fea4211c990 ]: 3 @ [ 0x7f32bc51a990 ]: 3 ``` ----- Closes bpftrace#1031 and solves the second part of bpftrace#75.
… given `resolve_usym()` caches a `bcc_symbol` object using an executable name as a key, but on the ASLR-enabled platform, symbol addresses change with each execution. Disable (discard) a cache, in this case, to resolve symbol names properly. Introduce `BPFTRACE_CACHE_USER_SYMBOLS` env variable to force caching. Caching is fine if only trace one program execution. Note and known issues: - A cache is discarded whenever resolve_usym is called even if a pid is the same as the old one. This is because pid may be reused. - This does not check whether a binary is PIE ASLAR or not. Note that even if a binary is not PIE ASLR, addresses of shared libraries are randomized if ASLR is enabled. (If a binary is not PIE ASLR and `resolve_usym()` resolves symbol in a binary, we can utilize a cache.) - If ASLR is disabled on the first execution but enabled on the second execution, `resolve_usym()` for the second run will use the previous cache. - As discussed in bpftrace#246, symbolizing will fail after process termination (this is a separate issue). For example: ``` % bpftrace -e 'u:/lib/x86_64-linux-gnu/libc.so.6:*nanosleep* /comm == "sleep"/ { @[ustack] = count(); }' Attaching 7 probes... ^C @[ 0x7ff1917cb990 ]: 3 @ [ 0x7fea4211c990 ]: 3 @ [ 0x7f32bc51a990 ]: 3 ``` ----- Closes bpftrace#1031 and solves the second part of bpftrace#75.
… given `resolve_usym()` caches a `bcc_symbol` object using an executable name as a key, but on the ASLR-enabled platform, symbol addresses change with each execution. Disable (discard) a cache, in this case, to resolve symbol names properly. Introduce `BPFTRACE_CACHE_USER_SYMBOLS` env variable to force caching. Caching is fine if only trace one program execution. Note and known issues: - A cache is discarded whenever resolve_usym is called even if a pid is the same as the old one. This is because pid may be reused. - This does not check whether a binary is PIE ASLAR or not. Note that even if a binary is not PIE ASLR, addresses of shared libraries are randomized if ASLR is enabled. (If a binary is not PIE ASLR and `resolve_usym()` resolves symbol in a binary, we can utilize a cache.) - If ASLR is disabled on the first execution but enabled on the second execution, `resolve_usym()` for the second run will use the previous cache. - As discussed in bpftrace#246, symbolizing will fail after process termination (this is a separate issue). For example: ``` % bpftrace -e 'u:/lib/x86_64-linux-gnu/libc.so.6:*nanosleep* /comm == "sleep"/ { @[ustack] = count(); }' Attaching 7 probes... ^C @[ 0x7ff1917cb990 ]: 3 @ [ 0x7fea4211c990 ]: 3 @ [ 0x7f32bc51a990 ]: 3 ``` ----- Closes bpftrace#1031 and solves the second part of bpftrace#75.
… given `resolve_usym()` caches a `bcc_symbol` object using an executable name as a key, but on the ASLR-enabled platform, symbol addresses change with each execution. Disable (discard) a cache, in this case, to resolve symbol names properly. Introduce `BPFTRACE_CACHE_USER_SYMBOLS` env variable to force caching. Caching is fine if only trace one program execution. Note and known issues: - A cache is discarded whenever resolve_usym is called even if a pid is the same as the old one. This is because pid may be reused. - This does not check whether a binary is PIE ASLAR or not. Note that even if a binary is not PIE ASLR, addresses of shared libraries are randomized if ASLR is enabled. (If a binary is not PIE ASLR and `resolve_usym()` resolves symbol in a binary, we can utilize a cache.) - If ASLR is disabled on the first execution but enabled on the second execution, `resolve_usym()` for the second run will use the previous cache. - As discussed in #246, symbolizing will fail after process termination (this is a separate issue). For example: ``` % bpftrace -e 'u:/lib/x86_64-linux-gnu/libc.so.6:*nanosleep* /comm == "sleep"/ { @[ustack] = count(); }' Attaching 7 probes... ^C @[ 0x7ff1917cb990 ]: 3 @ [ 0x7fea4211c990 ]: 3 @ [ 0x7f32bc51a990 ]: 3 ``` ----- Closes #1031 and solves the second part of #75.
I believe that this is now fixed by #2386 (at least to the point where we can't do much more to resolve symbols). Closing. |
Ubuntu 18.04 Bionic (and other OSes) have switched to randomizing the address space layout, which breaks simple approaches for symbol resolution. From https://wiki.ubuntu.com/BionicBeaver/ReleaseNotes#Security_Improvements:
The bpftrace uaddr() call needs to work on both normal executables, as well as PIE executables (gcc -pie -fpie). Here's how to tell the difference:
From the above output, uaddr-old is an "executable", whereas uaddr-pie is a "shared object".
You can also see this in the address space of a running process:
Which means techniques like
objdump
no longer work:However:
uprobe already works for both!
uprobe uses bcc_resolve_symname() to get the offset. Maybe we can do the same here, since it seems to already deal with PIE.
The text was updated successfully, but these errors were encountered: