-
Notifications
You must be signed in to change notification settings - Fork 710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect cpu features on ARM with getauxval() #408
Detect cpu features on ARM with getauxval() #408
Conversation
210c17e
to
f8a628e
Compare
mk/ring.mk
Outdated
@@ -32,6 +32,7 @@ RING_SRCS = $(addprefix $(RING_PREFIX), \ | |||
crypto/bn/random.c \ | |||
crypto/bn/shift.c \ | |||
crypto/cipher/e_aes.c \ | |||
crypto/cpu-arm-linux.c \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't ideal, and goes against the spirit of the filename...
Upon making this mini-PR I remembered why I put the C where I did: if the C bits are ARM-only with no provision for running tests, then so is all the Rust that uses it. So, to make the Rust (and Rust/C interaction) testable on average joe hardware (i.e. x86 Linux), I made cpu-arm-linux.c, uh, not ARM-specific. What's the right way to do this (where "this" is "make it testable on x86 linux, but in a production build only have this stuff present on ARM")?
f8a628e
to
e111e1d
Compare
crypto/cpu-arm-linux.c
Outdated
* WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION | ||
* OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN | ||
* CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. */ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The license must stay the same. If you think the Google copyright statement is not appropriate then suggest a different copyright statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I don't really have an opinion; I just removed it because none of the code in this file was written by Google anymore.
crypto/cpu-arm-linux.c
Outdated
* - If getauxval() is available and the requested type is found, this returns | ||
* 1 and also writes to the result pointer param. | ||
*/ | ||
int32_t getauxval_wrapper(unsigned long type, unsigned long *result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our goal is to do as much in Rust as reasonable. That means doing as little in C as we can get away with. With that in mind, why not do this:
/* getauxval_wrapper returns 0 if |getauxval| is available and 1 otherwise.
* If |getauxval| is available then |*result| is set to the result of
* |getauxval(type)| before |getauxval_wrapper| returns. */
int getauxval_wrapper(unsigned long type, unsigned long *result) {
if (getauxval == NULL) {
return 0;
}
/* When |getauxval| returns 0, |errno == 0| indicates a zero value,
* whereas |errno != 0| after |getauxval| indicates an error.
* https://www.securecoding.cert.org/confluence/pages/viewpage.action?pageId=6619179#ERR10-C.Seterrnotozerobeforecallingalibraryfunctionknowntoseterrno,andcheckerrnoonlyafterthefunctionreturnsavalueindicatingfailure-LibraryFunctionsthatSeterrnoandReturnanIn-BandErrorIndicator
*/
errno = 0;
*result = getauxval(type);
return 1;
}
The Rust code can then do all the rest of the logic, in particular using std::io::Error::last_os_error()
to get errno.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, I did not know about that function. That looks good, but how would we clear errno after reading it in Rust?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that my first comment in this review was submitted before I hit "Start a review," and so it isn't part of the "review" as defined by Git. Also note I rewrote that comment a bit and GitHub might not have sent you email for each update.
Also note that I have lots of feedback about simplifying the code. Please don't take this too harshly. Basically, I just want us to try hard to make this code as simple as possible, taking advantage of the fact that we only really need need to detect specifically NEON and crypto features. As far as testing goes, IMO the testing here is overkill but I don't object to it. In the end, we're going to need to test this on real devices before it gets merged to master. I'm actually in the process of acquiring the devices now.
As I mentioned below, I think we are better of just implementing getauxval()
ourselves in Rust, in a separate PR on top of this PR, instead of implementing the fallback to /proc/self/auxv
.
|
||
if let Ok(v) = getauxval_provider.getauxval(AT_HWCAP) { | ||
hwcap = v; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above can be replaced with let hwcap = getauxval_provider.getauxval(AT_HWCAP).unwrap_or(0);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/init.rs
Outdated
#[cfg(all(any(target_arch = "arm", target_arch = "aarch64"), | ||
target_os = "linux"))] | ||
fn set_cpu_features() { | ||
cpu_feature::arm_linux::arm_linux_set_cpu_features(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should just inline arm_linux_set_cpu_features()
into this, instead of having it be a separate function, making armcap_from_features()
the public function in the submodule. That would simplify the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
#[cfg(all(any(target_arch = "arm", target_arch = "aarch64"), | ||
target_os="linux"))] | ||
pub fn arm_linux_set_cpu_features() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In ring we try to organize files top-down. In particular, arm_linux_set_cpu_features()
should be at the top of the module if it continues to exist, and its implementation details should be below.
|
||
if let Ok(v) = getauxval_provider.getauxval(AT_HWCAP) { | ||
hwcap = v; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let hwcap = provider.getauxval(AT_HWCAP).unwrap_or(0);
This means we're always calling unwrap_or(0)
on the result of provider.getauxval()
, so provider.getauxval()
should just return 0 on error, instead of returning a Result
. Further, if we do that, then getauxval_wrapper()
doesn't need to distinguish between the cases where getauxval()
doesn't exist, where getauxval()
returns 0 without setting errno
, and where getauxval()
encountered an error: In all three cases, it can just set the output result to zero and we'll get the right semantics.
In the next PR, we may need to distinguish between these cases, but at this point, YAGNI applies, so let's simplify the code accordingly.
Plus, I have an idea to make all this code much simpler: Let's just implement getauxal()
ourselves in Rust using the information from http://stackoverflow.com/a/7247142 and http://stackoverflow.com/a/7246388, in particular by iterating past the end of environ
. I took a peak ah how getauxval()
is implemented in Android's Bionic and that's how it's done anyway.
Then we won't need to write any code to read from /proc/self/auxv, and we won't have to worry about errno semantics, we won't have to worry about the differences between glibc and Android's Bionic or between glibc versions, and we won't have to worry about weak symbol linkage, and we can get rid of all the C code. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good; I'll give it a try. It's been fun learning about this particular flavor of C/Rust interop, but I'd just as soon ditch all that.
let mut hwcap2 = 0; | ||
if let Ok(v) = getauxval_provider.getauxval(AT_HWCAP2) { | ||
hwcap2 = v; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let hwcap2 = provider.getauxval(AT_HWCAP2).unwrap_or(0);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (do you want these Done comments? I miss BitBucket's tasks...)
const ARM_HWCAP2_AES: AuxvUnsignedLong = 1 << 0; | ||
const ARM_HWCAP2_PMULL: AuxvUnsignedLong = 1 << 1; | ||
const ARM_HWCAP2_SHA1: AuxvUnsignedLong = 1 << 2; | ||
const ARM_HWCAP2_SHA2: AuxvUnsignedLong = 1 << 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest you rename auxv::AuxvUnsignedLong
to auxv::Type
, since it is the type of the type
parameter. The fact that it is an unsigned long
isn't relevant outside of self::auxv
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love the current name, but it's also the value, so it might be confusing to someone to see that the return value is of type Type
. Really it's too bad that the terminology of getauxval uses "type" for input instead of, say, "key". :( I can't think of anything better, so Type
it is!
src/cpu_feature/arm_linux/auxv.rs
Outdated
fn test_getauxv_hwcap_linux_finds_hwcap() { | ||
let native_getauxval = NativeGetauxvalProvider{}; | ||
let result = native_getauxval.getauxval(AT_HWCAP); | ||
// there should be SOMETHING in the value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is true, because by definition all these features are optional; if they weren't optional then AT_HWCAP wouldn't be required to report them. Or, do you have a counterexample?
I suggest we just drop this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropped
src/cpu_feature/arm_linux/auxv.rs
Outdated
let native_getauxval = NativeGetauxvalProvider{}; | ||
|
||
// AT_NULL aka 0 is effectively the EOF for auxv, so it's never a valid type | ||
assert_eq!(GetauxvalError::NotFound, native_getauxval.getauxval(0).unwrap_err()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test may be more useful for your general-purpose getauxval crate, but I don't think it's useful enough to include in ring, so I suggest we just drop it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropped
let getauxval = NativeGetauxvalProvider{}; | ||
|
||
// can't really say anything useful about its result | ||
let _ = armcap_from_features(getauxval); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this test, because the tested code will be executed on every test on ARM platforms, so I suggest you just remove the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
#[test] | ||
fn armcap_for_hwcap2_zero_returns_zero() { | ||
assert_eq!(0, super::armcap_for_hwcap2(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this test either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
IMO, testing on x86 and x86-64 is a "nice to have" and IMO it's probably not worthwhile to do, once we've verified the code works on ARM and ARM64, because of all the complication it adds. I'd rather we find a way to make it easier to test this on ARM (e.g. make it easy to set up Android dev environments, perhaps via Docker) than to make the code even a little more complex or confusing to facilitate non-ARM testing. |
Also, regarding licensing:
|
…vel of feature detection into init. I agree to license my contributions to each file under the terms given at the top of each file I changed.
Do you want the contributor statement in every commit, or just one, or only the last or first one, or...? I'm the copyright holder; all this is done on my own time. Also, please don't feel bad about telling me you don't like my approach on something. I'm here to learn how to build a crypto library, not to convince you that my way of building for testability or whatever is gospel. Obviously I choose an approach that seems best to me, and I wipe away a tiny tear when I delete code that enables testability or portability because those things are dear to me, but I'm perfectly happy to admit that my background is not in building low level crypto stuff, so my heuristics for what is worth the lines of code and what is not may well be not well-tuned for this arena. Anyway, I appreciate your thoughtful reviews. You've clearly been doing this sort of code longer than I have, so I'm happy to incorporate your knowledge there into my personal stash. |
e111e1d
to
ee5b621
Compare
I agree to license my contributions to each file under the terms given at the top of each file I changed.
3 similar comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Sorry to request so many small changes.
crypto/cpu-arm-linux.c
Outdated
if (hwcap2 & HWCAP2_PMULL) { | ||
GFp_armcap_P |= ARMV8_PMULL; | ||
unsigned long auxval = getauxval(type); | ||
// map errors to a zero value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to http://man7.org/linux/man-pages/man3/getauxval.3.html#RETURN_VALUE, there's no reason to check errno here, since auxval
will be zero if an error occurs.
Also, all released versions of Android (AFAICT) fail to set errno
at all. The fix is checked into the trunk of Android, but it will be years before we can use it. Therefore, we have to just ignore errno
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently stopping writing Android code years ago wasn't enough to leave these problems behind me. ;)
crypto/cpu-arm-linux.c
Outdated
} entry; | ||
/* | ||
* If getauxval is not available, or an error occurs, return 0. | ||
* Otherwise, return the value found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/found./found (which may be zero)/.
#define HWCAP2_PMULL (1 << 1) | ||
#define HWCAP2_SHA1 (1 << 2) | ||
#define HWCAP2_SHA2 (1 << 3) | ||
#ifdef __linux__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#include <errno.h>
crypto/cpu-arm-linux.c
Outdated
/* Matching OpenSSL, only report other features if NEON is present. */ | ||
if (hwcap & HWCAP_NEON) { | ||
GFp_armcap_P |= ARMV7_NEON; | ||
#include <errno.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this up to where I indicated above (#include
s go first).
include/openssl/arm_arch.h
Outdated
@@ -104,6 +104,8 @@ | |||
#define __ARM_MAX_ARCH__ 8 | |||
#endif | |||
|
|||
/* Keep in sync with src/cpu_feature/arm_linux/arm_linux.rs */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: End sentences with periods.
src/cpu_feature/arm_linux/auxv.rs
Outdated
// from [linux]/include/uapi/linux/auxvec.h. First 32 bits of HWCAP | ||
// even on platforms where unsigned long is 64 bits. | ||
pub const AT_HWCAP: Type = 16; | ||
// currently only used by powerpc and arm64 AFAICT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AT_HWCAP2
is used for 32-bit ARM too, in particular for deteecting crypto instructions on ARMv8 CPUs. For example, consider a 32-bit Android app running on Aaarc64 Android, as is common.
#[test] | ||
fn armcap_for_hwcap2_all_hwcap2_returns_all_armcap() { | ||
assert_eq!(ARMV8_AES | ARMV8_PMULL | ARMV8_SHA1 | ARMV8_SHA256, | ||
super::armcap_for_hwcap2(ARM_HWCAP2_AES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: The |
should be the last character of the preceding line, instead of the first character of the next line.
} | ||
|
||
#[test] | ||
fn armcap_bits_arm_8_with_neon_only_getauxv_hwcap_and_all_getauxv_hwcap2_yields_all_features_armcap() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's choose function names that are short enough that the fn <name>() {
fits in 80 characters.
ret |= ARMV8_SHA256; | ||
} | ||
|
||
return ret; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/return ret;/ret/
armcap |= armcap_for_hwcap2(hwcap2); | ||
} | ||
|
||
return armcap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/return armcap;/armcap/
src/c.rs
Outdated
@@ -41,17 +41,6 @@ macro_rules! define_metrics_tests { | |||
$expected_align_factor:expr ) => | |||
{ | |||
#[cfg(test)] | |||
extern { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving this in didn't work when adding unsigned_long
because it uses the same size and alignment C globals, which then would get redefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm not sure I understand. Did you try defining GFp_unsigned_long_align
and GFp_unsigned_long_size
in crypto/crypto.c and using them?
src/c.rs
Outdated
static GFp_int8_t_size: u16; | ||
static GFp_int8_t_align: u16; | ||
static GFp_uint8_t_size: u16; | ||
static GFp_uint8_t_align: u16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So instead, I define them all beforehand. Maybe there are cleverer ways to conditionally define stuff in a macro?
1 similar comment
1 similar comment
3 similar comments
I've ordered some actual hardware so I can test this, and also I'm digging into the Android code as well. |
Looks like the Android jobs time out and the aarch64 ones fail aes tests: https://travis-ci.org/marshallpierce/ring/builds/193520619. The last merged PR (my .gitignore lines) had one android job time out, but the aarch64 ones passed: https://travis-ci.org/briansmith/ring/builds/189857834 Is it possible that the arm hardware used is the broken hardware listed in #357? |
2 similar comments
1 similar comment
3 similar comments
Good news: I have some hardware that can be used to test this now. Hoping to get to the review soon. I'm not sure what's up with that AAarch64 build. I will look into it. I doubt it's this PRs fault. |
Sorry that so much time has passed on this, which seems to have gotten stuck on us not being able to debug some issue with failing tests a long, long time ago. Now I have some time to work on this again, but a lot has changed. In particular, API level 28 is now the minimum for the Google Play store, so there's no reason to worry about whether I think then, on ARM Linux, we can use @marshallpierce Would you be interested in updating the PR? |
Hypothetically, but I'm pretty maxed out right now with other projects. Give it another year and we'll see ;) |
:) I wish I could have predicted that time would make most of the complexity here irrelevant. I have a private branch that does this, which I will submit publicly soon, so I'm going to close this PR. Thanks for the effort you put into this! |
This is 1 and 2 of your suggested PR disassembly in #392. I elected to do them together because otherwise this would have failed to build on arm because there wouldn't be a replacement for
GFp_cpuid_setup
.