-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clingo/gringo/clasp: incorrect behavior on Android 11+ #475
Comments
Just had a quick look. To me it seems that we should actually aim for solution 5. @BenKaufmann, can't we simply use bit 0 instead of 63 here? The bit should be guaranteed to be zero because of alignment. Or am I missing something with the string literals? |
To be clear: |
No, the way that const char* foo = "12345";
auto ok = ConstString::fromLiteral(foo); // lsb not set
auto bad = ConstString::fromLiteral(foo + 1); // lsb set The current implementation is indeed broken and not following good practice (see e.g. https://muxup.com/2023q4/storing-data-in-pointers).
Hence, maybe this could be turned into a non-issue by just dropping the (unused?) no-copy optimization. |
I had a quick look, clingo indeed uses the upper bits for symbols. The good news is that this is going to change in some future release. The bad news is that this is going to take a while. |
Are you sure about point 3. The clingo library manages it's memory completely by itself. It should be safe to use the shared object together with other libraries relying on different malloc implementations. |
I am. Here is a minimal example:
Linked with
What seems to trigger the problem is the registration of the propagator, or some code that is executed because of it. If you remove the registration, the error disappears. (I created a NULL propagator to keep the program small, but the error occurs even if you provide pointers to valid callback functions.) |
I am not sure what is happening here. Maybe further options are required when building the libclingo.so library. In any case, I can promise that these issues will go away with future releases. Tagging the upper bits of pointers was not a good idea in the first place. There are more and more extensions that increase the virtual address space. We have one machine that supports 57bit virtual addresses (because of some Intel extension). |
Great! I am happy to run tests on Android on beta versions, just keep me in the loop. |
In the beginning of next year. The wip branch should be quite stable by now. There won't be any patches regarding MTE, however. |
Closing this. Future clingo versions won't touch the high bits of pointers anymore. There will still be some time before releases however. |
Problem description
Clingo/gringo/clasp do not work properly on Android 11+. Grounding produces no aspif rules and solving produces empty answer sets even when given non-empty aspif files.
The issue is due clingo's use of some memory pointers in a way that is incompatible with ARM Memory Tagging Extension (MTE) support. Details can be found at https://source.android.com/docs/security/test/tagged-pointers.
This affects, for instance, clasp's
StrRef
(fileclasp/src/shared_context.cpp
). WhileStrRef
's modification to bit 63 of the allocated pointer is only temporary and does not flow back to the call tofree()
,StrRef
assumes that bit 63 of the pointer returned bymalloc()
is always 0. That is not guaranteed under Android 11+ and in fact the bit is often set to 1. As a result, clasp gets confused as to what type of data structure is stored in that memory area and ends up displaying empty answer sets. Interestingly, the underlying computation is correct to the best of my knowledge and only the displaying of the answer sets is affected. I have tested this by writing a small test C++ program that callssolve()
and handles on its own the task of displaying the answer sets.Note that this issue is not limited to Android. In principle, it may affect any ARM-based device whose OS adopts MTE.
Symptoms
The tests below are run in a termux terminal running on Galaxy Z Fold5, Galaxy S10e and Galaxy Tab S7 with the same results.
Clingo was compiled with clang 17.0.6, the stock compiler in termux. So far, I have been unable to build clingo with gcc on Android.
gringo issue
clingo/clasp issue
Possible solutions
A few possible solutions exist, but none is entirely pain-free.
Solution 1 (Tested)
Use a
malloc()
replacement such as dmalloc:/path/to/dmalloc/libdmalloc.a
to the linker flags when you configure clingo, i.e.:libdmalloc.a
Solution 2 (Tested)
Compile clingo statically against the static version of libc found here: https://android.googlesource.com/toolchain/prebuilts/ndk/r17/+/refs/heads/main/sysroot/usr/lib/aarch64-linux-android
Needless to say, this solution does not produce shared objects.
The steps are similar to Solution 1. I turned off python and lua and I am not sure this solution supports them.
Solution 3 (Not tested)
Building a 32-bit version of clingo should eliminate the problem. See here for a discussion: termux/termux-packages#7332
Solution 4 (Not tested)
Another solution provided at termux/termux-packages#7332 relies on disabling tagged pointers via code. I hypothesize that clingo/gringo/clasp could incorporate the code given in this post: termux/termux-packages#7332 (comment).
Possibly a headache for the Potassco team since it introduces architecture/OS-dependent code.
Solution 5 (Not tested)
A more seamless solution for users -- but more work for the Potassco team! ;) -- is to modify clingo/gringo/clasp and move all tags to the second byte.
Discussion: swiftlang/swift#40779
The text was updated successfully, but these errors were encountered: