This repository hosts ViaLin, an accurate, low-overhead, path-aware, dynamic taint analyzer for Android.
This repository also hosts the ViaLin's evaluation package.
If you use this tool, please cite:
Khaled Ahmed, Yingying Wang, Mieszko Lis, and Julia Rubin. ViaLin: Path-Aware Dynamic Taint Analysis for Android. The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE), 2023 (26% acceptance rate).
-
Install the Android SDK and build tools: https://developer.android.com/studio/intro/update
-
Install python3
-
Install and build Android AOSP version 8.0.0 by following the instructions in the Android manual (in our evaluation, we targeted android-8.0.0_r21, lunch 31).
-
Change the directory to the AndroidSource:
cd AndroidSource
-
In the
apply_code.py
script, change the path ofandroid_src_folder
to the Android AOSP. -
Run
apply_code.py
:python3 apply_code.py
-
Replace [path-to-jar] in
java.mk
to the jar file built from step #3 -
Create a folder called framework_analysis_results, place its path in [framework_analysis_results] in the
java.mk
-
Create folder class_info/ inside framework_analysis_results
-
Replace [path-to-sources] and [path-to-sink] in
java.mk
with path to the absolute path of GPBench/config/empty.txt from the evaluation package -
Change the directory to the downloaded AOSP, follow the "Setting up the environment", "Choosing a target", and "Building the code" section of the Building Android Manual
-
Flash an Android device by following the instructions in the Android Manual Flashing Devices
An example on how to taint and install an app on the device is in the evaluation package GPBench/scripts/run_gp.py, run from the vialin directory python3 GPBench/scripts/run_gp.py, modify the paths in the script to point to the correct folder for the AOSP, framework_analysis_results, source/sink lists, the android-record-and-replay tool included in vialin, and the path to the apk.
This section describes both the DroidICCBench and the GPBench benchmarks along with the configuration used to run the benchmarks. The evaluation package scripts and configuration files are under evaluation_package
.
This package is organized in the following structure:
.
├── apktool/
├── android-record-replay/
├── DroidICCBench/
│ ├── scripts/
│ │ ├── run_droidbench.py
│ │ └── translate_droidbench.py
│ ├── config/
│ │ ├── app1.src.log
│ │ ├── app1.sink.log
│ │ └── ...
│ └── apps/
│ ├── Category1/
│ │ ├── app1
│ │ ├── app2
│ │ └── ...
│ ├── Category2/
│ └── ...
└── GPBench/
├── scripts/
│ ├── run_gp.py
│ ├── run_overhead.py
│ └── translate_gp.py
├── config/
│ ├── app1.src.log
│ ├── app1.sink.log
│ └── ...
└── apps/
├── app1
├── app2
└── ...
The evaluation package contains the android-touch-record-replay
tool which we used to record and replay the execution for each app.
Next, we describe the DroidICCBench and GPBench sections of the evaluation package.
The benchmark consists of 217 apps from DroidBench and ICCbench. We had to exclude eight apps for which we cannot reliably trigger the flow in an automated way, e.g., because it is triggered when the phone memory is low. The 8 eight excluded apps are:
- Callbacks.AnonymousClass1
- Callbacks.LocationLeak1
- Callbacks.LocationLeak2
- Callbacks.LocationLeak3
- Callbacks.RegisterGlobal1
- Callbacks.RegisterGlobal2
- GeneralJava.FactoryMethods1
- Lifecycle.ActivityLifecycle3
For the remaining 209 apps, as the benchmark apps were developed for an older Android API (level 19), where permissions to run any sensitive, we modified the apps to request permissions using the approach if the newer Android versions. At the end, we used 209 apps in our evaluation. The apps are grouped into categories under the DroidICCBench/apps
folder.
The combined sources list is under DroidICCBench/config/source_full_list.txt
, the combined sinks list is under DroidICCBench/config/sinks_full_list.txt
. The replay script for each app is at DroidICCBench/config/[app].replay.txt
, apps without an execution script use the default script DroidICCBench/config/trigger_flow.replay.txt
.
The script DroidICCBench/scripts/run_droidbench.py
runs the specified benchmark app by selecting its number, the number of each benchmark is its line number in the DroidICCBench/config/droidbench_apks.log
. The script scripts/extract_path.py
extracts the paths from the Android logcat and translates it into a human readable format.
We used the benchmark of Google Play applications from Zhang et al [37]. We excluded from our study three out of the 19 apps, as their backend servers were non-functional at the time of writing and we thus could not execute them dynamically. The remaining 16 apps are listed below:
The sources short list for each app GPBench/config/[app-name].src.log
, the sinks short list for each app is under GPBench/config/[app-name].sink.log
. The long list of sources is at GPBench/config/source_long_list.txt
and the long list of sinks is GPBench/config/sinks_long_list.log
. The replay script for each app is at GPBench/config//[app].replay.txt
.
The script GPBench/scripts/run_gp.py
runs the specified benchmark app. The script GPBench/scripts/run_overhead.py
runs the overhead experiment. The script GPBench/scripts/translate_gp.py
extracts the paths from the Android logcat and translates it into a human readable format.
We cannot distribute the YoWhatsApp malicious apk online, instead, we provide its following indicators:
package name: com.gbbwhatsapp
sha1 hash: a8dbfd8d48e4a4952e1a822ce1323a37348f0c1c
sha256 hash: 89c23dc02f4f67972a5c4cd9ccc61f7c08c95173d07a980c7340101ba597939e
md5: 531d0a00d3b7221b4ac712fbfe846029
blog describing the malware: link
The path provided to the analysts is available here
Khaled Ahmed, Yingying Wang, Mieszko Lis, and Julia Rubin. ViaLin: Path-Aware Dynamic Taint Analysis for Android. The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE), 2023 (26% acceptance rate).
If you experience any issues, please submit an issue or contact us at khaledea@ece.ubc.ca