Skip to content

Commit

Permalink
Merge pull request #6 from guethilu/splc23-explanations
Browse files Browse the repository at this point in the history
Merge Dockerization into thesis branch
  • Loading branch information
guethilu authored Dec 6, 2023
2 parents e1b423f + 3791a7b commit dab27e0
Show file tree
Hide file tree
Showing 14 changed files with 134 additions and 93 deletions.
93 changes: 22 additions & 71 deletions INSTALL.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,11 @@
# Installation
## Installation Instructions
In the following, we describe how to replicate the validation from our paper (Section 5) step-by-step.
In the following, we describe how to replicate the evaluation from our paper step-by-step.
The instructions explain how to build the Docker image and run the validation in a Docker container.

### 1. Install Docker (if required)
How to install Docker depends on your operating system:
### 1. Start Docker

- _Windows or Mac_: You can find download and installation instructions [here](https://www.docker.com/get-started).
- _Linux Distributions_: How to install Docker on your system, depends on your distribution. The chances are high that Docker is part of your distributions package database.
Docker's [documentation](https://docs.docker.com/engine/install/) contains instructions for common distributions.

Then, start the docker deamon.
Start the docker daemon.

### 2. Open a Suitable Terminal
```
Expand All @@ -31,10 +26,18 @@ Clone this repository to a directory of your choice using git:
```shell
git clone https://github.com/VariantSync/DiffDetective.git
```
Then, navigate to the root of your local clone of this repository:
Then, navigate to the root of your local clone of this repository
```shell
cd DiffDetective
```
and checkout the branch of the replication:
```shell
git checkout splc23-explanations
```
Finally, navigate to the replication directory
```shell
cd replication/splc23-explanations
```

### 3. Build the Docker Container
To build the Docker container you can run the `build` script corresponding to your operating system:
Expand All @@ -45,9 +48,9 @@ To build the Docker container you can run the `build` script corresponding to yo
./build.sh
```

## 4. Verification & Replication
## 4. Replication

### Running the Replication or Verification
### Running the Replication
To execute the replication you can run the `execute` script corresponding to your operating system with `replication` as first argument.

#### Windows:
Expand All @@ -57,10 +60,9 @@ To execute the replication you can run the `execute` script corresponding to you

> WARNING!
> The replication will at least require an hour and might require up to a day depending on your system.
> Therefore, we offer a short verification (5-10 minutes) which runs DiffDetective on only four of the datasets.
> You can run it by providing "verification" as argument instead of "replication" (i.e., `.\execute.bat verification`, `./execute.sh verification`).
> If you want to stop the execution, you can call the provided script for stopping the container in a separate terminal.
> When restarted, the execution will continue processing by restarting at the last unfinished repository.
> If you want to replicate parts of the evaluation on a subset of the datasets, run the replication on a custom dataset (see below for instructions).
> #### Windows:
> `.\stop-execution.bat`
> #### Linux/Mac (bash):
Expand All @@ -69,61 +71,10 @@ To execute the replication you can run the `execute` script corresponding to you
You might see warnings or errors reported from SLF4J like `Failed to load class "org.slf4j.impl.StaticLoggerBinder"` which you can safely ignore.
Further troubleshooting advice can be found at the bottom of this file.

The results of the verification will be stored in the [results](results) directory.
The results of the verification will be stored in the [results](../../results) directory.

### Expected Output of the Verification
The aggregated results of the verification/replication can be found in the following files.
The example file content shown below should match your results when running the _verification_.
(Note that the links below only have a target _after_ running the replication or verification.)

- The [speed statistics](results/validation/current/speedstatistics.txt) contain information about the total runtime, median runtime, mean runtime, and more:
```
#Commits: 24701
Total commit process time is: 14.065916666666668min
Fastest commit process time is: d86e352859e797f6792d6013054435ae0538ef6d___xfig___0ms
Slowest commit process time is: 9838b7032ea9792bec21af424c53c07078636d21___xorg-server___7996ms
Median commit process time is: f77ffeb9b26f49ef66f77929848f2ac9486f1081___tcl___13ms
Average commit process time is: 34.166835350795516ms
```
- The [classification results](results/validation/current/ultimateresult.metadata.txt) contain information about how often each pattern was matched, and more.
```
repository: <NONE>
total commits: 42323
filtered commits: 7425
failed commits: 0
empty commits: 10197
processed commits: 24701
tree diffs: 80751
fastestCommit: 518e205b06d0dc7a0cd35fbc2c6a4376f2959020___xorg-server___0ms
slowestCommit: 9838b7032ea9792bec21af424c53c07078636d21___xorg-server___7996ms
runtime in seconds: 853.9739999999999
runtime with multithreading in seconds: 144.549
treeformat: org.variantsync.diffdetective.diff.difftree.serialize.treeformat.CommitDiffDiffTreeLabelFormat
nodeformat: org.variantsync.diffdetective.mining.formats.ReleaseMiningDiffNodeFormat
edgeformat: org.variantsync.diffdetective.mining.formats.DirectedEdgeLabelFormat with org.variantsync.diffdetective.mining.formats.ReleaseMiningDiffNodeFormat
analysis: org.variantsync.diffdetective.validation.PatternValidationTask
#NON nodes: 0
#ADD nodes: 0
#REM nodes: 0
filtered because not (is not empty): 212
AddToPC: { total = 443451; commits = 22470 }
AddWithMapping: { total = 51036; commits = 2971 }
RemFromPC: { total = 406809; commits = 21384 }
RemWithMapping: { total = 36622; commits = 2373 }
Specialization: { total = 7949; commits = 1251 }
Generalization: { total = 11057; commits = 955 }
Reconfiguration: { total = 3186; commits = 381 }
Refactoring: { total = 4862; commits = 504 }
Untouched: { total = 0; commits = 0 }
#Error[conditional macro without expression]: 2
#Error[#else after #else]: 2
#Error[#else or #elif without #if]: 11
#Error[#endif without #if]: 12
#Error[not all annotations closed]: 8
```

Moreover, the results comprise the (LaTeX) tables that are part of our paper and appendix.
The processing times might deviate because performance depends on your hardware.


### (Optional) Running DiffDetective on Custom Datasets
You can also run DiffDetective on other datasets by providing the path to the dataset file as first argument to the execution script:
Expand All @@ -133,24 +84,24 @@ You can also run DiffDetective on other datasets by providing the path to the da
#### Linux/Mac (bash):
`./execute.sh path/to/custom/dataset.md`

The input file must have the same format as the other dataset files (i.e., repositories are listed in a Markdown table). You can find [dataset files](docs/datasets.md) in the [docs](docs) folder.
The input file must have the same format as the other dataset files (i.e., repositories are listed in a Markdown table). You can find [dataset files](../../docs/datasets/all.md) in the [docs/datasets](../../docs/datasets) folder.

## Troubleshooting

### 'Got permission denied while trying to connect to the Docker daemon socket'
`Problem:` This is a common problem under Linux, if the user trying to execute Docker commands does not have the permissions to do so.
`Problem:` This is a common problem under Linux, if the user trying to execute Docker commands does not have the permissions to do so.

`Fix:` You can fix this problem by either following the [post-installation instructions](https://docs.docker.com/engine/install/linux-postinstall/), or by executing the scripts in the replication package with elevated permissions (i.e., `sudo`).

### 'Unable to find image 'replication-package:latest' locally'
`Problem:` The Docker container could not be found. This either means that the name of the container that was built does not fit the name of the container that is being executed (this only happens if you changed the provided scripts), or that the Docker container was not built yet.
`Problem:` The Docker container could not be found. This either means that the name of the container that was built does not fit the name of the container that is being executed (this only happens if you changed the provided scripts), or that the Docker container was not built yet.

`Fix:` Follow the instructions described above in the section `Build the Docker Container`.

### No results after verification, or 'cannot create directory '../results/validation/current': Permission denied'
`Problem:` This problem can occur due to how permissions are managed inside the Docker container. More specifically, it will appear, if Docker is executed with elevated permissions (i.e., `sudo`) and if there is no [results](results) directory because it was deleted manually. In this case, Docker will create the directory with elevated permissions, and the Docker user has no permissions to access the directory.
`Problem:` This problem can occur due to how permissions are managed inside the Docker container. More specifically, it will appear, if Docker is executed with elevated permissions (i.e., `sudo`) and if there is no [results](../../results) directory because it was deleted manually. In this case, Docker will create the directory with elevated permissions, and the Docker user has no permissions to access the directory.

`Fix:` If there is a _results_ directory, delete it with elevated permission (e.g., `sudo rm -r results`).
`Fix:` If there is a _results_ directory, delete it with elevated permission (e.g., `sudo rm -r results`).
Then, create a new _results_ directory without elevated permissions, or execute `git restore .` to restore the deleted directory.

### Failed to load class "org.slf4j.impl.StaticLoggerBinder"
Expand Down
76 changes: 71 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,74 @@
# Paper Replication Package
# Explaining Edits to Variability Annotations in Evolving Software Product Lines

This is the replication package for the paper "Explaining Edits to Variability Annotations in Evolving Software Product Lines".
This is the replication package for our paper _Explaining Edits to Variability Annotations in Evolving Software Product Lines_.

It contains the adapted DiffDetective implementation used for the evaluation in the paper.
The main change is the edge-adding in the `PaperEvaluationTask`, where variation diffs get extended to edge-typed variation diffs with *Implication* and *Alternative* edges.
This replication package consists of two parts:

1. **DiffDetective**: For our validation, we modified [_DiffDetective_](https://github.com/VariantSync/DiffDetective), a java library and command-line tool to classify edits to variability in git histories of preprocessor-based software product lines, to extend the variation diffs constructed by _DiffDetective_ to edge-typed variation diffs we introduce in the paper.
2. **Dataset Overview**: We provide an overview of the 44 inspected datasets with updated links to their repositories in the file [docs/datasets.md][dataset].

## 1. DiffDetective
DiffDetective is a java library and command-line tool to parse and classify edits to variability in git histories of preprocessor-based software product lines by creating [variation diffs][difftree_class] and operating on them.

We offer a [Docker](https://www.docker.com/) setup to easily __replicate__ the validation performed in our paper.
In the following, we provide a quickstart guide for running the replication.
You can find detailed information on how to install Docker and build the container in the [INSTALL](INSTALL.md) file, including detailed descriptions of each step and troubleshooting advice.

### 1.1 Build the Docker container
Start the docker deamon.
Clone this repository.
Open a terminal and navigate to the root directory of this repository.
To build the Docker container you can run the `build` script corresponding to your operating system.
#### Windows:
`.\build.bat`
#### Linux/Mac (bash):
`./build.sh`

### 1.2 Start the replication
To execute the replication you can run the `execute` script corresponding to your operating system with `replication` as first argument.

#### Windows:
`.\execute.bat replication`
#### Linux/Mac (bash):
`./execute.sh replication`

> WARNING!
> The replication will at least require an hour and might require up to a day depending on your system.
> If you want to stop the execution, you can call the provided script for stopping the container in a separate terminal.
> When restarted, the execution will continue processing by restarting at the last unfinished repository.
> If you want to replicate parts of the evaluation on a subset of the datasets, run the replication on a custom dataset (see below for instructions).
> #### Windows:
> `.\stop-execution.bat`
> #### Linux/Mac (bash):
> `./stop-execution.sh`
You might see warnings or errors reported from SLF4J like `Failed to load class "org.slf4j.impl.StaticLoggerBinder"` which you can safely ignore.
Further troubleshooting advice can be found at the bottom of the [Install](INSTALL.md) file.

### 1.3 View the results in the [results][resultsdir] directory
All raw results are stored in the [results][resultsdir] directory.
The aggregated results can be found in the following files.
(Note that the links to the results only have a target _after_ running the replication or verification.)
The results consist of general information about the analysed repositories as well as CSV files with entries for every patch analysed, i.e., every source code file changed in a commit.



## 2. Dataset Overview
We provide an overview of the used 44 open-source preprocessor-based software product lines in the [docs/datasets.md][dataset] file, taken from the original [_DiffDetective_](https://github.com/VariantSync/DiffDetective) implementation.


## 3. Running DiffDetective on Custom Datasets
You can also run DiffDetective on other datasets by providing the path to the dataset file as first argument to the execution script:

#### Windows:
`.\execute.bat path\to\custom\dataset.md`
#### Linux/Mac (bash):
`./execute.sh path/to/custom/dataset.md`

The input file must have the same format as the other dataset files (i.e., repositories are listed in a Markdown table). You can find [dataset files](docs/datasets.md) in the [docs](docs) folder.

[dataset]: docs/datasets.md


[resultsdir]: results

To replicate the evaluation, run the `main` method in `relationshipedges/Validation`
4 changes: 0 additions & 4 deletions REQUIREMENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,3 @@ To run DiffDetective, JDK16, and Maven are required.
Dependencies to other packages are documented in the maven build file ([pom.xml](pom.xml)) and are handled automatically by Maven.
Alternatively, the docker container can be used on any system supporting docker.
Docker will take care of all requirements and dependencies to replicate our validation.

The requirements to build our `proofs` Haskell library are documented in its respective [proofs/REQUIREMENTS.md](proofs/REQUIREMENTS.md) file.

[stack]: https://docs.haskellstack.org/en/stable/README/
2 changes: 0 additions & 2 deletions build.bat

This file was deleted.

2 changes: 0 additions & 2 deletions build.sh

This file was deleted.

8 changes: 4 additions & 4 deletions docker/execute.sh
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,10 @@ cd /home/sherlock || exit

if [ "$1" == 'replication' ]; then
echo "Running full replication. Depending on your system, this will require several hours or even a few days."
java -cp DiffDetective.jar org.variantsync.diffdetective.validation.Validation
java -cp DiffDetective.jar org.variantsync.diffdetective.relationshipedges.Validation
elif [ "$1" == 'verification' ]; then
echo "Running a short verification."
java -cp DiffDetective.jar org.variantsync.diffdetective.validation.Validation docs/verification/datasets.md
java -cp DiffDetective.jar org.variantsync.diffdetective.relationshipedges.Validation docs/verification/datasets.md
else
echo ""
echo "Running detection on a custom dataset with the input file $1"
Expand All @@ -27,7 +27,7 @@ else
fi
echo "Collecting results."
cp -r results/* ../results/
java -cp DiffDetective.jar org.variantsync.diffdetective.validation.FindMedianCommitTime ../results/validation/current
java -cp DiffDetective.jar org.variantsync.diffdetective.tablegen.MiningResultAccumulator ../results/validation/current ../results/validation/current
# java -cp DiffDetective.jar org.variantsync.diffdetective.validation.FindMedianCommitTime ../results/validation/current
# java -cp DiffDetective.jar org.variantsync.diffdetective.tablegen.MiningResultAccumulator ../results/validation/current ../results/validation/current
echo "The results are located in the 'results' directory."

8 changes: 4 additions & 4 deletions Dockerfile → replication/splc23-explanations/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,12 @@ RUN apk add maven
# REQUIRED: Create and navigate to a working directory
WORKDIR /home/user

COPY local-maven-repo ./local-maven-repo
COPY ../../local-maven-repo ./local-maven-repo

# EXAMPLE: Copy the source code
COPY src ./src
COPY ../../src ./src
# EXAMPLE: Copy the pom.xml if Maven is used
COPY pom.xml .
COPY ../../pom.xml .
# EXAMPLE: Execute the maven package process
RUN mvn package || exit

Expand All @@ -53,7 +53,7 @@ COPY --from=0 /home/user/target/diffdetective-1.0.0-jar-with-dependencies.jar ./
WORKDIR /home/sherlock

# Copy the setup
COPY docs holmes/docs
COPY ../../docs holmes/docs

# Copy the docker resources
COPY docker/* ./
Expand Down
17 changes: 17 additions & 0 deletions replication/splc23-explanations/build.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
@echo off
setlocal

set "targetSubPath=splc23-explanations"

rem Get the current directory
for %%A in ("%CD%") do set "currentDir=%%~nxA"

rem Check if the current directory ends with the target sub-path
if "%currentDir:~-19%"=="%targetSubPath%" (
cd ..\..
docker build -t diff-detective -f replication\splc23-explanations\Dockerfile .
@pause
) else (
echo error: the script must be run from inside the splc23-explanations directory, i.e., DiffDetective\replication\%targetSubPath%
)
endlocal
15 changes: 15 additions & 0 deletions replication/splc23-explanations/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#! /bin/bash
# Assure that the script is only called from the splc23-explanations folder
current_dir=$(pwd)
expected_path="/replication/splc23-explanations"
if [[ ! $current_dir =~ $expected_path ]]; then
echo "error: the script must be run from inside the splc23-explanations directory, i.e., DiffDetective$expected_path"
exit 1
fi

# We have to switch to the root directory of the project and build the Docker image from there,
# because Docker only allows access to the files in the current file system subtree (i.e., no access to ancestors).
# We have to do this to get access to 'src', 'docker', 'local-maven-repo', etc.
cd ../..

docker build -t diff-detective -f replication/splc23-explanations/Dockerfile .
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
import java.util.stream.Collectors;

/**
* This is the validation for the paper.
* This is the validation for the paper Explaining Edits to Variability Annotations in Evolving Software Product Lines.
* It provides all configuration settings and facilities to setup the validation by
* creating a {@link HistoryAnalysis} and run it.
*/
Expand Down

0 comments on commit dab27e0

Please sign in to comment.