energy-influence-model-structure-uncertainty

GP4P - Gaussian Process Regression for Performance Prediction

Performance optimization for configurable software systems is not only an im- portant aspect of software engineering, it restricts how the software is being used on the base of performance evaluations of its configuration options. The lack of domain knowledge in a black-box scenario induces uncertainty in the de- cision making process, which is why there is a risk that a certain configuration exceeds the user’s budget unexpectedly.

To estimate the model structure uncertainty in performance influence mod- els to identify synergistic effects among selected options, this thesis provides the methodology of Gaussian Process Regression, which is the foundation of the Bayesian Optimization Algorithm. The key contribution is the implemen- tation of a variety of different Gaussian process models with shift invariant as well as additive structured kernels that are estimated using exact infer- ence of the marginal log likelihood or the state of the art NUTS sampler for approximate posterior evaluations.

With the presented methods of the Bayesian Approximate Kernel Regres- sion and the distance measure groupRATE, the provided application can iden- tify interactions among configuration options to choose from in a variable se- lection procedure as a set of constraints to state a dynamic program for finding a best subset selection

Installation

prerequisites
- Python 3.9.6 or higher
- pip or
- conda In the root directory of the project, run the following command to install the required packages:

pip install -r requirements.txt

or

conda install --file requirements.txt

or using the setup.py

python setup.py install

Usage

This project follows the Clean Architecture design principles. The adapters directory contains third party module implementations. The domain directory contains the domain logic of the application. Most importantly, domain/env.py contains the environment variables that are used throughout the application. See the description there how to adjust them. The application directory contains the use cases of the application. To reproduce the data in the thesis, run the following command:

python application/thesis_pipeline.py

or use application/gpytorch_pipeline.py to train single models using exact inference. application/fully_bayesian_pipeline.py contains the implementation of the fully Bayesian model using the NUTS sampler.

For inference, the application/model_inference.pyprovide insights in the model, especially the BAKR methodology and the groupRATE distance measure. All yet trained models are stored under the modeldata directory. Just change the file_name in the model_inference.py for the desired model. A main contribution is the Jupyter notebook for an Exploratory Data Analysis on the modeldata. This can be found unter notebooks/EDA_walkthrough.ipynb. The application/eval_training.py provides the evaluation of the models using different score metrics.

License

The project is published under the GNU General Public License v3.0. If you need to use the project under a different license, please contact the author.

Project status

The project is still under development. The current version is a prototype that is used to evaluate the performance of the Gaussian Process models, while the methods for inference and evaluation focusses on MCMC models, but work partially also for exact inference models.

Name		Name	Last commit message	Last commit date
Latest commit History 472 Commits
.ipynb_checkpoints		.ipynb_checkpoints
adapters		adapters
application		application
domain		domain
evaluation		evaluation
modeldata		modeldata
rawdata		rawdata
uncertainty_modelling.egg-info		uncertainty_modelling.egg-info
.gitignore		.gitignore
EDA_walkthrough.ipynb		EDA_walkthrough.ipynb
LICENSE		LICENSE
Master_Thesis_FINAL.pdf		Master_Thesis_FINAL.pdf
README.md		README.md
connect_tpu.sh		connect_tpu.sh
requirements.txt		requirements.txt
requirements_conda.yml		requirements_conda.yml
requirements_conda_alt.txt		requirements_conda_alt.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

energy-influence-model-structure-uncertainty

GP4P - Gaussian Process Regression for Performance Prediction

Installation

Usage

License

Project status

About

Releases

Packages

Languages

License

iammteeh/GP4P

Folders and files

Latest commit

History

Repository files navigation

energy-influence-model-structure-uncertainty

GP4P - Gaussian Process Regression for Performance Prediction

Installation

Usage

License

Project status

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages