GraSH: Successive Halving for Knowledge Graphs

This is the code and configuration accompanying the paper "Start Small, Think Big: On Hyperparameter Optimization for Large-Scale Knowledge Graph Embeddings" presented at ECML-PKDD 2022. The code extends the knowledge graph embedding library for distributed training Dist-KGE. For documentation on Dist-KGE refer to the Dist-KGE repository. We provide the hyperparameter settings for the searches and finally selected trials in /examples/experiments/.

UPDATE: GraSH was recently merged into our main library LibKGE. All configs from this repository, except the ones for Freebase that require distributed training, can be executed in LibKGE. Please use LibKGE for your own experiments with GraSH.

Quick start

Setup

# retrieve and install project in development mode
git clone https://github.com/uma-pi1/grash.git
cd grash
pip install -e .

# download and preprocess datasets
cd data
sh download_all.sh
cd ..

Training

# train an example model on a toy dataset (you can omit '--job.device cpu' when you have a gpu)
python -m kge start examples/toy-complex-train.yaml --job.device cpu

This example will train on a toy dataset in a sequential setup on CPU

GraSH Hyperparameter Search

# perform a search with GraSH on a toy dataset (you can omit '--job.device cpu' when you have a gpu)
python -m kge start examples/toy-complex-search-grash.yaml --job.device cpu

This example will perform a small GraSH search with 16 trials on a toy dataset in a sequential setup on CPU

Configuration of GraSH Search

The most important configuration options for a hyperparameter search with GraSH are:

dataset:
  name: yago3-10
grash_search:
  eta: 4
  num_trials: 64
  search_budget: 3
  variant: combined
  parameters: # define your search space here
job:
  type: search
model: complex
train:
  max_epochs: 400

eta defines the reduction factor during the search. Per round the number of remaining trials is reduced to 1/eta
search_budget is defined in "number of full training runs". The default choice search_budget=3, for example, corresponds to an overall search cost of three full training runs.
variant controls which reduction technique to use (only epoch, only graph, or combined)

Run a GraSH hyperparameter search

Run the default search on yago3-10 with the following command:

python -m kge start examples/experiments/search_configs/yago3-10/search-complex-yago-combined.yaml

The k-core subgraphs will automatically be generated and saved to data/yago3-10/subsets/k-core/. By default, each experiment will create a new folder in local/experiments/<timestamp>-<config-name> where the results can be found.

Results and Configurations

All results were obtained with the GraSH default settings (num_trials=64, eta=4, search_budget=3, variant=combined)

Yago3-10

Model	Variant	MRR	Hits@1	Hits@10	Hits@100	config
ComplEx	Epoch	0.536	0.460	0.672	0.601	config
ComplEx	Graph	0.463	0.375	0.634	0.800	config
ComplEx	Combined	0.528	0.455	0.660	0.772	config
RotatE	Epoch	0.432	0.337	0.619	0.768	config
RotatE	Graph	0.432	0.337	0.619	0.768	config
RotatE	Combined	0.434	0.342	0.607	0.742	config
TransE	Epoch	0.499	0.406	0.661	0.794	config
TransE	Graph	0.422	0.311	0.628	0.802	config
TransE	Combined	0.499	0.406	0.661	0.794	config

Wikidata5M

Model	Variant	MRR	Hits@1	Hits@10	Hits@100	config
ComplEx	Epoch	0.300	0.247	0.390	0.506	config
ComplEx	Graph	0.300	0.247	0.390	0.506	config
ComplEx	Combined	0.300	0.247	0.390	0.506	config
RotatE	Epoch	0.241	0.187	0.331	0.438	config
RotatE	Graph	0.232	0.169	0.326	0.432	config
RotatE	Combined	0.241	0.187	0.331	0.438	config
TransE	Epoch	0.263	0.210	0.358	0.483	config
TransE	Graph	0.263	0.210	0.358	0.483	config
TransE	Combined	0.268	0.213	0.363	0.480	config

Freebase

Model	Variant	MRR	Hits@1	Hits@10	Hits@100	config
ComplEx	Epoch	0.572	0.486	0.714	0.762	config
ComplEx	Graph	0.594	0.511	0.726	0.767	config
ComplEx	Combined	0.594	0.511	0.726	0.767	config
RotatE	Epoch	0.561	0.522	0.625	0.679	config
RotatE	Graph	0.613	0.578	0.669	0.719	config
RotatE	Combined	0.613	0.578	0.669	0.719	config
TransE	Epoch	0.261	0.078	0.518	0.636	config
TransE	Graph	0.553	0.520	0.614	0.682	config
TransE	Combined	0.553	0.520	0.614	0.682	config

How to cite

@inproceedings{kochsiek2022start,
  title={Start Small, Think Big: On Hyperparameter Optimization for Large-Scale Knowledge Graph Embeddings},
  author={Kochsiek, Adrian and Niesel, Fritz and Gemulla, Rainer},
  booktitle={Joint European Conference on Machine Learning and Knowledge Discovery in Databases},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1,378 Commits
data		data
docs		docs
examples		examples
kge		kge
local		local
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GraSH: Successive Halving for Knowledge Graphs

Table of contents

Quick start

Setup

Training

GraSH Hyperparameter Search

Configuration of GraSH Search

Run a GraSH hyperparameter search

Results and Configurations

Yago3-10

Wikidata5M

Freebase

How to cite

About

Releases

Packages

Contributors 2

Languages

License

uma-pi1/GraSH

Folders and files

Latest commit

History

Repository files navigation

GraSH: Successive Halving for Knowledge Graphs

Table of contents

Quick start

Setup

Training

GraSH Hyperparameter Search

Configuration of GraSH Search

Run a GraSH hyperparameter search

Results and Configurations

Yago3-10

Wikidata5M

Freebase

How to cite

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages