Skip to content

Latest commit

 

History

History

thoth-performance-dataset

Thoth Performance Datasets

Thoth Performance Datasets are created with one of the components of Thoth called Amun. This service acts as an execution engine for Thoth where applications are built and tested using Thoth Performance Indicators (PI). Amun can be scheduled through another component in Thoth called Dependency Monkey. This component aims to automatically verify software stacks and aggregate relevant observations. Thoth Performance Datasets contains tests on performance for software stacks for different types of applications (e.g Machine Learning).

Thoth Performance Dataset v2.0

This dataset is made by ~3300 files in json format: ~23Mb once extracted and it is described in the notebook called Thoth Performance Dataset.

This notebook shows what is the structure of inspections and what information can be find analyzing several ones.

Thoth TensorFlow==2.1.0 Stack combinations

This dataset is made by ~295 inspection reports in json format: ~39Mb once extracted and it is described in the notebook called Performance of TensorFlow Software stack combinations.

This notebook will show how Thoth can easily use Dependency Monkey and Amun to create all possible combinations of software stack for a certain package (Dependency Monkey Zoo) and how it can easily identify errors and performance differences across stacks.

Some of the results you can find:

Performance TensorFlow==2.1.0 Stack Combinations

If you want to know more just run this notebook!

Thoth TensorFlow==2.1.0 Stack combinations errors discovery

This dataset is made by ~823 files in json format: ~12.5Mb once extracted and it is described in the notebook called Performance of TensorFlow Software stack combinations.

Request for this analysis can be found here, while inputs used to create the dataset for this analysis can be find here.

This notebook specifically, it's a continuation on the work down for Performance of TensorFlow Software stack combinations, but in this case we could discover packages that do not allow TensorFlow 2.1.0 to run, so that new advices can be created for users that rely on Thoth.

Some of the results you can find:

TensorFlow==2.1.0 Stack Combinations Errors

If you want to know more just run this notebook!

inspections Analysis 2021-02-09

This dataset is made by ~6081 files in json format: ~102.2Mb once extracted and it is described in the notebook called AmunInspectionAnalysis2021-02-09.

You can find the reference configuration to recreate the dataset in the Dependency Monkey zoo using this link.

Some of the results you can find:

TensorFlow==2.4.0 Stack Performances

If you want to know more just run this notebook!