Root Storage of Deep Learning Models in TMVA

This project was a part of Google Summer of Code 2021 under the organization CERN-HSF
Link to Project Page

Project Details


Student's Name	Sanjiban Sengupta
Mentors	Lorenzo Moneta, Sitong An, Anirudh Dagar
Organization	Root-Project (CERN-HSF)
Organization Code Repository	https://github.com/root-project/root
Final Report	https://github.com/sanjibansg/GSoC21-RootStorage/wiki
Code Implementations	https://github.com/root-project/root/pulls?q=author:sanjibansg
Project Proposal	https://docs.google.com/document/d/1MVKpGP9lr0tUhrxB59nrNlZfAtnO_Dgkx8ddw1k26Yk/edit?usp=sharing
Documentation Blog	https://blog.sanjiban.ml/series/gsoc

About Project

The Toolkit for Multivariate Data Analysis (TMVA) is a sub-module of ROOT which provides a machine learning environment for conducting the training, testing, and evaluation of various multivariate methods especially used in High-energy Physics. Recently, the TMVA team introduced SOFIE (System for Fast Inference code Emit) which facilitates its own intermediate representation of deep learning models following the ONNX standards. To facilitate the usage, storage, and exchange of these models, this project aimed at developing the storage functionality of Deep Learning models in the `.root` format, popular in the High Energy Physics community.

Project Contents

Functionality for serialization of RModel for storing a trained deep learning model in `.root` format.
Functionality for parsing a Keras `.h5` file into a RModel object for generation of inference code.
Functionality for parsing a PyTorch `.pt` file into a RModel object for generation of inference code.
Tests,Tutorials & Documentations for various parsers of TMVA SOFIE's RModel object.
Funcationality for Intermediate Representation of BDT Models and Parsing of TMVA trained BDT models

Tech Stack

Languages: C/C++, Python
Deep Learning Libraries: Keras, PyTorch
API: C-Python API
Build: CMake
Tests: GTest Framework
Documentation: DOxygen

Installation

Installation Steps for building ROOT from source can be found here

https://root.cern/install/build_from_source/

Provided install.sh can also be used which directly builds the repository and merges the implemented code files

git clone https://github.com/sanjibansg/GSoC21-RootStorage.git
cd GSoC21-RootStorage
./install.sh

Interface

Serialization of RModel

//Writing ROOT File
TFile file("model.root","CREATE");
using namespace TMVA::Experimental;
SOFIE::RModel model = SOFIE::PyKeras::Parse("trained_model_dense.h5");
model.Write("model");
file.Close();

//Reading ROOT File
TFile file("model.root","READ");
using namespace TMVA::Experimental;
SOFIE::RModel *model;
file.GetObject("model",model);
file.Close();

Keras Converter for RModel

//Parser returns a RModel object
using TMVA::Experimental::SOFIE;
RModel model = PyKeras::Parse("trained_model_dense.h5");

//Converter writes a ROOT file directly
PyKeras::ConvertToRoot(“trained_model_dense.h5”);

PyTorch Converter for RModel

//Parser returns a RModel object
using TMVA::Experimental::SOFIE;

//Building the vector for input shapes
std::vector<size_t> s1{120,1};
std::vector<std::vector<size_t>> inputShape{s1};
RModel model = PyTorch::Parse("trained_model_dense.pt",inputShape);

//Converter write3s a ROOT file directly
std::vector<size_t> s1{120,1};
std::vector<std::vector<size_t>> shape{s1};
PyTorch::ConvertToRoot(“trained_model_dense.pt”,inputShape);

Root Storage of BDT

//Parser loads the BDT model from .xml to RootStorage::BDT object
TMVA::Experimental::RootStorage::BDT model;
bool usePurity = true;
model.Parse("TMVA_CNN_Classification_BDT.weights.xml",usePurity);

Future Plan

Development of Root Storage of BDT
- Develop the mapping interface for inference code generation from class RootStorage::BDT
- Researching on the conversion of scikit-learn based BDT models to class RootStorage::BDT for subsequent inference
- Adding tests & tutorials for BDT
Adding Support for conversion of Convolution Layers from Keras and PyTorch models.

Contributions

For existing bugs and adding more features open a issue here.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
tmva		tmva
tutorials/tmva		tutorials/tmva
LICENSE		LICENSE
README.md		README.md
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Root Storage of Deep Learning Models in TMVA

Project Details

About Project

Project Contents

Tech Stack

Installation

Interface

Future Plan

Contributions

About

Releases

Packages

Languages

License

sanjibansg/GSoC21-RootStorage

Folders and files

Latest commit

History

Repository files navigation

Root Storage of Deep Learning Models in TMVA

Project Details

About Project

Project Contents

Tech Stack

Installation

Interface

Future Plan

Contributions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages