This repository belongs to the thesis: An approach for interpreting the BERT for sequence classification model with the use of Random Forest By Angela Puc
Here you can find the google colab notebooks used for thesis. The order of the files is the following:
- preprocessing-for-bert-yelp-dataset
- categories-yelp-dataset (categories_Yelp_KMeans is a experiment of this step)
- BERT_training (10epochs-training_BERT and BERT_epochs_analysis are complementary notebooks of this step)
- BERT_yelp_evaluation
- RF_input_creation
- RF_mimic_BERT_grid
- BERT_features_analysis (analysis_subsamples and feature_contributions_analysis are complementary notebooks of this step)
The obtained data to perform the pertinent analysis can be found here: https://drive.google.com/drive/folders/1BRTsgsweYI646d3bunB3DbLyVt-q4tpb?usp=sharing