Methodology

We used Bio.Entrez package of Python 3 to query , search and fetch the metainformations of the RCT studies in PubMed (search period from 2010 to 2020 February; Protocol of the systematic review has been published https://www.sciencedirect.com/science/article/abs/pii/S1087079221000307). The three BERT models of distillBERT, BioBERT and SciBERT are used to classify the title and abstract via Pytorch. We manually labelled the text by reading abstract. After diagnosing the wrong predictions, a stacked model was built by featuring the probability predicted by distillBERT and keywords of the search domains (complementary and alternative medicine). For the studies labelled as 1 (positive) based on the abstract, their full texts in PDF format were fetched from PubMed Central when available. Haystack question-answering pipeline(https://github.com/deepset-ai/haystack/#tutorials) was then fine-tunned and applied to the preprocessed full text to extract key information for further article screening.

pipeline

flowchart

Stacked Model Design (by Salash)

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
Bio_SciBERT_Pytorch_FulltextQA.ipynb		Bio_SciBERT_Pytorch_FulltextQA.ipynb
ConcatSim.ipynb		ConcatSim.ipynb
Distillbert_textclassification_abstractplustitle.ipynb		Distillbert_textclassification_abstractplustitle.ipynb
Fine_Tunning_Fulltext73_labelled_QA_haystack.ipynb		Fine_Tunning_Fulltext73_labelled_QA_haystack.ipynb
Flowchart.jpg		Flowchart.jpg
FulltextQA_Label_concatdf.ipynb		FulltextQA_Label_concatdf.ipynb
Learning_curve_distillBERTs_CorrectedAbs_addTil0.ipynb		Learning_curve_distillBERTs_CorrectedAbs_addTil0.ipynb
NLP_Keyword_Abstract_Classifier.ipynb		NLP_Keyword_Abstract_Classifier.ipynb
PubmedEntrez—CAMs.ipynb		PubmedEntrez—CAMs.ipynb
QuestionAnswering_haystack_Bio_Scibert.ipynb		QuestionAnswering_haystack_Bio_Scibert.ipynb
README.md		README.md
Similarity.ipynb		Similarity.ipynb
StackedModelDesign.jpg		StackedModelDesign.jpg
df_fulltext73final.pkl		df_fulltext73final.pkl
df_n547.pkl		df_n547.pkl
n547_wIDs.pkl		n547_wIDs.pkl
pipeline.JPG		pipeline.JPG
preproQA.pkl		preproQA.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Methodology

pipeline

flowchart

Stacked Model Design (by Salash)

About

Releases

Packages

Languages

Xiaowen-JI/Semi-automation-of-systematic-review-of-clinical-trials-in-medical-psychology-with-BERT-models

Folders and files

Latest commit

History

Repository files navigation

Methodology

pipeline

flowchart

Stacked Model Design (by Salash)

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages