We provide a spectrum of pre-trained models on different datasets.
import layoutparser as lp
model = lp.Detectron2LayoutModel(
config_path ='lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config', # In model catalog
label_map ={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In model`label_map`
extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional
)
model.detect(image)
Dataset | Model | Config Path | Eval Result (mAP) |
---|---|---|---|
HJDataset | faster_rcnn_R_50_FPN_3x | lp://HJDataset/faster_rcnn_R_50_FPN_3x/config | |
HJDataset | mask_rcnn_R_50_FPN_3x | lp://HJDataset/mask_rcnn_R_50_FPN_3x/config | |
HJDataset | retinanet_R_50_FPN_3x | lp://HJDataset/retinanet_R_50_FPN_3x/config | |
PubLayNet | faster_rcnn_R_50_FPN_3x | lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config | |
PubLayNet | mask_rcnn_R_50_FPN_3x | lp://PubLayNet/mask_rcnn_R_50_FPN_3x/config | |
PubLayNet | mask_rcnn_X_101_32x8d_FPN_3x | lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config | 88.98 eval.csv |
PrimaLayout | mask_rcnn_R_50_FPN_3x | lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config | 69.35 eval.csv |
NewspaperNavigator | faster_rcnn_R_50_FPN_3x | lp://NewspaperNavigator/faster_rcnn_R_50_FPN_3x/config | |
TableBank | faster_rcnn_R_50_FPN_3x | lp://TableBank/faster_rcnn_R_50_FPN_3x/config | 89.78 eval.csv |
TableBank | faster_rcnn_R_101_FPN_3x | lp://TableBank/faster_rcnn_R_101_FPN_3x/config | 91.26 eval.csv |
Math Formula Detection(MFD) | faster_rcnn_R_50_FPN_3x | lp://MFD/faster_rcnn_R_50_FPN_3x/config | 79.68 eval.csv |
- For PubLayNet models, we suggest using
mask_rcnn_X_101_32x8d_FPN_3x
model as it's trained on the whole training set, while others are only trained on the validation set (the size is only around 1/50). You could expect a 15% AP improvement using themask_rcnn_X_101_32x8d_FPN_3x
model.
Dataset | Label Map |
---|---|
HJDataset | {1:"Page Frame", 2:"Row", 3:"Title Region", 4:"Text Region", 5:"Title", 6:"Subtitle", 7:"Other"} |
PubLayNet | {0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"} |
PrimaLayout | {1:"TextRegion", 2:"ImageRegion", 3:"TableRegion", 4:"MathsRegion", 5:"SeparatorRegion", 6:"OtherRegion"} |
NewspaperNavigator | {0: "Photograph", 1: "Illustration", 2: "Map", 3: "Comics/Cartoon", 4: "Editorial Cartoon", 5: "Headline", 6: "Advertisement"} |
TableBank | {0: "Table"} |
MFD | {1: "Equation"} |