Scene Text Recognition Models Explainability Using Local Features (STRExp)

Mark Vincent Ty and Rowel Atienza

Electrical and Electronics Engineering Institute
University of the Philippines, Diliman

We release a framework that merges Explainable AI (XAI) into Scene Text Recognition (STR). This tool builds on the captum library and applies explainability to existing Scene Text Recognition models by leveraging their local explanations.

Download

Pretrained STR models. After unzipping, simply put the "pretrained/" folder into the cloned strexp directory.

    https://zenodo.org/record/7476285/files/pretrained.zip

STR LMDB Real Test Datasets and their segmentations. After unzipping, simply put the "datasets/" folder into the cloned strexp directory.

    https://zenodo.org/record/7478796/files/datasets.zip

STR Model Evaluations

Before running anything, you need to edit the file settings.py. Set the STR model (vitstr, parseq, srn, abinet, trba, matrn), the segmentation directory, and the STR real test dataset name (IIIT5k_3000, SVT, IC03_860, IC03_867, IC13_857, IC13_1015, IC15_1811, IC15_2077, SVTP, CUTE80).
To run/evaluate (vitstr, srn, abinet, trba, matrn), pip install timm==0.4.5
To run/evaluate (parseq), pip install timm==0.6.7
Run STRExp on VITSTR:

CUDA_VISIBLE_DEVICES=0 python captum_improve_vitstr.py --eval_data datasets/data_lmdb_release/evaluation \
--benchmark_all_eval --Transformation None --FeatureExtraction None --SequenceModeling None --Prediction None --Transformer --sensitive \
--data_filtering_off  --imgH 224 --imgW 224 --TransformerModel=vitstr_base_patch16_224 \
--saved_model pretrained/vitstr_base_patch16_224_aug.pth --batch_size=1 --workers=0 --scorer mean --blackbg

Run STRExp on PARSeq:

CUDA_VISIBLE_DEVICES=0 python captum_improve_parseq.py --eval_data datasets/data_lmdb_release/evaluation \
--benchmark_all_eval --Transformation None --FeatureExtraction None --SequenceModeling None --Prediction None --Transformer --sensitive \
--data_filtering_off  --imgH 32 --imgW 128 --TransformerModel=vitstr_base_patch16_224 --batch_size=1 --workers=0 --scorer mean --blackbg --rgb

Run STRExp on TRBA:

CUDA_VISIBLE_DEVICES=0 python captum_improve_trba.py --eval_data datasets/data_lmdb_release/evaluation --benchmark_all_eval \
--Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn --batch_size 1 --workers=0 --data_filtering_off \
--saved_model pretrained/trba.pth --confidence_mode 0 --scorer mean --blackbg --imgH 32 --imgW 100

Run STRExp on SRN:

CUDA_VISIBLE_DEVICES=0 python captum_improve_srn.py --eval_data datasets/data_lmdb_release/evaluation \
--saved_model pretrained/srn.pth --batch_size=1 --workers=0 --imgH 32 --imgW 100 --scorer mean

Run STRExp on ABINET (change also the dataset.test.roots dataset name in configs\train_abinet.yaml to settings.py TARGET_DATASET):

CUDA_VISIBLE_DEVICES=0 python captum_improve_abinet.py --config=configs/train_abinet.yaml --phase test --image_only --scorer mean --blackbg \
--checkpoint pretrained/abinet.pth --imgH 32 --imgW 128 --rgb

Run STRExp on MATRN (change also the dataset.test.roots dataset name in configs\train_matrn.yaml to settings.py TARGET_DATASET):

CUDA_VISIBLE_DEVICES=0 python captum_improve_matrn.py --imgH 32 --imgW 128 --checkpoint=pretrained/matrn.pth --scorer mean --rgb

Acquiring the Selectivity AUC

After running the experiments above, you will have an output pickle file in the current directory. The name of this pickle file can be found in the variable "outputSelectivityPkl", which is written just below the "acquireSingleCharAttrAve()" function. For example in captum_improve_vitstr.py, you can see the variable "outputSelectivityPkl" in line 194.
In order to acquire the selectivity AUC, you need to replace the pickle file in captum_test.py to the output pickle filename. After this, uncomment the line 670 in captum_improve_vitstr.py (comment out line 671 too) and run the same code above to produce the metrics of STRExp evaluated on VITSTR.
Repeat this step for all other captum_improve pyfiles.

Experiments

From top to bottom: VITSTR(1st & 2nd figure), PARSeq(3rd & 4th figure), TRBA(5th & 6th figure), SRN(7th & 8th figure), ABINET(9th & 10th figure) and MATRN(11th & 12th figure) quantitative results.

From top to bottom: PARSeq(1st & 2nd figure), SRN(3rd & 4th figure), and TRBA(5th & 6th figure) qualitative results.

Reference

Full thesis manuscript: https://drive.google.com/file/d/1KBFXfjZL6Gf4HYU5cw5nWylU93gCPLlv/view?usp=sharing
If you find the codes useful, please cite this paper

@inproceedings{ty2023scene,
  title={Scene Text Recognition Models Explainability Using Local Features},
  author={Ty, Mark Vincent and Atienza, Rowel},
  booktitle={2023 IEEE International Conference on Image Processing (ICIP)},
  pages={645--649},
  year={2023},
  organization={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
augmentation		augmentation
captum		captum
configs		configs
data		data
demo_image		demo_image
imagenet_c		imagenet_c
lime		lime
loader		loader
modules		modules
modules_abinet		modules_abinet
modules_matrn		modules_matrn
modules_srn		modules_srn
modules_trba		modules_trba
strhub		strhub
util		util
.gitignore		.gitignore
README.md		README.md
attribution_ops.py		attribution_ops.py
callbacks.py		callbacks.py
captum_improve_abinet.py		captum_improve_abinet.py
captum_improve_matrn.py		captum_improve_matrn.py
captum_improve_parseq.py		captum_improve_parseq.py
captum_improve_srn.py		captum_improve_srn.py
captum_improve_trba.py		captum_improve_trba.py
captum_improve_vitstr.py		captum_improve_vitstr.py
captum_test.py		captum_test.py
captum_test_eval.py		captum_test_eval.py
dataset.py		dataset.py
dataset_abinet.py		dataset_abinet.py
dataset_matrn.py		dataset_matrn.py
dataset_trba.py		dataset_trba.py
losses.py		losses.py
losses_matrn.py		losses_matrn.py
model.py		model.py
model_abinet.py		model_abinet.py
model_matrn.py		model_matrn.py
model_srn.py		model_srn.py
model_trba.py		model_trba.py
requirements.txt		requirements.txt
settings.py		settings.py
str_exp_demo.py		str_exp_demo.py
train_shap_corr.py		train_shap_corr.py
transforms.py		transforms.py
utils.py		utils.py
utils_abinet.py		utils_abinet.py
utils_matrn.py		utils_matrn.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scene Text Recognition Models Explainability Using Local Features (STRExp)

Download

STR Model Evaluations

Acquiring the Selectivity AUC

Experiments

Reference

About

Releases

Packages

Languages

markytools/strexp

Folders and files

Latest commit

History

Repository files navigation

Scene Text Recognition Models Explainability Using Local Features (STRExp)

Download

STR Model Evaluations

Acquiring the Selectivity AUC

Experiments

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages