Skip to content

Commit

Permalink
updates
Browse files Browse the repository at this point in the history
  • Loading branch information
felixdittrich92 committed Jul 11, 2023
1 parent 959bf34 commit 31ce669
Showing 1 changed file with 23 additions and 21 deletions.
44 changes: 23 additions & 21 deletions docs/source/using_doctr/using_models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ Explanations about the metrics being used are available in :ref:`metrics`.

*Disclaimer: both FUNSD subsets combined have 199 pages which might not be representative enough of the model capabilities*

Seconds per iteration is computed after a warmup phase of 100 tensors (where the batch size is 1), by measuring the average number of processed tensors per second over 1000 samples. Those results were obtained on a `11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz`.
Seconds per iteration (with a batch size of 1) is computed after a warmup phase of 100 tensors, by measuring the average number of processed tensors per second over 1000 samples. Those results were obtained on a `11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz`.


Detection predictors
Expand Down Expand Up @@ -133,11 +133,11 @@ For a comprehensive comparison, we have compiled a detailed benchmark on publicl
+-----------------------------------------------------------------------------------+----------------------------+----------------------------+--------------------+
| | FUNSD | CORD | |
+================+=================================+=================+==============+============+===============+============+===============+====================+
| **Backend** | **Architecture** | **Input shape** | **# params** | **Exact** | **Partial** | **Exact** | **Partial** | **sec/it (B: 64)** |
| **Backend** | **Architecture** | **Input shape** | **# params** | **Exact** | **Partial** | **Exact** | **Partial** | **sec/it (B: 1)** |
+----------------+---------------------------------+-----------------+--------------+------------+---------------+------------+---------------+--------------------+
| TensorFlow | crnn_vgg16_bn | (32, 128, 3) | 15.8 M | | | | | 0.9 |
| TensorFlow | crnn_vgg16_bn | (32, 128, 3) | 15.8 M | 88.12 | 88.85 | 94.68 | 95.10 | 0.9 |
+----------------+---------------------------------+-----------------+--------------+------------+---------------+------------+---------------+--------------------+
| Tensorflow | crnn_mobilenet_v3_small | (32, 128, 3) | 2.1 M | | | | | 0.25 |
| Tensorflow | crnn_mobilenet_v3_small | (32, 128, 3) | 2.1 M | 86.88 | 87.61 | 92.28 | 92.73 | 0.25 |
+----------------+---------------------------------+-----------------+--------------+------------+---------------+------------+---------------+--------------------+
| TensorFlow | crnn_mobilenet_v3_large | (32, 128, 3) | 4.5 M | | | | | 0.34 |
+----------------+---------------------------------+-----------------+--------------+------------+---------------+------------+---------------+--------------------+
Expand All @@ -151,11 +151,11 @@ For a comprehensive comparison, we have compiled a detailed benchmark on publicl
+----------------+---------------------------------+-----------------+--------------+------------+---------------+------------+---------------+--------------------+
| TensorFlow | parseq | (32, 128, 3) | 23.8 M | | | | | 3.6 |
+----------------+---------------------------------+-----------------+--------------+------------+---------------+------------+---------------+--------------------+
| PyTorch | crnn_vgg16_bn | (32, 128, 3) | 15.8 M | | | | | 0.6 |
| PyTorch | crnn_vgg16_bn | (32, 128, 3) | 15.8 M | 86.54 | 87.41 | 94.29 | 94.69 | 0.6 |
+----------------+---------------------------------+-----------------+--------------+------------+---------------+------------+---------------+--------------------+
| PyTorch | crnn_mobilenet_v3_small | (32, 128, 3) | 4.5 M | | | | | 0.05 |
| PyTorch | crnn_mobilenet_v3_small | (32, 128, 3) | 4.5 M | 87.25 | 87.99 | 93.91 | 94.34 | 0.05 |
+----------------+---------------------------------+-----------------+--------------+------------+---------------+------------+---------------+--------------------+
| PyTorch | crnn_mobilenet_v3_large | (32, 128, 3) | 2.1 M | | | | | 0.08 |
| PyTorch | crnn_mobilenet_v3_large | (32, 128, 3) | 2.1 M | 87.38 | 88.09 | 94.46 | 94.92 | 0.08 |
+----------------+---------------------------------+-----------------+--------------+------------+---------------+------------+---------------+--------------------+
| PyTorch | master | (32, 128, 3) | 58.7 M | | | | | 17.6 |
+----------------+---------------------------------+-----------------+--------------+------------+---------------+------------+---------------+--------------------+
Expand Down Expand Up @@ -183,7 +183,7 @@ While most of our recognition models were trained on our french vocab (cf. :ref:
*Disclaimer: both FUNSD subsets combine have 30595 word-level crops which might not be representative enough of the model capabilities*

Seconds per iteration (with a batch size of 64) is computed after a warmup phase of 100 tensors, by measuring the average number of processed tensors per second over 1000 samples. Those results were obtained on a `11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz`.
Seconds per iteration (with a batch size of 1) is computed after a warmup phase of 100 tensors, by measuring the average number of processed tensors per second over 1000 samples. Those results were obtained on a `11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz`.


Recognition predictors
Expand Down Expand Up @@ -211,19 +211,21 @@ You can use any combination of detection and recognition models supported by doc

For a comprehensive comparison, we have compiled a detailed benchmark on publicly available datasets:

+--------------------------------------------------+----------------------------+----------------------------+
| | FUNSD | CORD |
+================+=================================+============================+============+===============+
| **Backend** | **Architecture** | **Recall** | **Precision** | **Recall** | **Precision** |
+----------------+---------------------------------+------------+---------------+------------+---------------+
| TensorFlow | db_resnet50 | 81.22 | 86.66 | 92.46 | 89.62 |
+----------------+---------------------------------+------------+---------------+------------+---------------+
| None | Gvision text detection | 59.50 | 62.50 | 75.30 | 59.03 |
+----------------+---------------------------------+------------+---------------+------------+---------------+
| None | Gvision doc. text detection | 64.00 | 53.30 | 68.90 | 61.10 |
+----------------+---------------------------------+------------+---------------+------------+---------------+
| None | AWS textract | 78.10 | 83.00 | 87.50 | 66.00 |
+----------------+---------------------------------+------------+---------------+------------+---------------+
+---------------------------------------------------------------------------+----------------------------+----------------------------+
| | FUNSD | CORD |
+================+==========================================================+============================+============+===============+
| **Backend** | **Architecture** | **Recall** | **Precision** | **Recall** | **Precision** |
+----------------+----------------------------------------------------------+------------+---------------+------------+---------------+
| TensorFlow | db_resnet50 + crnn_vgg16_bn | 70.82 | 75.56 | 83.97 | 81.40 |
+----------------+----------------------------------------------------------+------------+---------------+------------+---------------+
| PyTorch | db_resnet50 + crnn_vgg16_bn | 67.82 | 73.35 | 84.84 | 83.27 |
+----------------+----------------------------------------------------------+------------+---------------+------------+---------------+
| None | Gvision text detection | 59.50 | 62.50 | 75.30 | 59.03 |
+----------------+----------------------------------------------------------+------------+---------------+------------+---------------+
| None | Gvision doc. text detection | 64.00 | 53.30 | 68.90 | 61.10 |
+----------------+----------------------------------------------------------+------------+---------------+------------+---------------+
| None | AWS textract | 78.10 | 83.00 | 87.50 | 66.00 |
+----------------+----------------------------------------------------------+------------+---------------+------------+---------------+


All OCR models above have been evaluated using both the training and evaluation sets of FUNSD and CORD (cf. :ref:`datasets`).
Expand Down

0 comments on commit 31ce669

Please sign in to comment.