diff --git a/docs/en/ocr_pipeline_components.md b/docs/en/ocr_pipeline_components.md
index 9277022fde..873bf9dcd9 100644
--- a/docs/en/ocr_pipeline_components.md
+++ b/docs/en/ocr_pipeline_components.md
@@ -3211,6 +3211,8 @@ others. One could almost say they feed on and grow on ideas.
`ImageToTextV2` can work on CPU, but GPU is preferred in order to achieve acceptable performance.
+`ImageToTextV2` can be used with caching enabled.
+
`ImageToTextV2` can receive regions representing single line texts, or regions coming from a text detection model.
@@ -3221,6 +3223,7 @@ others. One could almost say they feed on and grow on ideas.
| Param name | Type | Default | Column Data Description |
| --- | --- | --- | --- |
| inputCols | Array[string] | [image] | Can use as input image struct ([Image schema](ocr_structures#image-schema)) and regions. |
+| regionsColumn | string | regions | Input column containing regions to be processed. |
@@ -3232,6 +3235,13 @@ others. One could almost say they feed on and grow on ideas.
| lineTolerance | integer | 15 | Line tolerance in pixels. It's used for grouping text regions by lines. |
| borderWidth | integer | 5 | A value of more than 0 enables to border text regions with width equal to the value of the parameter. |
| spaceWidth | integer | 10 | A value of more than 0 enables to add white spaces between words on the image. |
+| maxImageRatio | float | 11.25 | Value for the width/height ratio of images that are fed to the model. Large values reduce inference time, but may cause the model to diverge. |
+| groupImages | boolean | True | Whether to group images to maximize detection quality or not. |
+| batchSize | integer | 3 | Number of text patches to feed the model at the same time. |
+| taskParallelism | integer | 8 | How many threads to use when processing a single region. |
+| useGPU | boolean | False | Enable to use GPU. |
+| useCaching | boolean | True | Enable to use caching. |
+| keepInput | boolean | True | Enable to preserve input column. |
@@ -3240,7 +3250,9 @@ others. One could almost say they feed on and grow on ideas.
{:.table-model-big}
| Param name | Type | Default | Column Data Description |
| --- | --- | --- | --- |
-| outputCol | string | text | Recognized text |
+| outputCol | string | text | Recognized text. |
+| positionsCol | string | positions | Position Col. |
+| outputFormat | Enum | OcrOutputFormat.TEXT | Return output type. |
**Example:**
@@ -3251,6 +3263,7 @@ others. One could almost say they feed on and grow on ideas.
```python
from pyspark.ml import PipelineModel
from sparkocr.transformers import *
+from sparkocr.enums import *
imagePath = "path to image"
@@ -3271,7 +3284,11 @@ text_detector = ImageTextDetectorV2 \
.setSizeThreshold(20)
ocr = ImageToTextV2.pretrained("ocr_base_printed", "en", "clinical/ocr") \
- .setInputCols(["image", "text_regions"]) \
+ .setInputCols(["image"]) \
+ .setRegionsColumn("text_regions") \
+ .setUseGPU(True) \
+ .setUseCaching(True) \
+ .setOutputFormat(OcrOutputFormat.TEXT) \
.setOutputCol("text")
# Define pipeline
@@ -4391,4 +4408,4 @@ Output:
```
-
\ No newline at end of file
+