-
Notifications
You must be signed in to change notification settings - Fork 90
Using Templating API
This section discusses the setting up of BlinkOCR recognizer for scanning templated documents. Please check demo app for examples.
Templated document is any document which is defined by its template. Template contains the information about how the document should be detected, i.e. found on the camera scene and information about which part of document contains which useful information.
Before performing OCR of the document, BlinkID first needs to find its location on camera scene. In order to perform detection, you need to define PPDetectorSettings
which will be used to instantiate detector which perform document detection. You can set detector with detectorSettings
property. Check our guide for initializing PPDetectorSettings
.
If you do not set detector settings, BlinkOCR recognizer will work in normal mode, recognizing characters on input images.
After document has been detected, it will be recognized. This is done in following way:
- the detector produces a
PPDetectorResult
which contains one or more detection locations. - based on array of
PPDecodingInfo
's that were defined as part of concretePPDetectorSettings
, for each element of array following is performed:- location defined in
PPDecodingInfo
is dewarped to image of height defined withinPPDecodingInfo
. For example take this image: Location ofPPDecodingInfo
containing surname would be
CGRectMake(292.0/1024.0, 145.0/645.0, 355.0/1024.0, 65.0/645.0); //OR CGRectMake(0.28, 0.22, 0.33, 0.1);
- a parser group that has same name/uniqueId as current
PPDecodingInfo
is searched and if it is found, optimal OCR settings for all parsers from that parser group is calculated - using optimal OCR settings OCR is performaed on the dewarped image
- finally, OCR result is parsed with each parser from that parser group
- if parser group with the same name as current DecodingInfo cannot be found, no OCR will be performed, however image will be reported via
didOutputMetadata:
if receiving of DEWARPED images has been enabled
- location defined in
- if property
documentClassifier
hasn't been set recognition is done. IfPPDocumentClassifier
exists, its methodclassifyDocumentFromResult:
is called to determine which type document has been detected - If classifier returned string which is same as one used previously to setup parser decoding infos, then this array of
PPDecodingInfos
is obtained and step 2. is performed again with obtained array ofPPDecodingInfos
.
If you plan scanning several different documents of same size, for example different ID cards, which are all 85x54 mm (credit card) size, then you need to use PPDocumentClassifer
to classify the type of document so correct PPDecodingInfo
array can be used for obtaining relevant information. An example would be the case where you need to scan both front sides of croatian and german ID cards - the location of first and last names are not same on both documents. Therefore, you first need to classify the document based on some discriminative features.
If you plan supporting only single document type (i.e. national ID of a single country), then you do not need to use PPDocumentClassifier
.
PPDocumentClassifier
is protocol that should be implemented to support classification of documents that cannot be differentiated by detector. Classification result is used to determine which set of decoding infos will be used to extract classification-specific data.The following method has to be implemented:
- (NSString *)classifyDocumentFromResult:(PPTemplatingRecognizerResult *)result;
Based on PPTemplatingRecognizerResult
(superclass of PPBlinkOcrRecognizerResult
) which contains data extracted from decoding infos inherent to detector, classifies the document. For each document type that you want to support, returned result string has to be equal to the name/uniqueId of the corresponding set of PPDecodingInfo
objects which are defined for that document type. Named decoding info sets should be defined using the following method in PPTemplatingRecognizerSettings
superclass:
- (void)setDecodingInfoSet:(NSArray<PPDecodingInfo*> *)decodingInfos forClassifierResult:(NSString *)classifierResult;
It can be hard fine-tuning the exact location of each PPDecodingInfo
but outputting each image that is sent to our OCR engine can help.
To enable this feature this property needs to be set to YES
before initializing your PPCoordinator
. Sample code below shows how this can be done:
/** 1. Initialize the Scanning settings */
// Initialize the scanner settings object. This initialize settings with all default values.
PPSettings *settings = [[PPSettings alloc] init];
settings.metadataSettings.debugMetadata.debugOcrInputFrame = YES;
/** 2. Setup the license key */
// Add your license key here, like in our sample applications
/** 3. Set up what is being scanned. See detailed guides for specific use cases. */
// Add your recognizers here, like in our sample applications
/** 4. Initialize the Scanning Coordinator object */
PPCameraCoordinator *coordinator = [[PPCameraCoordinator alloc] initWithSettings:settings];
The images will then be outputted to scanningViewController:didOutputMetadata:
callback in your PPScanningDelegate
.
Below is sample code demonstrating how to fetch these images:
- (void)scanningViewController:(UIViewController<PPScanningViewController> *)scanningViewController
didOutputMetadata:(PPMetadata *)metadata {
if ([metadata isKindOfClass:[PPImageMetadata class]]) {
PPImageMetadata *imageMetadata = (PPImageMetadata *)metadata;
// Fetch the image
UIImage *ocrInputImage = imageMetadata.image;
}
}
Just like when using PPBlinkOcrRecognizer
recognizer in segment scan mode, same principles apply here (guide is available here. You use the same approach as discussed in Obtaining results from BlinkOCR recognizer. Just keep in mind to use parser group names that are equal to decoding info names. Templating-sample app is available on GitHub for detailed example.
- Getting Started with BlinkID SDK
- Obtaining scanning results
- Using Direct Processing API
- Customizing Camera UI
- Creating customized framework
- Upgrading from older versions
- Troubleshoot