Skip to content

Process email meta data (from S3 bucket). Extract blocks and key value pairs using AWS Textract

Notifications You must be signed in to change notification settings

lux-group/fn-stp-textract

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Straight Through Processing (STP) Textract Processor

CircleCI

Lambda function that reacts to new objects in an S3 mail data structure (see https://github.com/brandsExclusive/fn-stp-process-inbox/), email data is processed with AWS Textract PDF, if matching rules on originating organisation (i.e. email sender) and attchement file type are met.

Configuration

The function is fired by an S3 action whenever a new *_meta.json file signals a new email has been pre-processed by the fn-stp-process-inbox lambda function.

See config files in ./deploy folder for lambda naming, S3 inbox and S3 output bucket names.

The ORIGINATORS environment variable determines domains for which emails will be accepted for processing (comma separated). After each domain a regex express to filter required files to extact is specified in curly braces.

e.g. the following will Textract any PDF files sent from gmail.com, but only PDF files with the word INVOICE in them from luxuryescapes.com

gmail.com{.*(PDF|pdf)},luxuryescapes.com{.*INVOICE.*(PDF|pdf)}`

Deployment

To deploy run the following JOBS on jenkins

TODO: configure jenkins

To deploy locally install the AWS CLI and run the following:

TEST

$ yarn deploy-test

PRODUCTION

$ FN_ORIGINATORS=gmail.com{.*(PDF|pdf)},luxuryescapes.com{.*INVOICE.*(PDF|pdf)} yarn deploy-production

Logs

To tail logs locally install the AWS CLI and run the following:

TEST

$ yarn logs-test

PRODUCTION

$ yarn logs-production

Maintainers

Collaborators

  • TBA

About

Process email meta data (from S3 bucket). Extract blocks and key value pairs using AWS Textract

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published