Learn how to use Google Cloud Platform to process and enrich invoices so that we can enable fraud detection.
- Google Cloud Document AI
- Procurement DocAI
- Google Cloud Storage
- Google Cloud Pub/Sub
- Google Cloud Functions
- Cloud Build
- Geocoding API
- BigQuery
-
Create a Google Cloud Platform Project
-
Enable the APIs in the project you created in step #1 above
- Cloud Document AI API
- Cloud Functions API
- Geocoding API
- Cloud Build API
# Replace with Your Project ID
gcloud config set project YOUR_PROJECT_ID
gcloud services enable documentai.googleapis.com
gcloud services enable cloudfunctions.googleapis.com
gcloud services enable geocoding-backend.googleapis.com
gcloud services enable cloudbuild.googleapis.com
-
Initialize repositorysitory
-
Activate your Command Shell and clone this GitHub repository in your Command shell using the command:
git clone https://github.com/GoogleCloudPlatform/documentai-fraud-detection-demo.git
-
Change Directory to the repository Folder
cd documentai-fraud-detection-demo
-
-
Manage API Key
-
Paste the API Key in the
geocode-addresses/.env.yaml
file. -
Add API restrictions - To set API restrictions:
- Select Restrict key in the API restrictions section.
- Select Geocoding API from the dropdown.
- Select the Save button.
-
Create your Doc AI processor
- Go to Console > Doc AI > Create Processor > Invoice Parser (Under Specilaized)
- Name the processor
fraud-detection-invoice-parser
(or something else you'll remember) - Note the Region and ID of the processor, you will need to plug these values in your cloud function's environment variables
- Name the processor
- Paste the processor location and ID in the
process-invoices/.env.yaml
file
- Go to Console > Doc AI > Create Processor > Invoice Parser (Under Specilaized)
-
Execute Bash shell scripts in your Cloud Shell terminal to create cloud resources (i.e Google Cloud Storage Buckets, Pub/Sub topics, Cloud Functions, BigQuery tables)
-
Update the value of PROJECT_ID in
.env.local
to match your current projectID -
Execute your .sh files to create cloud resources
bash create-archive-bucket.sh bash create-input-bucket.sh bash create-output-bucket.sh bash create-pub-sub-topic.sh bash create-bq-tables.sh bash deploy-cloud-function-process-invoices.sh bash deploy-cloud-function-geocode-addresses.sh
-
-
Testing/Validating the demo
- Upload a sample invoice in the input bucket
- At the end of the processing, you should expect your BigQuery tables to be populated with extracted entities as well as enriched data (i.e placesID, lat, long, formatted address, name, url, description)
- Reading the results, we can now build custom business intelligence rules using these enriched fields to enable fraud detection. For example, if the address is not something the Geocoding API can find, then it is an indicator of either incorrect value or fraudulent invoice