Home

Awesome

Document Intake Accelerator

A pre-packaged and customizable solution to accelerate the development of end-to-end document processing workflow incorporating Document AI parsers and other GCP products (Firestore, BigQuery, GKE, etc). The goal is to accelerate the development efforts in document workflow with many ready-to-use components.

Key features

Getting Started to Deploy the DocAI Workflow

Prerequisites

export PROJECT_ID=<GCP Project ID>
export REGION=us-central1
export ADMIN_EMAIL=<Your Email>
export BASE_DIR=$(pwd)

# A custom domain like your-domain.com, or leave it blank for using the Ingress IP address instead.
export API_DOMAIN=<Your Domain>

gcloud auth application-default login
gcloud auth application-default set-quota-project $PROJECT_ID
gcloud config set project $PROJECT_ID

Make sure to update to the latest gcloud tool:

# Tested with gcloud v400.0.0
gcloud components update

GCP Orgnization policy

Run the following commands to update Organization policies:

export ORGANIZATION_ID=<your organization ID>
gcloud resource-manager org-policies disable-enforce constraints/compute.requireOsLogin --organization=$ORGANIZATION_ID
gcloud resource-manager org-policies delete constraints/compute.vmExternalIpAccess --organization=$ORGANIZATION_ID

Or, change the following Organization policy constraints in GCP Console

GCP foundation - Terraform

Set up Terraform environment variables and GCS bucket for state file:

export TF_VAR_api_domain=$API_DOMAIN
export TF_VAR_admin_email=$ADMIN_EMAIL
export TF_VAR_project_id=$PROJECT_ID
export TF_BUCKET_NAME="${PROJECT_ID}-tfstate"
export TF_BUCKET_LOCATION="us"

# Create Terraform Statefile in GCS bucket.
bash setup/setup_terraform.sh

Run Terraform

cd terraform/environments/dev
terraform init

# enabling GCP services first.
terraform apply -target=module.project_services -target=module.service_accounts -auto-approve

# Run the rest of Terraform
terraform apply

# ...
# Enter yes at the promopt to apply Terraform changes.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

IMPORTANT: Run the script to update config JSON based on terraform output.

# in terraform/environments/dev folder
bash ../../../setup/update_config.sh

Get the API endpoint IP address, this will be used in Firebase Auth later.

kubectl describe ingress | grep Address

NOTE: If you don’t have a custom domain, and want to use the Ingress IP address as the API endpoint:

Enable Firebase Auth

Deploying Kubernetes Microservices

Connect to the default-cluster:

gcloud container clusters get-credentials main-cluster --region $REGION --project $PROJECT_ID

Build all microservices (including web app) and deploy to the cluster:

cd $BASE_DIR
skaffold run -p prod --default-repo=gcr.io/$PROJECT_ID

Update DNS with custom Domain (Optional)

Get the Ingress external IP:

kubectl describe ingress | grep Address

Add an A Record in your DNS setting to point to the Ingress IP Address, e.g. in https://domains.google.com/.

Deployment Troubleshoot

Terraform Troubleshoot

App Engine already exists

│ Error: Error creating App Engine application: googleapi: Error 409: This application already exists and cannot be re-created., alreadyExists
│
│   with module.firebase.google_app_engine_application.firebase_init,
│   on ../../modules/firebase/main.tf line 3, in resource "google_app_engine_application" "firebase_init":
│    3: resource "google_app_engine_application" "firebase_init" {

Solution: Import the existing project in Terraform:

terraform import module.firebase.google_app_engine_application.firebase_init $PROJECT_ID

CloudRun Troubleshoot

The CloudRun service “queue” is used as the task dispatcher from listening to Pub/Sub “queue-topic”

Frontend Web App

Your connection is not private

Development

Prerequisites

Install required packages:

Initial setup for local development

After cloning the repo, please set up for local development.

Build and run all microservices in the default GKE cluster

NOTE: By default, skaffold builds with CloudBuild and runs in GKE cluster, using the namespace set above.

To build and run in cluster:

skaffold run --port-forward

Build and run all microservices in Develompent mode with live reload

To build and run in cluster with hot reload:

skaffold dev --port-forward

Build and run with a specific microservice

skaffold run --port-forward -m <Microservice>

You can also run multiple specific microservices altogether. E.g.:

skaffold run --port-forward -m sample-service,other-service

Build and run microservices with a custom Source Repository path

skaffold dev --default-repo=<Image registry path> --port-forward

E.g. you can point to a different GCP Cloud Source Repository path:

skaffold dev --default-repo=gcr.io/another-project-path --port-forward

Run with local minikube cluster

Install Minikube:

# For MacOS:
brew install minikube

# For Windows:
choco install -y minikube

Make sure the Docker daemon is running locally. To start minikube:

minikube start

To build and run locally:

skaffold run --port-forward

# Or, to build and run locally with hot reload:
skaffold dev --port-forward

Optionally, you may want to set GOOGLE_APPLICATION_CREDENTIALS manually to a local JSON key file.

GOOGLE_APPLICATION_CREDENTIALS=<Path to Service Account key JSON file>

Deploy to a specific GKE cluster

IMPORTANT: Please change gcloud project and kubectl context before running skaffold.

Replace the <Custom GCP Project ID> with a specific project ID and run the following:

export PROJECT_ID=<Custom GCP Project ID>

# Switch to a specific project.
gcloud config set project $PROJECT_ID

# Assuming the default cluster name is "default_cluster".
gcloud container clusters get-credentials default_cluster --zone us-central1-a --project $PROJECT_ID

Run with skaffold:

skaffold run -p custom --default-repo=gcr.io/$PROJECT_ID

# Or run with hot reload and live logs:
skaffold dev -p custom --default-repo=gcr.io/$PROJECT_ID

Build and run microservices with a different Skaffold profile

# Using custom profile
skaffold dev -p custom --port-forward

# Using prod profile
skaffold dev -p prod --port-forward

Skaffold profiles

By default, the Skaffold YAML contains the following pre-defined profiles ready to use.

Useful Kubectl commands

To check if pods are deployed and running:

kubectl get po

# Or, watch the live update in a separate terminal:
watch kubectl get po

To create a namespace:

kubectl create ns <New namespace>

To set a specific namespace for further kubectl operations:

kubectl config set-context --current --namespace=<Your namespace>

Code Submission Process

For the first-time setup:

When making code changes

(For Repo Admins) Reviewing a Pull Request

For code reviewers, go to the Pull Requests page of the origin repo on Github.

Test for PR changes

(For Develpers) Microservices Assumptions