A Terraform module to deploy an Airbyte server on a GCP compute engine instance.
Deployed Airbyte version: 0.52.0
- Terraform. Tested with v1.5.3
- A GCP project and an authenticated gcloud CLI
- Broad roles that will work, but not recommended for service accounts or even people.
roles/owner
- Recommended roles to respect the least privilege principle.
roles/compute.admin
roles/iam.serviceAccountAdmin
roles/resourcemanager.projectIamAdmin
- Granular permissions required to build a custom role specific for this deployment.
compute.addresses.create
compute.addresses.delete
compute.disks.create
compute.firewalls.create
compute.firewalls.delete
compute.instances.create
compute.instances.delete
compute.instances.setMetadata
compute.instances.setServiceAccount
compute.networks.create
compute.networks.delete
compute.networks.updatePolicy
compute.routers.create
compute.routers.delete
compute.routers.update
compute.routes.create
compute.routes.delete
compute.subnetworks.create
compute.subnetworks.delete
compute.subnetworks.use
iam.serviceAccounts.create
iam.serviceAccounts.delete
resourcemanager.projects.setIamPolicy
Go to the examples
directory to view deployment code samples.
Once deployment is successfully finished, you will need to SSH tunnel to your Airbyte instance:
$ gcloud compute ssh airbyte -- -L 8000:localhost:8000 -N -f
You will then be able to access it via your browser.
You may also Terraform the ELT flows themselves using the Artefactory terraform-google-airbyte-flows module, or the Airbyte community provider.
This module will provision the following resources.
As a security precaution, the Airbyte VM is not reachable from the internet. The only way to access it is through gcloud-authenticated SSH. Egress traffic is of course possible to access remote data sources to pull data from.
The Airbyte service account has fairly high level of privilege on GCS and BQ (roles/storage.objectAdmin
, and roles/bigquery.dataEditor
), allowing it to read/write on any table or bucket. These permissions are required to allow Airbyte to create temporary datasets and tables. If that is an issue for you, you can isolate this deployment in a dedicated project.
No requirements.
Name | Version |
---|---|
n/a |
No modules.
Name | Type |
---|---|
google_compute_address.airbyte_external_ip | resource |
google_compute_firewall.allow_internal_traffic | resource |
google_compute_firewall.allow_ssh_from_iap | resource |
google_compute_instance.airbyte_vm | resource |
google_compute_network.airbyte_vpc | resource |
google_compute_route.internet_route | resource |
google_compute_router.router | resource |
google_compute_router_nat.airbyte_nat | resource |
google_compute_subnetwork.airbyte_subnet | resource |
google_project_iam_member.airbyte_iam | resource |
google_service_account.airbyte | resource |
Name | Description | Type | Default | Required |
---|---|---|---|---|
config | Configuration for the Airbyte VM | object({ |
{} |
no |
project_id | GCP project id | string |
n/a | yes |
region | GCP region | string |
n/a | yes |
zone | GCP zone | string |
n/a | yes |
Name | Description |
---|---|
airbute_nat | n/a |
airbyte_address | n/a |
airbyte_router | n/a |
airbyte_service_account | n/a |
airbyte_subnet | n/a |
airbyte_vm_name | n/a |
airbyte_vpc | n/a |