Deployment Strategies

This page describes various deployment strategies for the Gaia toolchain, addressing different institutional constraints and use cases.


Deployment Options

  1. gaiaDocker (Docker Compose) - Official deployment method (recommended)
  2. Podman - Daemonless alternative to Docker
  3. Ansible - Infrastructure-as-code for bare metal/VMs
  4. Manual - Traditional installation
  5. Kubernetes - Production-scale (in development)
  6. Cloud - AWS, Azure, GCP (planned)


Podman

Daemonless alternative to Docker, preferred by some IT/security departments.

# Install
sudo dnf install podman podman-compose  # RHEL/CentOS/Fedora
brew install podman  # macOS

# Deploy
podman machine init && podman machine start  # macOS/Windows
git clone https://github.com/OHDSI/gaiaDocker.git && cd gaiaDocker
podman-compose --profile gaia up -d


Ansible

Infrastructure-as-code for bare metal, VMs, and air-gapped environments.

Prerequisites: Ansible 2.9+, Ubuntu 20.04+/RHEL 8+, SSH access

# Deploy
ansible-playbook -i inventory/production playbooks/deploy-full-stack.yml

# Use vault for secrets
ansible-vault create group_vars/production/vault.yml
ansible-playbook playbook.yml --ask-vault-pass

See gaiaDb repository for example playbooks and roles.


Manual Installation

For maximum control, institutional constraints, or learning.

Requirements: Ubuntu 20.04+/RHEL 8+, 8GB+ RAM, 50GB+ disk, PostgreSQL 12+, PostGIS 3.0+

# 1. Install PostgreSQL/PostGIS
sudo apt-get install postgresql-14 postgresql-14-postgis-3

# 2. Configure database
sudo -u postgres psql
CREATE DATABASE gaiadb;
CREATE USER gaia WITH PASSWORD 'your_password';
GRANT ALL PRIVILEGES ON DATABASE gaiadb TO gaia;
\c gaiadb
CREATE EXTENSION postgis;

# 3. Clone gaiaDb
git clone https://github.com/OHDSI/gaiaDb.git && cd gaiaDb
psql -U gaia -d gaiadb -f schema/create_schema.sql

See gaiaDb and gaiaCore repositories for detailed setup instructions.


Kubernetes & Cloud

Status: In development

  • Kubernetes: Helm charts, horizontal autoscaling, multi-site federation (planned Q2 2026)
  • AWS: Terraform modules, RDS PostgreSQL, ECS/Fargate (in development)
  • Azure: ARM templates, Azure Database for PostgreSQL, AKS (planned)
  • GCP: Cloud SQL, Cloud Run/GKE (planned)

Community contributions welcome.


Deployment Decision Matrix

Scenario Recommended Approach Reason
Local development gaiaDocker Easy setup, quick iteration
IT requires no Docker daemon Podman + gaiaDocker Daemonless, rootless
Air-gapped environment Ansible + Manual No internet, full control
Production, small scale gaiaDocker + monitoring Simple, proven, versioned
Production, large scale Kubernetes Scalability, resilience
Multi-site research network Kubernetes + federation Distributed architecture
Cloud-native Cloud platform Managed services, scalability
Learning/testing Manual Understand components
Institutional restrictions Manual or Ansible Maximum compatibility


Security

All deployments: Change default passwords, enable SSL/TLS, restrict network access, regular updates, audit logging, backups

Containers: Official images, vulnerability scanning, non-root user, secrets management, limited capabilities

Cloud: IAM roles, encryption at rest/transit, private subnets, VPC peering


Monitoring

Tools: Prometheus/Grafana (metrics), ELK/Loki (logs), Jaeger (tracing)

Key metrics: Connection pool utilization, query performance, API response times, resource usage, disk space


Backup & Recovery

# Database backup
pg_dump -U gaia gaiadb > gaiadb_backup_$(date +%Y%m%d).sql

# Docker volume backup
docker run --rm -v gaiadb_data:/data -v /backup:/backup ubuntu tar czf /backup/gaiadb_data.tar.gz /data

Test backups regularly, document recovery procedures, store off-site, define RTO/RPO targets.


Performance Tuning

PostgreSQL: Increase shared_buffers (25% RAM), work_mem (256MB), enable parallel queries

PostGIS: Create GIST spatial indexes, ANALYZE after bulk loads, VACUUM regularly

CREATE INDEX idx_geom ON table USING GIST(geom);
VACUUM ANALYZE table;


Troubleshooting

Connection refused: Check docker compose ps or systemctl status postgresql, verify network with nc -zv localhost 5432

Out of memory: Increase Docker memory limit or reduce batch size

Slow queries: Check spatial indexes exist (pg_indexes), create if missing (CREATE INDEX ... USING GIST)


Upgrades & Help

Upgrades: Backup → test in staging → review changelog → deploy → verify → monitor

Help: GIS Teams channel, GitHub issues, Friday meetings (10 AM ET),