OHDSI GIS
WGThis page describes various deployment strategies for the Gaia toolchain, addressing different institutional constraints and use cases.
Official deployment method with coordinated image builds and versioned releases.
Prerequisites: Docker Engine 20.10+, Docker Compose 2.0+, 8GB+ RAM, 50GB+ disk
# Clone gaiaDocker repository
git clone https://github.com/OHDSI/gaiaDocker.git
cd gaiaDocker
# Start the full Gaia stack
docker compose --profile gaia up -d
# Optional: Include degauss geocoding service
docker compose --profile gaia --profile degauss up -d
# Verify services are running
docker compose ps
# View logs
docker compose logs -f
# Check specific service
docker compose logs -f gaia-db
Customize .env file for database credentials, service
ports, and resource limits. See gaiaDocker repository for full
example.
curl http://localhost:3000/ # Test PostgREST API
docker compose exec gaia-db psql -U gaia -d gaiadb # Connect to database
docker compose ps # Check service health
docker compose down # Stop (preserves data)
docker compose down -v # Stop and remove volumes (deletes data!)
Daemonless alternative to Docker, preferred by some IT/security departments.
# Install
sudo dnf install podman podman-compose # RHEL/CentOS/Fedora
brew install podman # macOS
# Deploy
podman machine init && podman machine start # macOS/Windows
git clone https://github.com/OHDSI/gaiaDocker.git && cd gaiaDocker
podman-compose --profile gaia up -d
Infrastructure-as-code for bare metal, VMs, and air-gapped environments.
Prerequisites: Ansible 2.9+, Ubuntu 20.04+/RHEL 8+, SSH access
# Deploy
ansible-playbook -i inventory/production playbooks/deploy-full-stack.yml
# Use vault for secrets
ansible-vault create group_vars/production/vault.yml
ansible-playbook playbook.yml --ask-vault-pass
See gaiaDb repository for example playbooks and roles.
For maximum control, institutional constraints, or learning.
Requirements: Ubuntu 20.04+/RHEL 8+, 8GB+ RAM, 50GB+ disk, PostgreSQL 12+, PostGIS 3.0+
# 1. Install PostgreSQL/PostGIS
sudo apt-get install postgresql-14 postgresql-14-postgis-3
# 2. Configure database
sudo -u postgres psql
CREATE DATABASE gaiadb;
CREATE USER gaia WITH PASSWORD 'your_password';
GRANT ALL PRIVILEGES ON DATABASE gaiadb TO gaia;
\c gaiadb
CREATE EXTENSION postgis;
# 3. Clone gaiaDb
git clone https://github.com/OHDSI/gaiaDb.git && cd gaiaDb
psql -U gaia -d gaiadb -f schema/create_schema.sql
See gaiaDb and gaiaCore repositories for detailed setup instructions.
Status: In development
Community contributions welcome.
| Scenario | Recommended Approach | Reason |
|---|---|---|
| Local development | gaiaDocker | Easy setup, quick iteration |
| IT requires no Docker daemon | Podman + gaiaDocker | Daemonless, rootless |
| Air-gapped environment | Ansible + Manual | No internet, full control |
| Production, small scale | gaiaDocker + monitoring | Simple, proven, versioned |
| Production, large scale | Kubernetes | Scalability, resilience |
| Multi-site research network | Kubernetes + federation | Distributed architecture |
| Cloud-native | Cloud platform | Managed services, scalability |
| Learning/testing | Manual | Understand components |
| Institutional restrictions | Manual or Ansible | Maximum compatibility |
All deployments: Change default passwords, enable SSL/TLS, restrict network access, regular updates, audit logging, backups
Containers: Official images, vulnerability scanning, non-root user, secrets management, limited capabilities
Cloud: IAM roles, encryption at rest/transit, private subnets, VPC peering
Tools: Prometheus/Grafana (metrics), ELK/Loki (logs), Jaeger (tracing)
Key metrics: Connection pool utilization, query performance, API response times, resource usage, disk space
# Database backup
pg_dump -U gaia gaiadb > gaiadb_backup_$(date +%Y%m%d).sql
# Docker volume backup
docker run --rm -v gaiadb_data:/data -v /backup:/backup ubuntu tar czf /backup/gaiadb_data.tar.gz /data
Test backups regularly, document recovery procedures, store off-site, define RTO/RPO targets.
PostgreSQL: Increase shared_buffers (25% RAM), work_mem (256MB), enable parallel queries
PostGIS: Create GIST spatial indexes, ANALYZE after bulk loads, VACUUM regularly
CREATE INDEX idx_geom ON table USING GIST(geom);
VACUUM ANALYZE table;
Connection refused: Check
docker compose ps or
systemctl status postgresql, verify network with
nc -zv localhost 5432
Out of memory: Increase Docker memory limit or reduce batch size
Slow queries: Check spatial indexes exist
(pg_indexes), create if missing
(CREATE INDEX ... USING GIST)
Upgrades: Backup → test in staging → review changelog → deploy → verify → monitor
Help: GIS Teams channel, GitHub issues, Friday meetings (10 AM ET), houghtaling@ohdsi.org