Skip to content

Deployment

Docker Compose (Development)

docker compose up --build

Starts all four services: API, Admin, PostgreSQL, MinIO.

Services

Service Build Image Ports
api ./Dockerfile python:3.14-slim + uv 8000
admin ./admin/Dockerfile node:24-slim → node:24-alpine 4321
db postgres:18-alpine 5432
minio minio/minio:latest 9000, 9001

Build Args

  • FACE_API_MODEL: Model to download at build time (default: scrfd_10g)

Volumes

  • pgdata: PostgreSQL data directory
  • minio-data: MinIO object storage

Production Configuration

API Service

environment:
  FACE_API_MODEL: scrfd_10g
  FACE_API_STORAGE_ENABLED: "true"
  FACE_API_DATABASE_URL: "postgresql+asyncpg://user:pass@db-host:5432/faceapi"
  FACE_API_S3_ENDPOINT: "https://s3.amazonaws.com"
  FACE_API_S3_ACCESS_KEY: "${AWS_ACCESS_KEY_ID}"
  FACE_API_S3_SECRET_KEY: "${AWS_SECRET_ACCESS_KEY}"
  FACE_API_S3_BUCKET: "xylolabs-face-api-prod"
  FACE_API_S3_REGION: "us-east-1"
  FACE_API_CORS_ORIGINS: "https://admin.face-api.xylolabs.com"
  FACE_API_ORT_THREADS: "2"

Concurrency Tuning

Variable Default Description
FACE_API_MAX_CONCURRENT_INFERENCE 2 Max parallel ONNX inference calls per worker
FACE_API_MAX_CONCURRENT_REQUESTS 10 Max in-flight requests before 503
FACE_API_ORT_THREADS 0 (auto) ONNX Runtime threads per inference call

Scaling

  • Workers: --workers N in uvicorn CMD (default: 1). Each worker loads the model separately (~200MB).
  • ORT Threads: Set to core count (e.g., FACE_API_ORT_THREADS=2 for 2 cores).
  • Inference Concurrency: Keep MAX_CONCURRENT_INFERENCE <= core count to avoid CPU thrashing.
  • DB Pool: SQLAlchemy pool_size=10, max_overflow=20 (configured in database.py).

Health Check

curl http://localhost:8000/health

Docker HEALTHCHECK runs every 30s with 10s startup grace period.

Domain Routing

Domain Target
face-api.xylolabs.com API service :8000
admin.face-api.xylolabs.com Admin UI :4321
docs.face-api.xylolabs.com MkDocs static site
static.face-api.xylolabs.com MinIO :9000 (public assets)

Use nginx as reverse proxy. All domains have Let's Encrypt SSL with auto-renewal.

Production (Oracle Cloud)

The API is deployed at 130.162.132.159 (Oracle Cloud, Ampere Altra ARM, 2 cores / 4GB RAM).

Infrastructure

Component Details
VM OCI Ampere A1 (aarch64), Ubuntu 24.04
Docker Docker Engine 29.x + Compose v5.x
Reverse Proxy nginx 1.24 with HTTP→HTTPS redirect
SSL Let's Encrypt via certbot (auto-renews)
Domains face-api.xylolabs.com, admin.face-api.xylolabs.com, docs.face-api.xylolabs.com, static.face-api.xylolabs.com

Tuning for 2-core/4GB

# docker-compose.yml overrides
FACE_API_MAX_CONCURRENT_INFERENCE: "1"
FACE_API_MAX_CONCURRENT_REQUESTS: "8"
FACE_API_ORT_THREADS: "2"
# Dockerfile: --workers 1 (avoid double model loading)
# Memory limit: 768MB for API container

SSH Access

ssh -i ~/.ssh/xylolabs-prod.pem ubuntu@130.162.132.159

Redeploy

# From local machine
rsync -avz --exclude='.venv' --exclude='.git' --exclude='models/*.onnx' \
  --exclude='node_modules' --exclude='__pycache__' \
  -e "ssh -i ~/.ssh/xylolabs-prod.pem" \
  ./ ubuntu@130.162.132.159:~/face-api/

# On remote
cd ~/face-api && docker compose up -d --build

Nginx Config

Located at /etc/nginx/sites-available/face-api.conf. Certbot manages SSL directives automatically.

Firewall

Ports 22, 80, 443 open in iptables (persisted via netfilter-persistent). OCI Security List must also allow ingress on TCP 80/443.

Environment Variables

Full list in README.md.

Migration Path

The current setup uses Base.metadata.create_all() on startup for schema creation. For production migrations:

  1. Add Alembic: uv add alembic
  2. Initialize: alembic init migrations
  3. Configure migrations/env.py to use async engine
  4. Generate migrations: alembic revision --autogenerate -m "description"
  5. Apply: alembic upgrade head

The PostgreSQL service in docker-compose is designed to be migrated to a separate database server — just update FACE_API_DATABASE_URL.