Deployment¶
Docker Compose (Development)¶
Starts all four services: API, Admin, PostgreSQL, MinIO.
Services¶
| Service | Build | Image | Ports |
|---|---|---|---|
api |
./Dockerfile |
python:3.14-slim + uv | 8000 |
admin |
./admin/Dockerfile |
node:24-slim → node:24-alpine | 4321 |
db |
— | postgres:18-alpine | 5432 |
minio |
— | minio/minio:latest | 9000, 9001 |
Build Args¶
FACE_API_MODEL: Model to download at build time (default:scrfd_10g)
Volumes¶
pgdata: PostgreSQL data directoryminio-data: MinIO object storage
Production Configuration¶
API Service¶
environment:
FACE_API_MODEL: scrfd_10g
FACE_API_STORAGE_ENABLED: "true"
FACE_API_DATABASE_URL: "postgresql+asyncpg://user:pass@db-host:5432/faceapi"
FACE_API_S3_ENDPOINT: "https://s3.amazonaws.com"
FACE_API_S3_ACCESS_KEY: "${AWS_ACCESS_KEY_ID}"
FACE_API_S3_SECRET_KEY: "${AWS_SECRET_ACCESS_KEY}"
FACE_API_S3_BUCKET: "xylolabs-face-api-prod"
FACE_API_S3_REGION: "us-east-1"
FACE_API_CORS_ORIGINS: "https://admin.face-api.xylolabs.com"
FACE_API_ORT_THREADS: "2"
Concurrency Tuning¶
| Variable | Default | Description |
|---|---|---|
FACE_API_MAX_CONCURRENT_INFERENCE |
2 |
Max parallel ONNX inference calls per worker |
FACE_API_MAX_CONCURRENT_REQUESTS |
10 |
Max in-flight requests before 503 |
FACE_API_ORT_THREADS |
0 (auto) |
ONNX Runtime threads per inference call |
Scaling¶
- Workers:
--workers Nin uvicorn CMD (default: 1). Each worker loads the model separately (~200MB). - ORT Threads: Set to core count (e.g.,
FACE_API_ORT_THREADS=2for 2 cores). - Inference Concurrency: Keep
MAX_CONCURRENT_INFERENCE<= core count to avoid CPU thrashing. - DB Pool: SQLAlchemy
pool_size=10, max_overflow=20(configured indatabase.py).
Health Check¶
Docker HEALTHCHECK runs every 30s with 10s startup grace period.
Domain Routing¶
| Domain | Target |
|---|---|
face-api.xylolabs.com |
API service :8000 |
admin.face-api.xylolabs.com |
Admin UI :4321 |
docs.face-api.xylolabs.com |
MkDocs static site |
static.face-api.xylolabs.com |
MinIO :9000 (public assets) |
Use nginx as reverse proxy. All domains have Let's Encrypt SSL with auto-renewal.
Production (Oracle Cloud)¶
The API is deployed at 130.162.132.159 (Oracle Cloud, Ampere Altra ARM, 2 cores / 4GB RAM).
Infrastructure¶
| Component | Details |
|---|---|
| VM | OCI Ampere A1 (aarch64), Ubuntu 24.04 |
| Docker | Docker Engine 29.x + Compose v5.x |
| Reverse Proxy | nginx 1.24 with HTTP→HTTPS redirect |
| SSL | Let's Encrypt via certbot (auto-renews) |
| Domains | face-api.xylolabs.com, admin.face-api.xylolabs.com, docs.face-api.xylolabs.com, static.face-api.xylolabs.com |
Tuning for 2-core/4GB¶
# docker-compose.yml overrides
FACE_API_MAX_CONCURRENT_INFERENCE: "1"
FACE_API_MAX_CONCURRENT_REQUESTS: "8"
FACE_API_ORT_THREADS: "2"
# Dockerfile: --workers 1 (avoid double model loading)
# Memory limit: 768MB for API container
SSH Access¶
Redeploy¶
# From local machine
rsync -avz --exclude='.venv' --exclude='.git' --exclude='models/*.onnx' \
--exclude='node_modules' --exclude='__pycache__' \
-e "ssh -i ~/.ssh/xylolabs-prod.pem" \
./ ubuntu@130.162.132.159:~/face-api/
# On remote
cd ~/face-api && docker compose up -d --build
Nginx Config¶
Located at /etc/nginx/sites-available/face-api.conf. Certbot manages SSL directives automatically.
Firewall¶
Ports 22, 80, 443 open in iptables (persisted via netfilter-persistent). OCI Security List must also allow ingress on TCP 80/443.
Environment Variables¶
Full list in README.md.
Migration Path¶
The current setup uses Base.metadata.create_all() on startup for schema creation. For production migrations:
- Add Alembic:
uv add alembic - Initialize:
alembic init migrations - Configure
migrations/env.pyto use async engine - Generate migrations:
alembic revision --autogenerate -m "description" - Apply:
alembic upgrade head
The PostgreSQL service in docker-compose is designed to be migrated to a separate database server — just update FACE_API_DATABASE_URL.