Deploy Omni with Docker Compose for development or single-server production
Docker Compose is the simplest deployment option, all services are deployed on a single node.For high availability, auto-scaling, or multi-region deployments, see AWS Deployment with Terraform.
cp .env.example .envEdit .env and update the following variables:
Copy
# Database (generate secure password, e.g. openssl rand -base64 32)DATABASE_PASSWORD=<your_db_password># Security (generate secure keys, e.g., openssl rand -hex 16)ENCRYPTION_KEY=<your_encryption_key>ENCRYPTION_SALT=<your_encryption_salt># ApplicationOMNI_DOMAIN=<your_domain_name>APP_URL=https://<your_domain_name># Enabled Connectors# Specify connector names as a comma-separated string# Pick and choose connectors you wish to run: google, slack, atlassian, etc. (See the .env file for valid values)ENABLED_CONNECTORS=google,web
alias omni-compose="docker compose -f docker/docker-compose.yml --env-file .env"
Start Omni:
Copy
omni-compose up -d
Monitor startup:
Copy
omni-compose logs -fomni-compose ps # all services should show "healthy"
Access at https://<your_domain_name>. First user becomes admin.Once all services are healthy, follow the Initial Setup guide to configure LLM providers, embeddings, and connectors.
If you wish to use local inference (for either language or embedding models), you will almost definitely need GPU acceleration.To enable this, download the GPU override:
omni-compose -f docker/docker-compose.gpu.yml --profile local-embeddings --profile vllm up -d
Then configure vLLM as your LLM provider, and “local” as your embedding provider in the Web UI.The compose stack has separate containers for local LLM and embedding inference, so you can choose each one independent of the other - e.g., you could use a cloud LLM with a local embedding model.