This page provides a comprehensive reference for all environment variables and configuration options available in Omni.
All environment variables can be set in your .env file for Docker Compose deployments or in terraform.tfvars for AWS deployments.
Database Configuration
| Variable | Required | Default | Description |
|---|
DATABASE_HOST | Yes | postgres | Database hostname or IP address |
DATABASE_PORT | Yes | 5432 | Database port |
DATABASE_USERNAME | Yes | omni | Database username |
DATABASE_PASSWORD | Yes | - | Database password (use strong password) |
DATABASE_NAME | Yes | omni | Database name |
DATABASE_SSL | No | false | Enable SSL for database connection |
DB_MAX_CONNECTIONS | No | 10 | Connection pool size per service |
DB_ACQUIRE_TIMEOUT_SECONDS | No | 3 | Connection acquisition timeout |
Redis Configuration
| Variable | Required | Default | Description |
|---|
REDIS_URL | Yes | redis://redis:6379 | Redis connection URL (format: redis://host:port) |
For Redis with password:
REDIS_URL=redis://:password@redis:6379
Application Configuration
| Variable | Required | Default | Description |
|---|
APP_URL | Yes | http://localhost:3000 | Public-facing application URL (include protocol) |
OMNI_DOMAIN | No | localhost | Domain name for the application |
OMNI_VERSION | No | latest | Docker image version tag for all Omni services (e.g., 0.1.4) |
SESSION_SECRET | Yes | - | Secret key for session encryption (32+ characters) |
SESSION_COOKIE_NAME | No | auth-session | Name of the session cookie |
SESSION_DURATION_DAYS | No | 7 | Session expiry in days |
ACME_EMAIL | No | - | Email for Let’s Encrypt notifications (for automatic HTTPS) |
Never use the same SESSION_SECRET across different environments. Generate unique secrets for dev, staging, and production.
Security & Encryption
| Variable | Required | Default | Description |
|---|
ENCRYPTION_KEY | Yes | - | Encryption key for sensitive credentials (32+ characters) |
ENCRYPTION_SALT | Yes | - | Salt for key derivation (16+ characters) |
Service Ports
Core Services
| Variable | Required | Default | Description |
|---|
WEB_PORT | No | 3000 | SvelteKit web application port |
SEARCHER_PORT | No | 3001 | Search service port |
INDEXER_PORT | No | 3002 | Indexer service port |
AI_SERVICE_PORT | No | 3003 | AI service port |
CONNECTOR_MANAGER_PORT | No | 3004 | Connector manager port |
Connector Services
| Variable | Required | Default | Description |
|---|
GOOGLE_CONNECTOR_PORT | No | 4001 | Google connector port |
SLACK_CONNECTOR_PORT | No | 4002 | Slack connector port |
ATLASSIAN_CONNECTOR_PORT | No | 4003 | Atlassian connector port |
WEB_CONNECTOR_PORT | No | 4004 | Web connector port |
GITHUB_CONNECTOR_PORT | No | 4005 | GitHub connector port |
HUBSPOT_CONNECTOR_PORT | No | 4006 | HubSpot connector port |
MICROSOFT_CONNECTOR_PORT | No | 4007 | Microsoft 365 connector port |
NOTION_CONNECTOR_PORT | No | 4008 | Notion connector port |
FIREFLIES_CONNECTOR_PORT | No | 4009 | Fireflies connector port |
IMAP_CONNECTOR_PORT | No | 4010 | IMAP email connector port |
CLICKUP_CONNECTOR_PORT | No | 4011 | ClickUp connector port |
LINEAR_CONNECTOR_PORT | No | 4012 | Linear connector port |
FILESYSTEM_CONNECTOR_PORT | No | 4013 | Filesystem connector port |
NEXTCLOUD_CONNECTOR_PORT | No | 4014 | Nextcloud connector port |
PAPERLESS_CONNECTOR_PORT | No | 4015 | Paperless-ngx connector port |
Optional Services
| Variable | Required | Default | Description |
|---|
LOCAL_INFERENCE_MODEL_PORT | No | 8000 | Local LLM inference port (used by the llama.cpp container in docker-compose.local-inference.yml) |
LOCAL_EMBEDDINGS_PORT | No | 8001 | Local embedding model port (used by the HuggingFace TEI container) |
DOCLING_PORT | No | 8003 | Docling document conversion service port |
SANDBOX_PORT | No | 8090 | Code execution sandbox port (used by the AI service for agent tools) |
In Docker Compose, services communicate via service names (e.g., http://searcher:3001). Ports only need to be exposed to the host for debugging.
Docker Compose Resource Limits
Docker Compose deployments expose CPU and memory controls in .env.example. Defaults are conservative for a small single-node deployment and can be raised on larger hosts.
| Variable group | Description |
|---|
OMNI_*_CPUS, DOCLING_CPUS | Per-service CPU limits, expressed in cores |
OMNI_*_MEMORY, DOCLING_MEMORY | Per-service container memory limits |
OMNI_POSTGRES_SHM_SIZE | Shared memory allocated to Postgres |
OMNI_CPU_SHARES_* | Relative CPU weights used when containers contend for CPU |
Inter-Service URLs
These URLs are used for internal communication between services. In Docker Compose, they use the service name and port variable interpolation.
Core Service URLs
| Variable | Required | Default | Description |
|---|
SEARCHER_URL | Yes | http://searcher:${SEARCHER_PORT} | Search service URL |
INDEXER_URL | Yes | http://indexer:${INDEXER_PORT} | Indexer service URL |
AI_SERVICE_URL | Yes | http://ai:${AI_SERVICE_PORT} | AI service URL |
CONNECTOR_MANAGER_URL | Yes | http://connector-manager:${CONNECTOR_MANAGER_PORT} | Connector manager URL |
Optional Service URLs
| Variable | Required | Default | Description |
|---|
LOCAL_EMBEDDINGS_URL | Conditional | http://embeddings:${LOCAL_EMBEDDINGS_PORT}/v1 | Local embeddings service URL (required if using the local embedding provider) |
DOCLING_URL | No | http://docling:${DOCLING_PORT} | Docling document conversion service URL |
SANDBOX_URL | No | http://sandbox:${SANDBOX_PORT} | Sandbox service URL (used by the AI service for agent code execution) |
LLM Provider Configuration
LLM providers and models are configured through the Admin Panel (Settings > LLM Providers). API keys and other secrets are encrypted at rest in the database using ENCRYPTION_KEY and ENCRYPTION_SALT — they are never read from environment variables. Multiple providers can be active simultaneously, and users can select which model to use on a per-chat basis.
Supported Providers
Omni supports seven LLM provider types:
| Provider | Required Config | Description |
|---|
| Anthropic | API Key | Direct access to Claude models via Anthropic’s API |
| OpenAI | API Key | Access to GPT models via OpenAI’s API |
| Google Gemini | API Key | Direct access to Gemini models via the Google AI Studio API |
| AWS Bedrock | AWS Region (+ optional credentials) | Claude and other models via AWS Bedrock |
| Vertex AI | GCP Region + Project ID | Claude and Gemini models via Google Cloud Vertex AI |
| Azure AI Foundry | Endpoint URL | Claude and GPT models via Azure AI Foundry |
| OpenAI-compatible | API URL (+ optional API key) | Any OpenAI-compatible endpoint — use this for self-hosted models via llama.cpp, vLLM, Ollama, LM Studio, etc. |
Predefined Models
When you add a provider, the following models are automatically available:
Anthropic: Claude Opus 4.6, Claude Sonnet 4.5, Claude Haiku 4.5
OpenAI: GPT-5.2, GPT-5 Mini, GPT-4.1
Google Gemini: Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.5 Flash Lite
AWS Bedrock: Claude Opus 4.6, Claude Sonnet 4.5, Claude Haiku 4.5, Amazon Nova Pro
Vertex AI: Claude Sonnet 4.5, Gemini 2.5 Pro, Gemini 2.5 Flash
Azure AI Foundry: Claude Opus 4.6, Claude Sonnet 4.5, Claude Haiku 4.5, GPT-5.2, GPT-5 Mini, GPT-4.1
OpenAI-compatible: No predefined models — specify the model ID exposed by your endpoint.
Unlike the other provider types, OpenAI-compatible can be registered multiple times in a single instance — each configured endpoint (e.g. a local llama.cpp + a separate vLLM instance + an OpenRouter account) shows up as its own provider card in the admin panel, and its models become independently selectable per chat.
AWS Bedrock Environment Variables
For Bedrock providers, the following environment variables are used as fallbacks when not configured in the admin panel:
| Variable | Required | Default | Description |
|---|
AWS_REGION | No | - | AWS region for Bedrock (e.g., us-east-1). Used as fallback if not set in provider config |
AWS_ACCESS_KEY_ID | Conditional | - | AWS access key (if not using IAM role) |
AWS_SECRET_ACCESS_KEY | Conditional | - | AWS secret key (if not using IAM role) |
When running on EC2 or ECS with an appropriate IAM role, AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are not needed — the SDK uses instance credentials automatically.
Vertex AI Environment Variables
Vertex AI uses Google Cloud Application Default Credentials (ADC). Configure region and project_id in the admin panel.
| Variable | Required | Default | Description |
|---|
GOOGLE_APPLICATION_CREDENTIALS | Conditional | - | Path to service account key JSON file (not needed when running on GCP with a service account attached) |
When running on GKE, Cloud Run, or Compute Engine with an attached service account, ADC is automatic — no environment variables are needed.
Azure AI Foundry Environment Variables
Azure AI Foundry uses DefaultAzureCredential (Managed Identity). Configure the endpoint_url in the admin panel.
When running on Azure (AKS, Container Apps, VMs) with Managed Identity configured, authentication is automatic. No additional environment variables are needed.
Self-hosted Local Inference
Omni ships a docker-compose.local-inference.yml overlay that starts a local llama.cpp container for LLM inference and (optionally) a HuggingFace TEI container for embeddings. Once running, register them as providers in the admin UI: OpenAI-compatible for the LLM pointed at http://llama-cpp:${LOCAL_INFERENCE_MODEL_PORT} and Local for the embedding provider.
| Variable | Required | Default | Description |
|---|
LOCAL_INFERENCE_MODEL_PORT | No | 8000 | Port exposed by the llama.cpp container |
LOCAL_EMBEDDINGS_PORT | No | 8001 | Port exposed by the HuggingFace TEI container |
LOCAL_EMBEDDINGS_MODEL | No | - | HuggingFace model ID loaded by TEI |
EMBEDDING_MAX_MODEL_LEN | No | 8192 | Maximum context length for the embedding model |
The local LLM and embedding containers can be enabled independently — for example, cloud LLM with a local embedding model, or vice versa.
Feature Flags
| Variable | Required | Default | Description |
|---|
AI_ANSWER_ENABLED | No | false in .env.example | Enable or disable AI-generated answers in search results |
AGENTS_ENABLED | No | false | Enable background AI agents (scheduled tasks) |
MEMORY_ENABLED | No | false | Enable the Memory settings pages and AI-service memory provider |
MEMORY_PROVIDER | No | mem0 | Memory backend provider. mem0 is the currently supported provider |
Memory Configuration
Memory lets Omni recall selected context across future chats and agent runs. Set MEMORY_ENABLED=true, configure an embedding provider in the admin UI, then use Settings → Memory to choose the organization-wide default mode and memory LLM.
| Mode | Behavior |
|---|
| Off | Memory is disabled for users and agents |
| Chat memory | Completed chat turns can be summarized into memories and recalled in later chats |
| Full memory | Chat memory plus agent run context for background agents |
The organization default is a ceiling: users can lower their personal memory level from Settings → Memory, but cannot choose a mode above the admin default.
Background Agents Configuration
Controls the background agent scheduler and execution limits. Requires AGENTS_ENABLED=true.
| Variable | Required | Default | Description |
|---|
AGENT_SCHEDULER_POLL_INTERVAL | No | 30 | Seconds between scheduler poll checks |
AGENT_MAX_CONCURRENT_RUNS | No | 3 | Maximum concurrent agent executions |
AGENT_MAX_ITERATIONS | No | 15 | Maximum tool calls per agent run |
AI Service Configuration
| Variable | Required | Default | Description |
|---|
AI_WORKERS | No | 2 | Number of uvicorn worker processes |
MODEL_PATH | No | /models | Directory for model storage |
APPROVAL_TIMEOUT_SECONDS | No | 600 | How long interactive-chat tool approval prompts wait for a user decision before expiring (background agents don’t use approval prompts) |
Token usage for every LLM call (chat, agent run, compaction, title generation) is recorded in the model_usage table in Postgres, broken down by user, provider, and model. No additional configuration is required.
Conversation Compaction
Controls automatic compaction of long conversations to stay within model context limits.
| Variable | Required | Default | Description |
|---|
ENABLE_CONVERSATION_COMPACTION | No | true | Enable or disable conversation compaction |
MAX_CONVERSATION_INPUT_TOKENS | No | 150000 | Maximum input tokens before compaction triggers |
COMPACTION_RECENT_MESSAGES_COUNT | No | 20 | Number of recent messages to preserve during compaction |
COMPACTION_SUMMARY_MAX_TOKENS | No | 2000 | Maximum tokens for the compaction summary |
COMPACTION_CACHE_TTL_SECONDS | No | 86400 | Cache TTL for compaction results (default: 24 hours) |
Searcher Configuration
| Variable | Required | Default | Description |
|---|
RAG_CONTEXT_WINDOW | No | 2 | Number of surrounding chunks to fetch in RAG search |
SEMANTIC_SEARCH_TIMEOUT_MS | No | 1000 | Timeout for semantic (vector) search in milliseconds |
RECENCY_BOOST_WEIGHT | No | 0.2 | Weight for recency in search ranking (0.0–1.0) |
RECENCY_HALF_LIFE_DAYS | No | 30.0 | Days for document relevance to decay to 50% |
Connector Manager
The connector-manager service orchestrates all connector operations including scheduling syncs, health checks, and connector lifecycle management.
| Variable | Required | Default | Description |
|---|
MAX_CONCURRENT_SYNCS | No | 10 | Maximum concurrent syncs across all sources |
MAX_CONCURRENT_SYNCS_PER_TYPE | No | 3 | Maximum concurrent syncs per connector type |
SCHEDULER_POLL_INTERVAL_SECONDS | No | 60 | How often the scheduler checks for due syncs |
STALE_SYNC_TIMEOUT_MINUTES | No | 60 | Timeout to mark a sync as stale/failed |
EXTRACTION_CONCURRENCY | No | 2 | Maximum concurrent document extraction requests handled by connector-manager |
EXTRACTION_RETRY_AFTER_SECONDS | No | 30 | Retry delay advertised when extraction capacity is saturated |
CONNECTOR_MANAGER_MAX_EXTRACT_INPUT_BYTES | No | 52428800 | Maximum binary payload size accepted for extraction requests |
CONNECTOR_MANAGER_MAX_EXTRACTED_TEXT_BYTES | No | 5242880 | Maximum extracted text returned from connector-manager extraction requests |
CONNECTOR_MANAGER_SPREADSHEET_MAX_INDEXED_ROWS | No | 1000 | Maximum spreadsheet rows indexed through connector-manager extraction |
Document Conversion (Docling)
Docling is an optional service for extracting structured text from PDFs, Word documents, Excel files, PowerPoint, and common image formats. It can be toggled per-instance from Settings → Document Conversion in the admin UI. When disabled, Omni falls back to lightweight built-in extractors.
The same admin page also exposes a quality preset that controls how aggressively Docling parses each file:
| Preset | Behavior |
|---|
| Fast | OCR off, fast table-former mode, no code/formula enrichment. Use for text-heavy docs where basic tables are fine. |
| Balanced (default) | OCR off, accurate table-former mode. Best tradeoff for most deployments. |
| Quality | OCR on, accurate table-former mode, 1.5× image scale, code and formula enrichment enabled. Slowest, but highest fidelity. |
The preset is stored in Redis and picked up by indexers and connector-manager on every extraction — no restart needed.
| Variable | Required | Default | Description |
|---|
DOCLING_ENABLED | No | false | Expose the Document Conversion admin page. Also set the Compose docling profile to deploy the service |
DOCLING_URL | No | http://docling:${DOCLING_PORT} | Docling service URL used by indexer and connector-manager extraction paths |
DOCLING_DEVICE | No | - | Leave empty for the CPU-only image; set to cuda to pull the CUDA-enabled image |
DOCLING_MEMORY | No | 2g | Memory limit for the Docling container |
DOCLING_MAX_CONCURRENT_CONVERSIONS | No | 1 | Maximum concurrent conversions inside the Docling service |
Storage Configuration
| Variable | Required | Default | Description |
|---|
STORAGE_BACKEND | Yes | postgres | Storage backend: postgres or s3 |
S3_BUCKET | Conditional | - | S3 bucket name (required if STORAGE_BACKEND=s3) |
S3_REGION | Conditional | - | S3 region (required if STORAGE_BACKEND=s3) |
PostgreSQL storage (default):
STORAGE_BACKEND=postgres
# Content stored directly in the database — simplest setup
S3 storage:
STORAGE_BACKEND=s3
S3_BUCKET=omni-content-prod
S3_REGION=us-east-1
# Uses IAM role in AWS, or set AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY
Connector-Specific Configuration
Google Workspace Connector
| Variable | Required | Default | Description |
|---|
WEBHOOK_RENEWAL_CHECK_INTERVAL_SECONDS | No | 3600 | How often to check and renew Google Drive webhooks |
GOOGLE_MAX_AGE_DAYS | No | 712 | Maximum age of documents to index |
GOOGLE_DRIVE_MAX_DOWNLOAD_BYTES | No | 52428800 | Maximum bytes downloaded for a single Drive file |
GOOGLE_DRIVE_PARALLEL_USERS | No | 3 | Number of Drive users processed concurrently during sync |
GOOGLE_WEBHOOK_DEBOUNCE_SECONDS | No | 14400 | Debounce window for repeated Google webhook notifications |
Google Drive webhook URLs are derived from OMNI_DOMAIN as https://<domain>/google-webhook. Webhooks are disabled automatically when OMNI_DOMAIN=localhost.
Logging & Monitoring
| Variable | Required | Default | Description |
|---|
RUST_LOG | No | info | Rust services log level: trace, debug, info, warn, error |
RUST_BACKTRACE | No | - | Enable Rust backtraces: set to 1 or full for debugging |
Log level recommendations:
- Development:
RUST_LOG=debug
- Production:
RUST_LOG=info
- Troubleshooting:
RUST_LOG=trace
Telemetry (OpenTelemetry)
| Variable | Required | Default | Description |
|---|
OTEL_EXPORTER_OTLP_ENDPOINT | No | - | OTLP collector endpoint (empty = telemetry disabled) |
OTEL_DEPLOYMENT_ID | No | - | Deployment identifier for tracing |
OTEL_DEPLOYMENT_ENVIRONMENT | No | production | Environment: development, staging, production |
SERVICE_VERSION | No | 0.1.0 | Service version for tracing |
Example with Honeycomb:
OTEL_EXPORTER_OTLP_ENDPOINT=https://api.honeycomb.io
OTEL_EXPORTER_OTLP_HEADERS=x-honeycomb-team=your-api-key
OTEL_DEPLOYMENT_ID=omni-prod-us-east-1
OTEL_DEPLOYMENT_ENVIRONMENT=production