Documentation Index
Fetch the complete documentation index at: https://docs.getomni.co/llms.txt
Use this file to discover all available pages before exploring further.
This page provides a comprehensive reference for all environment variables and configuration options available in Omni.
All environment variables can be set in your .env file for Docker Compose deployments or in terraform.tfvars for AWS deployments.
Database Configuration
| Variable | Required | Default | Description |
|---|
DATABASE_HOST | Yes | postgres | Database hostname or IP address |
DATABASE_PORT | Yes | 5432 | Database port |
DATABASE_USERNAME | Yes | omni | Database username |
DATABASE_PASSWORD | Yes | - | Database password (use strong password) |
DATABASE_NAME | Yes | omni | Database name |
DATABASE_SSL | No | false | Enable SSL for database connection |
DB_MAX_CONNECTIONS | No | 10 | Connection pool size per service |
DB_ACQUIRE_TIMEOUT_SECONDS | No | 3 | Connection acquisition timeout |
Redis Configuration
| Variable | Required | Default | Description |
|---|
REDIS_URL | Yes | redis://redis:6379 | Redis connection URL (format: redis://host:port) |
For Redis with password:
REDIS_URL=redis://:password@redis:6379
Application Configuration
| Variable | Required | Default | Description |
|---|
APP_URL | Yes | http://localhost:3000 | Public-facing application URL (include protocol) |
OMNI_DOMAIN | No | localhost | Domain name for the application |
OMNI_VERSION | No | latest | Docker image version tag for all Omni services (e.g., 0.1.4) |
SESSION_SECRET | Yes | - | Secret key for session encryption (32+ characters) |
SESSION_COOKIE_NAME | No | auth-session | Name of the session cookie |
SESSION_DURATION_DAYS | No | 7 | Session expiry in days |
ACME_EMAIL | No | - | Email for Let’s Encrypt notifications (for automatic HTTPS) |
Never use the same SESSION_SECRET across different environments. Generate unique secrets for dev, staging, and production.
Security & Encryption
| Variable | Required | Default | Description |
|---|
ENCRYPTION_KEY | Yes | - | Encryption key for sensitive credentials (32+ characters) |
ENCRYPTION_SALT | Yes | - | Salt for key derivation (16+ characters) |
Service Ports
Core Services
| Variable | Required | Default | Description |
|---|
WEB_PORT | No | 3000 | SvelteKit web application port |
SEARCHER_PORT | No | 3001 | Search service port |
INDEXER_PORT | No | 3002 | Indexer service port |
AI_SERVICE_PORT | No | 3003 | AI service port |
CONNECTOR_MANAGER_PORT | No | 3004 | Connector manager port |
Connector Services
| Variable | Required | Default | Description |
|---|
GOOGLE_CONNECTOR_PORT | No | 4001 | Google connector port |
SLACK_CONNECTOR_PORT | No | 4002 | Slack connector port |
ATLASSIAN_CONNECTOR_PORT | No | 4003 | Atlassian connector port |
WEB_CONNECTOR_PORT | No | 4004 | Web connector port |
GITHUB_CONNECTOR_PORT | No | 4005 | GitHub connector port |
HUBSPOT_CONNECTOR_PORT | No | 4006 | HubSpot connector port |
MICROSOFT_CONNECTOR_PORT | No | 4007 | Microsoft 365 connector port |
NOTION_CONNECTOR_PORT | No | 4008 | Notion connector port |
FIREFLIES_CONNECTOR_PORT | No | 4009 | Fireflies connector port |
IMAP_CONNECTOR_PORT | No | 4010 | IMAP email connector port |
CLICKUP_CONNECTOR_PORT | No | 4011 | ClickUp connector port |
LINEAR_CONNECTOR_PORT | No | 4012 | Linear connector port |
FILESYSTEM_CONNECTOR_PORT | No | 4013 | Filesystem connector port |
NEXTCLOUD_CONNECTOR_PORT | No | 4014 | Nextcloud connector port |
PAPERLESS_CONNECTOR_PORT | No | 4015 | Paperless-ngx connector port |
Optional Services
| Variable | Required | Default | Description |
|---|
LOCAL_INFERENCE_MODEL_PORT | No | 8000 | Local LLM inference port (used by the llama.cpp container in docker-compose.local-inference.yml) |
LOCAL_EMBEDDINGS_PORT | No | 8001 | Local embedding model port (used by the HuggingFace TEI container) |
DOCLING_PORT | No | 8003 | Docling document conversion service port |
SANDBOX_PORT | No | 8090 | Code execution sandbox port (used by the AI service for agent tools) |
In Docker Compose, services communicate via service names (e.g., http://searcher:3001). Ports only need to be exposed to the host for debugging.
Inter-Service URLs
These URLs are used for internal communication between services. In Docker Compose, they use the service name and port variable interpolation.
Core Service URLs
| Variable | Required | Default | Description |
|---|
SEARCHER_URL | Yes | http://searcher:${SEARCHER_PORT} | Search service URL |
INDEXER_URL | Yes | http://indexer:${INDEXER_PORT} | Indexer service URL |
AI_SERVICE_URL | Yes | http://ai:${AI_SERVICE_PORT} | AI service URL |
CONNECTOR_MANAGER_URL | Yes | http://connector-manager:${CONNECTOR_MANAGER_PORT} | Connector manager URL |
Optional Service URLs
| Variable | Required | Default | Description |
|---|
LOCAL_EMBEDDINGS_URL | Conditional | http://embeddings:${LOCAL_EMBEDDINGS_PORT}/v1 | Local embeddings service URL (required if using the local embedding provider) |
DOCLING_URL | No | http://docling:${DOCLING_PORT} | Docling document conversion service URL |
SANDBOX_URL | No | http://sandbox:${SANDBOX_PORT} | Sandbox service URL (used by the AI service for agent code execution) |
LLM Provider Configuration
LLM providers and models are configured through the Admin Panel (Settings > LLM Providers). API keys and other secrets are encrypted at rest in the database using ENCRYPTION_KEY and ENCRYPTION_SALT — they are never read from environment variables. Multiple providers can be active simultaneously, and users can select which model to use on a per-chat basis.
Supported Providers
Omni supports seven LLM provider types:
| Provider | Required Config | Description |
|---|
| Anthropic | API Key | Direct access to Claude models via Anthropic’s API |
| OpenAI | API Key | Access to GPT models via OpenAI’s API |
| Google Gemini | API Key | Direct access to Gemini models via the Google AI Studio API |
| AWS Bedrock | AWS Region (+ optional credentials) | Claude and other models via AWS Bedrock |
| Vertex AI | GCP Region + Project ID | Claude and Gemini models via Google Cloud Vertex AI |
| Azure AI Foundry | Endpoint URL | Claude and GPT models via Azure AI Foundry |
| OpenAI-compatible | API URL (+ optional API key) | Any OpenAI-compatible endpoint — use this for self-hosted models via llama.cpp, vLLM, Ollama, LM Studio, etc. |
Predefined Models
When you add a provider, the following models are automatically available:
Anthropic: Claude Opus 4.6, Claude Sonnet 4.5, Claude Haiku 4.5
OpenAI: GPT-5.2, GPT-5 Mini, GPT-4.1
Google Gemini: Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.5 Flash Lite
AWS Bedrock: Claude Opus 4.6, Claude Sonnet 4.5, Claude Haiku 4.5, Amazon Nova Pro
Vertex AI: Claude Sonnet 4.5, Gemini 2.5 Pro, Gemini 2.5 Flash
Azure AI Foundry: Claude Opus 4.6, Claude Sonnet 4.5, Claude Haiku 4.5, GPT-5.2, GPT-5 Mini, GPT-4.1
OpenAI-compatible: No predefined models — specify the model ID exposed by your endpoint.
Unlike the other provider types, OpenAI-compatible can be registered multiple times in a single instance — each configured endpoint (e.g. a local llama.cpp + a separate vLLM instance + an OpenRouter account) shows up as its own provider card in the admin panel, and its models become independently selectable per chat.
AWS Bedrock Environment Variables
For Bedrock providers, the following environment variables are used as fallbacks when not configured in the admin panel:
| Variable | Required | Default | Description |
|---|
AWS_REGION | No | - | AWS region for Bedrock (e.g., us-east-1). Used as fallback if not set in provider config |
AWS_ACCESS_KEY_ID | Conditional | - | AWS access key (if not using IAM role) |
AWS_SECRET_ACCESS_KEY | Conditional | - | AWS secret key (if not using IAM role) |
When running on EC2 or ECS with an appropriate IAM role, AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are not needed — the SDK uses instance credentials automatically.
Vertex AI Environment Variables
Vertex AI uses Google Cloud Application Default Credentials (ADC). Configure region and project_id in the admin panel.
| Variable | Required | Default | Description |
|---|
GOOGLE_APPLICATION_CREDENTIALS | Conditional | - | Path to service account key JSON file (not needed when running on GCP with a service account attached) |
When running on GKE, Cloud Run, or Compute Engine with an attached service account, ADC is automatic — no environment variables are needed.
Azure AI Foundry Environment Variables
Azure AI Foundry uses DefaultAzureCredential (Managed Identity). Configure the endpoint_url in the admin panel.
When running on Azure (AKS, Container Apps, VMs) with Managed Identity configured, authentication is automatic. No additional environment variables are needed.
Self-hosted Local Inference
Omni ships a docker-compose.local-inference.yml overlay that starts a local llama.cpp container for LLM inference and (optionally) a HuggingFace TEI container for embeddings. Once running, register them as providers in the admin UI: OpenAI-compatible for the LLM pointed at http://llama-cpp:${LOCAL_INFERENCE_MODEL_PORT} and Local for the embedding provider.
| Variable | Required | Default | Description |
|---|
LOCAL_INFERENCE_MODEL_PORT | No | 8000 | Port exposed by the llama.cpp container |
LOCAL_EMBEDDINGS_PORT | No | 8001 | Port exposed by the HuggingFace TEI container |
LOCAL_EMBEDDINGS_MODEL | No | - | HuggingFace model ID loaded by TEI |
EMBEDDING_MAX_MODEL_LEN | No | 8192 | Maximum context length for the embedding model |
The local LLM and embedding containers can be enabled independently — for example, cloud LLM with a local embedding model, or vice versa.
[Work In Progress] Batch Embedding Inference (AWS Bedrock)
For large-scale embedding generation using AWS Bedrock batch inference:
| Variable | Required | Default | Description |
|---|
ENABLE_EMBEDDING_BATCH_INFERENCE | No | false | Enable batch processing for embeddings |
EMBEDDING_BATCH_S3_BUCKET | Conditional | - | S3 bucket for batch files (required if batch enabled) |
EMBEDDING_BATCH_BEDROCK_ROLE_ARN | Conditional | - | IAM role ARN for Bedrock (required if batch enabled) |
EMBEDDING_BATCH_MIN_DOCUMENTS | No | 100 | Minimum documents to trigger batch job |
EMBEDDING_BATCH_MAX_DOCUMENTS | No | 50000 | Maximum documents per batch |
EMBEDDING_BATCH_ACCUMULATION_TIMEOUT_SECONDS | No | 300 | Wait time before starting batch (5 min) |
EMBEDDING_BATCH_ACCUMULATION_POLL_INTERVAL | No | 10 | Interval to check queue (10 sec) |
EMBEDDING_BATCH_MONITOR_POLL_INTERVAL | No | 300 | Interval to check batch status (5 min) |
Feature Flags
| Variable | Required | Default | Description |
|---|
AI_ANSWER_ENABLED | No | true | Enable or disable AI-generated answers in search results |
AGENTS_ENABLED | No | false | Enable background AI agents (scheduled tasks) |
Background Agents Configuration
Controls the background agent scheduler and execution limits. Requires AGENTS_ENABLED=true.
| Variable | Required | Default | Description |
|---|
AGENT_SCHEDULER_POLL_INTERVAL | No | 30 | Seconds between scheduler poll checks |
AGENT_MAX_CONCURRENT_RUNS | No | 3 | Maximum concurrent agent executions |
AGENT_MAX_ITERATIONS | No | 15 | Maximum tool calls per agent run |
AI Service Configuration
| Variable | Required | Default | Description |
|---|
AI_WORKERS | No | 2 | Number of uvicorn worker processes |
MODEL_PATH | No | /models | Directory for model storage |
APPROVAL_TIMEOUT_SECONDS | No | 600 | How long interactive-chat tool approval prompts wait for a user decision before expiring (background agents don’t use approval prompts) |
Token usage for every LLM call (chat, agent run, compaction, title generation) is recorded in the model_usage table in Postgres, broken down by user, provider, and model. No additional configuration is required.
Conversation Compaction
Controls automatic compaction of long conversations to stay within model context limits.
| Variable | Required | Default | Description |
|---|
ENABLE_CONVERSATION_COMPACTION | No | true | Enable or disable conversation compaction |
MAX_CONVERSATION_INPUT_TOKENS | No | 150000 | Maximum input tokens before compaction triggers |
COMPACTION_RECENT_MESSAGES_COUNT | No | 20 | Number of recent messages to preserve during compaction |
COMPACTION_SUMMARY_MAX_TOKENS | No | 2000 | Maximum tokens for the compaction summary |
COMPACTION_CACHE_TTL_SECONDS | No | 86400 | Cache TTL for compaction results (default: 24 hours) |
Searcher Configuration
| Variable | Required | Default | Description |
|---|
RAG_CONTEXT_WINDOW | No | 2 | Number of surrounding chunks to fetch in RAG search |
SEMANTIC_SEARCH_TIMEOUT_MS | No | 1000 | Timeout for semantic (vector) search in milliseconds |
RECENCY_BOOST_WEIGHT | No | 0.2 | Weight for recency in search ranking (0.0–1.0) |
RECENCY_HALF_LIFE_DAYS | No | 30.0 | Days for document relevance to decay to 50% |
Connector Manager
The connector-manager service orchestrates all connector operations including scheduling syncs, health checks, and connector lifecycle management.
| Variable | Required | Default | Description |
|---|
MAX_CONCURRENT_SYNCS | No | 10 | Maximum concurrent syncs across all sources |
MAX_CONCURRENT_SYNCS_PER_TYPE | No | 3 | Maximum concurrent syncs per connector type |
SCHEDULER_POLL_INTERVAL_SECONDS | No | 60 | How often the scheduler checks for due syncs |
STALE_SYNC_TIMEOUT_MINUTES | No | 60 | Timeout to mark a sync as stale/failed |
Document Conversion (Docling)
Docling is an optional service for extracting structured text from PDFs, Word documents, Excel files, PowerPoint, and common image formats. It can be toggled per-instance from Settings → Document Conversion in the admin UI. When disabled, Omni falls back to lightweight built-in extractors.
The same admin page also exposes a quality preset that controls how aggressively Docling parses each file:
| Preset | Behavior |
|---|
| Fast | OCR off, fast table-former mode, no code/formula enrichment. Use for text-heavy docs where basic tables are fine. |
| Balanced (default) | OCR off, accurate table-former mode. Best tradeoff for most deployments. |
| Quality | OCR on, accurate table-former mode, 1.5× image scale, code and formula enrichment enabled. Slowest, but highest fidelity. |
The preset is stored in Redis and picked up by indexers and connector-manager on every extraction — no restart needed.
| Variable | Required | Default | Description |
|---|
DOCLING_URL | No | http://docling:${DOCLING_PORT} | Docling service URL used by the indexer |
DOCLING_DEVICE | No | - | Leave empty for the CPU-only image; set to cuda to pull the CUDA-enabled image |
DOCLING_CPUSET | No | 2,3 | CPU cores pinned to the Docling container |
DOCLING_MEMORY | No | - | Memory limit for the Docling container (e.g. 2g) |
DOCLING_MAX_CONCURRENT_CONVERSIONS | No | 1 | Maximum concurrent Docling conversions per indexer |
Storage Configuration
| Variable | Required | Default | Description |
|---|
STORAGE_BACKEND | Yes | postgres | Storage backend: postgres or s3 |
S3_BUCKET | Conditional | - | S3 bucket name (required if STORAGE_BACKEND=s3) |
S3_REGION | Conditional | - | S3 region (required if STORAGE_BACKEND=s3) |
PostgreSQL storage (default):
STORAGE_BACKEND=postgres
# Content stored directly in the database — simplest setup
S3 storage:
STORAGE_BACKEND=s3
S3_BUCKET=omni-content-prod
S3_REGION=us-east-1
# Uses IAM role in AWS, or set AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY
Connector-Specific Configuration
Google Workspace Connector
| Variable | Required | Default | Description |
|---|
GOOGLE_SYNC_INTERVAL_SECONDS | No | 86400 | Interval between Google sync runs (default: 24 hours) |
GOOGLE_WEBHOOK_URL | No | - | Public URL for Google Drive change notifications (e.g., https://yourdomain.com/google-webhook) |
WEBHOOK_RENEWAL_CHECK_INTERVAL_SECONDS | No | 3600 | How often to check and renew Google Drive webhooks (default: 1 hour) |
GOOGLE_MAX_AGE_DAYS | No | 712 | Maximum age of documents to index (documents older than this are skipped) |
Web Connector
| Variable | Required | Default | Description |
|---|
WEB_SYNC_INTERVAL_SECONDS | No | 86400 | Interval between web recrawl runs (default: 24 hours) |
Logging & Monitoring
| Variable | Required | Default | Description |
|---|
RUST_LOG | No | info | Rust services log level: trace, debug, info, warn, error |
RUST_BACKTRACE | No | - | Enable Rust backtraces: set to 1 or full for debugging |
Log level recommendations:
- Development:
RUST_LOG=debug
- Production:
RUST_LOG=info
- Troubleshooting:
RUST_LOG=trace
Telemetry (OpenTelemetry)
| Variable | Required | Default | Description |
|---|
OTEL_EXPORTER_OTLP_ENDPOINT | No | - | OTLP collector endpoint (empty = telemetry disabled) |
OTEL_DEPLOYMENT_ID | No | - | Deployment identifier for tracing |
OTEL_DEPLOYMENT_ENVIRONMENT | No | production | Environment: development, staging, production |
SERVICE_VERSION | No | 0.1.0 | Service version for tracing |
Example with Honeycomb:
OTEL_EXPORTER_OTLP_ENDPOINT=https://api.honeycomb.io
OTEL_EXPORTER_OTLP_HEADERS=x-honeycomb-team=your-api-key
OTEL_DEPLOYMENT_ID=omni-prod-us-east-1
OTEL_DEPLOYMENT_ENVIRONMENT=production