Skip to main content
The Filesystem connector enables Omni to index and search content from local directories, making your organization’s file repositories searchable alongside other data sources.

Overview

What Gets Indexed

Content TypeDescription
Text filesFull content extraction for text-based files
Code filesSource code files (.py, .js, .rs, .go, etc.)
DocumentsMarkdown, JSON, XML, and other structured text
MetadataFile name, path, size, timestamps for all files

Supported File Formats

Full content indexing (text extracted and searchable):
  • text/* (all text MIME types)
  • application/json
  • application/xml
  • application/javascript
  • application/x-sh (shell scripts)
  • application/x-python
  • application/x-ruby
Metadata-only indexing (searchable by name, path, date):
  • Binary files (images, PDFs, etc.)
  • Files larger than 10MB

How It Works

  1. The connector scans configured directories recursively
  2. Text files are read and content is extracted for full-text search
  3. File metadata (name, path, size, timestamps) is indexed for all files
  4. A file watcher detects changes in real-time between full scans
The connector uses read-only access. Omni cannot modify or delete any files in your filesystem.

Prerequisites

Before setting up the Filesystem connector, ensure you have:
  • Omni deployment with the Filesystem connector service running
  • Directory access to the files you want to index
  • Docker volume mounting configured (for containerized deployments)

Setup

Step 1: Configure Docker Volume Mounts

For Docker Compose deployments, add volume mounts to the Filesystem connector service in your docker-compose.override.yml:
services:
  filesystem-connector:
    volumes:
      - /path/to/your/documents:/data/documents:ro
      - /path/to/another/folder:/data/shared:ro
Always use read-only mounts (:ro) for security. The connector only needs read access to index files.

Step 2: Add Environment Variables

Add the following to your .env file:
FILESYSTEM_CONNECTOR_PORT=4006

Step 3: Connect in Omni

  1. Navigate to SettingsIntegrations in Omni
  2. Find Filesystem and click Connect
  3. Configure the source settings:
SettingRequiredDefaultDescription
base_pathYes-Root directory to scan (inside container, e.g., /data/documents)
scan_interval_secondsNo300Full scan interval (5 minutes)
file_extensionsNo-Whitelist of extensions (e.g., ["txt", "md", "json"])
exclude_patternsNo-Patterns to exclude (e.g., [".git", "node_modules"])
max_file_size_bytesNo10485760Max file size for content extraction (10MB)

Step 4: Start Initial Sync

  1. Click Save Configuration
  2. Click Sync Now to start the initial scan
  3. Monitor progress in the admin panel
Your Filesystem connector is now configured. The connector will scan and index files from the configured directory.

Example Configurations

Documentation Repository

Index a documentation folder with Markdown and text files:
{
  "base_path": "/data/documents",
  "scan_interval_seconds": 600,
  "file_extensions": ["md", "txt", "rst"],
  "exclude_patterns": [".git", "_build", "node_modules"]
}

Code Repository

Index source code with common development exclusions:
{
  "base_path": "/data/code",
  "scan_interval_seconds": 300,
  "file_extensions": ["py", "js", "ts", "rs", "go", "java"],
  "exclude_patterns": [
    ".git",
    "node_modules",
    "__pycache__",
    "target",
    "dist",
    "build",
    ".venv",
    "venv"
  ]
}

Shared File Server

Index a shared network drive with all text content:
{
  "base_path": "/data/shared",
  "scan_interval_seconds": 1800,
  "exclude_patterns": ["Thumbs.db", ".DS_Store", "~$*"],
  "max_file_size_bytes": 52428800
}

Managing the Integration

Viewing Sync Status

Navigate to SettingsIntegrationsFilesystem to view:
  • Last sync time
  • Number of indexed files
  • Any scan errors

Sync Behavior

The Filesystem connector uses two synchronization mechanisms:
MechanismFrequencyDescription
Full ScanEvery 5 min (default)Walks entire directory tree
File WatcherReal-timeDetects file changes between scans
The file watcher polls for changes every 2 seconds and batches events with a 30-second idle timeout.

Adding More Directories

To index additional directories:
  1. Add another volume mount in Docker Compose
  2. Create a new Filesystem source in Omni with the new base_path

Removing the Integration

  1. Navigate to SettingsIntegrationsFilesystem
  2. Click Remove
  3. Indexed documents will be deleted from search

Troubleshooting

The container doesn’t have read access to the files.Solution:
  • Ensure files are readable by the container user
  • Check that SELinux/AppArmor isn’t blocking access
  • Try mounting with :ro,z (for SELinux) or :ro,Z (for private mount)
Files larger than 10MB (default) are indexed for metadata only.Solution: Increase max_file_size_bytes in the source configuration. Be aware this increases memory usage.
Large directories with many files will take longer to scan.Factors affecting scan time:
  • Number of files in the directory tree
  • Average file size
  • Disk I/O speed
Optimization tips:
  • Use file_extensions to limit which files are processed
  • Use exclude_patterns to skip unnecessary directories
  • Increase scan_interval_seconds for stable directories
The file watcher may not detect all changes.Solution: The full scan (every 5 minutes by default) will catch any missed changes. You can also trigger a manual sync.

Security Considerations

  • Read-only mounts: Always use :ro flag for volume mounts
  • File permissions: The connector respects filesystem permissions
  • No network access: The connector only reads local files
  • Container isolation: Files are accessed through Docker volumes
Ensure sensitive files (credentials, private keys, etc.) are excluded using exclude_patterns or not mounted at all.

What’s Next