Skip to main content

System Requirements

OpenRAG requires significant resources to run a local LLM (llama3.1:8b) and the complete infrastructure.

Hardware Configuration

MINIMUM Configuration (CPU-only Mode)

This configuration will allow the system to run but with limited performance. The LLM will take 50-75 seconds for the first response, then 5-15 seconds for subsequent ones.

CPU

Minimum: 8 cores (x86_64)The LLM uses 80-100% of all cores during generation

RAM

Minimum: 16 GB
  • LLM (llama3.1:8b): ~5.5 GB
  • Services (PostgreSQL, Redis, Qdrant, MinIO): ~2 GB
  • Streamlit frontends: ~500 MB
  • OS + buffers: ~8 GB

Storage

Minimum: 50 GB SSD
  • Docker images: ~8 GB
  • Ollama model (llama3.1:8b): 4.9 GB
  • Embeddings: ~400 MB
  • Data + documents: 10+ GB

Network

Required: Stable internet connectionTo download LLM model (4.9 GB) and Docker images
With an NVIDIA GPU, LLM performance is 10-50x faster. Responses take 1-3 seconds instead of 5-15s.

CPU

Recommended: 12+ cores

RAM

Recommended: 32 GBMore RAM allows loading larger models and handling more simultaneous users

GPU

Recommended: NVIDIA GPU with 12+ GB VRAM
  • RTX 3060 (12GB): Good for llama3.1:8b
  • RTX 4090 (24GB): Excellent for larger models
  • A100 (40/80GB): Production
Important: Requires CUDA Toolkit and nvidia-docker

Storage

Recommended: 100+ GB NVMe SSDFor better I/O performance on PostgreSQL and Qdrant

RAM Usage Breakdown (Production System)

Ollama (loaded LLM):        5.5 GB
PostgreSQL 16:              500 MB
Qdrant (928 vectors):       300 MB
Redis 7:                    100 MB
MinIO:                      200 MB
API Gateway:                200 MB
Orchestrator:               300 MB
Embedding Service:          200 MB
Frontend User:              250 MB
Frontend Admin:             250 MB
OS (Debian/Ubuntu):       2-4 GB
System Buffers:           4-6 GB
-----------------------------------
TOTAL:                   14-18 GB

Logiciels requis

Docker & Docker Compose

Required Software

Docker & Docker Compose (REQUIRED)

# Update packages
sudo apt-get update

# Install Docker
sudo apt-get install -y docker.io docker-compose-plugin

# Add your user to docker group (avoids sudo every time)
sudo usermod -aG docker $USER
newgrp docker

# Verify versions
docker --version        # Required: 26.0+
docker compose version  # Required: 2.26+
IMPORTANT: After adding to docker group, you must log out and log back in for permissions to take effect.

Git (REQUIRED)

# Linux
sudo apt-get install git

# macOS  
brew install git

# Verify
git --version
These tools make testing and debugging easier but are not required:
# Linux (Debian/Ubuntu)
sudo apt-get install -y curl jq

# macOS
brew install curl jq

# Verify
curl --version
jq --version
Usefulness:
  • curl: Test REST API (HTTP requests)
  • jq: Parse and format JSON responses

Network Ports Used

OpenRAG uses 10 services with the following ports:

Public ports (accessible from browser)

ServicePortURLDescription
Chat Interface8501http://localhost:8501Streamlit user interface
Admin Panel8502http://localhost:8502Administration dashboard
REST API8000http://localhost:8000API entry point
MinIO Console9001http://localhost:9001Storage management (admin/admin123456)
Qdrant Dashboard6333http://localhost:6333/dashboardVector DB

Internal ports (between Docker containers)

ServicePortUsage
PostgreSQL5432Database
Redis6379Cache and queues
MinIO API9000S3 storage
Qdrant gRPC6334Vector DB gRPC
Ollama11434LLM Server
Orchestrator8001Orchestration service
Embedding8002Embeddings service

Check if a Port is Available

# Linux/macOS
sudo lsof -i :8501

# If port is in use, find the process
sudo lsof -i :8501 | grep LISTEN

# Kill the process if necessary
sudo kill -9 <PID>
If a port is already in use, you’ll need to either stop the application using it, or modify the docker-compose.yml file to change port mappings.

GPU Support (Optional - 10-50x Performance)

1

Install NVIDIA Container Toolkit

# Add NVIDIA repository
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
  sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
  sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

# Install
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit

# Configure Docker
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
2

Test GPU in Docker

docker run --rm --gpus all nvidia/cuda:12.0-base nvidia-smi
You should see your NVIDIA GPU information.
3

Modify docker-compose.yml for Ollama

# In docker-compose.yml, ollama section:
ollama:
  deploy:
    resources:
      reservations:
        devices:
          - driver: nvidia
            count: all
            capabilities: [gpu]
With GPU: LLM responds in 1-3 seconds
Without GPU: LLM responds in 5-15 seconds (after first load 50-75s)

Apple Silicon (M1/M2/M3)

Ollama supports Metal acceleration on Apple Silicon. Performance is better than CPU-only but typically not as fast as NVIDIA GPUs. Configuration:
  • No special setup required
  • Docker Desktop for Mac handles acceleration
  • Performance: ~2-5 seconds per query

Quick Requirements Verification

Before installing OpenRAG, run these commands to verify your system:
# Docker versions (minimum required in comments)
docker --version        # Required: 26.0+
docker compose version  # Required: 2.26+

# Available RAM
free -h | grep Mem     # Required: 16GB minimum

# Disk space
df -h | grep -E '/$|/home'  # Required: 50GB minimum free

# GPU (optional)
nvidia-smi  # If you have an NVIDIA GPU

Pre-installation Checklist

Server with 16 GB+ RAM
50 GB+ SSD disk space
Docker 26.0+ installed
Docker Compose 2.26+ installed
User in docker group
Ports 8000, 8501, 8502 available
Stable internet connection (5 GB model download)

Configuration per Use Case

Hardware:
  • CPU: 8 cores
  • RAM: 16 GB
  • SSD: 50 GB
Expected Performance:
  • Vector search: 100-200 ms
  • LLM first response: 50-75 s (model loading)
  • LLM subsequent: 5-15 s
Ideal for: Testing, development, personal use

Next Steps

Once requirements are met, consult the Quick Start Guide to install OpenRAG in 5 minutes.

Quick Start

Install and launch OpenRAG with docker compose up