Quick Start
Install and launch OpenRAG with all its web interfaces in 5 minutes flat .
Prerequisites
Docker Compose 2.26+ Included with modern Docker
16 GB RAM minimum 32 GB recommended with GPU
50 GB storage For Docker images + LLM model (4.9 GB)
Important: The system requires 16 GB RAM minimum to run the llama3.1:8b LLM. See Detailed Requirements for more information.
Installation in 4 Steps
1. Clone the Repository
git clone https://github.com/3ntrop1a/openrag.git
cd openrag
2. Launch All Services
# Start all microservices
sudo docker-compose up -d
What does the stack look like? (docker-compose.yml overview)
services :
# Infrastructure
postgres : # PostgreSQL 16 — document metadata & query history
redis : # Redis 7 — cache & task queue
minio : # MinIO — S3-compatible file storage (port 9000/9001)
qdrant : # Qdrant — vector database (port 6333)
ollama : # Ollama — local LLM server (port 11434)
# Application
embedding : # Sentence-transformer embedding service (port 8002)
orchestrator : # RAG pipeline orchestration (port 8001)
api : # FastAPI REST gateway (port 8000)
frontend-nextjs : # Next.js chat + admin panel (port 3000)
# Monitoring (optional — started with --profile monitoring)
prometheus : # Metrics scraping (port 9090)
grafana : # Pre-configured dashboards (port 3002)
All services are connected to the openrag-network Docker bridge. Only the ports above are exposed to your host — everything else is internal.
First startup: Downloading Docker images and LLM model (4.9 GB). Allow 10-15 minutes depending on your connection.
3. Verify Everything is Started
# View the status of the 10 services
sudo docker-compose ps
You should see 8 services with Up status:
NAME STATUS
openrag-api Up
openrag-orchestrator Up
openrag-embedding Up
openrag-postgres Up
openrag-redis Up
openrag-minio Up
openrag-qdrant Up
openrag-ollama Up
4. Download the LLM Model
If you’re using Ollama (default configuration):
docker exec -it openrag-ollama ollama pull llama3.1:8b
Lightweight alternatives: llama3.1:3b (2GB), gemma:2b (1.5GB), phi3:mini (2.3GB)
Downloading the llama3.1:8b model takes 4.9 GB . Allow 5-10 minutes depending on your connection.
Access Web Interfaces
Open your browser and test the interfaces:
First Test
Option 1: Via Chat Interface (Recommended)
Ask a test question
In the chat, type: What is OpenRAG and how does it work?
Click “Send” or press Enter.
Observe the response
The system will:
Search in documents (100-200 ms)
Generate a response with the LLM (5-15 s after first load)
Display sources below with relevance scores
Important: The first query takes 70-90 seconds or more (loading LLM model into RAM — CPU mode is always slow).
Option 2: Via REST API (curl)
Check API health
curl http://localhost:8000/health | jq
Expected response: {
"status" : "healthy" ,
"timestamp" : "2026-02-18T..." ,
"version" : "1.1.0" ,
"services" : {
"database" : "healthy" ,
"redis" : "healthy" ,
"vector_store" : "healthy" ,
"orchestrator" : "healthy"
}
}
Do a simple search (without LLM)
curl -X POST http://localhost:8000/query \
-H "Content-Type: application/json" \
-d '{
"query": "configuration settings",
"collection_id": "default",
"max_results": 3,
"use_llm": false
}' | jq
Returns similar documents with relevance scores.
Make a query with LLM
curl -X POST http://localhost:8000/query \
-H "Content-Type: application/json" \
-d '{
"query": "What are the main features described in the documentation?",
"collection_id": "default",
"max_results": 5,
"use_llm": true
}' | jq -r '.answer'
Every query: 70-90 seconds or more (CPU-only, llama3.1:8b)
Upload Your Own Documents
Via Admin Interface (Recommended)
Go to Upload
Click “Upload” in the sidebar
Select a PDF file
Click “Browse files”
Choose a PDF
Fill in metadata (optional)
Click “Upload”
Verify processing
Go to “Documents” section
Check status (processing → processed)
Allow 10-30 seconds per document depending on size
Via API
curl -X POST http://localhost:8000/documents/upload \
-F "file=@my_document.pdf" \
-F "collection_id=default" \
-F "metadata={ \" category \" : \" guide \" , \" source \" : \" documentation \" }"
MinIO Access (File Storage)
URL: http://localhost:9001
Credentials: admin / admin123456
Important: Change this password before any production deployment!
Useful Commands
View Logs in Real-Time
# All services
sudo docker-compose logs -f
# A specific service
sudo docker-compose logs -f orchestrator
sudo docker-compose logs -f ollama
Restart a Service
sudo docker-compose restart orchestrator
Stop Everything
Clean Completely (Including Data)
sudo docker-compose down -v # Also removes volumes
The -v option removes all volumes, including your documents and indexed data!
Next Steps
System Architecture Understand OpenRAG’s internal workings
Detailed Requirements GPU configuration, optimizations, production
Tests & Validation Load tests, performance, quality
API Reference Complete REST API documentation
Quick Troubleshooting
Services won’t start
# Check logs
sudo docker-compose logs -f
# Check disk space (minimum 50 GB)
df -h
# Check RAM (minimum 16 GB)
free -h
Ollama not responding
# Check if model is downloaded
docker exec -it openrag-ollama ollama list
# If absent, download it
docker exec -it openrag-ollama ollama pull llama3.1:8b
Queries very slow (>75s)
No results for queries
# Check if documents are processed
curl http://localhost:8000/documents | jq '.documents[] | {filename, status}'
# Status "processed" = ready
# Status "processing" = in progress (wait 10-30s)