Welcome to OpenRAG 🚀

OpenRAG is a complete, modular, and production-ready RAG (Retrieval-Augmented Generation) solution. It allows you to query your documents using advanced language models with precise and relevant context.

What is a RAG system?

A RAG system combines information retrieval with text generation to provide accurate answers based on your own documents:

Retrieval: Finds relevant passages in your document base
Augmented: Enriches the query with the found context
Generation: Generates a coherent response with an LLM

RAG Workflow

Main Features

Document Upload

PDF, DOCX, TXT, Markdown - Automatic processing

Semantic Search

Advanced vector search with Qdrant

Answer Generation

Ollama, OpenAI, Anthropic Claude

Modular Architecture

Decoupled microservices with Docker

Main Components

OpenRAG consists of 10 Docker services:

Service	Port	Role
frontend-user	8501	User chat interface (Streamlit)
frontend-admin	8502	Admin panel, upload, stats (Streamlit)
api	8000	REST API (FastAPI)
orchestrator	8001	RAG workflow coordination
embedding	8002	Embeddings generation (sentence-transformers)
ollama	11434	Local LLM server
qdrant	6333	Vector database
postgres	5432	Metadata and history
redis	6379	Cache and queues
minio	9000/9001	File storage (S3-compatible)

Use Cases

Enterprise Knowledge Base

Create an AI assistant that knows all your internal documents, procedures, and company policies.

Legal Assistance

Quickly query contracts, case law, and legal documents with precise citations.

Customer Support

Automatically answer questions based on your product documentation and FAQ.

Academic Research

Explore and synthesize large collections of scientific research papers.

Quick Start

Clone and Launch

git clone https://github.com/3ntrop1a/openrag.git
cd openrag
sudo docker-compose up -d

Download LLM Model

docker exec -it openrag-ollama ollama pull llama3.1:8b

Open Chat Interface

Navigate to http://localhost:8501 and ask your first question!

First time? Initial startup takes 10-15 minutes (downloading Docker images + 4.9 GB LLM model)

Next Steps

Quick Start Guide

Complete installation in 5 minutes

Detailed Architecture

Understand the internal workings

System Requirements

Minimum configuration: 16 GB RAM, optional GPU

API Reference

Complete REST API documentation

Technical Characteristics

Performance

With GPU: 1-3s per query
Without GPU: 5-15s per query (after warm-up)
Vector search: 100-200ms
Indexing: 10-30s per PDF document

Scalability

Horizontally scalable microservices architecture
Support for millions of documents
Redis for distributed cache
PostgreSQL for metadata
Qdrant for high-performance vector search

Security

Data stored locally (no third-party cloud)
Docker service isolation
Support for local LLMs (Ollama) for total privacy
Compatible with cloud LLMs (OpenAI, Claude) if desired

Extensibility

Easy integration of new document formats
Multi-LLM support (Ollama, OpenAI, Anthropic)
REST API for integration into your applications
Multiple collections for document segmentation

Support and Contributions

🐛 Issues: GitHub Issues
💻 Source Code: github.com/3ntrop1a/openrag
📖 Documentation: This Mintlify documentation

​Welcome to OpenRAG 🚀

​What is a RAG system?

​RAG Workflow

​Main Features

Document Upload

Semantic Search

Answer Generation

Modular Architecture

​Main Components

​Use Cases

​Quick Start

​Next Steps

Quick Start Guide

Detailed Architecture

System Requirements

API Reference

​Technical Characteristics

​Support and Contributions

Welcome to OpenRAG 🚀

What is a RAG system?

RAG Workflow

Main Features

Main Components

Use Cases

Quick Start

Next Steps

Technical Characteristics

Support and Contributions