Skip to main content

Qdrant in OpenRAG

Qdrant is the high-performance vector database used for semantic search in OpenRAG.

Configuration

Image: qdrant/qdrant:latest Ports:
  • 6333: HTTP API
  • 6334: gRPC API
Volume: qdrant_data:/qdrant/storage Dashboard: http://localhost:6333/dashboard

Vector Configuration

Dimension: 384 (matches sentence-transformers all-MiniLM-L6-v2) Distance Metric: Cosine similarity Index Type: HNSW (Hierarchical Navigable Small World)

Collections

OpenRAG uses Qdrant collections to organize vectors: Default Collection: default Collection Parameters:
{
    "vectors": {
        "size": 384,
        "distance": "Cosine"
    },
    "optimizers_config": {
        "indexing_threshold": 20000
    },
    "hnsw_config": {
        "m": 16,
        "ef_construct": 100
    }
}

REST API Usage

List Collections:
curl http://localhost:6333/collections | jq
Response:
{
  "result": {
    "collections": [
      {
        "name": "default"
      }
    ]
  },
  "status": "ok",
  "time": 0.000123
}
Get Collection Info:
curl http://localhost:6333/collections/default | jq
Response:
{
  "result": {
    "status": "green",
    "points_count": 928,
    "indexed_vectors_count": 928,
    "vectors_count": 928,
    "config": {
      "params": {
        "vectors": {
          "size": 384,
          "distance": "Cosine"
        }
      }
    }
  }
}
Search Vectors:
curl -X POST http://localhost:6333/collections/default/points/search \
  -H "Content-Type: application/json" \
  -d '{
    "vector": [0.1, 0.2, ...],  # 384 dimensions
    "limit": 5,
    "with_payload": true
  }'
Insert Point:
curl -X PUT http://localhost:6333/collections/default/points \
  -H "Content-Type: application/json" \
  -d '{
    "points": [
      {
        "id": "550e8400-e29b-41d4-a716-446655440000",
        "vector": [0.1, 0.2, ...],
        "payload": {
          "document_id": "abc123",
          "chunk_index": 0,
          "text": "Sample text..."
        }
      }
    ]
  }'

Python Client Usage

OpenRAG uses the qdrant-client Python library:
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

client = QdrantClient(host="qdrant", port=6333)

# Create collection
client.create_collection(
    collection_name="documents_embeddings",
    vectors_config=VectorParams(
        size=384,
        distance=Distance.COSINE
    )
)

# Insert vectors
client.upsert(
    collection_name="documents_embeddings",
    points=[
        PointStruct(
            id=str(uuid.uuid4()),
            vector=embedding,
            payload={
                "document_id": doc_id,
                "chunk_index": idx,
                "content": chunk_text
            }
        )
    ]
)

# Search
results = client.search(
    collection_name="documents_embeddings",
    query_vector=query_embedding,
    limit=5,
    score_threshold=0.3
)

Current Data in Qdrant

Test Results (after uploading 31 WTE documents):
curl http://localhost:6333/collections/default | jq '.result | {points: .points_count, status: .status}'
Output:
{
  "points": 928,
"status": "green"
}
Breakdown:
  • Documents processed: 28
  • Total chunks: 928
  • Average chunks per document: 33
  • Vector dimension: 384
  • Storage status: green (healthy)

Search Performance

Search Latency: 50-150ms (typical) Throughput: Hundreds of searches per second Accuracy: Cosine similarity scores range from 0.0 (orthogonal) to 1.0 (identical) Typical Score Thresholds:
  • 0.7+: Highly relevant
  • 0.5-0.7: Moderately relevant
  • 0.3-0.5: Potentially relevant
  • <0.3: Not relevant (filtered out in OpenRAG)

Dashboard Features

Access at http://localhost:6333/dashboard Features:
  • Collection browser
  • Point inspector
  • Search testing
  • Performance metrics
  • Configuration viewer

Collection Management

Create New Collection:
curl -X PUT http://localhost:6333/collections/my_collection \
  -H "Content-Type: application/json" \
  -d '{
    "vectors": {
      "size": 384,
      "distance": "Cosine"
    }
  }'
Delete Collection:
curl -X DELETE http://localhost:6333/collections/my_collection
Collection Aliases:
curl -X POST http://localhost:6333/collections/aliases \
  -H "Content-Type: application/json" \
  -d '{
    "actions": [
      {
        "create_alias": {
          "collection_name": "default",
          "alias_name": "production"
        }
      }
    ]
  }'

Filtering

Qdrant supports payload filtering:
results = client.search(
    collection_name="default",
    query_vector=embedding,
    query_filter={
        "must": [
            {
                "key": "category",
                "match": {
                    "value": "cisco_phones"
                }
            }
        ]
    },
    limit=10
)

Optimization

HNSW Parameters:
  • m: Number of connections per layer (16 recommended)
  • ef_construct: Construction time/accuracy tradeoff (100-200 recommended)
  • ef: Search time/accuracy tradeoff (dynamic, typically 128)
Indexing Threshold:
  • Points indexed after reaching threshold
  • Default: 20,000 points
  • Lower for better search accuracy, higher for faster insertions

Backup and Restore

Create Snapshot:
curl -X POST http://localhost:6333/collections/default/snapshots
List Snapshots:
curl http://localhost:6333/collections/default/snapshots | jq
Download Snapshot:
curl -o snapshot.tar http://localhost:6333/collections/default/snapshots/snapshot_name
Restore (via volume mount):
# Stop Qdrant
sudo docker-compose stop qdrant

# Copy snapshot to volume
sudo docker cp snapshot.tar openrag-qdrant:/qdrant/storage/

# Restart
sudo docker-compose start qdrant

Monitoring

Collection Stats:
curl http://localhost:6333/collections/default | jq '.result | {
  points: .points_count,
  indexed: .indexed_vectors_count,
  segments: .segments_count,
  status: .status
}'
Cluster Info (if using cluster mode):
curl http://localhost:6333/cluster | jq

Troubleshooting

No Results Returned:
  • Check score_threshold (try lowering to 0.2)
  • Verify vector dimensions match (384)
  • Ensure collection has points
Slow Search:
  • Increase ef parameter
  • Optimize HNSW configuration
  • Check system resources
Storage Issues:
  • Monitor disk space
  • Create snapshots and cleanup old data
  • Consider collection partitioning
View Logs:
sudo docker logs openrag-qdrant --tail=100

API Reference

Full Qdrant API documentation: https://qdrant.tech/documentation/ Common endpoints used in OpenRAG:
  • GET /collections: List all collections
  • GET /collections/{name}: Collection info
  • POST /collections/{name}/points/search: Vector search
  • PUT /collections/{name}/points: Insert/update points
  • DELETE /collections/{name}/points: Delete points