Qdrant - Documentation

Qdrant in OpenRAG

Qdrant is the high-performance vector database used for semantic search in OpenRAG.

Configuration

Image: qdrant/qdrant:latest Ports:

6333: HTTP API
6334: gRPC API

Volume: qdrant_data:/qdrant/storage Dashboard: http://localhost:6333/dashboard

Vector Configuration

Dimension: 384 (matches sentence-transformers all-MiniLM-L6-v2) Distance Metric: Cosine similarity Index Type: HNSW (Hierarchical Navigable Small World)

Collections

OpenRAG uses Qdrant collections to organize vectors: Default Collection: default Collection Parameters:

{
    "vectors": {
        "size": 384,
        "distance": "Cosine"
    },
    "optimizers_config": {
        "indexing_threshold": 20000
    },
    "hnsw_config": {
        "m": 16,
        "ef_construct": 100
    }
}

REST API Usage

List Collections:

curl http://localhost:6333/collections | jq

Response:

{
  "result": {
    "collections": [
      {
        "name": "default"
      }
    ]
  },
  "status": "ok",
  "time": 0.000123
}

Get Collection Info:

curl http://localhost:6333/collections/default | jq

Response:

{
  "result": {
    "status": "green",
    "points_count": 928,
    "indexed_vectors_count": 928,
    "vectors_count": 928,
    "config": {
      "params": {
        "vectors": {
          "size": 384,
          "distance": "Cosine"
        }
      }
    }
  }
}

Search Vectors:

curl -X POST http://localhost:6333/collections/default/points/search \
  -H "Content-Type: application/json" \
  -d '{
    "vector": [0.1, 0.2, ...],  # 384 dimensions
    "limit": 5,
    "with_payload": true
  }'

Insert Point:

curl -X PUT http://localhost:6333/collections/default/points \
  -H "Content-Type: application/json" \
  -d '{
    "points": [
      {
        "id": "550e8400-e29b-41d4-a716-446655440000",
        "vector": [0.1, 0.2, ...],
        "payload": {
          "document_id": "abc123",
          "chunk_index": 0,
          "text": "Sample text..."
        }
      }
    ]
  }'

Python Client Usage

OpenRAG uses the qdrant-client Python library:

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

client = QdrantClient(host="qdrant", port=6333)

# Create collection
client.create_collection(
    collection_name="documents_embeddings",
    vectors_config=VectorParams(
        size=384,
        distance=Distance.COSINE
    )
)

# Insert vectors
client.upsert(
    collection_name="documents_embeddings",
    points=[
        PointStruct(
            id=str(uuid.uuid4()),
            vector=embedding,
            payload={
                "document_id": doc_id,
                "chunk_index": idx,
                "content": chunk_text
            }
        )
    ]
)

# Search
results = client.search(
    collection_name="documents_embeddings",
    query_vector=query_embedding,
    limit=5,
    score_threshold=0.3
)

Current Data in Qdrant

Test Results (after uploading 31 WTE documents):

curl http://localhost:6333/collections/default | jq '.result | {points: .points_count, status: .status}'

Output:

{
  "points": 928,
"status": "green"
}

Breakdown:

Documents processed: 28
Total chunks: 928
Average chunks per document: 33
Vector dimension: 384
Storage status: green (healthy)

Search Performance

Search Latency: 50-150ms (typical) Throughput: Hundreds of searches per second Accuracy: Cosine similarity scores range from 0.0 (orthogonal) to 1.0 (identical) Typical Score Thresholds:

0.7+: Highly relevant
0.5-0.7: Moderately relevant
0.3-0.5: Potentially relevant
<0.3: Not relevant (filtered out in OpenRAG)

Dashboard Features

Access at http://localhost:6333/dashboard Features:

Collection browser
Point inspector
Search testing
Performance metrics
Configuration viewer

Collection Management

Create New Collection:

curl -X PUT http://localhost:6333/collections/my_collection \
  -H "Content-Type: application/json" \
  -d '{
    "vectors": {
      "size": 384,
      "distance": "Cosine"
    }
  }'

Delete Collection:

curl -X DELETE http://localhost:6333/collections/my_collection

Collection Aliases:

curl -X POST http://localhost:6333/collections/aliases \
  -H "Content-Type: application/json" \
  -d '{
    "actions": [
      {
        "create_alias": {
          "collection_name": "default",
          "alias_name": "production"
        }
      }
    ]
  }'

Filtering

Qdrant supports payload filtering:

results = client.search(
    collection_name="default",
    query_vector=embedding,
    query_filter={
        "must": [
            {
                "key": "category",
                "match": {
                    "value": "cisco_phones"
                }
            }
        ]
    },
    limit=10
)

Optimization

HNSW Parameters:

m: Number of connections per layer (16 recommended)
ef_construct: Construction time/accuracy tradeoff (100-200 recommended)
ef: Search time/accuracy tradeoff (dynamic, typically 128)

Indexing Threshold:

Points indexed after reaching threshold
Default: 20,000 points
Lower for better search accuracy, higher for faster insertions

Backup and Restore

Create Snapshot:

curl -X POST http://localhost:6333/collections/default/snapshots

List Snapshots:

curl http://localhost:6333/collections/default/snapshots | jq

Download Snapshot:

curl -o snapshot.tar http://localhost:6333/collections/default/snapshots/snapshot_name

Restore (via volume mount):

# Stop Qdrant
sudo docker-compose stop qdrant

# Copy snapshot to volume
sudo docker cp snapshot.tar openrag-qdrant:/qdrant/storage/

# Restart
sudo docker-compose start qdrant

Monitoring

Collection Stats:

curl http://localhost:6333/collections/default | jq '.result | {
  points: .points_count,
  indexed: .indexed_vectors_count,
  segments: .segments_count,
  status: .status
}'

Cluster Info (if using cluster mode):

curl http://localhost:6333/cluster | jq

Troubleshooting

No Results Returned:

Check score_threshold (try lowering to 0.2)
Verify vector dimensions match (384)
Ensure collection has points

Slow Search:

Increase ef parameter
Optimize HNSW configuration
Check system resources

Storage Issues:

Monitor disk space
Create snapshots and cleanup old data
Consider collection partitioning

View Logs:

sudo docker logs openrag-qdrant --tail=100

API Reference

Full Qdrant API documentation: https://qdrant.tech/documentation/ Common endpoints used in OpenRAG:

GET /collections: List all collections
GET /collections/{name}: Collection info
POST /collections/{name}/points/search: Vector search
PUT /collections/{name}/points: Insert/update points
DELETE /collections/{name}/points: Delete points

​Qdrant in OpenRAG

​Configuration

​Vector Configuration

​Collections

​REST API Usage

​Python Client Usage

​Current Data in Qdrant

​Search Performance

​Dashboard Features

​Collection Management

​Filtering

​Optimization

​Backup and Restore

​Monitoring

​Troubleshooting

​API Reference