Process Query

Main endpoint for querying the RAG system. Performs semantic search in your documents and generates an intelligent response based on the found context.

Endpoint

POST /query

Request Body

query

string

required

The user’s question or query

collection_id

string

ID of the collection to query (default: all collections)

max_results

integer

default:"5"

Maximum number of source documents to retrieve (1-20)

use_llm

boolean

default:"true"

Use the LLM to generate a response. If false, returns only relevant sources.

metadata_filter

object

Metadata filters to refine the search

{
  "document_type": "pdf",
  "category": "finance"
}

Response

query_id

string

Unique query identifier

answer

string

Response generated by the LLM (null if use_llm=false)

sources

array

List of source documents used

Show properties

document_id

string

Source document ID

filename

string

Source file name

chunk_index

integer

Chunk index in the document

relevance_score

float

Relevance score (0-1)

execution_time_ms

integer

Execution time in milliseconds

timestamp

string

ISO 8601 query timestamp

Examples

curl -X POST http://localhost:8000/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the refund policy?",
    "max_results": 3,
    "use_llm": true
  }'

{
  "query_id": "123e4567-e89b-12d3-a456-426614174000",
  "answer": "According to our documents, the refund policy allows returns within 30 days for all unused products. Refunds are processed within 7 business days to the original payment method. Return shipping costs are the customer's responsibility except for defective products.",
  "sources": [
    {
      "document_id": "doc-123",
      "filename": "refund_policy.pdf",
      "chunk_index": 2,
      "relevance_score": 0.94
    },
    {
      "document_id": "doc-456",
      "filename": "terms_conditions.pdf",
      "chunk_index": 8,
      "relevance_score": 0.87
    },
    {
      "document_id": "doc-123",
      "filename": "refund_policy.pdf",
      "chunk_index": 3,
      "relevance_score": 0.82
    }
  ],
  "execution_time_ms": 1234,
  "timestamp": "2024-02-17T10:30:45.123Z"
}

Error Codes

400

Bad Request

Invalid request (missing or incorrect parameters)

500

Internal Server Error

Server error (LLM unavailable, processing error)

504

Gateway Timeout

Request timeout (>60 seconds)

Best Practices

Optimize Performance

Adjust max_results according to your needs (less = faster)
Use use_llm=false for pure search without generation
Add metadata filters to refine the search

Improve Answer Quality

Formulate clear and precise questions
Use terms specific to your domain
Increase max_results for more context (5-10)

Use Metadata Filters

{
  "query": "What is the procedure?",
  "metadata_filter": {
    "department": "HR",
    "year": "2024"
  }
}

Handle Errors

try:
    response = requests.post(url, json=data, timeout=60)
    response.raise_for_status()
    result = response.json()
except requests.Timeout:
    print("The request took too long")
except requests.HTTPError as e:
    print(f"HTTP Error: {e}")

Limitations

Timeout: 60 seconds maximum per query
Query length: 1000 characters maximum
LLM context: Limited by model’s context window (~2048-4096 tokens)

Technical Notes

Search Process

Query embedding: Conversion to vector (384 dimensions)
Vector search: K-nearest neighbors search in Qdrant (cosine similarity)
Filtering: Application of metadata filters if provided
Relevance threshold: Only results with score > 0.7 are kept
Content retrieval: Getting full text of chunks
LLM generation: Prompt construction and response generation

LLM Prompt Format

Provided context:
Document 1:
[Most relevant chunk content]

Document 2:
[2nd chunk content]

...

Question: [Your question]

Answer the question based ONLY on the context provided above.
If the context does not contain enough information to answer, say so clearly.
Cite the document numbers you use in your answer.

Upload Documents

Add documents to the system

Collections

Organize your documents

​Process Query

​Endpoint

​Request Body

​Response

​Examples

​Error Codes

​Best Practices

​Limitations

​Technical Notes

​Search Process

​LLM Prompt Format

​See Also

Upload Documents

Collections

Process Query

Endpoint

Request Body

Response

Examples

Error Codes

Best Practices

Limitations

Technical Notes

Search Process

LLM Prompt Format

See Also