API Reference
Process Query
Process a user query and return an LLM-generated response
POST
Process Query
Main endpoint for querying the RAG system. Performs semantic search in your documents and generates an intelligent response based on the found context.Endpoint
Request Body
The user’s question or query
ID of the collection to query (default: all collections)
Maximum number of source documents to retrieve (1-20)
Use the LLM to generate a response. If false, returns only relevant sources.
Metadata filters to refine the search
Response
Unique query identifier
Response generated by the LLM (null if use_llm=false)
List of source documents used
Execution time in milliseconds
ISO 8601 query timestamp
Examples
Error Codes
Invalid request (missing or incorrect parameters)
Server error (LLM unavailable, processing error)
Request timeout (>60 seconds)
Best Practices
Optimize Performance
Optimize Performance
- Adjust
max_resultsaccording to your needs (less = faster) - Use
use_llm=falsefor pure search without generation - Add metadata filters to refine the search
Improve Answer Quality
Improve Answer Quality
- Formulate clear and precise questions
- Use terms specific to your domain
- Increase
max_resultsfor more context (5-10)
Use Metadata Filters
Use Metadata Filters
Handle Errors
Handle Errors
Limitations
Technical Notes
Search Process
- Query embedding: Conversion to vector (384 dimensions)
- Vector search: K-nearest neighbors search in Qdrant (cosine similarity)
- Filtering: Application of metadata filters if provided
- Relevance threshold: Only results with score > 0.7 are kept
- Content retrieval: Getting full text of chunks
- LLM generation: Prompt construction and response generation
LLM Prompt Format
See Also
Upload Documents
Add documents to the system
Collections
Organize your documents

