Quick Start
Install
# Recommended — Meilisearch + Cohere reranker + interactive CLI
pip install retrievalagent[recommended]
# Base only — in-memory backend, no external services
pip install retrievalagent
Individual backends & rerankers
pip install retrievalagent[meilisearch]
pip install retrievalagent[azure]
pip install retrievalagent[chromadb]
pip install retrievalagent[lancedb]
pip install retrievalagent[pgvector]
pip install retrievalagent[qdrant]
pip install retrievalagent[duckdb]
pip install retrievalagent[cohere]
pip install retrievalagent[huggingface]
pip install retrievalagent[jina]
pip install retrievalagent[rerankers]
pip install retrievalagent[embed-anything]
pip install retrievalagent[qdrant,cohere,cli]
One-liner with init_agent
The fastest path — string aliases for everything, no imports:
from retrievalagent import init_agent
# Minimal — in-memory backend, LLM from env vars
rag = init_agent("docs")
# OpenAI + Qdrant + Cohere reranker
rag = init_agent(
"my-collection",
model="openai:gpt-5.4",
backend="qdrant",
backend_url="http://localhost:6333",
reranker="cohere",
)
# Fully local — Ollama + ChromaDB + HuggingFace
rag = init_agent(
"docs",
model="ollama:llama3",
backend="chroma",
reranker="huggingface",
)
# Anthropic + Azure AI Search (server-side vectorisation)
rag = init_agent(
"my-index",
model="anthropic:claude-sonnet-4-6",
gen_model="anthropic:claude-opus-4-7",
backend="azure",
backend_url="https://my-search.search.windows.net",
)
Model strings follow "provider:model-name" — openai, anthropic, azure_openai, google_vertexai, ollama, groq, mistralai, and any other LangChain provider.
Your first query
from retrievalagent import init_agent
rag = init_agent("docs", model="openai:gpt-5.4", backend="qdrant",
backend_url="http://localhost:6333")
# Full agentic answer
state = rag.invoke("What is hybrid search?")
print(state.answer)
print(f"Sources: {len(state.documents)}")
# Multi-turn chat
from retrievalagent import ConversationTurn
history: list[ConversationTurn] = []
state = rag.chat("What is hybrid search?", history)
history.append(ConversationTurn(question="What is hybrid search?", answer=state.answer))
state = rag.chat("How does it compare to pure vector search?", history)
Manual setup
For full control over the backend instance:
from retrievalagent import Agent, InMemoryBackend
backend = InMemoryBackend(embed_fn=my_embed_fn)
backend.add_documents([
{"content": "RAG combines retrieval with generation", "source": "wiki"},
{"content": "Vector search finds similar embeddings", "source": "docs"},
])
rag = Agent(index="demo", backend=backend)
state = rag.invoke("What is retrieval-augmented generation?")
print(state.answer)
Environment variables
| Variable | Purpose |
|---|---|
OPENAI_API_KEY |
OpenAI API key |
AZURE_OPENAI_ENDPOINT |
Azure OpenAI endpoint (auto-detected) |
AZURE_OPENAI_API_KEY |
Azure OpenAI key |
AZURE_OPENAI_DEPLOYMENT |
Deployment name |
COHERE_API_KEY |
Cohere reranker key |
JINA_API_KEY |
Jina reranker key |
Set in a .env file — retrievalagent loads it automatically via python-dotenv if installed.
Next steps
- Backends — configure each backend
- Reranking — choose and tune rerankers
- Filtering — narrow results with metadata filters
- Multi-Collection — route queries across multiple indexes
- Examples — 10 ready-to-run patterns