Skip to content

Quick Start

Install

# Recommended — Meilisearch + Cohere reranker + interactive CLI
pip install retrievalagent[recommended]

# Base only — in-memory backend, no external services
pip install retrievalagent
Individual backends & rerankers

pip install retrievalagent[meilisearch]
pip install retrievalagent[azure]
pip install retrievalagent[chromadb]
pip install retrievalagent[lancedb]
pip install retrievalagent[pgvector]
pip install retrievalagent[qdrant]
pip install retrievalagent[duckdb]
pip install retrievalagent[cohere]
pip install retrievalagent[huggingface]
pip install retrievalagent[jina]
pip install retrievalagent[rerankers]
pip install retrievalagent[embed-anything]
Mix freely: pip install retrievalagent[qdrant,cohere,cli]


One-liner with init_agent

The fastest path — string aliases for everything, no imports:

from retrievalagent import init_agent

# Minimal — in-memory backend, LLM from env vars
rag = init_agent("docs")

# OpenAI + Qdrant + Cohere reranker
rag = init_agent(
    "my-collection",
    model="openai:gpt-5.4",
    backend="qdrant",
    backend_url="http://localhost:6333",
    reranker="cohere",
)

# Fully local — Ollama + ChromaDB + HuggingFace
rag = init_agent(
    "docs",
    model="ollama:llama3",
    backend="chroma",
    reranker="huggingface",
)

# Anthropic + Azure AI Search (server-side vectorisation)
rag = init_agent(
    "my-index",
    model="anthropic:claude-sonnet-4-6",
    gen_model="anthropic:claude-opus-4-7",
    backend="azure",
    backend_url="https://my-search.search.windows.net",
)

Model strings follow "provider:model-name"openai, anthropic, azure_openai, google_vertexai, ollama, groq, mistralai, and any other LangChain provider.


Your first query

from retrievalagent import init_agent

rag = init_agent("docs", model="openai:gpt-5.4", backend="qdrant",
                 backend_url="http://localhost:6333")

# Full agentic answer
state = rag.invoke("What is hybrid search?")
print(state.answer)
print(f"Sources: {len(state.documents)}")

# Multi-turn chat
from retrievalagent import ConversationTurn
history: list[ConversationTurn] = []

state = rag.chat("What is hybrid search?", history)
history.append(ConversationTurn(question="What is hybrid search?", answer=state.answer))
state = rag.chat("How does it compare to pure vector search?", history)

Manual setup

For full control over the backend instance:

from retrievalagent import Agent, InMemoryBackend

backend = InMemoryBackend(embed_fn=my_embed_fn)
backend.add_documents([
    {"content": "RAG combines retrieval with generation", "source": "wiki"},
    {"content": "Vector search finds similar embeddings", "source": "docs"},
])

rag = Agent(index="demo", backend=backend)

state = rag.invoke("What is retrieval-augmented generation?")
print(state.answer)

Environment variables

Variable Purpose
OPENAI_API_KEY OpenAI API key
AZURE_OPENAI_ENDPOINT Azure OpenAI endpoint (auto-detected)
AZURE_OPENAI_API_KEY Azure OpenAI key
AZURE_OPENAI_DEPLOYMENT Deployment name
COHERE_API_KEY Cohere reranker key
JINA_API_KEY Jina reranker key

Set in a .env file — retrievalagent loads it automatically via python-dotenv if installed.


Next steps