Multi-Collection Routing
When your data lives in multiple indexes, retrievalagent can route each query to the right subset automatically. Before retrieval, an LLM routing step selects which collections are relevant — only those are searched.
Setup
from retrievalagent import init_agent
rag = init_agent(
collections={
"products": "Product catalog: SKUs, prices, specs, availability",
"faq": "Customer-facing FAQ, troubleshooting, return policy",
"policies": "Internal HR and compliance policy documents",
},
backend="qdrant",
backend_url="http://localhost:6333",
model="openai:gpt-5.4",
)
from retrievalagent import Agent
from retrievalagent.backend import MeilisearchBackend
backends = {
"products": MeilisearchBackend("products"),
"manuals": MeilisearchBackend("manuals"),
}
rag = Agent(
index="catalog",
collections=backends,
collection_descriptions={
"products": "Product listings with prices and specs",
"manuals": "Installation and user guides",
},
)
How it works
- The query arrives at
invoke/chat. - An LLM call selects the relevant collection names (using names + optional descriptions).
- Only the selected backends are searched — the context variable
_ACTIVE_COLLECTIONSscopes retrieval. - Each retrieved document gets a
metadata["_collection"]tag with its source collection.
If the LLM returns an empty selection (uncertain), retrievalagent falls back to searching all collections.
Collection metadata
state = rag.invoke("What's our return policy?")
for doc in state.documents:
print(doc.metadata["_collection"], doc.page_content[:80])