Changelog
0.14.2
enable_precision_filterconfig flag — opt out of the result-aware second-pass filter detection without disabling the prepare-time filter intent. Useful for predictable benchmarks and A/B comparisons. Env var:RAG_ENABLE_PRECISION_FILTER. Default on.- Iterative re-fire after rewrite —
_arewritenow clearsstate.filter_intenton every exit path (pinpoint-rerank, pinpoint bypass, llm rewrite, swarm fallback) soprecision_filterre-evaluates against the new query's results. Previously a stale filter from the first pass survived rewrites. - Value-validation gate — when the filter-intent LLM extracts a value, we now require it to appear (case-insensitive substring) in at least one sampled value of the chosen path. Rejects spurious mappings (e.g. user says "blue widget" and LLM hallucinates
color = blueagainst an unrelated catalog). Tracespath="value-not-in-samples"when the gate trips. Falls through unchanged when the path wasn't sampled, to avoid blocking legitimate sparse-schema cases. - 19 new precision-filter unit tests covering re-fire, validation gate, skip conditions, hallucination rejection, and disabled-flag behavior.
0.14.1
- SQLite FTS dialect — uses
LIKE+COLLATE NOCASE(no ILIKE on SQLite) and json1 (json_each+json_extract) for nested-array paths, e.g.stock_data.location_name CONTAINS "X"→EXISTS (SELECT 1 FROM json_each(stock_data) WHERE json_extract(value, '$.location_name') LIKE '%X%' COLLATE NOCASE). - LanceDB dialect — flat fields keep the SQL syntax; dotted (nested) paths return empty so
precision_filterfalls back to its path-aware in-memory match. List[struct] doesn't have a clean predicate-substring form on DataFusion. - 5 new tests; total 19 across all dialects.
0.14.0
- Qdrant native nested filter support —
QdrantBackend.build_filter_exprnow returns a QdrantFilterdict (not a string). Dotted paths auto-convert to Qdrant's array suffix syntax:stock_data.location_name→stock_data[].location_name. CONTAINS usesMatchText, equality usesMatchValue, NOT_CONTAINS / != go tomust_not. SearchRequest.filter_exprwidened to accept structured dicts so Qdrant Filter shapes flow end-to-end without string serialization._make_search_requesthandles dict filters — bypasses the string-merge path when the backend'sbuild_filter_exprproduced a structured filter (e.g. Qdrant). String filters still merge withself.filteras before.- 4 new Qdrant unit tests; total 14 nested-dialect tests across Meili / SQL / OData / Qdrant.
Backends still using the in-memory match fallback for nested paths: LanceDB list[struct], SQLite FTS, Chroma. Generic in-memory match in precision_filter covers most queries on these.
0.13.2
- SQL nested-path filter compilation — dotted
FilterIntent.field(e.g.stock_data.location_name) now compiles to anEXISTS … jsonb_array_elements … ->>clause; works on Postgres, pgvector, Postgres-FTS, DuckDB. Multi-level paths fall back to in-memory match. - OData (Azure AI Search) nested-path support — dotted paths convert to slash-separated complex-type paths (
stock_data/location_name) automatically; CONTAINS now emitssearch.ismatchinstead of an invalidCONTAINSoperator (pre-existing bug). - 10 unit tests covering nested + flat compilation across Meili / SQL / OData dialects.
0.13.1
- Nested-aware filter intent — schema sampling now flattens dicts and
list[dict]one level, surfacing dotted paths likestock_data.location_nameto the filter-intent LLM. Meili dialect passes the path through unchanged, so "in Hunzenschwil am Lager" →stock_data.location_name CONTAINS "Hunzenschwil"automatically precision_filtergraph node — runs aftermerge_rerankbetween retrieval and generation. Walks the actual top-k docs' metadata, builds a path→values map specific to THIS query, and asks the filter-intent LLM if a precise filter is warranted. Filters in-memory first (path-aware match for dotted paths), falls back to fresh keyword search using the canonical query if needed. Generic across all 9 backends; native nested-filter support varies (Meili full; others get safe in-memory-only path)- Verified live on a 50k+ article catalog: "ich suche trockenbeton welches in hunzenschwil am lager ist" → 7/7 docs with stock at Hunzenschwil
0.13.0
- Deterministic regex negation extraction —
extract_negation_terms()runs in_aprepare_nodebefore the synonym fanout, populatingstate.excluded_termsfrom DE/EN/FR/IT cues (aber nicht von X,ohne X,but not from X,sans X,ma non X). Synonym LLM now MERGES instead of overwrites, so misses no longer leak excluded brands into results - Elbow-method score cutoff — opt-in
enable_elbow_cutoffinRAGConfig(alsoRAG_ENABLE_ELBOW_CUTOFF=1); cuts after the largest consecutive score drop, dropping noise tail like the bieröffner→Wickeltisch case - Cutoff quality eval —
tests/eval_v2/cutoff_eval.pyimplements Doug Turnbull's variable-precision / F1 / time-well-spent metrics against labeled JSONL queries - TUI redesign — brand-aligned palette (ink
#0B0F14, surface#11161D, accent#E8613C), consistent borders, orange Submit, dark Quit; elbow cutoff checkbox in setup screen, defaults on - Docs polish — hero deduped (mark-only SVG instead of full logo with text), nav logo dark-mode swap via
[data-md-color-scheme="slate"]CSS
0.12.0
- Retry strategies — grader classifies failures as
widen(broaden),narrow(add filter), orpinpoint(exact product-code lookup); rewrite loop follows the strategy - Follow-up detection — single structured LLM call in
_acontextualizeclassifies + rewrites;is_followup=trueinjects previous turn's documents, skipping re-retrieval - LLM no-result suggestions — zero-result termination calls the LLM for a multilingual "nothing found" message + 2–3 alternative queries
- Score gate fix — canonical query arm is always included in RRF fusion; 0.3 gate previously dropped natural-language queries where stopwords dilute BM25 scores
- CLI modes —
--retrieve(top-k docs, no LLM),--query(single Q → stdout),--plain(readline REPL) - Typed
backendparameter —Literal[...]union for IDE autocomplete and type checking - Removed Optuna tuner —
tuner.py,tuner_v2.py,dataset_builder.pydeleted; useRAGConfig+custom_instructionsinstead - Generic codebase — removed all customer-specific hardcoding;
custom_instructions/instructions=are the documented extension points - Default
query_languageschanged from["de", "fr", "it", "en"]→["en"]
0.7.0
- Lean parallel pipeline —
prepare → [keyword_search ∥ synonym_search] → evaluate → quality_gate → [semantic_backup →] merge_rerank → generate; replaces the old sequential multi-query swarm with a two-branch fan-out that is always on - Synonym node — single LLM call in
_asynonym_search_nodeproduces spell-correction, synonym/alias expansion, and negation extraction; originalstate.queryis never overwritten - Spell-correction as parallel BM25 term — corrected form searched alongside original (
"troceknbeton"→ BM25 on"troceknbeton"+"trockenbeton"), not as a query overwrite - Negative filter extraction —
"cola aber nicht zero"→RAGState.excluded_terms = ["zero"]; post-filtered in_amerge_rerank_node; works for any negated concept - MMR diversity pass —
_mmr_diverse(lam=0.7)runs before reranking in_amerge_rerank_nodeto remove near-duplicate docs; bag-of-words Jaccard, no embeddings needed (source:retrievalagent/_internal/fusion.py) - Retry reasoning — rewrite node passes the top-3 doc snippets to the LLM so it can reason about why the previous results were wrong
- Model routing clarification —
llm(cheap/utility: synonyms, rewrite, quality-gate, defaultgpt-5.4-mini),gen_llm(generation, defaultgpt-5.5),grader_llm(configurable separately viaconfig.grader_model)
0.6.5
mem0_memory=parameter for LLM-based fact extraction with deduplication and conflict resolution- Supports both
Memory(sync via thread-pool) andAsyncMemory(native async) - Strips retrieved documents from checkpointed state to keep context lean
0.6.4
auto_strategy=Trueis now the default — retrievalagent is agentic out of the box- Added multi-collection routing (
collections=parameter) with LLM-based selection - Added
_MultiBackendwith_ACTIVE_COLLECTIONScontext-variable scoping - Extended filter coverage: NOT CONTAINS (ILIKE) for LanceDB, DuckDB, pgvector; Qdrant server MatchText; Chroma AND filters
- Added
%to_SAFE_FILTER_REto allow ILIKE patterns - Cleaned repo of all internal product references
0.6.3
- Backend-aware filter translator (Meili, SQL ILIKE, OData, Chroma dict, Qdrant native)
build_filter_exprper-backend helper
0.6.1 — 0.6.2
- LanceDB filter coverage and test suite
- Embedding API call cache (disk-based)
0.6.0
- Multi-query swarm retrieval
- LangGraph state machine refactor
- Async-native pipeline with
_run_syncfor sync wrappers
Earlier
See GitHub releases for full history.