Skip to content

Changelog

0.14.2

  • enable_precision_filter config flag — opt out of the result-aware second-pass filter detection without disabling the prepare-time filter intent. Useful for predictable benchmarks and A/B comparisons. Env var: RAG_ENABLE_PRECISION_FILTER. Default on.
  • Iterative re-fire after rewrite_arewrite now clears state.filter_intent on every exit path (pinpoint-rerank, pinpoint bypass, llm rewrite, swarm fallback) so precision_filter re-evaluates against the new query's results. Previously a stale filter from the first pass survived rewrites.
  • Value-validation gate — when the filter-intent LLM extracts a value, we now require it to appear (case-insensitive substring) in at least one sampled value of the chosen path. Rejects spurious mappings (e.g. user says "blue widget" and LLM hallucinates color = blue against an unrelated catalog). Traces path="value-not-in-samples" when the gate trips. Falls through unchanged when the path wasn't sampled, to avoid blocking legitimate sparse-schema cases.
  • 19 new precision-filter unit tests covering re-fire, validation gate, skip conditions, hallucination rejection, and disabled-flag behavior.

0.14.1

  • SQLite FTS dialect — uses LIKE + COLLATE NOCASE (no ILIKE on SQLite) and json1 (json_each + json_extract) for nested-array paths, e.g. stock_data.location_name CONTAINS "X"EXISTS (SELECT 1 FROM json_each(stock_data) WHERE json_extract(value, '$.location_name') LIKE '%X%' COLLATE NOCASE).
  • LanceDB dialect — flat fields keep the SQL syntax; dotted (nested) paths return empty so precision_filter falls back to its path-aware in-memory match. List[struct] doesn't have a clean predicate-substring form on DataFusion.
  • 5 new tests; total 19 across all dialects.

0.14.0

  • Qdrant native nested filter supportQdrantBackend.build_filter_expr now returns a Qdrant Filter dict (not a string). Dotted paths auto-convert to Qdrant's array suffix syntax: stock_data.location_namestock_data[].location_name. CONTAINS uses MatchText, equality uses MatchValue, NOT_CONTAINS / != go to must_not.
  • SearchRequest.filter_expr widened to accept structured dicts so Qdrant Filter shapes flow end-to-end without string serialization.
  • _make_search_request handles dict filters — bypasses the string-merge path when the backend's build_filter_expr produced a structured filter (e.g. Qdrant). String filters still merge with self.filter as before.
  • 4 new Qdrant unit tests; total 14 nested-dialect tests across Meili / SQL / OData / Qdrant.

Backends still using the in-memory match fallback for nested paths: LanceDB list[struct], SQLite FTS, Chroma. Generic in-memory match in precision_filter covers most queries on these.

0.13.2

  • SQL nested-path filter compilation — dotted FilterIntent.field (e.g. stock_data.location_name) now compiles to an EXISTS … jsonb_array_elements … ->> clause; works on Postgres, pgvector, Postgres-FTS, DuckDB. Multi-level paths fall back to in-memory match.
  • OData (Azure AI Search) nested-path support — dotted paths convert to slash-separated complex-type paths (stock_data/location_name) automatically; CONTAINS now emits search.ismatch instead of an invalid CONTAINS operator (pre-existing bug).
  • 10 unit tests covering nested + flat compilation across Meili / SQL / OData dialects.

0.13.1

  • Nested-aware filter intent — schema sampling now flattens dicts and list[dict] one level, surfacing dotted paths like stock_data.location_name to the filter-intent LLM. Meili dialect passes the path through unchanged, so "in Hunzenschwil am Lager" → stock_data.location_name CONTAINS "Hunzenschwil" automatically
  • precision_filter graph node — runs after merge_rerank between retrieval and generation. Walks the actual top-k docs' metadata, builds a path→values map specific to THIS query, and asks the filter-intent LLM if a precise filter is warranted. Filters in-memory first (path-aware match for dotted paths), falls back to fresh keyword search using the canonical query if needed. Generic across all 9 backends; native nested-filter support varies (Meili full; others get safe in-memory-only path)
  • Verified live on a 50k+ article catalog: "ich suche trockenbeton welches in hunzenschwil am lager ist" → 7/7 docs with stock at Hunzenschwil

0.13.0

  • Deterministic regex negation extractionextract_negation_terms() runs in _aprepare_node before the synonym fanout, populating state.excluded_terms from DE/EN/FR/IT cues (aber nicht von X, ohne X, but not from X, sans X, ma non X). Synonym LLM now MERGES instead of overwrites, so misses no longer leak excluded brands into results
  • Elbow-method score cutoff — opt-in enable_elbow_cutoff in RAGConfig (also RAG_ENABLE_ELBOW_CUTOFF=1); cuts after the largest consecutive score drop, dropping noise tail like the bieröffner→Wickeltisch case
  • Cutoff quality evaltests/eval_v2/cutoff_eval.py implements Doug Turnbull's variable-precision / F1 / time-well-spent metrics against labeled JSONL queries
  • TUI redesign — brand-aligned palette (ink #0B0F14, surface #11161D, accent #E8613C), consistent borders, orange Submit, dark Quit; elbow cutoff checkbox in setup screen, defaults on
  • Docs polish — hero deduped (mark-only SVG instead of full logo with text), nav logo dark-mode swap via [data-md-color-scheme="slate"] CSS

0.12.0

  • Retry strategies — grader classifies failures as widen (broaden), narrow (add filter), or pinpoint (exact product-code lookup); rewrite loop follows the strategy
  • Follow-up detection — single structured LLM call in _acontextualize classifies + rewrites; is_followup=true injects previous turn's documents, skipping re-retrieval
  • LLM no-result suggestions — zero-result termination calls the LLM for a multilingual "nothing found" message + 2–3 alternative queries
  • Score gate fix — canonical query arm is always included in RRF fusion; 0.3 gate previously dropped natural-language queries where stopwords dilute BM25 scores
  • CLI modes--retrieve (top-k docs, no LLM), --query (single Q → stdout), --plain (readline REPL)
  • Typed backend parameterLiteral[...] union for IDE autocomplete and type checking
  • Removed Optuna tunertuner.py, tuner_v2.py, dataset_builder.py deleted; use RAGConfig + custom_instructions instead
  • Generic codebase — removed all customer-specific hardcoding; custom_instructions / instructions= are the documented extension points
  • Default query_languages changed from ["de", "fr", "it", "en"]["en"]

0.7.0

  • Lean parallel pipelineprepare → [keyword_search ∥ synonym_search] → evaluate → quality_gate → [semantic_backup →] merge_rerank → generate; replaces the old sequential multi-query swarm with a two-branch fan-out that is always on
  • Synonym node — single LLM call in _asynonym_search_node produces spell-correction, synonym/alias expansion, and negation extraction; original state.query is never overwritten
  • Spell-correction as parallel BM25 term — corrected form searched alongside original ("troceknbeton" → BM25 on "troceknbeton" + "trockenbeton"), not as a query overwrite
  • Negative filter extraction"cola aber nicht zero"RAGState.excluded_terms = ["zero"]; post-filtered in _amerge_rerank_node; works for any negated concept
  • MMR diversity pass_mmr_diverse(lam=0.7) runs before reranking in _amerge_rerank_node to remove near-duplicate docs; bag-of-words Jaccard, no embeddings needed (source: retrievalagent/_internal/fusion.py)
  • Retry reasoning — rewrite node passes the top-3 doc snippets to the LLM so it can reason about why the previous results were wrong
  • Model routing clarificationllm (cheap/utility: synonyms, rewrite, quality-gate, default gpt-5.4-mini), gen_llm (generation, default gpt-5.5), grader_llm (configurable separately via config.grader_model)

0.6.5

  • mem0_memory= parameter for LLM-based fact extraction with deduplication and conflict resolution
  • Supports both Memory (sync via thread-pool) and AsyncMemory (native async)
  • Strips retrieved documents from checkpointed state to keep context lean

0.6.4

  • auto_strategy=True is now the default — retrievalagent is agentic out of the box
  • Added multi-collection routing (collections= parameter) with LLM-based selection
  • Added _MultiBackend with _ACTIVE_COLLECTIONS context-variable scoping
  • Extended filter coverage: NOT CONTAINS (ILIKE) for LanceDB, DuckDB, pgvector; Qdrant server MatchText; Chroma AND filters
  • Added % to _SAFE_FILTER_RE to allow ILIKE patterns
  • Cleaned repo of all internal product references

0.6.3

  • Backend-aware filter translator (Meili, SQL ILIKE, OData, Chroma dict, Qdrant native)
  • build_filter_expr per-backend helper

0.6.1 — 0.6.2

  • LanceDB filter coverage and test suite
  • Embedding API call cache (disk-based)

0.6.0

  • Multi-query swarm retrieval
  • LangGraph state machine refactor
  • Async-native pipeline with _run_sync for sync wrappers

Earlier

See GitHub releases for full history.