All posts tagged: retrieval

Why prompt debt, retrieval debt, and evaluation debt are quietly reshaping enterprise AI risk

Published by skeptic

Over the past two decades, technical debt meant outdated architecture, messy code, and poorly maintained documentation. That definition is no longer sufficient in the AI era, where failure modes are more subtle and often non-linear. AI systems are introducing new layers of technical debt that live across prompts, models, and data dependencies — making these layers less visible, harder to measure, and often more dangerous than traditional debt. A crisis hiding in plain sight The complexities of AI systems and their associated failures have been well documented. A 2025 MIT study found that 95% of AI projects fail to reach production or deliver value. A similar study by S&P Global Market Intelligence found that 42% of businesses scrapped multiple AI initiatives in 2025 — a sharp increase from 17% the previous year. Various reasons are cited for these failures, but most of them point to poorly designed and implemented systems that are complex to manage and have multiple hard-to-monitor failure points, leading to a rapid accumulation of AI debt. Traditional technical debt was localized to …

Context architecture is replacing RAG as agentic AI pushes enterprise retrieval to its limits

Published by skeptic

Redis built its name as the caching layer that kept web applications from collapsing under load. The problem it is targeting now has the same structure but is harder to solve: production AI agents failing not because the models are wrong, but because the data underneath them is scattered, stale and structured for humans rather than machines. Retrieval pipelines built for single queries cannot absorb the volume agents generate. The gap Redis is targeting is structural: agents make orders of magnitude more data requests than human users, but most retrieval layers were built for the human-scale problem. Redis Iris, launched Monday, is the company’s answer: a context and memory platform that sits between an agent and the data it needs to act. The platform combines real-time data ingestion, a semantic interface that auto-generates MCP tools from business data models, and an agent memory server built on Redis Flex, a rewritten storage engine that runs 99% of data on flash at a tenth of the cost of in-memory storage alone. The announcement lands as enterprise RAG …

The retrieval rebuild: Why hybrid retrieval intent tripled as enterprise RAG programs hit the scale wall

Published by skeptic

Something shifted in enterprise RAG in Q1 2026. VB Pulse data spanning January through March tells a consistent story: the market stopped adding retrieval layers and started fixing the ones it already has. Call it the retrieval rebuild. The survey covered three consecutive monthly waves from organizations with 100 or more employees, with between 45 and 58 qualified respondents per month across platform adoption, buyer intent, architecture outlook and evaluation criteria. The data should be treated as directional. Enterprise intent to adopt hybrid retrieval tripled from 10.3% to 33.3% in a single quarter — even as 22% of qualified enterprise respondents reported having no production RAG systems at all. For data engineers and enterprise architects building agentic AI infrastructure, the data reveals a market in active transition: the RAG architecture most enterprises built to scale is not the one they expect to run by year-end. Credit: VentureBeat Pulse survey Hybrid retrieval has become the consensus enterprise strategy. Unlike single-method RAG pipelines that rely on vector similarity alone, hybrid retrieval combines dense embeddings with sparse keyword …

RAG precision tuning can quietly cut retrieval accuracy by 40%, putting agentic pipelines at risk

Published by skeptic

Enterprise teams that fine-tune their RAG embedding models for better precision may be unintentionally degrading the retrieval quality those pipelines depend on, according to new research from Redis. The paper, “Training for Compositional Sensitivity Reduces Dense Retrieval Generalization,” tested what happens when teams train embedding models for compositional sensitivity. That is the ability to catch sentences that look nearly identical but mean something different — “the dog bit the man” versus “the man bit the dog,” or a negation flip that reverses a statement’s meaning entirely. That training consistently broke dense retrieval generalization, how well a model retrieves correctly across broad topics and domains it wasn’t specifically trained on. Performance dropped by 8 to 9 percent on smaller models and by 40 percent on a current mid-size embedding model teams are actively using in production. The findings have direct implications for enterprise teams building agentic AI pipelines, where retrieval quality determines what context flows into an agent’s reasoning chain. A retrieval error in a single-stage pipeline returns a wrong answer. The same error in an …

Why MongoDB thinks better retrieval — not bigger models — is the key to trustworthy enterprise AI

Published by skeptic

Agentic systems and enterprise search depend on strong data retrieval that works efficiently and accurately. Database provider MongoDB thinks its newest embeddings models help solve falling retrieval quality as more AI systems go into production. As agentic and RAG systems move into production, retrieval quality is emerging as a quiet failure point — one that can undermine accuracy, cost, and user trust even when models themselves perform well. The company launched four new versions of its embeddings and reranking models. Voyage 4 will be available in four modes: voyage-4 embedding, voyage-4-large, voyage-4-lite, and voyage-4-nano. MongoDB said the voyage-4 embedding serves as its general-purpose model; MongoDB considers Voyage-4-large its flagship model. Voyage-4-lite focuses on tasks requiring little latency and lower costs, and voyage-4-nano is intended for more local development and testing environments or for on-device data retrieval. Voyage-4-nano is also MongoDB’s first open-weight model. All models are available via an API and on MongoDB’s Atlas platform. The company said the models outperform similar models from Google and Cohere on the RTEB benchmark. Hugging Face’s RTEB benchmark …

Databricks’ Instructed Retriever beats traditional RAG data retrieval by 70% — enterprise metadata was the missing link

Published by skeptic

A core element of any data retrieval operation is the use of a component known as a retriever. Its job is to retrieve the relevant content for a given query. In the AI era, retrievers have been used as part of RAG pipelines. The approach is straightforward: retrieve relevant documents, feed them to an LLM, and let the model generate an answer based on that context. While retrieval might have seemed like a solved problem, it actually wasn’t solved for modern agentic AI workflows. In research published this week, Databricks introduced Instructed Retriever, a new architecture that the company claims delivers up to 70% improvement over traditional RAG on complex, instruction-heavy enterprise question-answering tasks. The difference comes down to how the system understands and uses metadata. “A lot of the systems that were built for retrieval before the age of large language models were really built for humans to use, not for agents to use,” Michael Bendersky, a research director at Databricks, told VentureBeat. “What we found is that in a lot of cases, the …

Skeptic Society Magazine

for honest conversations

Years

Authors

Filter by Month

Filter by Categories

Filter by Tags

All posts tagged: retrieval

Why prompt debt, retrieval debt, and evaluation debt are quietly reshaping enterprise AI risk

Context architecture is replacing RAG as agentic AI pushes enterprise retrieval to its limits

The retrieval rebuild: Why hybrid retrieval intent tripled as enterprise RAG programs hit the scale wall

RAG precision tuning can quietly cut retrieval accuracy by 40%, putting agentic pipelines at risk

Why MongoDB thinks better retrieval — not bigger models — is the key to trustworthy enterprise AI

Databricks’ Instructed Retriever beats traditional RAG data retrieval by 70% — enterprise metadata was the missing link