All posts tagged: Databricks

Databricks tested a stronger model against its multi-step agent on hybrid queries. The stronger model still lost by 21%.

Published by skeptic

Data teams building AI agents keep running into the same failure mode. Questions that require joining structured data with unstructured content, sales figures alongside customer reviews or citation counts alongside academic papers, break single-turn RAG systems. New research from Databricks puts a number on that failure gap. The company’s AI research team tested a multi-step agentic approach against state-of-the-art single-turn RAG baselines across nine enterprise knowledge tasks and reported gains of 20% or more on Stanford’s STaRK benchmark suite, along with consistent improvement across Databricks’ own KARLBench evaluation framework, according to the research. Databricks argues the performance gap between single-turn RAG and multi-step agents on hybrid data tasks is an architectural problem, not a model quality problem. The work builds on Databricks’ earlier instructed retriever research, which showed retrieval improvements on unstructured data using metadata-aware queries. This latest research adds structured data sources, relational tables and SQL warehouses, into the same reasoning loop, addressing the class of questions enterprises most commonly fail to answer with current agent architectures. “RAG works, but it doesn’t scale,” Michael …

Databricks built a RAG agent it says can handle every kind of enterprise search

Published by skeptic

Most enterprise RAG pipelines are optimized for one search behavior. They fail silently on the others. A model trained to synthesize cross-document reports handles constraint-driven entity search poorly. A model tuned for simple lookup tasks falls apart on multi-step reasoning over internal notes. Most teams find out when something breaks. Databricks set out to fix that with KARL, short for Knowledge Agents via Reinforcement Learning. The company trained an agent across six distinct enterprise search behaviors simultaneously using a new reinforcement learning algorithm. The result, the company claims, is a model that matches Claude Opus 4.6 on a purpose-built benchmark at 33% lower cost per query and 47% lower latency, trained entirely on synthetic data the agent generated itself with no human labeling required. That comparison is based on KARLBench, which Databricks built to evaluate enterprise search behaviors. “A lot of the big reinforcement learning wins that we’ve seen in the community in the past year have been on verifiable tasks where there is a right and a wrong answer,” Jonathan Frankle, Chief AI Scientist …

Databricks’ serverless database slashes app development from months to days as companies prep for agentic AI

Published by skeptic

Five years ago, Databricks coined the term ‘data lakehouse’ to describe a new type of data architecture that combines a data lake with a data warehouse. That term and data architecture are now commonplace across the data industry for analytics workloads. Now, Databricks is once again looking to create a new category with its Lakebase service, now generally available today. While the data lakehouse construct deals with OLAP (online analytical processing) databases, Lakebase is all about OLTP (online transaction processing) and operational databases. The Lakebase service has been in development since June 2025 and is based on technology Databricks gained via its acquisition of PostgreSQL database provider Neon. It was further enhanced in October of 2025 with the acquisition of Mooncake, which brought capabilities to help bridge PostgreSQL with lakehouse data formats. Lakebase is a serverless operational database that represents a fundamental rethinking of how databases work in the age of autonomous AI agents. Early adopters, including easyJet, Hafnia and Warner Music Group, are cutting application delivery times by 75 to 95%, but the deeper …

Snowflake, Databricks challenger ClickHouse hits B valuation

Snowflake, Databricks challenger ClickHouse hits $15B valuation

Published by skeptic

Database provider ClickHouse secured $400 million at a $15 billion valuation, Bloomberg reported, representing about a 2.5x increase from its $6.35 billion valuation last May. The round was led by Dragoneer Investment Group, the startup said, with participation from investors including Bessemer Venture Partners, GIC, Index Ventures, Khosla Ventures, and Lightspeed Venture Partners. ClickHouse, which spun out from Russian search giant Yandex in 2021, develops database software designed to process the massive datasets required by AI agents. The company competes with Snowflake and Databricks. The company also announced the acquisition of Langfuse, a startup that helps developers track and evaluate the performance of their AI agents. Langfuse competes directly with LangSmith, LangChain’s observability platform. ClickHouse database is open sourced, and it makes money by selling managed cloud services, which saw annual recurring revenue (ARR) grow by more than 250% year-over-year, it said. The company’s customers include Meta, Tesla, Capital One, Lovable, Decagon, and Polymarket. Source link

Databricks’ Instructed Retriever beats traditional RAG data retrieval by 70% — enterprise metadata was the missing link

Published by skeptic

A core element of any data retrieval operation is the use of a component known as a retriever. Its job is to retrieve the relevant content for a given query. In the AI era, retrievers have been used as part of RAG pipelines. The approach is straightforward: retrieve relevant documents, feed them to an LLM, and let the model generate an answer based on that context. While retrieval might have seemed like a solved problem, it actually wasn’t solved for modern agentic AI workflows. In research published this week, Databricks introduced Instructed Retriever, a new architecture that the company claims delivers up to 70% improvement over traditional RAG on complex, instruction-heavy enterprise question-answering tasks. The difference comes down to how the system understands and uses metadata. “A lot of the systems that were built for retrieval before the age of large language models were really built for humans to use, not for agents to use,” Michael Bendersky, a research director at Databricks, told VentureBeat. “What we found is that in a lot of cases, the …

Skeptic Society Magazine

for honest conversations

Years

Authors

Filter by Month

Filter by Categories

Filter by Tags

All posts tagged: Databricks

Databricks tested a stronger model against its multi-step agent on hybrid queries. The stronger model still lost by 21%.

Databricks built a RAG agent it says can handle every kind of enterprise search

Databricks’ serverless database slashes app development from months to days as companies prep for agentic AI

Snowflake, Databricks challenger ClickHouse hits $15B valuation

Databricks’ Instructed Retriever beats traditional RAG data retrieval by 70% — enterprise metadata was the missing link