All posts tagged: production

Context compression finally works in production: new research cuts LLM input 16x without the accuracy hit

Context compression finally works in production: new research cuts LLM input 16x without the accuracy hit

Context windows are becoming a computational bottleneck. The longer an agent runs, the more tokens accumulate from retrieved documents, reasoning traces and conversation history, and the more memory and compute that growing context demands. Most existing solutions either degrade model accuracy, require the full context to load before compression begins, or produce memory savings that don’t translate into real speedups in standard serving infrastructure. A research team from NYU, Columbia, Princeton, University of Maryland, Harvard and Lawrence Livermore National Laboratory published a paper this week that proposes a novel fix. The researchers introduce the concept of  Latent Context Language Models, or LCLMs, a family of encoder-decoder compression models that compress input context before it reaches the decoder. The models are open-sourced on HuggingFace. Unlike KV cache compression methods — the dominant approach in the field, which still materialize the full KV cache before evicting entries — LCLMs compress the input token sequence before decoder prefill, so higher compression ratios directly reduce decoder-side compute and memory. The paper reports LCLMs at 16x compression produced output 8.8 …

Why AI that works in the lab often fails in production — and what actually fixes it

Why AI that works in the lab often fails in production — and what actually fixes it

Presented by Capital One Enterprises aren’t struggling to experiment with AI; they’re struggling to make it work in the real world. Moving from promising prototypes to reliable, production-scale systems is where most efforts stall. In my role within Capital One’s AI Foundations organization, I’ve seen firsthand that successful AI implementation isn’t just about adopting the latest models or tools. It requires a disciplined R&D approach that connects foundational research to real-world systems, and holds ideas accountable as they move from concept to production. That’s harder than it sounds. AI capabilities are evolving quickly, but enterprise environments can be complex, fragmented, and risk-minded. The question isn’t just what’s possible, but what actually works — for a specific workflow, user, or decision — with today’s technology and constraints. What follows reflects how organizations can turn AI ambition into production reality through a more deliberate approach to research, evaluation, and deployment. Bridging foundational and applied research Delivering impactful AI requires closing the gap between cutting-edge research and practical, real-world use cases. When research exists in an academic vacuum, …

EU approves €23bn to boost Italian renewable energy production

EU approves €23bn to boost Italian renewable energy production

The European Commission has approved a €23bn Italian state aid programme designed to accelerate renewable energy production across the country. The scheme, which aligns with the European Union’s Clean Industrial Deal, will support the development of new renewable electricity projects and help Italy move closer to its 2030 climate and energy targets. The funding will be used to support electricity generation from onshore wind, solar power, hydropower and sewage gas facilities. According to the Commission, the programme is expected to deliver 37.15 gigawatts (GW) of additional renewable electricity capacity, representing nearly half of Italy’s existing renewable energy infrastructure. European officials believe the investment will strengthen Italy’s energy security, lower electricity costs over time and reduce dependence on imported fossil fuels. The approval also marks one of the largest renewable energy support measures authorised under the EU’s new Clean Industrial Deal State Aid Framework (CISAF). Commenting on the major funding, Teresa Ribera, Executive Vice-President for Clean, Just and Competitive Transition, said: “With this €23bn scheme, Italy will support the production of renewable electricity from various technologies, …

Panasonic to start US data centre battery production by fiscal 2028

Panasonic to start US data centre battery production by fiscal 2028

TOKYO, June 8 : Panasonic Holdings said on Monday it plans to start mass production of battery cells for data centre applications at a plant in the U.S. state of Kansas in the financial year 2028, which ends March 2029. Here are some details: • The company said it would allocate about 350 billion yen ($2.18 billion) of its previously announced 500 billion yen investment in AI infrastructure over fiscal 2026-2028 to its Energy unit, which supplies Tesla, and 150 billion yen to its Industry segment. • Panasonic Energy also plans to build a third plant in Mexico with mass production scheduled for the fiscal year 2028. • Panasonic Energy CEO Kazuo Tadanobu said the unit’s 950 billion yen sales target for data centre-related energy storage systems in the 2028 financial year was a “minimum commitment”, adding that the business would aim to lift sales to more than 1 trillion yen. ($1 = 160.1900 yen) Source link

When Claude changed, everything changed: Managing AI blast radius in production

When Claude changed, everything changed: Managing AI blast radius in production

Our system did one thing, and it did it well: It turned natural-language questions into API calls. The users were analysts, account managers, and operations leads. They knew what data they needed, but assembling it manually meant pulling from four dashboards, two BI tools, and a Salesforce report builder. With our system, they typed the request in plain English. A request like “Compile a report on sales volume for January through March 2026 for the Northeast region, broken down by city” was translated into an API call that the system could act on: json {   “description”: “User requested sales volume for the given date range, here is the API call to get the response”,   “api_call”: “/api/sales_volume”,   “post_body”: {     “start_date”: “2026-01-01”,     “end_date”: “2026-03-31”,     “region”: “northeast”   } } The rest of the pipeline was conventional engineering. The system dispatched the call to the right backend — we had integrations with internal reporting portals, Salesforce, and several homegrown services — applied a large language model (LLM)(-generated JSON query to filter and shape the response, and delivered it via …

Top Democrats urge Treasury Department to halt production of gold coins with Trump image

Top Democrats urge Treasury Department to halt production of gold coins with Trump image

Two Senate Democrats on Thursday called on the Trump administration to stop the production of a 24-karat gold coin bearing President Trump’s image to commemorate America’s 250th birthday, with the lawmakers expressing concerns that some of the gold used by U.S. Mint could be traced to foreign cartels. Sens. Elizabeth Warren (Mass.) and Ron Wyden… Source link

Anthropic says 80% of its new production code is now authored by Claude — how your enterprise can keep up

Anthropic says 80% of its new production code is now authored by Claude — how your enterprise can keep up

Anthropic co-founder and CEO Dario Amodei said it was coming, but it still feels like a milestone: More than 80% of the code merged into Anthropic’s production codebase in May wasn’t authored by humans, but by its own AI model, Claude, according to a new report shared by the record-breaking AI startup today. This transformation has triggered an 8x increase in the volume of code shipped per engineer per quarter compared to the company’s 2021–2025 baseline, which the company notes means even more code someone or something must review. For enterprise technical leaders, this is no longer a localized research curiosity; it’s a new, aggressive competitive baseline. If a frontier AI laboratory can successfully offload the vast majority of its engineering output to autonomous agents — showing signs of the long-sought AI Holy Grail of “recursive self-improvement,” models that can independently research and upgrade themselves — what’s preventing enterprises across other sectors from automating more of their internal software development with AI agents, too? Obviously, it’s easier said than done. Anthropic is one of the …

Netflix pauses production on Denzel Washington-starring Hannibal film due to budget concerns

Netflix pauses production on Denzel Washington-starring Hannibal film due to budget concerns

Stay ahead of the curve with our weekly guide to the latest trends, fashion, relationships and more Stay ahead of the curve with our weekly guide to the latest trends, fashion, relationships and more Stay ahead of the curve with our weekly guide to the latest trends, fashion, relationships and more Netflix has paused production on Denzel Washington’s forthcoming Hannibal film. The untitled historical epic war movie, which was planned to shoot in Italy this summer, has been halted as the producers and studio deal with budget concerns. Movie execs on both ends are reportedly working to get production back on track in the hope it can still move forward at Netflix, according to Deadline. Netflix did not return The Independent’s request for comment. The Equalizer director Antoine Fuqua is leading the film, which will star Washington as the Carthaginian general Hannibal, who is regarded as one of the most famous military commanders in history. The script has been written by Oscar-winner John Logan, known for work on Michael, The Aviator and Gladiator. The story …

AI agents keep giving confident wrong answers. The context layer is enterprise AI’s next production problem.

AI agents keep giving confident wrong answers. The context layer is enterprise AI’s next production problem.

Enterprise AI agents have a new production failure mode, and it is not the model. As enterprises move from single-layer RAG to hybrid retrieval architectures, the same underlying data produces different answers depending on which agent, tool or system asks the question. Revenue means one thing in a business intelligence (BI) dashboard, something slightly different in a SQL table and something else again in an agent instruction. The retrieval infrastructure build-out of the past two years produced faster and cheaper vector search. It did not produce a shared definition of what the data means. At Snowflake Summit 26 in San Francisco, the data cloud vendor is taking a broad swing at that problem, with announcements spanning a Kafka-compatible managed streaming service called Data Stream, adaptive compute improvements, expanded Apache Iceberg interoperability and updates to its Cowork and CoCo agent and coding products. Running underneath all of it is a context layer: Horizon Context and Cortex Sense, a two-layer system designed to give agents a governed, shared definition of business logic across retrieval stacks. The context …

PNNL launches first US battery production line for prismatic cells

PNNL launches first US battery production line for prismatic cells

A major new battery manufacturing capability has been unveiled at the United States Department of Energy’s Pacific Northwest National Laboratory (PNNL), marking a significant step forward in the development of next-generation energy storage technologies. Researchers have activated a dedicated production line designed specifically for prismatic cells, a battery format increasingly viewed as critical for future grid-scale energy storage systems. The new facility, located within PNNL’s Grid Storage Launchpad, enables scientists to produce and evaluate prismatic cells at a scale much closer to real-world commercial applications. The capability is expected to help bridge the gap between laboratory research and industrial deployment by allowing researchers and private-sector partners to validate new battery designs before they reach the market. With testing completed and operational procedures nearing finalisation, the production line will initially focus on manufacturing sodium-ion and lithium-iron-phosphate battery chemistries. The resulting data will establish performance and safety benchmarks while demonstrating the viability of scaling advanced battery technologies from small laboratory prototypes to larger commercial formats. New facility designed for commercial-scale battery research The newly commissioned production line …