All posts tagged: OpenWeight

NVIDIA Nemotron 3 Ultra: The Top 550B Open-Weight AI of 2026

NVIDIA Nemotron 3 Ultra: The Top 550B Open-Weight AI of 2026

NVIDIA’s Nemotron 3 Ultra introduces a 550-billion-parameter language model designed to balance computational efficiency and task precision. Using a mixture-of-experts architecture, it activates only 55 billion parameters per task, significantly reducing resource demands while maintaining robust performance. According to Sam Witteveen, one of its defining features is a million-token context window, which allows it to process complex, multi-step workflows effectively. This capability makes it particularly suited for tasks such as reasoning, coding and long-term decision-making. Dive into how the Nemotron 3 Ultra performs in practical scenarios, including its faster token generation and its results on benchmarks like Pinchbench. Learn about the training strategies that enhance its adaptability, such as multi-tier policy distillation and fine-tuning with agent-specific datasets. This explainer also examines its broader applications, from automation to research and customer service, offering a detailed look at its role in advancing AI-driven solutions. What Distinguishes the Neotron 3 Ultra? TL;DR Key Takeaways : Advanced AI Model: NVIDIA’s Neotron 3 Ultra is a 550-billion-parameter language model built on a mixture-of-experts architecture, optimized for reasoning, tool usage and …

Why open-weight models without guardrails are a AI safety risk : NPR

Why open-weight models without guardrails are a AI safety risk : NPR

Participants hold their laptops in front of an illuminated wall at the annual Chaos Computer Club (CCC) computer hackers’ congress, called 29C3, on December 28, 2012 in Hamburg, Germany. In 2026, open-weight AI models possess advanced capabilities not far behind their proprietary counterparts. Getting rid of open-weight models’ guardrails used to take time and deep expertise. But in recent months, that process has become dramatically more accessible and popular. Patrick Lux/Getty Images Europe hide caption toggle caption Patrick Lux/Getty Images Europe How do you make explosives using household items? How do you make meth? How do you plan a school shooting? If you ask the popular AI chatbots most people are familiar with, chances are they will say that it’s illegal, harmful or that answering would be a policy violation. But another type of AI model will never refuse to provide what the user asks for. In recent months, these models have become more accessible and popular. “Everybody can download and operate their own state-of-the-art model and use it for great things and terrible things,” …

Running Open-Weight AI Models on AMD Hardware in 2026

Running Open-Weight AI Models on AMD Hardware in 2026

Running artificial intelligence (AI) models locally is gaining traction as a practical alternative to cloud-based solutions, especially for those prioritizing privacy and cost efficiency. In his explainer, Sam Witteveen highlights how AMD’s advanced hardware, including the Ryzen Threadripper 9980X CPU and Radeon AI Pro R9 700 GPU, enables users to handle demanding AI workloads directly on their machines. With features like 32GB of VRAM and exceptional multi-threaded performance, these components enable smooth operation of AI applications ranging from text generation to media creation, all while keeping sensitive data secure and eliminating recurring cloud fees. Explore how local AI setups can streamline your workflow with AMD’s ROCm platform, which supports popular frameworks like PyTorch and the Transformers library. Gain insight into optimizing your hardware with Linux for enhanced performance and discover practical applications such as fine-tuning models for specialized tasks or generating high-quality media. This guide provides a clear path for using AMD’s ecosystem to unlock the potential of local AI in a variety of creative and technical domains. Why Local AI is Becoming Essential TL;DR …

Cohere’s open-weight ASR model hits 5.4% word error rate — low enough to replace speech APIs in production pipelines

Cohere’s open-weight ASR model hits 5.4% word error rate — low enough to replace speech APIs in production pipelines

Enterprises building voice-enabled workflows have had limited options for production-grade transcription: closed APIs with data residency risks, or open models that trade accuracy for deployability. Cohere’s new open-weight ASR model, Transcribe, is built to compete on all four key differentiators — contextual accuracy, latency, control and cost. Cohere says that Transcribe outperforms current leaders on accuracy — and unlike closed APIs, it can run on an organization’s own infrastructure. Cohere, which can be accessed via an API or in Cohere’s Model Vault as cohere-transcribe-03-2026, has 2 billion parameters and is licensed under Apache-2.0. The company said Transcribe has an average word error rate (WER) of just 5.42%, so it makes fewer mistakes than similar models. It’s trained on 14 languages: English, French, German, Italian, Spanish, Greek, Dutch, Polish, Portuguese, Chinese, Japanese, Korean, Vietnamese and Arabic. The company did not specify which Chinese dialect the model was trained on.  Cohere said it trained the model “with a deliberate focus on minimizing WER, while keeping production readiness top-of-mind.” According to Cohere, the result is a model that …

Ai2 releases MolmoWeb, an open-weight visual web agent with 30K human task trajectories and a full training stack

Ai2 releases MolmoWeb, an open-weight visual web agent with 30K human task trajectories and a full training stack

Engineers building browser agents today face a choice between closed APIs they cannot inspect and open-weight frameworks with no trained model underneath them. Ai2 is now offering a third option. The Seattle-based nonprofit behind the open-source OLMo language models and Molmo vision-language family today is releasing MolmoWeb, an open-weight visual web agent available in 4 billion and 8 billion parameter sizes. Until now, no open-weight visual web agent shipped with the training data and pipeline needed to audit or reproduce it. MolmoWeb does. MolmoWebMix, the accompanying dataset, includes 30,000 human task trajectories across more than 1,100 websites, 590,000 individual subtask demonstrations and 2.2 million screenshot question-answer pairs — which Ai2 describes as the largest publicly released collection of human web-task execution ever assembled. “Can you go from just passively understanding images, describing them and captioning them, to actually making them take action in some environment?” Tanmay Gupta, senior research scientist at Ai2, told VentureBeat. “That is exactly what MolmoWeb is.” How it works: It sees what you see MolmoWeb operates entirely from browser screenshots. It …

Nvidia Will Spend  Billion to Build Open-Weight AI Models, Filings Show

Nvidia Will Spend $26 Billion to Build Open-Weight AI Models, Filings Show

Nvidia will spend $26 billion over the next five years to build open source artificial intelligence models, according to a 2025 financial filing. Executives confirmed the news, which has not been previously reported, in interviews with WIRED. The sizable investment could see Nvidia evolve from a chipmaker with an impressive software stack into a bona fide frontier lab capable of competing with OpenAI and DeepSeek. It’s a strategic move that could further entrench Nvidia’s place as the AI world’s leading chip manufacturer, since the models are tuned to the company’s hardware. Open source models are ones where the weights or the parameters that determine a model’s behavior are released publicly—sometimes with the details of its architecture and training. This allows anyone to download and run it on their own machine or the cloud. In Nvidia’s case, the company also reveals the technical innovations involved in building and training its models, making it easier for startups and researchers to modify and build upon the company’s innovations. On Wednesday, Nvidia also released Nemotron 3 Super, its most …