All posts tagged: Tokens

Inside AMEX’s agentic commerce stack: How intent contracts and single-use tokens enforce AI transactions

Inside AMEX’s agentic commerce stack: How intent contracts and single-use tokens enforce AI transactions

American Express (Amex) is building a system that lets AI agents shop and pay on behalf of users — but right now it’s only within its own payment network, and still involves a black box that could hinder trust and auditability. Amex already participates in agentic commerce protocol projects, especially Google’s Agent Pay Protocol (AP2), which focuses on interoperability. Amex’s Agentic Commerce Experiences (ACE) developer kit, on the other hand, touches on something most protocols currently lack: Full transaction control in the payment layer.  But it still isn’t completely transparent in how it handles validation. ACE uses a closed-loop system — serving as both the card issuer and the payment network — to validate agent-led transactions.  Luke Gebb, Amex’s EVP and global head of innovation, told VentureBeat that the company believes this model is the missing piece in agentic commerce.   “Some of what is missing so far is the perspective of a company like ours: We feel that trust and security are critical to advancing this space,” Gebb said. “This is really the first time …

Cheaper tokens, bigger bills: The new math of AI infrastructure

Cheaper tokens, bigger bills: The new math of AI infrastructure

Presented by Nutanix As enterprises move from AI experimentation into production deployment, the primary cost driver has shifted away from foundation model training and toward the infrastructure required to run thousands of concurrent inference workloads at scale, with agentic AI as the accelerant. Where early enterprise AI projects involved a handful of large, scheduled training jobs, production agentic environments require continuous support for short-lived, unpredictable requests that consume GPU, networking, and storage resources in ways traditional infrastructure was never designed to handle. For enterprise technology leaders, that shift is turning infrastructure efficiency into a make-or-break factor in AI economics. “Every employee with an AI assistant, every automated workflow, every agent pipeline needs models for inferencing and generates a lot of tokens,” says Anindo Sengupta, VP of products at Nutanix. “Those inferencing requests land on a GPU infrastructure, traverse specialized networks, and pull data from storage systems purpose built to support these AI workloads.” Why cost per token is becoming a core infrastructure metric Inference costs per token have dropped by roughly an order of magnitude …

President Trump Faces Renewed Backlash As Trump-Linked Tokens Crash

President Trump Faces Renewed Backlash As Trump-Linked Tokens Crash

Authored by Vince Quill via CoinTelegraph.com, United States President Donald Trump is facing renewed scrutiny as crypto tokens and projects touted by the US president crash to all-time lows or sit near record low levels. The Official Trump token (TRUMP), a memecoin pushed by Trump, hit an all-time low of about $2.73 in March 2026 and is currently trading at about $2.86, according to data from CoinGecko. The TRUMP memecoin has plummeted in price since launching in January 2025. Source: CoinGecko The governance token issued by World Liberty Financial (WLFI), a decentralized finance (DeFi) platform co-founded by Trump’s sons, sunked to an all-time low of just $0.07 on Saturday. WLFI is down by nearly 75% from its all-time high of about $0.31 reached in September 2025, while the TRUMP memecoin is down by about 90% since its all-time high of over $73 reached in January 2025.  The WLFI token has crashed by nearly 75% since the all-time high reached in September 2025. Source: CoinMarketCap “We thought Sam Bankman-Fried or Gary Gensler were the worst things to happen to the crypto industry, …

Are AI tokens the new signing bonus or just a cost of doing business?

Are AI tokens the new signing bonus or just a cost of doing business?

This week, a topic that has been boomeranging around Silicon Valley bounced into the spotlight: AI tokens as compensation. The idea is straightforward enough — rather than giving engineers only salary, equity, and bonuses, companies would also hand them a budget of AI tokens, the computational units that power tools like Claude, ChatGPT, and Gemini. Spend them to run agents, automate tasks, crank through code. The pitch is that access to more compute makes engineers more productive, and that more productive engineers are worth more. It’s an investment in the person holding them, is the idea. Jensen Huang, the leather-jacket-wearing CEO of Nvidia, seemed to capture everyone’s imagination when he floated the notion at the company’s annual GTC event earlier this week that engineers should receive roughly half their base salary again — in tokens. His top people, by his math, might burn through $250,000 a year in AI compute. He called it a recruiting tool and predicted it would become standard across Silicon Valley. It isn’t entirely clear where the idea was first, well, …

OpenAI GPT-5.4 Long-Context Upgrade: 1 Million Tokens and Large Projects

OpenAI GPT-5.4 Long-Context Upgrade: 1 Million Tokens and Large Projects

The recent leak of GPT-5.4 has sparked significant interest, offering a detailed glimpse into the model’s upgraded capabilities. Universe of AI highlights one of the standout features: an expanded context window capable of processing up to 1 million tokens in a single session. This improvement allows users to handle large-scale data more efficiently, whether analyzing entire books, managing complex coding projects, or processing extensive datasets without losing context. Such advancements aim to address existing limitations in workflows that demand comprehensive analysis and continuity. In this breakdown, you’ll explore how ChatGPT 5.4’s Extreme Reasoning Mode enhances its ability to tackle intricate multi-step problems, making it a valuable resource for professionals in fields like engineering, law and science. Additionally, its improved memory retention across sessions ensures smoother collaboration on long-term projects. The guide also covers its lower error rates and expanded automation capabilities, offering insights into how these features can streamline operations and improve accuracy across industries. GPT-5.4 Features Overview TL;DR Key Takeaways : ChatGPT 5.4 features an expanded context window capable of processing up to 1 …

Stripe wants to turn your AI costs into a profit center

Stripe wants to turn your AI costs into a profit center

Stripe on Monday released a preview of a new feature that could help AI startups (and other companies) solve the problem of passing through the underlying costs of AI model usage to their customers. Stripe’s feature, however, goes even further than just passing through the costs of the tokens. It allows startups to charge a markup percentage on token usage. So a company can, for instance, charge an automatic 30% above the cost of the tokens that the startup will pay the model maker. As Stripe described it, “Say you’re building an AI app: you want a consistent 30% margin over raw LLM token costs across providers. Billing automates the process.” The billing feature lets the startup pick the AI models it uses. It tracks the API prices of those models. It then records the customers’ token usage and applies the profit-margin markup automatically. As we’ve previously reported, there are a variety of ways that AI startups are charging for their wares. Many of them charge tiered monthly subscriptions that have usage-rate caps; once those …

8 billion tokens a day forced AT&T to rethink AI orchestration — and cut costs by 90%

8 billion tokens a day forced AT&T to rethink AI orchestration — and cut costs by 90%

When your average daily token usage is 8 billion a day, you have a massive scale problem. This was the case at AT&T, and chief data officer Andy Markus and his team recognized that it simply wasn’t feasible (or economical) to push everything through large reasoning models. So, when building out an internal Ask AT&T personal assistant, they reconstructed the orchestration layer. The result: A multi-agent stack built on LangChain where large language model “super agents” direct smaller, underlying “worker” agents performing more concise, purpose-driven work. This flexible orchestration layer has dramatically improved latency, speed and response times, Markus told VentureBeat. Most notably, his team has seen up to 90% cost savings. “I believe the future of agentic AI is many, many, many small language models (SLMs),” he said. “We find small language models to be just about as accurate, if not as accurate, as a large language model on a given domain area.” Most recently, Markus and his team used this re-architected stack along with Microsoft Azure to build and deploy Ask AT&T Workflows, …

OpenCode Tutorial : Run Parallel AI Tasks & Track Tokens Easily

OpenCode Tutorial : Run Parallel AI Tasks & Track Tokens Easily

What if you could master an innovative platform that transforms your AI development workflow in less time than it takes to watch an episode of your favorite show? Below Keith explores how OpenCode, a powerful task orchestration platform, can transform the way developers manage complex projects by allowing seamless integration with AI models, automating repetitive tasks, and offering unparalleled customization. Whether you’re juggling multiple projects or trying to streamline token usage, OpenCode promises to simplify even the most intricate workflows. With its ability to execute tasks in parallel and adapt to your unique preferences, this platform is quickly becoming an essential for developers aiming to maximize efficiency. In this breakdown, you’ll uncover the core features that make OpenCode a fantastic option, from its intuitive installation options to its advanced capabilities like skill automation and token tracking. Imagine being able to orchestrate multiple AI-driven tasks simultaneously while staying in full control of your workflow. This guide will walk you through the essentials, making sure you can confidently harness the platform’s potential in under 30 minutes. Whether …

MIT’s new ‘recursive’ framework lets LLMs process 10 million tokens without context rot

MIT’s new ‘recursive’ framework lets LLMs process 10 million tokens without context rot

Recursive language models (RLMs) are an inference technique developed by researchers at MIT CSAIL that treat long prompts as an external environment to the model. Instead of forcing the entire prompt into the model’s context window, the framework allows the LLM to programmatically examine, decompose, and recursively call itself over snippets of the text. Rather than expanding context windows or summarizing old information, the MIT team reframes long-context reasoning as a systems problem. By letting models treat prompts as something they can inspect with code, recursive language models allow LLMs to reason over millions of tokens without retraining. This offers enterprises a practical path to long-horizon tasks like codebase analysis, legal review, and multi-step reasoning that routinely break today’s models. Because the framework is designed as a wrapper around existing models, it can serve as a drop-in replacement for applications that make direct calls to LLMs. The LLM context problem While frontier models are becoming increasingly sophisticated at reasoning, their ability to process massive amounts of information is not scaling at the same rate. This …

Claude Code MCP Upgrade 2026 : Cut Tokens by 95% with Smart Loading

Claude Code MCP Upgrade 2026 : Cut Tokens by 95% with Smart Loading

What if you could make your workflows not just faster, but ten times faster? Better Stack outlines how Claude Code’s latest update has transformed Model-Connected Plugin (MCP) functionality, delivering a staggering boost in speed and efficiency. By tackling long-standing challenges like token inefficiency and operational errors, this breakthrough introduces a smarter, leaner way to handle large language models. Imagine cutting token usage by up to 95% while maintaining precision and control, this isn’t just incremental progress; it’s a paradigm shift in how we think about performance and scalability in AI-driven systems. In this deep dive, we’ll explore the two innovative optimization strategies that make this leap possible: search-based selection and programmatic orchestration. Whether you’re intrigued by the simplicity of dynamically loading only the most relevant features or the advanced customization offered by programmatic control, there’s something here to transform how you approach complex workflows. Along the way, you’ll uncover how these updates address critical issues like naming collisions and command injections, paving the way for more secure and efficient applications. The implications are profound, how …