All posts tagged: harnesses

Alibaba’s proprietary Qwen3.7-Max can run for 35 hours autonomously and supports external harnesses like Anthropic’s Claude Code

Alibaba’s proprietary Qwen3.7-Max can run for 35 hours autonomously and supports external harnesses like Anthropic’s Claude Code

The AI industry has fully entered the “agent era,” a paradigm where AI models do far more than generate text — they now actively plan, execute, and course-correct complex tasks over days rather than seconds. Thus, it’s perhaps unsurprising to see Chinese e-commerce giant Alibaba’s famed Qwen Team of AI researchers release a model capable of performing autonomous agentic AI work over multiple days: that model has arrived in the form of Qwen3.7-Max which the company reports in a blog post achieved “~35 hours of continuous autonomous execution” — albeit, in a proprietary, not open source format, as prior Qwen Team releases were. This is also to be expected — it’s what many analysts and industry experts feared in the wake of the departure of several key Qwen Team leaders earlier this year. But it makes sense for Alibaba financially, at least in the short term: training AI models, especially ones as powerful as Qwen3.7-Max, is expensive, and giving them away essentially for free, as open source models are, does not immediately help recoup any …

Mystery solved: Anthropic reveals changes to Claude’s harnesses and operating instructions likely caused degradation

Mystery solved: Anthropic reveals changes to Claude’s harnesses and operating instructions likely caused degradation

For several weeks, a growing chorus of developers and AI power users claimed that Anthropic’s flagship models were losing their edge. Users across GitHub, X, and Reddit reported a phenomenon they described as “AI shrinkflation”—a perceived degradation where Claude seemed less capable of sustained reasoning, more prone to hallucinations, and increasingly wasteful with tokens. Critics pointed to a measurable shift in behavior, alleging that the model had moved from a “research-first” approach to a lazier, “edit-first” style that could no longer be trusted for complex engineering. While the company initially pushed back against claims of “nerfing” the model to manage demand, the mounting evidence from high-profile users and third-party benchmarks created a significant trust gap. Today, Anthropic addressed these concerns directly, publishing a technical post-mortem that identified three separate product-layer changes responsible for the reported quality issues. “We take reports about degradation very seriously,” reads Anthropic’s blog post on the matter. “We never intentionally degrade our models, and we were able to immediately confirm that our API and inference layer were unaffected.” Anthropic claims it …

Anthropic cracks down on unauthorized Claude usage by third-party harnesses and rivals

Anthropic cracks down on unauthorized Claude usage by third-party harnesses and rivals

Anthropic has confirmed the implementation of strict new technical safeguards preventing third-party applications from spoofing its official coding client, Claude Code, in order to access the underlying Claude AI models for more favorably pricing and limits — a move that has disrupted workflows for users of popular open source coding agent OpenCode. Simultaneously but separately, it has restricted usage of its AI models by rival labs including xAI (through the integrated developer environment Cursor) to train competing systems to Claude Code. The former action was clarified on Friday by Thariq Shihipar, a Member of Technical Staff at Anthropic working on Claude Code. Writing on the social network X (formerly Twitter), Shihipar stated that the company had “tightened our safeguards against spoofing the Claude Code harness.” He acknowledged that the rollout had unintended collateral damage, noting that some user accounts were automatically banned for triggering abuse filters—an error the company is currently reversing. However, the blocking of the third-party integrations themselves appears to be intentional. The move targets harnesses—software wrappers that pilot a user’s web-based Claude …