OpenAI recommends in a new guide for GPT-5.5 that users should not reuse old prompts but instead start fresh with minimal, result-oriented instructions. Overly detailed process specifications carried ...
Update from April 25, 2026: GPT-5.5 also stumbles on the BullshitBench. The benchmark throws 100 questions at a model across five fields—software, finance, law, physics, and medicine—that sound ...
In a week-long experiment called "Project Deal," Anthropic had AI agents from the Claude model family independently negotiate and trade real goods on behalf of employees. The more capable Claude Opus ...
The United Arab Emirates is rolling out what may be the world's most ambitious government AI overhaul. Within two years, the country wants 50 percent of all government sectors, services, and processes ...
Canadian AI company Cohere is acquiring Heidelberg-based startup Aleph Alpha, with the combined company valued at around $20 billion. The Schwarz Group is investing $600 million and providing the ...
Jerry Tworek, a former OpenAI researcher, has unveiled his new AI lab, "Core Automation," with the goal of building "the most automated AI lab in the world," starting by automating its own research.
Chinese AI lab Deepseek has released V4-Pro and V4-Flash as open-weight models with up to 1.6 trillion parameters and a one-million-token context window. A new architecture dramatically cuts the ...
Following user complaints about declining quality in Claude Code, Anthropic identified and fixed three separate bugs affecting reasoning depth, caching, and text length restrictions. To prevent ...
Google is doubling down on AI coding, using more AI internally and aiming for models that can eventually improve themselves. Google Deepmind has put together a specialized team of researchers and ...
Sony's table tennis robot "Ace" is the first robot to reach expert-level performance in a sport, according to the company. In matches played under the official rules of the International Table Tennis ...
A small group of unauthorized users gained access to Anthropic's new AI model Claude Mythos, Bloomberg reports. Anthropic considers Mythos powerful enough to enable dangerous cyberattacks, which is ...
Moonshot AI has released Kimi K2.6 as an open-weight model. It's built to match GPT-5.4 and Claude Opus 4.6 on coding benchmarks, and it can run up to 300 agents in parallel. Moonshot AI says K2.6 ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results