Vallone, who spent three years at OpenAI and built out the “model policy” research team there, worked on how to best deploy ...
Anthropic published Claude's constitution—a document that teaches the AI to behave ethically and even refuse orders from the ...
Every now and then, researchers at the biggest tech companies drop a bombshell. There was the time Google said its latest quantum chip indicated multiple universes exist. Or when Anthropic gave its AI ...
If we do not address AI's sycophancy problem, we risk AI becoming "a giant mirror to our illusions." ...
AI researchers from OpenAI, Google DeepMind, Anthropic, and a broad coalition of companies and nonprofit groups, are calling for deeper investigation into techniques for monitoring the so-called ...
Add Yahoo as a preferred source to see more of our stories on Google. Large language models are learning how to win—and that’s the problem. In a research paper published Tuesday titled "Moloch’s ...
Several frontier AI models show signs of scheming. Anti-scheming training reduced misbehavior in some models. Models know they're being tested, which complicates results. New joint safety testing from ...
In an era of AI “hype,” I sometimes find that something critical is lost in the conversation. Specifically, there’s a yawning gap between AI research and real-world application. Though many ...