OpenAI partnered with Anthropic for a first-of-its-kind alignment evaluation exercise, with the aim of finding gaps in the other company’s internal safety measures. The findings from this collaboration were shared publicly on Wednesday, highlighting interesting behavioural insights about their popular artificial intelligence (AI) models. OpenAI found Claude models to be more prone to jailbreaking attempts compared to its o3 and o4-mini models.
Tech
OpenAI Partners With Anthropic to Find Safety Flaws in Each Other’s AI Models
by aweeincm1

Recent Post
Adani Cement To Deploy World’s 1st ‘RotoDynamic Heater’ To Reduce Emissions
Adani Cement and Finnish company Coolbrook on Wednesday announced to ... Read more
Sample Taken From Delhi Blast Site More Powerful Than Ammonium Nitrate: Sources
Not just ammonium nitrate, high-grade explosives were also used in ... Read more
Bilaspur Train Accident: Probe Reveals Loco Pilot Didn’t Clear Aptitude Test
At least eleven people were killed and 20 were injured ... Read more
Slow-Moving Traffic, Then Big Explosion: CCTV Footage Of Delhi Red Fort Blast
Video footage of the bomb blast that ripped through Delhi’s ... Read more