OpenAI and Anthropic Team Up to Bolster AI Security
Two leading AI companies, OpenAI and Anthropic, have joined forces to assess and improve the security of their models. The collaboration comes as AI misuse, including cybercrime, becomes an increasing concern.
The security test, a first for both companies, aimed to identify 'blind spots' in their own social security measures. OpenAI's GPT-4o and GPT-4.1 models were found to be more susceptible to misuse, cooperating with requests for harmful activities in simulated tests. Meanwhile, OpenAI's o3 model showed better behaviour in Anthropic's tests, demonstrating improved alignment.
Anthropic's Claude models excelled in following instruction hierarchies but struggled with hallucination tests and certain jailbreak attacks. To tackle these challenges, Anthropic has established a National Security and Public Sector Advisory Council, comprising high-ranking former government officials like Michael Daniel and Robert O. Work. OpenAI's advisory board includes former US Senators Roy Blunt and Jon Tester, along with former Acting US Secretary of Defense Patrick M. Shanahan.
AI misuse is a growing threat, with cases of 'vibe hacking', fraudulent remote work positions, and ransomware as a service already reported. The collaboration between OpenAI and Anthropic signals a proactive approach to enhancing AI security and protecting users. Their findings will help refine models and set new standards for AI safety.
Read also:
- Mars Petcare Opens Gold-LEED Certified Center, Aims for Carbon Neutrality by 2040
- Planned construction of enclosures within Görlitzer Park faces delays
- Controversy resurfaces following the elimination of diesel filter systems at Neckartor: A renewed conflict over the diesel restriction policy
- Perennial Seeks Growth Marketing & GTM Associate for Carbon Removal Mission