In the wake of the human myths, OpenAI has a new model and strategy for cybersecurity

OpenAI on Tuesday Today it announced the next phase of its cybersecurity strategy and a new model designed specifically for use by digital defenders, GPT-5.4-Cyber.

This news comes on the heels of competitor Anthropic announcing last week that it is new Claude Mythos Preview The model is only being released privately at the moment, because, as the company says, it might be It is exploited by hackers and bad actors. Anthropic also announced an industry alliance, including competitors like Google, focused on how advances in generative AI across the field will impact cybersecurity.

OpenAI appeared to seek to differentiate its message on Tuesday by using a less catastrophic tone and touting existing guardrails and defenses while noting the need for more advanced protection in the long term.

“We believe the class of safeguards in use today sufficiently mitigates cyber risks enough to support widespread deployment of existing models,” the company wrote in a blog post. “We expect that versions of these safeguards will be sufficient for more robust upcoming models, while models that are explicitly trained and made more permissive in cybersecurity work will require more restrictive deployments and appropriate controls. In the long term, to ensure the continued proficiency of AI safety in cybersecurity, we also anticipate the need for more extensive defenses for future models, whose capabilities will quickly exceed even the best purpose-built models today.”

The company says it has focused on three pillars of its cybersecurity approach. The first involves so-called “know your customer” validation systems to allow controlled access to new models that are as broad and “democratic” as possible. “We are designing mechanisms that avoid arbitrary decision-making about who has access for legitimate use and who cannot,” the company wrote on Tuesday. OpenAI combines a model in which it collaborates with specific organizations on limited releases with an automated system introduced in February, known as Trusted Access for Cyber, or TAC.

The second element of the strategy involves “iterative deployment,” or the process of launching new capabilities “carefully” and then refining them so that the company can get realistic insight and feedback. The blog post specifically highlights “resilience to jailbreaks and other adversarial attacks, and improved defensive capabilities.” Finally, a third focus is on investments that the company says support software security and other digital defenses as generative AI spreads.

OpenAI says the initiative fits within its broader security efforts, including an AI agent for application security launched last month known as Codex Security, a cybersecurity grant program that began in 2023, a recent donation to the Linux Foundation to support open source security, and a “preparedness framework” that aims to assess and defend against “serious harms from frontier AI capabilities.”

Anthropic’s claims last week that more capable AI models require cybersecurity accounting have been controversial among security experts. Some say these fears are overblown and could fuel a new wave of anti-hacker sentiment, further consolidating power with the tech giants. However, others assert that the vulnerabilities and deficiencies of current security defenses are well known and could already be exploited with new speed and intensity by a broader range of bad actors in the era of agentic AI.

Leave a ReplyCancel Reply