OpenAI and Anthropic will start predicting when users are underage

OpenAI and Anthropic are rolling out new ways to detect underage users. As OpenAI updated its guidelines for how ChatGPT interacts with users ages 13-17, Anthropic is working on a new way to identify and operationalize users under 18.

On Thursday, OpenAI Announce it ChatGPT Typical specifications – Guidelines for how its chatbot should behave – will include four new principles for users under 18 years old. Now, ChatGPT aims to “put teen safety first, even when it conflicts with other goals.” This means steering teens toward safer options when other user interests, such as “maximum intellectual freedom,” conflict with safety concerns.

It also says ChatGPT should “enhance real-world support,” including by encouraging offline relationships, while outlining how ChatGPT should set clear expectations when interacting with younger users. The model’s specifications say ChatGPT should “treat teens like teens” by offering “warmth and respect” rather than giving condescending answers or treating teens like adults.

OpenAI says updating the ChatGPT model specification should lead to “stronger guardrails, safer alternatives, and encouragement for reliable offline support when conversations move into more risky territory.” The company adds that ChatGPT will prompt teens to contact emergency services or crisis resources if there are signs of “imminent danger.”

Along with this change, OpenAI says it is in the “early stages” of launching an age prediction model that will attempt to estimate someone’s age. If it detects that someone may be under 18, OpenAI will automatically implement teen protections. It will also give adults the opportunity to verify their age if they have been incorrectly reported by the system.

Anthropic, which doesn’t allow users under 18 to chat with Claude, is Generalization of measures Which will be used to detect and disable underage user accounts. It is developing a new system capable of detecting “subtle conversational signs that a user may be underage,” and says it is already flagging users who identify themselves as underage during conversations.

Anthropic also explains how she trained Claude to respond to prompts about suicide and self-harm, as well as the progress she made in reducing flattery, which can reaffirm harmful thinking. The company says its latest models are “the least flatter yet,” with the Haiku 4.5 performing the best, correcting its flatter behavior 37 percent of the time.

“On the face of it, this evaluation shows that there is significant room for improvement for all of our models,” Anthropic says. “We think the results reflect a trade-off between typical warmth or friendliness on the one hand, and ingratiation on the other.”

December 18 update: It is clarified that Anthropic does not allow users under the age of 18 to use Claude.

Leave a ReplyCancel Reply