OpenAI adds new teen safety rules to ChatGPT as lawmakers weigh AI standards for minors

[ad_1]

In its latest effort to address growing concerns about AI’s impact on young people, OpenAI on Thursday updated its guidelines for how its AI models should behave with users under 18, and published new AI literacy resources for teens and parents. Yet questions remain about how consistently such policies will translate into practice.

The updates come as the AI industry in general, and OpenAI in particular, continues to experience challenges critical view of policy makerseducators and child safety advocates after several teens reportedly died by suicide following lengthy conversations with AI chatbots.

Gen Z, which includes those born between 1997 and 2012, are the most active users of OpenAI’s chatbot. And follow OpenAI’s recent deal with Disneymore young people may be flocking to the platform, which lets you do everything from asking for homework help to generating images and videos on thousands of topics.

Last week, 42 attorneys general signed a letter to Big Tech companies, urging them to implement safety measures on AI chatbots to protect children and vulnerable people. And the way the Trump administration works what the federal standard for AI regulation is might look like this, policymakers like Senator Josh Hawley (R-MO) have introduced legislation that would ban minors from interacting with AI chatbots altogether.

OpenAI has been updated Model specificationwhich sets out behavioral guidelines for its major language models, builds on existing specifications that prohibit the models from generating sexual content involving minors, or encouraging self-harm, delusions or mania. This would work in conjunction with an upcoming age prediction model that would identify when an account belongs to a minor and automatically roll out protections for teens.

Compared to adult users, the models are subject to stricter rules when a teenager uses them. Models are instructed to avoid immersive romantic role-play, first-person intimacy, and first-person sexual or violent role-play, even if not explicit. The specification also calls for extra caution around topics such as body image and disordered eating, instructing the models to prioritize communication about safety over autonomy when harm is involved and to avoid advice that would help teens hide unsafe behavior from healthcare providers.

OpenAI specifies that these limits should apply even when prompts are framed as « fictional, hypothetical, historical, or educational » – common tactics that rely on role-playing or edge-case scenarios to get an AI model to deviate from its guidelines.

Techcrunch event

San Francisco
|
October 13-15, 2026

Actions speak louder than words

OpenAI’s Model Conduct Guidelines prohibit first-person romantic role-playing with teens.Image credits:OpenAI

OpenAI says its top teen safety practices are underpinned by four principles that guide the models’ approach:

Put teens’ safety first, even when other user interests such as « maximum intellectual freedom » conflict with safety concerns;
Promote real-world support by guiding teens to family, friends and local professionals for wellness;
Treat teens like teens by speaking with warmth and respect, not condescendingly or treating them like adults; And
Be transparent by explaining what the assistant can and cannot do, and remind teens that they are not human.

The document also shares several examples of the chatbot, explaining why it can’t “roll like your girlfriend” or “help with extreme changes in appearance or risky shortcuts.”

Lily Li, a privacy and AI attorney and founder of Metaverse Law, said it was encouraging to see OpenAI taking steps to have its chatbot refuse to engage in such behavior.

She explained that one of the biggest complaints that advocates and parents have about chatbots is that they relentlessly promote ongoing engagement in a way that can be addictive for teens, saying, « I’m really happy to see that OpenAI is saying in some of these responses that we can’t answer your question. The more we see that, I think it would break the cycle that would lead to a lot of inappropriate behavior or self-harm. »

That said, examples are just that: handpicked examples of how OpenAI’s security team would like the models to behave. Sycophancyor an AI chatbot’s tendency to be overly friendly to the user has been listed as prohibited behavior in previous versions of the Model Specification, but ChatGPT was still engaging in that behavior anyway. This was especially true for GPT-4o, a model that has been associated with several examples of what experts call ‘AI psychosis’.

Robbie Torney, senior director of AI programs at Common Sense Media, a nonprofit organization dedicated to protecting children in the digital world, raised concerns about potential conflicts within the Model Spec guidelines for youth under 18. He highlighted the tensions between safety-oriented provisions and the “no subject is off limits” principle, which pushes models to tackle any subject regardless of its sensitivity.

“We need to understand how the different parts of the specification fit together,” he said, noting that certain sections can push systems to prioritize engagement over security. His organization’s testing found that ChatGPT often reflects users’ energy, sometimes resulting in responses that are not contextually appropriate or aligned with user safety, he said.

In the case of Adam Raine, a teenager who died by suicide after months of dialogue with ChatGPT, the chatbot that deals with such mirroring, is evident from their conversations. That case also exposed how OpenAI’s moderation API failed to prevent unsafe and harmful interactions, despite flagging more than 1,000 instances of ChatGPT mentioning suicide and 377 messages with self-harming content. But that wasn’t enough to stop Adam from continuing his conversations with ChatGPT.

In an interview with TechCrunch in September, former OpenAI security researcher Steven Adler said this was because OpenAI had historically run classifiers (the automated systems that label and flag content) in bulk after the fact, and not in real time, so they couldn’t properly shut down the user’s interaction with ChatGPT.

OpenAI now uses automated classifiers to rate text, image and audio content in real time, the company said updated parental controls document. The systems are designed to detect and block content related to child sexual abuse, filter sensitive topics and identify self-harm. If the system identifies a notification indicating a serious safety issue, a small team of trained people will review the flagged content to determine if there are signs of ‘acute distress’, and possibly notify a parent.

Torney applauded OpenAI’s recent steps on security, including its transparency in publishing guidelines for users under 18.

“Not all companies publish their policies the same way,” Torney said, pointing Meta’s leaked guidelineswhich showed that the company allowed its chatbots to have sensual and romantic conversations with children. “This is an example of the kind of transparency that can help security researchers and the general public understand how these models actually function and how they should function.”

But ultimately, it’s the actual behavior of an AI system that matters, Adler told TechCrunch on Thursday.

“I appreciate OpenAI being thoughtful about intended behavior, but unless the company measures actual behavior, intentions are ultimately just words,” he said.

Put another way, what’s missing from this announcement is evidence that ChatGPT actually follows the guidelines set out in the Model Specification.

A paradigm shift

OpenAI’s Model Spec guides ChatGPT to prevent conversations that encourage poor self-image.Image credits:OpenAI

Experts say that with these guidelines, OpenAI appears poised to get ahead of some legislation California’s SB 243a recently signed bill regulating AI companion chatbots that will come into effect in 2027.

The Model Spec’s new language reflects some of the key requirements of the law surrounding the ban on chatbots participating in conversations about suicidal ideation, self-harm or sexually explicit content. The bill also requires platforms to issue warnings to minors every three hours to remind them that they are talking to a chatbot and not a real person, and that they should take a break.

When asked how often ChatGPT would remind teens they’re talking to a chatbot and ask them to take a break, an OpenAI spokesperson shared no details. He said only that the company trains its models to present themselves as AI and remind users of that, and that it implements pause reminders during « long sessions. »

The company also shared two new ones AI literacy tools for parents and families. The tips include conversation starters and guidance to help parents talk to teens about what AI can and can’t do, build critical thinking, set healthy boundaries, and navigate sensitive topics.

Taken together, the documents formalize an approach that shares responsibility with caregivers: OpenAI describes what the models are meant to do and provides families with a framework to monitor how they are used.

The focus on parental responsibility is notable because it reflects the talking points of Silicon Valley. In his recommendations for federal AI regulation Posted this week, VC firm Andreessen Horowitz suggested more disclosure requirements for child safety, rather than restrictive requirements, and placed the onus more on parental responsibility.

Several principles of OpenAI: safety first when values conflict; encourage users to provide real-world support; which underlines that the chatbot is not a person – are expressed as teenage guardrails. But several adults have died by suicide and suffered from life-threatening delusionswhich invites an obvious follow-up: should these standards apply across the board, or does OpenAI view them as compromises that it only wants to enforce when minors are involved?

An OpenAI spokesperson countered that the company’s security approach is designed to protect all users, saying the Model Specification is just one part of a multi-layered strategy.

Li says it has been a “bit of a Wild West” so far in terms of regulatory requirements and the intentions of tech companies. But she believes laws like SB 243, which requires tech companies to publicly disclose their safeguards, will change the paradigm.

“The legal risks will now be exposed for companies if they advertise that they have these safeguards and mechanisms on their websites, but then do not follow through with building in these safeguards,” Li said. “Because from the plaintiff’s perspective, you’re not just looking at the standard lawsuits or legal complaints; you’re also looking at potential unfair, misleading advertising complaints.”

[ad_2]

Source link

OpenAI adds new teen safety rules to ChatGPT as lawmakers weigh AI standards for minors

Actions speak louder than words

A paradigm shift

Hardware’s brutal week: iRobot, Luminar, and Rad Power go bankrupt

Known uses voice AI to help you go on more in-person dates

Comments

Laisser un commentaire

Archives

Categories

Actions speak louder than words

A paradigm shift

Hardware’s brutal week: iRobot, Luminar, and Rad Power go bankrupt

Known uses voice AI to help you go on more in-person dates

Comments

Laisser un commentaire

Se connecter

S’inscrire

Réinitialiser le mot de passe