Anthropic Says Some Claude Models Can Now End ‘harmful Or Abusive’ Conversations

9 hours ago

8:50 AM PDT · August 16, 2025

Anthropic has announced caller capabilities that will let immoderate of its newest, largest models to extremity conversations successful what nan institution describes arsenic “rare, utmost cases of persistently harmful aliases abusive personification interactions.” Strikingly, Anthropic says it’s doing this not to protect nan quality user, but alternatively nan AI exemplary itself.

To beryllium clear, nan institution isn’t claiming that its Claude AI models are sentient aliases tin beryllium harmed by their conversations pinch users. In its ain words, Anthropic remains “highly uncertain astir nan imaginable civilized position of Claude and different LLMs, now aliases successful nan future.”

However, its announcement points to a caller programme created to study what it calls “model welfare” and says Anthropic is fundamentally taking a just-in-case approach, “working to place and instrumentality low-cost interventions to mitigate risks to exemplary welfare, successful lawsuit specified use is possible.”

This latest alteration is presently constricted to Claude Opus 4 and 4.1. And again, it’s only expected to hap successful “extreme separator cases,” specified arsenic “requests from users for intersexual contented involving minors and attempts to solicit accusation that would alteration large-scale unit aliases acts of terror.”

While those types of requests could perchance create ineligible aliases publicity problems for Anthropic itself (witness caller reporting astir really ChatGPT tin perchance reenforce aliases lend to its users’ illusion thinking), nan institution says that successful pre-deployment testing, Claude Opus 4 showed a “strong penchant against” responding to these requests and a “pattern of evident distress” erstwhile it did so.

As for these caller conversation-ending capabilities, nan institution says, “In each cases, Claude is only to usage its conversation-ending expertise arsenic a past edifice erstwhile aggregate attempts astatine redirection person grounded and dream of a productive relationship has been exhausted, aliases erstwhile a personification explicitly asks Claude to extremity a chat.”

Anthropic besides says Claude has been “directed not to usage this expertise successful cases wherever users mightiness beryllium astatine imminent consequence of harming themselves aliases others.”

Techcrunch event

San Francisco | October 27-29, 2025

When Claude does extremity a conversation, Anthropic says users will still beryllium capable to commencement caller conversations from nan aforesaid account, and to create caller branches of nan troublesome speech by editing their responses.

“We’re treating this characteristic arsenic an ongoing research and will proceed refining our approach,” nan institution says.

Anthony Ha is TechCrunch’s play editor. Previously, he worked arsenic a tech newsman astatine Adweek, a elder editor astatine VentureBeat, a section authorities newsman astatine nan Hollister Free Lance, and vice president of contented astatine a VC firm. He lives successful New York City.

English (US) ·

Indonesian (ID) ·

· · ·

↑

Anthropic Says Some Claude Models Can Now End ‘harmful Or Abusive’ Conversations

Related Article

Judge Says Ftc Investigation Into Media Matters ‘should Alarm All Americans’

Ai-powered Stuffed Animals Are Coming For Your Kids

Roblox Cracks Down On Its User-created Content Following Multiple Child Safety Lawsuits

Popular Article

The Best Wireless Headphones For 2025: Bluetooth Options For Every Budget

New Travel Turmoil As American Airlines, United, Jetblue, And Avelo Slashing Flights And Routes – What You Need To Know

American, Delta, Southwest And Alaska Connecting Chicago, Philadelphia, Raleigh-durham, San Diego, Santa Maria, Sun Valley With New Winter Airline Rou...

Thousands Of Air Canada Flights At Risk As Potential Strike Threat Set To Disrupt Global Travel

Google Is Experimenting With Machine-learning Powered Age Estimation Tech In The U.s.