
Anthropic Just Dropped Its Safety Pledge. Here's What It Means for Your Business.
The Company We Build On Just Changed the Rules
Full transparency: we use Claude every day. It powers our content generation, our analysis workflows, our development tools. When we recommend AI to clients, Claude is usually the first name out of our mouths.
So when TIME reported that Anthropic—the company behind Claude—just scrapped the central promise of its flagship safety policy, we paid attention. You should too.
Here's what happened, what it actually means, and what you should do about it.
What Anthropic Promised (and Just Took Back)
In 2023, Anthropic introduced something called the Responsible Scaling Policy (RSP). The core pledge was straightforward: Anthropic would never train an AI system unless it could guarantee in advance that adequate safety measures were in place.
That was the whole point. A hard line. A tripwire. If safety couldn't be demonstrated before training a more powerful model, training wouldn't happen.
As of this week, that promise is gone.
The new RSP v3.0, approved unanimously by CEO Dario Amodei and the board, replaces that hard commitment with something softer. Anthropic will now "delay" development only if leadership considers the company to be leading the AI race and believes catastrophic risks are significant.
That's a lot of conditions on what used to be unconditional.
Why They Did It
Anthropic's chief science officer Jared Kaplan put it bluntly to TIME: "We felt that it wouldn't actually help anyone for us to stop training AI models." He added that making unilateral safety commitments didn't make sense "if competitors are blazing ahead."
Three forces drove the change:
The race intensified. OpenAI, Google DeepMind, and others are pushing capabilities forward fast. Anthropic's original hope was that the RSP would inspire rivals to adopt similar measures. Some did—but nobody matched Anthropic's hard commitment to halt development. Pausing while others sprint isn't a safety strategy. It's a way to lose relevance.
Regulation never came. When the RSP launched in 2023, there was real momentum toward federal AI regulation and even international treaties. That momentum died. The Trump administration has endorsed what critics call a "let it rip" approach to AI development, even attempting to nullify state-level regulations. No federal AI law is on the horizon.
The science got harder. AI safety evaluations turned out to be more complicated than anyone expected. In 2025, Anthropic announced it couldn't rule out the possibility of its models facilitating bioterrorism—but also couldn't prove they did. What they imagined would be a bright red line turned out to be, in Kaplan's words, "a fuzzy gradient."
And then there's the Pentagon. Defense Secretary Pete Hegseth reportedly gave Anthropic until Friday to grant the military "unfettered access" to Claude or face penalties—including invoking the Defense Production Act and canceling a $200 million contract. Claude is reportedly the Pentagon's primary AI system for sensitive defense work. When your biggest customer has that kind of leverage, the pressure is real.
What the New Policy Actually Says
Let's be fair: Anthropic didn't abandon safety entirely. The new RSP v3.0 includes real commitments:
- Risk Reports published every 3-6 months, detailing how capabilities, threats, and mitigations fit together
- Frontier Safety Roadmaps with public goals across security, alignment, safeguards, and policy
- Competitive parity — a commitment to match or exceed the safety efforts of any competitor
- Conditional delays if leadership believes Anthropic is leading the race and catastrophic risks are significant
That's more transparency than most AI companies offer. But it's also fundamentally different from a hard commitment to stop.
The old policy said: We won't build it until we can prove it's safe.
The new policy says: We'll try really hard to make it safe, and we'll tell you about it.
Those aren't the same thing.
The "Frog-Boiling" Problem
Chris Painter, director of policy at METR (a nonprofit focused on evaluating AI models for risky behavior), reviewed the new policy. His reaction was measured but sobering.
He warned that removing binary thresholds—the hard tripwires that would halt development—could enable a "frog-boiling" effect, where danger ramps up gradually without any single moment that sets off alarms.
His assessment of what the change signals: "This is more evidence that society is not prepared for the potential catastrophic risks posed by AI."
Painter also noted that the shift shows Anthropic "believes it needs to shift into triage mode with its safety plans, because methods to assess and mitigate risk are not keeping up with the pace of capabilities."
That last part is the most important. The company building some of the most capable AI in the world is telling us, publicly, that safety research can't keep up with capability research.
What This Means If You're Using AI in Your Business
Let's bring this down to earth. If you're a business owner using Claude (or any AI tool), here's our honest take:
Your tools aren't suddenly less safe today. Claude didn't change overnight. The models you're using right now are the same ones you were using yesterday. This is a policy change about future development, not a product recall.
But the safety net just got thinner. The guarantee that Anthropic would hit the brakes if safety couldn't be demonstrated? That's gone. What remains is a promise to try hard and be transparent about it. That's better than nothing—but it's not the same as a hard stop.
This isn't just about Anthropic. Within months of Anthropic's original RSP in 2023, both OpenAI and Google DeepMind adopted broadly similar frameworks. A rollback by the policy's originator may reshape what "responsible scaling" means across the entire industry. If the company that set the standard loosens it, everyone else will feel permission to do the same.
You need to be your own safety layer. This has always been true, but it matters more now. Don't outsource your judgment to any AI company's safety promises. Build review processes into your workflows. Keep humans in the loop for anything that touches customers, finances, or critical decisions.
We wrote about this in Why 95% of AI Projects Fail—the businesses that succeed with AI are the ones that build for failure. They assume AI will make mistakes and design systems that catch them before they cause damage.
That principle just got more important.
Our Position
We're not abandoning Claude. We're not telling you to either. It remains, in our experience, the most capable and thoughtful AI platform available.
But we're going into this next chapter with eyes wide open. The era of AI companies making hard safety commitments appears to be ending—replaced by softer pledges, competitive pressure, and government leverage.
That means the responsibility shifts. To us. To you. To every business building on these tools.
Build your own guardrails. Keep humans where judgment matters. Don't trust any company's promises more than your own processes.
The AI tools are getting more powerful. The question is whether we're getting more thoughtful about how we use them.
Want to make sure your AI workflows have proper human oversight built in? Book a free 30-minute call and let's talk about building AI processes that work—safely.
Sources:
- TIME: Anthropic Drops Flagship Safety Pledge (February 25, 2026)
- Anthropic: Responsible Scaling Policy v3.0 (February 24, 2026)
- Bloomberg: Anthropic Drops Hallmark Safety Pledge in Race With AI Peers (February 25, 2026)
- CNN: Anthropic ditches its core safety promise (February 25, 2026)
- Engadget: Anthropic weakens safety pledge in wake of Pentagon pressure (February 25, 2026)