← The Lens·February 28, 2026·5 min read

The AI Safety Company Just Dropped Its Safety Promise. Then the Pentagon Called.

Anthropic ditched its core safety pledge and faces a Pentagon ultimatum — all in the same week. The company built on caution is learning what happens when safety meets power.

By Harry Wenham

Anthropic spent three years telling the world it would rather stop building AI than build it dangerously. This week, it broke that promise. Then the Pentagon told it to break another one — or else.

The two events aren't officially connected. But together, they tell you everything about where AI safety stands right now: squeezed between market pressure from one side and government power from the other.

The Promise That Disappeared

In 2023, Anthropic made a pledge that no other major AI lab matched. If its models got too powerful for its safety measures, it would stop training. Full stop. Pause development. Wait until the guardrails caught up.

It was the centrepiece of the company's Responsible Scaling Policy — the document Anthropic pointed to whenever anyone asked why they were different from OpenAI or Google.

On Tuesday, they scrapped it.

"We felt that it wouldn't actually help anyone for us to stop training AI models," chief science officer Jared Kaplan told TIME. "We didn't really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead."

Translation: we can't be safe if being safe means losing.

The new policy replaces hard limits with "public goals" that Anthropic will "openly grade" its own progress toward. The company says it's actually more transparent now — promising detailed reports on model capabilities and risks. But the teeth are gone. The line that said "we will pause" now says "we might delay, if we think we're ahead."

The Pentagon's Friday Deadline

The same week Anthropic loosened its internal safety rules, the Defense Department tightened the external screws.

Defense Secretary Pete Hegseth met with CEO Dario Amodei on Tuesday and gave him until 5:01 p.m. Friday to agree to let the military use Claude — Anthropic's AI model — for "all lawful purposes" without restriction.

The threat: lose a $200 million Pentagon contract, get labelled a "supply chain risk" (a designation normally reserved for adversary nations), or face the Defense Production Act — a wartime power that lets the government compel private companies to cooperate.

Anthropic has two red lines it won't cross. It doesn't want Claude used for fully autonomous weapons — drones that kill without a human pressing a button. And it doesn't want Claude used for mass surveillance of American citizens.

"In a narrow set of cases, we believe AI can undermine, rather than defend, democratic values," Amodei wrote Thursday.

The Pentagon says it has no plans for either use. But it won't put that in writing as a contractual restriction. Its position: legality is our call, not yours.

Undersecretary of Defense Emil Michael called Amodei a "liar" with a "God-complex" on X. The Pentagon spokesperson called the request "simple" and "common-sense."

The Squeeze

Here's what makes this week so revealing.

Anthropic didn't drop its safety pledge because the science changed. It dropped it because the world around the science changed. No federal AI law materialised. The Trump administration adopted what the industry calls a "let-it-rip" approach. International governance talks collapsed. And competitors — OpenAI, Google, xAI — kept building without equivalent restrictions.

Anthropic's original bet was that its safety policy would create a "race to the top." Other labs would copy its approach. Governments would codify it into law. Neither happened.

So now Anthropic faces simultaneous pressure from two directions:

From the market: If you pause, you die. OpenAI, Google, and xAI don't have pause commitments. DeepSeek, the Chinese lab, won't even show American chipmakers its new model. Anthropic just raised $30 billion at a $380 billion valuation. Investors don't fund companies that stop building. From the government: If you restrict, we'll force you. The Pentagon doesn't want an AI company deciding which military applications are acceptable. Other labs — OpenAI, Google, xAI — already work with Defence without these restrictions.

Either way, the message is the same: safety is a competitive disadvantage.

What Anthropic Is Actually Holding

Strip away the politics and there's a genuinely hard question here.

Anthropic offered the Pentagon missile defence and cyber defence use cases in December. It's already integrated Claude into classified military networks. This isn't a pacifist company refusing to work with the military.

The two things it won't budge on — autonomous weapons and mass surveillance — aren't abstract philosophical concerns. They're engineering arguments. AI models hallucinate. They make confident mistakes. Anthropic's position is that today's technology isn't reliable enough to make kill decisions or monitor millions of citizens without human oversight.

That's not ideology. That's a technical assessment. And it's one that most AI researchers would agree with.

But the Pentagon isn't asking for Anthropic's technical opinion. It's asking for compliance.

The $380 Billion Question

Anthropic is valued at more than Ford, General Motors, and Boeing combined. It got there by being the "responsible" AI company — the one with the soul, as its own team described it.

That brand is now being tested from both ends.

If Anthropic caves to the Pentagon, it proves that safety commitments are marketing copy — nice until they cost you something. Employees who joined specifically because of those commitments start updating their resumes.

If Anthropic refuses, it loses government revenue, gets labelled a supply chain risk, and watches competitors absorb every defence dollar it leaves on the table.

And if the safety pledge it already dropped was its strongest card, what's left to play?

What This Means for Everyone Else

The Anthropic story matters beyond one company because it answers a question the entire AI industry has been dodging: can market forces and government pressure produce safe AI, or do they produce the opposite?

Anthropic tried self-regulation harder than anyone. It wrote the policy, hired the researchers, published the frameworks, built the brand. Three years later, its own chief science officer says unilateral safety commitments don't work when competitors ignore them.

That's not a failure of Anthropic's intentions. It's a failure of the theory that companies can police themselves in a race where the prize is measured in hundreds of billions of dollars.

The Friday deadline will pass. Anthropic will either bend or break. But the real lesson landed earlier this week, in a quiet policy update that got less attention than the Pentagon drama:

The company that promised it would stop rather than build dangerous AI decided that stopping was no longer an option.

That tells you everything you need to know about where we are.

🧠 Think you know today's news?

Take the daily quiz — 5 questions, 60 seconds

→

Keep Reading

Feb 28·3 min

300 Google Employees Just Drew a Line the Pentagon Can't Cross

AI workers at Google, OpenAI, and Anthropic are refusing to build weapons and surveillance tools. The deadline is tonight.

Feb 25·3 min

The Pentagon gave Anthropic a Friday deadline. This is what happens when AI safety meets national security.

The only frontier AI with classified Defense Department access just refused to remove usage restrictions. The Pentagon threatened to invoke the Defense Production Act. Friday is the deadline.

Feb 27·3 min

China's Humanoid Robots Now Cost Less Than a Car. The Race Just Changed.

Chinese humanoid robots hit $13,500 while Tesla's Optimus isn't shipping yet. How China grabbed 85% of the market — and what it means.

Explore Perspectives

🤖 Artificial Intelligence

Get this delivered free every morning

The daily briefing with perspectives from 7 regions — straight to your inbox.

Get the daily briefing free

← All articles

Loading…