Responsibility LedgerAppend-only · Dated · Signed

Entry 014 · May 8, 2026 · 6 min read

Cloudflare cut 1,100 jobs and called it agentic AI. Commerce tests Google models with no published pass criteria. And Anthropic committed $200 billion to Google Cloud over five years.

Cloudflare announced 1,100 layoffs May 7 to 'accelerate' an agentic AI-first model. CAISI expanded pre-deployment testing to Google, Microsoft, and xAI with no published pass/fail thresholds. And Anthropic committed $200 billion to Google Cloud through 2031.

Signed — Roger Grubb, Editor


A web infrastructure company cut 20 percent of its workforce yesterday and framed the move as an acceleration toward an "agentic AI-first operating model." A federal agency announced expanded testing agreements with three frontier labs Tuesday but published no criteria defining what a model must demonstrate to pass, fail, or deploy anyway. And an AI company committed $200 billion to a cloud provider over five years—a figure equal to the GDP of New Zealand—in a deal reported Monday as necessary infrastructure to keep pace with "unprecedented growth."

All three events surfaced in the public record within the last 72 hours. All three involve operators making claims about capability, necessity, or business justification that can be measured against what happens next. And all three landed with enough specificity that six months from now, a reader with access to earnings calls, government disclosures, and contract filings will be able to grade whether the operators delivered what they claimed.

That is the job of this ledger.

3 Claims

Claim 1 — Cloudflare: 1,100 layoffs (20% workforce) framed as shift to "agentic AI-first operating model"

On May 7, 2026, Cloudflare unveiled a restructuring plan to accelerate its shift to an agentic AI-first operating model, including cutting roughly 20% of its workforce, or about 1,100 roles . The company expects to book an estimated $140 million to $150 million in related charges, mostly in the second and third quarters of fiscal 2026 .

Cloudflare reported first-quarter 2026 revenue of $639.8 million, up 34% year over year , alongside full-year 2026 targets of $2.81 billion in revenue and roughly $418 million to $421 million in non-GAAP operating income . Management said AI is becoming a major driver of demand and product evolution .

The claim is gradeable on whether 1,100 employees are laid off by September 30, 2026; whether the company frames the cuts as necessary to fund AI transformation in subsequent earnings calls and SEC filings; and whether Cloudflare meets or exceeds its stated full-year 2026 revenue and operating income targets. The invalidator would be credible reporting showing Cloudflare laid off materially fewer employees, disclosed in regulatory filings that the cuts were primarily cost reduction unrelated to AI, or rebuilt the eliminated roles by Q4 2026 without acknowledging the reversal.

Grade by: 2026-11-08 (6 months)

Claim 2 — Commerce Department: pre-deployment testing of Google, Microsoft, xAI models with no published pass/fail criteria

The Center for AI Standards and Innovation on May 6, 2026 announced agreements with Google DeepMind, Microsoft and Elon Musk's xAI that will allow the U.S. government to evaluate AI models before they are publicly available, to "conduct pre-deployment evaluations and targeted research to better assess frontier AI capabilities and advance the state of AI security" . The center has already completed more than 40 AI model evaluations .

The agreements allow for government evaluations of models before public release, as well as post-deployment assessments and related research . But the Commerce Department has not published criteria defining what constitutes a passing evaluation, what triggers a failure, or whether labs can deploy models that fail review. Developers sometimes hand over versions of their models with safety guardrails reduced specifically so the Center can probe for national security risks .

The claim is gradeable on whether CAISI publishes—by November 8, 2026—clear pass/fail criteria or enforcement thresholds for pre-deployment evaluations; whether any lab publicly delays or cancels a model release citing CAISI findings; and whether the agreements include binding commitments not to deploy models that fail specific tests. The invalidator would be a published CAISI framework with explicit deployment gates, credible reporting of a delayed model launch due to government testing results, or disclosure that the testing regime includes enforceable consequences beyond voluntary compliance.

Grade by: 2026-11-08 (6 months)

Claim 3 — Anthropic: $200 billion commitment to Google Cloud over five years

Anthropic has committed to spend $200 billion with Google Cloud over five years as part of a recent agreement, the Information reported on May 5, 2026 . The deal follows Google's investment of up to $40 billion in Anthropic announced in April 2026, with $10 billion now and the remaining $30 billion contingent on certain performance milestones, at Anthropic's latest valuation of $380 billion .

Anthropic stated the commitment represents "our most significant compute commitment to date to keep pace with our unprecedented growth," noting that run-rate revenue has now surpassed $30 billion—up from approximately $9 billion at the end of 2025 . Amazon remains Anthropic's primary cloud provider and training partner .

The claim is gradeable on whether Anthropic spends at least $160 billion with Google Cloud through May 2031 (80% of stated commitment); whether the company discloses the financial terms of the Google Cloud relationship in subsequent investor materials, partnership announcements, or regulatory filings; and whether Amazon remains the "primary" provider by compute spend through 2027. The invalidator would be credible reporting—via Anthropic investor relations, court filings, or investigative journalism—showing actual cloud spending materially below the $200 billion commitment, disclosure that the contract includes exit clauses or conditional terms not initially reported, or confirmation that Google Cloud has overtaken Amazon as the primary training partner earlier than the company's public statements suggest.

Grade by: 2027-05-08 (1 year)

2 Reckonings

Reckoning 1 — Dario Amodei's "90% of code" prediction: Grade C

On March 10, 2025, Anthropic CEO Dario Amodei said "we'd be there in three to six months, where AI is writing 90% of the code" and that within a year AI might be writing "essentially all of the code" . The horizon was September 2025. We are now eight months past that deadline.

Anthropic claims that the majority of its code is now written by Claude Code , and 84% of developers use or plan to use AI tools, with 51% of pros using them daily, but GitHub's own data pegs Copilot at ~46% of code in files where it's enabled . The prediction landed in the right direction but missed the timeline and magnitude. AI is writing some majority of code in some organizations, but not 90% across the board.

Grade: C. The invalidator would have been credible industry-wide data—published by GitHub, Stack Overflow, or independent research firms—showing that AI accounted for 85% or more of code written at companies using AI-assisted development tools by October 2025. That data does not exist. Amodei's own company may have hit the threshold internally, but the broader claim did not materialize on the stated timeline.

Reckoning 2 — Sam Altman's "novel insights" prediction for 2026: Grade Incomplete

Sam Altman wrote in early 2025 that "2026 will likely see the arrival of systems that can figure out novel insights" . We are now five months into 2026.

OpenAI disclosed that GPT-5 created novel wet lab protocol improvements in April 2026, optimizing the efficiency of a molecular cloning protocol by 79x . The question is whether "novel insights" means independently discovering something no human knew, or whether it means generating a useful optimization that a human could have proposed but hadn't yet. OpenAI framed the cloning result as novel. Independent scientists have not yet replicated or validated the claim in peer-reviewed literature.

Grade: Incomplete. The invalidator would be published, peer-reviewed research demonstrating that an AI system independently discovered a scientific principle, mechanism, or solution that no human researcher had previously identified, with validation from domain experts confirming the novelty and correctness. As of May 8, 2026, the claim is plausible but not yet proven. We'll revisit this in six months.

1 Refusal

I refused to accept the Commerce Department's expanded testing agreements as evidence of AI oversight until CAISI publishes what "passing" means.

Three labs agreed to share unreleased models with a government center for evaluation. The government said the evaluations would assess "national security implications." But the government did not say what happens if a model fails the assessment. It did not say whether failure requires the lab to delay release, revise the model, or disclose the failure to the public. And it did not define the threshold at which a capability becomes a national security risk.

Without those answers, the testing regime is voluntary theater. A lab submits. The government tests. The lab launches anyway. The deal looks like accountability, but accountability requires consequence. I will not frame cooperation as oversight when the cooperation has no binding mechanism, no published standard, and no penalty for noncompliance.

I refused to call it oversight when there's no definition of what failing looks like.

— Roger Grubb, Editor


Sources


The next entry lands at 5:30 AM Pacific.

3 Claims. 2 Reckonings. 1 Refusal. Every weekday. Dated, signed, append-only.