Leader daily

Edition 2026-05-31 · read as Leader

AnthropicMythosBreaksAISIRanges,EDRFallsinDays

Sources
36
Words
1,854
Read
9min

Topics Agentic AI AI Capital AI Regulation

◆ The signal

Anthropic's Mythos became the first AI model to fully take over both UK AISI attack ranges autonomously, and a parallel study showed AI reverse-engineering all five major EDR products in days rather than weeks. Patch SLAs and endpoint detection assumptions were calibrated for human-speed adversaries. The honest question is not whether defenders have twelve to eighteen months before this proliferates. It is whether the rebuild started last quarter or has not started.

◆ INTELLIGENCE MAP

  1. 01

    AI Cyber Offense Crosses Full Autonomy Threshold

    act now

    Mythos cleared both AISI simulated attack ranges (first model ever). TrustedSec proved all five major EDRs share identical architectures now transparent to AI. PraisonAI was exploited within 4 hours of disclosure. NSA — not CISA — is getting access. The defensive model just broke.

    4hrs
    disclosure-to-exploit
    8
    sources
    • AISI ranges cleared
    • EDRs transparent to AI
    • Exploit window
    • LLMjacking attempts/mo
    1. 2024 Q4AI finds single vulns
    2. 2025 Q181% autonomous hack rate
    3. 2025 Q2Full network takeover achieved
    4. 2025 Q3 (proj)Capability proliferates to open-weight
  2. 02

    AI Infrastructure Enters Public-Market Validation Phase

    monitor

    Cerebras IPO at $56B (+70% day one) on a $20B OpenAI anchor contract. Fervo Energy surged 33% at $10B+ driven by Google's 3GW option. Microsoft's total OpenAI commitment disclosed at $100B+. xAI leasing 45% of Colossus to Anthropic signals compute is financializing. The window to secure favorable multi-year capacity is closing.

    $100B+
    Microsoft-OpenAI total
    7
    sources
    • Cerebras IPO valuation
    • First-day pop
    • OpenAI compute deal
    • Fervo Energy valuation
    • GPU demand ratio
    1. Microsoft→OpenAI100
    2. Cerebras IPO56
    3. OpenAI→Cerebras20
    4. Fervo Energy IPO10
    5. Anthropic ARR30
  3. 03

    Agent Platform War: Who Controls the Execution Layer

    monitor

    SAP (€100M fund, Knowledge Graph) and ServiceNow (headless Action Fabric via MCP) collide on who owns agent writes to enterprise systems of record. Apple is building agent gatekeeping into the App Store. Google's Gemini Intelligence ships on Android this summer. Vercel confirms 59% of AI traffic is now agentic. The execution layer is the new platform control point.

    59%
    agentic token share
    7
    sources
    • SAP agent fund
    • Agentic traffic share
    • Android market share
    • GTM value migration
    1. Agentic workloads59
    2. Chat/conversational27
    3. Batch/other14
  4. 04

    AI Liability Architecture Under Active Construction

    background

    a16z published the industry's definitive liability framework advocating user-liability defaults and damages caps. Courts are simultaneously deciding precedent-setting cases. ODNI and Commerce are fighting over pre-release model evaluation authority. The outcome determines whether open-source AI remains viable and whether incumbents or challengers survive.

    $115M
    a16z political spend
    4
    sources
    • a16z midterm spend
    • Clarity Act odds
    • Window to influence
    • Jurisdictions drafting
    1. Platform liability (Section 230-like)35
    2. Product liability (strict)65

◆ DEEP DIVES

  1. 01

    Your Security Model Just Broke on Two Axes — EDR Transparency + Autonomous Offense

    The Capability Discontinuity

    The week produced a convergence that calls for an architectural response rather than a budget one. Anthropic's Mythos became the first model to clear both UK AISI simulated attack ranges, achieving full autonomous network takeover rather than persistence or lateral movement alone. OpenAI's GPT-5.5-cyber cleared one of the two. Across multiple independent assessments, researcher consensus is now that frontier models can find, chain, and exploit vulnerabilities in something close to real time.

    In the same week, TrustedSec ran LLMs against five commercial EDR products and found all five share identical architectural patterns: YARA-style rules, behavioral logic, allowlists, prefilters, scripted engines readable as Lua after a single decryption pass, and local ML classifiers. Work that took a skilled reverse engineer weeks now finishes in days. The EDR category's defensive moat was security-through-obscurity. The obscurity has left the building.

    The security model was built on the premise that the cost of understanding the agent exceeded the value of bypassing it for most adversaries. That premise is no longer true for a growing share of the threat population.

    The Evidence Stack

    • Mythos: first model to clear both AISI end-to-end cyber ranges
    • PraisonAI: exploited within 4 hours of disclosure, well below most patch SLAs
    • LiteLLM, Ollama, OpenClaw: added to CISA KEV. AI infrastructure is being exploited in the wild now
    • LLMjacking honeypot: 113,000+ attacks per month, tooling maturing mid-experiment
    • NGINX: 18-year undetected RCE in the rewrite module, foundational infrastructure that "many eyes" missed
    • Foxconn: 8TB of Apple, Google, Intel, and Nvidia designs exfiltrated via Nitrogen ransomware

    Where Sources Disagree, And Why It Matters

    A tension runs through today's intelligence. The research community sees a 12-18 month window before these capabilities proliferate to open-weight models and the long tail of threat actors. The policy signals suggest otherwise. Congress is routing Mythos access through NSA, not CISA, which prioritizes offensive and intelligence operations over civilian defense. The private sector is on its own for that window.

    A reasonable skeptic would note that AI-powered defensive scanning is equally transformative. Mozilla found 271 bugs in Firefox; Microsoft's MDASH surfaced 16 exploitable flaws in a single patch cycle. The skeptic is correct, with one caveat: the key variable is harness design, not model selection. Mozilla's 271 bugs came from custom agentic harnesses, while generic scanning of curl produced one low-severity CVE. The moat is in orchestration.

    The Architectural Consequence

    Security architecture built around quarterly patching, annual pentests, and endpoint-as-load-bearing-control was calibrated for human-speed adversaries. That calibration is now wrong. The compensating controls that matter in the next eighteen months sit above the endpoint: identity, network telemetry, behavioral analytics. Patch SLAs written for a 30-day window need rewriting for a 7-day window on the exposures that actually matter.

    Action items

    • Commission a red team exercise this month specifically targeting your EDR with AI-assisted reverse engineering to quantify your actual detection gap
    • Rewrite critical-vulnerability patch SLAs from 30-day to 7-day for internet-facing assets by end of Q3
    • Inventory all AI infrastructure tooling (LiteLLM, Ollama, model registries) deployed by engineering teams and subject them to production-grade security review within 60 days
    • Evaluate deploying frontier models for defensive vulnerability scanning of your own codebase with custom harnesses — budget for this in Q4

    Sources:Clint Gibler · The Information AM · AINews · CyberScoop · The Hacker News · SANS AtRisk

  2. 02

    Compute Is Being Pre-Sold in $10B+ Blocks — Your Procurement Horizon Is Wrong

    The New Market Structure

    The recent IPO cluster and the Anthropic court disclosure landing inside the same window point to one conclusion: AI infrastructure is being allocated through relationship-based bilateral commitments, not open market competition. The data:

    EventScaleSignal
    Cerebras IPO$56B valuation, +70% day oneCustom silicon is platform-grade valuable
    OpenAI→Cerebras contract$20B commitmentCapacity pre-sold for the decade
    Microsoft→OpenAI (disclosed)$100B+ totalThe price of staying at the frontier
    Fervo Energy IPO$10B+, +33% day onePower is a platform business, not a utility
    Nebius growth684% YoY, 4:1 demand ratioNeocloud capacity can't keep pace

    The xAI Signal: Compute as a Financial Instrument

    Elon Musk, who publicly called Anthropic "misanthropic and evil," agreed to lease 220,000 GPUs (45% of Colossus 1) to them. When financial logic overwhelms competitive logic this visibly, the signal is hard to miss: GPU supply has become a financial instrument first and a strategic moat second. Grok never achieved meaningful traction. Lease revenue from Colossus exceeds what Grok could generate from those same GPUs.

    A plan that assumed three viable compute suppliers in eighteen months may now have one and a half. The optionality most infrastructure plans were quietly relying on is the line item being deleted.

    Anthropic's 80x Capacity Miss

    Anthropic disclosed it grew 80x against a planned 10x, operating at roughly 12% of required capacity for extended stretches. Paying customers inside that window received degraded service without disclosure. The 80x figure, combined with the leap from $9B to $30B in ARR over roughly four months, explains why Anthropic is now raising at $900B+, above OpenAI's $854B March mark.

    The planning implication: demand forecasting inside the labs is not a solved problem. Capacity-driven degradation will recur across vendors for at least four more quarters. Google's option for 3 gigawatts from Fervo, roughly sixty-plus data centers from a single supplier, says the hyperscalers see the constraint lasting through decade-end.

    What This Means for Non-Hyperscalers

    Fewer than five companies worldwide can sustain $100B+ rolling infrastructure investment. Every other AI strategy has to be built on partnership terms. Those terms are materially better today than they will be in eighteen months, because the counterparties have not yet finished sorting their own priorities. Firms shipping AI product on schedule right now locked capacity twelve to eighteen months ago. Firms still planning to scale are competing for what remains after the $20B blocks have been claimed.

    Action items

    • Audit all AI compute capacity contracts this month — model the cost differential between 12-month committed capacity and spot pricing exposure for your projected workloads
    • Evaluate multi-vendor AI infrastructure strategy including Cerebras, alternative neoclouds, and the emerging GPU lease market created by xAI-style arrangements
    • Secure energy/power supply commitments for any planned data center or AI infrastructure expansion before Q1 2026
    • Brief the board on a two-scenario model: capacity eases (write-down risk on commitments) vs. constraint persists (allocation risk without them)

    Sources:The Information AM · Martin Peers · StrictlyVC · Katie Roof · Bloomberg Technology · The Pragmatic Engineer

  3. 03

    The Execution Layer War: SAP, ServiceNow, Apple, and Google All Claimed the Same Surface This Week

    The Collision

    Four platform moves landed in the same window, and all four target the layer where AI agents commit writes to enterprise systems. The timing is not coincidence. It is the opening of a multi-year contest over write authority into systems of record:

    • SAP: €100M fund plus Knowledge Graph — vertically integrated, making its own agents contextually superior inside SAP's data universe
    • ServiceNow: Headless Action Fabric exposed via MCP (Model Context Protocol) — open interoperability, any agent can call
    • Apple: App Store agent framework gating AI agent distribution on iOS — inserting approval gates and a 30% tax on agent sub-spawning
    • Google: Gemini Intelligence shipping on Android this summer — converting 97%+ market share into an agent platform moat
    Agents acting across finance, HR, IT, and procurement need one authoritative place to reconcile state. Enterprises cannot run two reconciliation authorities for the same writes.

    Two Competing Theories

    SAP and ServiceNow are placing incompatible architectural bets on where the execution layer lives. SAP bets that a vertically integrated data moat makes its agents superior within its ecosystem. ServiceNow bets that open MCP interoperability makes it the universal fabric any agent can call. Both can win, in different segments, at the same time. The enterprise running both still faces a forced choice about which vendor owns the execution layer for processes that cannot stop.

    a16z sharpens the frame with data: more than $150B of GTM value is migrating from CRM systems of record toward the AI orchestration layer. Lemkin's signal from a single customer is the concrete version of that claim, with 80% fewer human seats, 83% higher total spend, 20+ agents running, and the implication is that consumption-based AI pricing is accretive against seat-based models rather than cannibalizing them.

    The Platform Gatekeeping Dimension

    Apple's move is different in kind. The company is specifically addressing agents that "spin up smaller apps on the spot after Apple has already approved the parent app." This is both a safety argument and a revenue argument, and the revenue half is the one to take seriously: it prevents agents from routing around the 30% tax. For any company whose AI roadmap includes consumer-facing agents on iOS, this is a constraint layer that has to be priced into product economics before WWDC.

    Google's Gemini Intelligence converts default OS position into agent intermediation. The grocery-list demo looks like a party trick. What it actually does is turn the app into infrastructure and the agent into the interface. The summer rollout on Galaxy S26 and Pixel 10 puts it on billions of devices within months.

    The Decision Framework

    The question is no longer which model to use. It is which execution surface the business sits on: builder of the orchestration layer that others pass through, or commodity infrastructure underneath someone else's agent collecting thinner margins each cycle. SAP's claim is strongest where the process is the transaction. ServiceNow's claim is strongest where the process is the workflow across systems. Most enterprises have both shapes of work, and the agents that commit writes are what forces the choice.

    Action items

    • Conduct an agent-readiness audit of your platform architecture by end of Q3 — can third-party AI agents discover, invoke, and orchestrate your workflows without a human UI?
    • Determine this quarter whether SAP or ServiceNow owns the execution layer for your mission-critical processes — the vendor relegated to 'integration' will spend three years explaining why the line was drawn wrong
    • Audit your iOS AI agent roadmap for Apple's likely fee/approval structure and model unit economics against a 30% platform tax plus opaque approval gates
    • Evaluate MCP (Model Context Protocol) as a strategic investment for your own platform — build or integrate MCP server capabilities before the protocol standard calcifies

    Sources:TLDR IT · a16z · Techpresso · TLDR · Simplifying AI · TLDR Design

  4. 04

    AI Liability Is Being Written in Three Rooms at Once — Your Absence Is Expensive

    The Three Fronts

    The AI liability regime is being decided in three venues at once, and the firms most exposed are largely absent from all of them:

    1. Courts: Active cases could impose substantial penalties on general-purpose AI developers for downstream user misuse, well before any legislative framework exists to mediate the question.
    2. Congress: a16z has published the industry's most comprehensive lobbying blueprint, built around user-liability defaults, damages caps, and federal preemption. They are deploying $115.5M into the 2026 midterms to back it.
    3. White House: ODNI and Commerce are fighting over who evaluates AI models pre-release. An IC-led regime looks like release gating with classified compliance. A Commerce-led regime looks like voluntary disclosure with no teeth.

    Why the Regime Determines Market Structure

    This is not an abstract policy exercise. The liability framework that wins decides two concrete things: who is on the hook when a model is misused, and what kinds of models get released at all.

    RegimeWho WinsWho Loses
    Strict/product liabilityDeep-pocketed incumbentsStartups, open-source developers
    Platform/Section 230-likeChallengers, open-sourceRegulators, plaintiffs' bar
    If developer liability for downstream use becomes the standard, the economic logic of releasing an open-source model stops working. No rational actor open-sources a model that generates unbounded liability for every downstream application.

    The Open-Source Dependency You Haven't Priced

    A reasonable skeptic would point out that product strategies have weathered regulatory ambiguity before, and most of the time the worst case does not arrive. The reasonable skeptic is correct on the base rate. What the skeptic does not address is that most current product strategies quietly assume continued access to open-weight models, which means they carry unpriced dependency on regulatory outcomes the P&L does not show. A strict-liability outcome restructures the supply chain toward proprietary foundation models and a handful of providers. Every build-vs-buy decision currently relying on open-source model access has this variable underneath it, whether the decision-maker has named it or not.

    The Timing Problem

    Courts are deciding cases now. The likely sequence is precedent-setting rulings arriving before any comprehensive federal framework, producing a patchwork of judicial standards that subsequent legislation has to work around rather than design from scratch. The guardrail bypass ecosystem has industrialized into middleware, proxy relays, and automated pipelines, which hands national security hawks exactly the evidence that shortens legislative timelines. Evidence of that sort does not get unseen once it is in the briefing.

    Action items

    • Commission a legal exposure audit against three competing liability frameworks (absolute liability, safe harbor with best practices, rebuttable user-liability presumption) by end of Q3
    • Begin building audit-ready AI governance infrastructure (model cards, safety testing documentation, incident reporting) that would satisfy proposed safe harbor requirements
    • Map all open-source AI dependencies and develop contingency plans for a world where open-source model availability contracts due to developer liability concerns
    • Engage federal legislative process through industry coalitions — file comments, join working groups, or allocate government affairs resources before the rules harden

    Sources:a16z AI Policy Brief · Risky.Biz · Morning Brew · The Download from MIT Technology Review

◆ QUICK HITS

  • ServiceNow blew its full-year Anthropic budget by May — no SLAs, no usage telemetry, no comment from Anthropic. The CDIO is building 'AI Control Tower' as a workaround and selling it to other enterprises.

    Laura Bratton

  • Duolingo CEO walked back blanket AI mandate after quantifying a ~20% 'slop tax' on AI-generated content at scale — first credible public admission that forced adoption produces performative compliance, not productivity

    TLDR Marketing

  • Update: Anthropic ARR leapt from $9B to $30B+ in ~4 months (120x in 24 months total) and is raising at $900B+ — above OpenAI's $854B. xAI is leasing them 220K GPUs because Grok can't generate comparable revenue from the silicon.

    StrictlyVC

  • Foxconn lost 8TB to Nitrogen ransomware — confirmed exfiltration includes confidential designs from Apple, Google, Intel, and Nvidia concentrated at a single contract manufacturer

    TLDR InfoSec

  • Lovable dissolved its growth management layer entirely, replaced with autonomous 'High-Impact IC' contributors — former VPs are voluntarily taking IC roles because one person with AI ships what a cross-functional squad used to ship in weeks

    Lenny's Newsletter

  • Only 15% of organizations have data foundations adequate for agentic AI while 85% are spending millions anyway — survey of 334 practitioners shows 95.2% say the gap is organizational (ownership, training, time), not tooling

    TLDR Data

  • NGINX carried an undetected RCE in its rewrite module for 18 years — blast radius is every internet-facing service using URL rewriting, which statistically means nearly every modern web application

    The Hacker News

  • Anthropic's June 15 pricing change caps third-party tool usage at plan-value credits then bills at full API rates — the 70-90% discount Cursor/Zed users enjoyed is ending as Anthropic prepares for an October IPO

    ben's bites

◆ Bottom line

The take.

AI cyber offense achieved full autonomous network takeover this week while a parallel study proved every major endpoint security product is now transparent to AI — and the infrastructure to defend against it is being pre-sold in $20 billion blocks to buyers who aren't you. The two decisions that compress into this quarter: rebuild your security architecture for machine-speed adversaries before the capability proliferates in 12-18 months, and source compute capacity through multi-year commitments before the remaining supply gets locked into bilateral deals between hyperscalers. Everything else — the platform wars, the liability regime, the org model — matters on a two-year clock. These two matter now.

— Promit, reading as Leader ·

Frequently asked

What does Mythos clearing both UK AISI attack ranges actually mean for defenders?
It means a frontier model can now find, chain, and exploit vulnerabilities to achieve full autonomous network takeover, not just persistence or lateral movement. Combined with TrustedSec's finding that all five major EDRs share readable architectural patterns, the security model premise — that understanding defenses costs more than bypassing them — no longer holds. Patch SLAs and endpoint-as-load-bearing-control were calibrated for human-speed adversaries and need to be rewritten.
How should patch SLAs change in response to AI-accelerated exploitation?
Critical-vulnerability SLAs for internet-facing assets should move from 30-day to 7-day windows. PraisonAI was exploited within four hours of disclosure, which means a 30-day cadence is no longer a patching window — it is the exposure window itself. The compensating controls that matter over the next 18 months sit above the endpoint: identity, network telemetry, and behavioral analytics.
Why does the xAI–Anthropic GPU lease matter strategically?
Musk leasing 220,000 GPUs (45% of Colossus 1) to a company he publicly disparaged signals that GPU supply has become a financial instrument first and a strategic moat second. For non-hyperscalers, it means the optionality most infrastructure plans assumed — multiple viable compute suppliers in 18 months — is being deleted as capacity gets pre-sold in $10B+ bilateral blocks. Terms available today are materially better than they will be in 18 months.
How should enterprises decide between SAP and ServiceNow as the agent execution layer?
SAP's vertically integrated bet is strongest where the process is the transaction inside its data universe; ServiceNow's open MCP-based fabric is strongest where the process is a workflow spanning multiple systems. Most enterprises have both shapes of work, but cannot run two reconciliation authorities for the same writes, so the choice has to be made per mission-critical process. The vendor relegated to 'integration' will set licensing leverage against you for the next three years of renewals.
What is the unpriced risk in current AI product strategies from the liability fight?
Most strategies quietly assume continued access to open-weight models, but a strict developer-liability outcome would make open-sourcing economically irrational because it generates unbounded liability for every downstream use. That restructures the supply chain toward a handful of proprietary foundation model providers and changes the unit economics of every build-vs-buy decision. Courts are likely to set precedent before Congress legislates, producing a patchwork standard that later law must work around.

◆ Same day, different angle

Read this day as…

◆ Recent in leader

Keep reading.