Thursday, May 14, 2026 ~5 min

The frontier-pricing assumption broke this week, and three other assumptions broke with it

Chinese labs are profitable at one-twentieth the price, agents now buy from competitor sites, and the npm worm wipes hosts the moment you rotate. Four assumptions your stack inherits — repriced in a single week.

A shopper opened Amazon this week, asked the assistant for a specific brand of running shoe, and the assistant went to a competitor's website and bought it. That's the week in one transaction. Google made Gemini the default interface on Acer, ASUS, Dell, HP, and Lenovo laptops shipping fall 2026. Salesforce shipped headless. OpenAI deprecated finetuning. A 14-lab tour of Chinese AI shops came back with margin disclosures, not capability hype. And the npm worm everyone has been tracking grew teeth — it now wipes the host when you revoke the stolen token.

Four assumptions died. None of them in isolation. All of them at once.

The pricing assumption

DeepSeek V4 Pro clears at $0.43 per million input tokens against Claude Opus 4.6 at $4.73. Z.ai reports 50% gross margin at a dollar per million. MiniMax claims 70% on enterprise tiers. These are not subsidized loss-leaders dressed up as a strategy — they are profitable operations running on roughly 8x less compute, achieving 4-7x more capability per FLOP, six to eight months behind frontier on capability and 10-28x ahead on price.

The tell is Cursor. Composer 2 is built on Moonshot's Kimi K2.5. A flagship US developer-tools company, the kind sitting in dozens of venture portfolios, already routes its flagship workload through Chinese open-source. Not a price-shopping experiment. Their actual product.

The natural objection is that capability still wins where it matters, and that's true in the tails. Frontier reasoning, safety-critical agents, the hardest coding work — that perimeter holds, probably for another year. Outside it, the Chinese price floor becomes the global price floor inside twelve months. Any vendor contract signed this quarter without a cost-collapse scenario in the model is one that gets renegotiated under duress.

The margin assumption

Six sources converged on the same arithmetic this week and most board decks haven't read it yet: AI-native software appears to cap near 17% gross margin, not 70%. The mechanism is mechanical. Personalized inference kills the caching that made SaaS SaaS. Reasoning models burn 10-100x more tokens per task, which offsets every per-token price cut everyone keeps citing as salvation. And enterprise deployment keeps requiring Forward-Deployed Engineers — Google, OpenAI, and Anthropic all confessed to this in the same week by shipping consulting arms in three different costumes.

When three frontier labs independently bolt on a services motion in seven days, that's not strategy. That's confession. Enterprise AI does not self-serve. The narrative premium in the marks still assumes clean software economics. The delivery model is Palantir-shaped.

There are four escape routes and only two are venture-scale: luxury seats at $20K+, or vertical integration into atoms. Cost-cloning is a race to the floor. Niche is fine if your board agrees it's a lifestyle business. A team that hasn't picked one explicitly is drifting into commodity economics by default.

The interface assumption

A study across 16,000 AI shopping rounds found that seven of eight traditional conversion mechanisms either failed or produced negative lift when the buyer was an agent. Scarcity badges, anchored discounts, countdown timers, bundle psychology — the entire CRO playbook of the last decade. GPT-5 actively penalized aggressive promotional cues. Schema markup produced 2.4% lift across 1,885 pages, which is statistical noise.

The machinery built to make products legible to search engines is not the machinery that makes them preferable to agents. Only product ratings produced reliable positive signal, because ratings are the thing in the funnel that's actually hard to fake.

The useful 2x2 for Monday is whether your product is discoverable by an agent and whether an agent can complete the task inside it. Score no on both and you have two to three quarters before the platform assistant intermediates the relationship for you. The interface-heavy businesses have two quarters before this shows up in conversion. System-of-record businesses have four. The defensibility migration is downward into data, permissions, workflow logic — and upward into networks and real-world execution. Pure-software companies in the middle get squeezed from both sides.

The IR assumption

Shai-Hulud is no longer a credential-theft worm. It is a hostage situation. The new variant ships a gh-token-monitor process that watches for revocation events, and when a defender rotates a compromised token the host is wiped. The standard detect-revoke-rotate sequence is now part of the kill chain.

The persistence sits in .claude/settings.json, .vscode/tasks.json, and .cursor/* — paths your SCA tooling does not check. npm audit returns clean. Dependabot sees nothing. Every IDE launch re-executes attacker-controlled config with access to LLM API keys, GitHub PATs, kubeconfigs, and cloud CLI tokens. Confirmed blast radius this week: 400+ npm packages including Mistral, TanStack, and UiPath, plus 150+ RubyGems used as an exfiltration channel.

The new IR primitive: isolate, snapshot, enumerate persistence, remove persistence, then rotate. If your SOC ran the old playbook this week against an infected runner, you may have already triggered destruction.

What to ship this week

Four assumptions died simultaneously, which means there are four concrete moves and each of them is small enough to start before Friday.

Rewrite the Shai-Hulud runbook today and tabletop it with on-call. Deploy file-integrity monitoring on agent config paths — it's an afternoon of work and your scanner won't ever ship it.

Run your top three highest-volume inference calls against DeepSeek V4 Pro and Kimi K2.6 with your real eval set. Not a benchmark. The eval that gates your production deploys. If half the parity claim survives contact with your traffic, the cost math demands a routing layer.

Map your top five user flows to agent-callable endpoints and ship MCP support on the highest-value one. Two quarters before the platform assistants intermediate the relationship is also two quarters to be the endpoint they prefer.

And rebuild the AI P&L at a 17-25% gross margin floor. Either pick the escape route or admit which one you're already on. The companies that survive the next four quarters are the ones who notice the assumptions broke before the renewal call forces them to.

◆ Behind the synthesis

Six specialist takes that fed this piece.

The piece above is one stream in my voice. Below are the six lenses my pipeline produced upstream — each tuned for a different reader. Use them when you want the angle that matters most to your role.

The frontier-pricing assumption broke this week, and three other assumptions broke with it

The pricing assumption

The margin assumption

The interface assumption

The IR assumption

What to ship this week

Six specialist takes that fed this piece.

Shai-Hulud now wipes infected systems the instant you revoke a stolen token — your IR playbook's 'rotate credentials first' step triggers evidence destruction.

Shai-Hulud has weaponized your incident response playbook.

The finetuning API deprecation OpenAI announced this week runs on a shorter window than most migration plans budgeted for, which leaves reward-model loops built on those endpoints on a clock that already started.

A shopper asked Amazon's new agent to buy something this week, and the agent went to another website to do it.

Chinese labs are pricing inference 10-28x below the US frontier and still running 50-70% gross margins, which is what 4-7x compute efficiency looks like when export controls force it.

Chinese labs are shipping frontier-adjacent models at roughly ten to twenty-eight times lower cost while holding fifty to seventy percent gross margins, and the US hyperscalers are committing north of a hundred billion dollars to compute on the assumption that pricing power holds.