What exactly changes for Claude usage on June 15?

Claude usage through third-party tools like Zed, Conductor, OpenCode, and T3 Code moves to a separate credit bucket capped at plan value, with no rollover and overflow billed at API rates. The effective 70-90% subsidy on programmatic usage through Max plans disappears, so any cost model assuming flat-rate Claude consumption through non-native IDEs needs to be reconciled before then.

Why are single-turn evals no longer sufficient for production?

Vercel's production index shows 59% of token volume is now agentic across 200,000 teams, but most eval harnesses still measure single-turn pass@1 on curated prompts. That misses tool-call precision, steps-to-completion, p95 latency under concurrent load, and reward-hacking paths in passing rollouts—failure modes that only surface in multi-turn traces and dominate real reliability.

How should teams instrument Claude spend given the lack of native telemetry?

Deploy an LLM gateway like LiteLLM or Portkey with per-user and per-feature tagging plus daily token budget alerts. Anthropic provides no native per-user or per-tool usage telemetry, so without a gateway the overage shows up on the invoice after credits are gone—as ServiceNow's CDIO discovered after burning through a full-year budget by May.

Which training efficiency claim is worth spiking first and why?

Nous Token Superposition Training, because it is a pretraining recipe change with no inference-side downstream and claims 2-3x wall-clock speedup at matched FLOPs, validated from 270M up to 10B-A1B MoE. A 1B continued-pretraining run against a matched-FLOPs baseline answers the question in roughly a week, and even a 1.6x replication pays for itself on the next full run.

What makes the Iceberg and Polaris CVEs different from inference-layer vulnerabilities?

They enable silent data poisoning rather than a detectable breach. CVE-2026-42812 lets an attacker redirect Iceberg metadata writes to an attacker-controlled S3 prefix so subsequent queries read poisoned Parquet, and Polaris credential-broadening bugs allow cross-tenant access to storage credentials. Default lakehouse logging tracks row changes, not metadata pointer mutations, so the corruption compounds through every downstream model trained on the affected tables.

Edition 2026-05-30 · read as Data Science

Anthropic'sJune15CapEndstheHiddenAgentSubsidy

Sources: 36
Words: 1,833
Read: 9min

Topics Agentic AI LLM Inference AI Regulation

◆ The signal

Anthropic's June 15 credit metering removes what was effectively a 70-90% subsidy on Claude-backed agents and eval harnesses. Vercel's production index puts 59% of tokens in the agentic bucket, so the cost model is off on both price-per-token and tokens-per-task. The thing the headline number doesn't tell you is how multi-turn traces compound under the new cap. Without reconciled attribution, the pricing decision is being made by default, and the invoice is the place it shows up.

Key facts

Anthropic's June 15 credit metering ends a 70-90% subsidy on Claude usage through third-party tools like Zed, OpenCode, and the Agent SDK, with overflow billed at API rates.
Vercel's AI Gateway production index across 200,000 teams shows agentic workloads now account for 59% of all token volume, up from under 20% six months ago.
Anthropic captures 61% of token spend via Opus on reasoning nodes while Google captures 38% of token volume via Flash on utility calls, indicating tiered routing in production.
Apache Iceberg CVE-2026-42812 (CVSS 9.9) lets attackers redirect table metadata to attacker-controlled S3 prefixes, enabling silent training-data poisoning invisible to row-level logging.
Nous Token Superposition Training claims 2-3x wall-clock speedup at matched FLOPs, validated from 270M to 10B-A1B MoE, with no inference-side architecture change.

◆ INTELLIGENCE MAP

01
Anthropic June 15 Credit Reset + 80x Capacity Crisis
act now
Anthropic planned for 10x growth, got 80x, and is leasing xAI's entire 220K-GPU Colossus 1 cluster to recover. June 15 kills the implicit subsidy on third-party Claude tools (Zed, Conductor, OpenCode). ServiceNow already burned its full-year Claude budget by May. Multi-provider routing is no longer optional—it's the only hedge against an 8x capacity forecast error.
80x
growth vs 10x plan
11
sources
- Anthropic B2B share
- OpenAI B2B share
- Colossus GPUs leased
- Opus 4.7 image cost
- Credit change date
1. Planned growth10
2. Actual growth80
02
59% Agentic Tokens: Eval + Cost Models Simultaneously Broken
act now
Vercel's AI Gateway production index puts agentic workloads at 59% of token volume across 200K teams. Anthropic captures 61% of spend via Opus, Google captures 38% of volume via Flash. Single-turn eval harnesses now measure the minority of production traffic. Cost models built on 3:1 input-output ratios are off by ~5x when agentic traces run 15:1.
59%
tokens now agentic
6
sources
- Anthropic spend share
- Google volume share
- Agent bot bypass rate
- Agentic input:output
1. Agentic tokens59
2. Single-turn tokens41
03
Training Efficiency Trifecta: TST 2-3x, Datology 17x, Star Elastic 360x
monitor
Three research drops in one week change pretraining and post-training economics. Nous TST delivers 2-3x wall-clock speedup at matched FLOPs with zero inference architecture change (validated to 10B). Datology beat InternVL3.5-2B by ~10 points at 17x less compute via data curation alone. NVIDIA Star Elastic claims one post-training run produces a model-size family at 360x lower cost than pretraining each.
17x
compute savings (curation)
3
sources
- TST speedup
- Datology VLM lift
- Star Elastic savings
- GPU demand ratio
1. Nous TST3
2. Datology curation17
3. Star Elastic360
04
Lakehouse Trust Boundary Shrank: Iceberg/Polaris CVSS 9.9
monitor
Apache Iceberg CVE-2026-42812 lets attackers redirect table metadata writes to attacker-controlled S3, poisoning downstream queries and training data silently. Apache Polaris has three CVSS 9.9 credential-broadening bugs enabling cross-tenant access. Argo CD 3.2-3.3 exposes plaintext K8s Secrets to read-only users. Combined path: compromised notebook → poisoned training data → cross-tenant credential theft.
9.9
CVSS (Iceberg/Polaris)
3
sources
- Iceberg CVSS
- Polaris CVEs
- Argo CD CVSS
- n8n SQLi CVSS
1. 01Iceberg metadata redirect9.9
2. 02Polaris cred broadening9.9
3. 03Argo CD secret read9.6
4. 04n8n SQLi9.8
05
Autonomous Cyber Capability Crosses Threshold
background
Anthropic's Mythos is the first model to clear both AISI simulated attack ranges (full network takeover). GPT-5.5-cyber cleared one of two. Google's threat intel team observed actual AI-built cybercrime tooling in the wild. AISI is building harder tests because current ones are saturating. Patch SLAs calibrated to human-speed exploitation are now structurally behind model-release cadence.
2 of 2
AISI ranges cleared
5
sources
- Mythos ranges cleared
- GPT-5.5-cyber cleared
- PraisonAI exploit time
- Mozilla bugs found
1. Prior MythosAdvanced persistence only
2. New MythosFull network takeover (2/2)
3. GPT-5.5-cyberFull takeover (1/2)
4. AISI responseBuilding harder tests

◆ DEEP DIVES

Anthropic's 80x Capacity Miss Has a June 15 Deadline Attached

The Capacity Admission That Explains Everything

At Code with Claude on May 6, Dario Amodei said Anthropic planned for 10x growth and got 80x in revenue and usage. An 8x forecast miss is sufficient to explain the Claude Code degradation, the quality complaints, and the infrastructure scramble. The patch is leasing xAI's entire Colossus 1 cluster—220,000+ NVIDIA GPUs spanning H100, H200, and GB200, from the CEO who three months ago called Anthropic 'misanthropic and evil'.

The June 15 Credit Change Is a Hard Deadline

Starting June 15, Claude usage through third-party tools (Zed, Conductor, OpenCode, T3 Code) moves to a separate credit bucket capped at plan value. No subsidized tokens, no rollover, overflow bills at API rates. What was effectively a 70-90% discount on programmatic usage through Max plans is gone. Any cost model assuming flat-rate Claude consumption through non-native IDEs is dead in 30 days.

If the budget assumed flat subscription cost on Agent SDK, GitHub Actions, or claude-p pipelines, expect a silent overrun.

The Enterprise Share Crossover

Ramp's AI Index puts Anthropic at 34.4% vs OpenAI 32.3% of US businesses paying for AI. First documented crossover. The thing this doesn't tell you is what it measures: corporate card spend, not token volume, not production criticality. OpenAI correctly notes large enterprises pay by invoice. The gap is 2 points in one monthly snapshot. Read it as a bottoms-up adoption signal, not a share claim.

What's changing in the next rate-limit window

Surface	Before	After (May 7–14)
Claude Code limits	5-hour cap	Doubled
Peak-hours throttle	Reduced for Pro/Max	Removed
Opus API rate limits	Squeezed during crunch	'Substantially raised'
Fleet composition	Anthropic-only	Heterogeneous incl. GB200 via Colossus

Any benchmark you ran between mid-April and early May is stale. Serving conditions changed, and they will change again as Colossus integrates. Re-baseline after the new caps land.

The Telemetry Gap Compounds the Problem

Anthropic provides no native per-user or per-tool usage telemetry. ServiceNow's CDIO burned through the full-year Claude budget by May. National Life Group's CIO calls Claude 'great for consumer usage but not great for companies' that need per-user monitoring. Token consumption in agentic workflows is non-linear: a reflection loop can 10x spend per task without proportional quality gain, and the signal arrives with the invoice.

Action items

Reconcile every Claude-backed workload (Agent SDK, claude-p, GitHub Actions, batch evals) against the new credit cap by June 1
Deploy an LLM gateway (LiteLLM/Portkey) with per-user, per-feature tagging and daily token budget alerts within 2 weeks
Add a second frontier provider behind a router with automatic failover on 429/5xx this sprint
Re-run Claude Code and Opus API benchmarks (throughput, p95 latency, rate-limit headroom) after Colossus integration stabilizes in late May

Sources:Claude just metered your agent SDK calls... · Claude Code latency on long-context requests drifted upward... · Anthropic shipped without the telemetry hooks... · Anthropic passes OpenAI in B2B... · Vercel published a number worth sitting with...

59% Agentic Tokens: Your Eval Harness and Cost Model Are Both Wrong

The Production Shape Has Changed

Vercel's AI Gateway production index is the only multi-tenant usage snapshot worth citing this quarter. It puts agentic workloads at 59% of all token volume across 200,000 teams. Six months ago that figure was under 20%. An eval harness built on single-turn benchmarks is now scoring the minority of production traffic.

The spend-versus-volume split tells the routing story. Anthropic captures 61% of dollars via Opus on planning and reasoning nodes. Google captures 38% of tokens via Flash on high-throughput utility calls. The data shows no vendor loyalty. That is a textbook tiered-routing signature, already in production at scale.

A serving layer that hardcodes one provider SDK is out of step with what 200K production teams are already running.

Why Cost Models Are Off by 5x

Most cost models were fit when input-output ratios sat around 3:1. Agentic traces run closer to 15:1 on input, with heavy cache reuse on some providers and none on others. A forecast built on last year's ratio is off by roughly a factor of five on spend, and the error is not symmetric across vendors. The median request stops being a useful planning unit. The p95 does the work.

The Eval Gap Is Wider Than the Cost Gap

Multi-agent decomposition (scan → adversarial debate → PoC construction) outperformed monolithic models on CyberGym. Microsoft's MDASH used 100+ agents. The thing this doesn't tell you is the inference bill for running 100+ agents, which the benchmark does not measure. The pattern works. The economics are the open question.

What harness measures	What production breaks on
Single-turn accuracy	Cost path through 40K-token planning loops
Pass@1 on curated prompts	Tool-call reliability at the tail
Mean latency	P95 under concurrent load
Aggregate task success	Reward-hacking paths in 'passing' rollouts

Sayash Kapoor's argument deserves to be the new default: outcome-only metrics systematically underestimate failure modes in capable agents. Stronger agents surface benchmark bugs and reward-hacking paths that weaker agents physically cannot reach. The pass@1 curve flattens exactly when real reliability is diverging.

MCP Is Consolidating Faster Than Expected

TikTok shipped a Model Context Protocol endpoint. SAP put €100M behind MCP-exposed Knowledge Graphs. ServiceNow shipped Action Fabric exposing workflows headlessly. MCP is no longer Anthropic-specific. Glean's benchmark claims off-the-shelf MCP uses 30% more tokens and loses 2.5x head-to-head against an enterprise knowledge graph. That is a vendor-published result with no methodology disclosed. Run the comparison on your own traffic before citing it.

The bot-detection bypass rate of 81% is the quieter number. If ranking models, recsys, or experiment populations are ingesting agent traffic as human, the optimizer is converging on agent-preferred artifacts. Flag agent traffic in the experimentation platform before the next model refresh.

Action items

Add trajectory-level metrics (tool-call precision/recall, steps-to-completion, cost-per-successful-task) to eval harness this sprint
Run a 1-hour spike: measure token overhead of current MCP/tool-calling setup vs. a retrieval-first baseline on 100 production agent traces
Segment token spend by workload type (agentic vs single-shot) and benchmark Flash/Haiku substitution on non-reasoning nodes
Add agent-traffic flagging to your experimentation platform and retrain bot-detection models with agent-generated traffic in the training set

Sources:Agentic traffic crossed fifty-nine percent... · Vercel published a number worth sitting with... · The CyberGym result is the kind of finding... · MCP plus knowledge graphs is the combination... · AI Gateway data puts agentic workloads at fifty-nine percent...

Three Training Efficiency Claims That Change This Quarter's Build-vs-Buy Math

The Claims, Ranked by Actionability

Three research drops landed the same week. Each one moves unit economics for teams running their own training or distillation, in directions worth measuring.

Work	Claim	Scale Validated	Inference Impact	Spike Priority
Nous TST	2-3x wall-clock at matched FLOPs	270M → 10B-A1B MoE	None—no architecture change	Highest
Datology VLM curation	+11.7 pts on 20 benchmarks at 17x less compute	2B and 4B params	Lower response FLOPs—real serving win	High
NVIDIA Star Elastic	360x cheaper model-family derivation	Not specified	Produces family of sizes from one run	Medium (verify first)

TST Is the One to Spike First

Token Superposition Training is a pretraining recipe change with no inference-side downstream. If it replicates, that is a 2-3x wall-clock speedup for free. The validation range covers 270M to 10B-A1B MoE, which is wide enough to take seriously. The risk is medium because it is single-source, but the claim is clean and falsifiable. A 1B continued-pretraining run against a matched-FLOPs baseline answers the question in a week.

Datology: The Marginal Dollar Moved from Compute to Curation

The clearest evidence this year that data curation dominates compute scaling for VLMs. A 2B model beat InternVL3.5-2B by about 10 points at 17x less training compute. A 4B near-frontier model hit 3.3x lower response FLOPs than Qwen3-VL-4B. Benchmark-selection risk is real, and the thing these numbers don't tell you is how the curation pipeline transfers to a different data distribution. Even so, half the claim still justifies redirecting spend from GPU hours to curation tooling.

The marginal dollar in VLM training has moved from compute to curation. That's the clearest evidence this year.

Star Elastic: Verify Before Extrapolating

NVIDIA claims one post-training run produces a family of reasoning model sizes at 360x lower cost than pretraining each, and 7x better than SOTA compression. The 360x figure is the kind that always shrinks under independent evaluation. A 30x hold would still restructure how size tiers get produced. No external validation exists yet.

The Compute Backdrop: 4:1 Demand Crunch

Nebius reported 4+ customers competing for every GPU brought online, with Q1 revenue at $399M (+684% YoY) and full-year guidance of $3–3.4B. Cisco corroborates. AI product orders from hyperscalers jump from $5B to $9B (+80%) next fiscal year, with explicit memory hardware shortages called out. H2 training runs need reserved capacity locked in now across 2+ providers.

This is what makes the efficiency results matter beyond academic interest. At current contention levels, a 2-3x speedup at matched FLOPs is the difference between shipping in H2 and waiting for Q1 2027.

DuckDB Quack + Kafka Share Groups: The Single-Node Stack Grows Up

DuckDB's HTTP client-server mode makes embedded DuckDB viable as a shared service. Combined with the published ECS Fargate + Terraform pattern, that is a credible path to deleting Glue/EMR footprint for sub-100GB jobs. Kafka Share Groups decouple consumer parallelism from partition count, with roughly linear 8x scaling at 32 instances on I/O-bound workloads. Both invalidate assumptions the current stack was probably built on.

Action items

Spike Token Superposition Training on a 1B continued-pretraining run against a matched-FLOPs baseline this quarter
Lock H2 2026 GPU reservations across 2+ providers before quarterly sellouts tighten further
Audit Glue/EMR job catalog for single-node candidates and spike one onto ECS Fargate + DuckDB + Terraform pattern
Benchmark Kafka Share Groups against your most partition-bound consumer group (embedding/enrichment workloads first)

Sources:Claude just metered your agent SDK calls... · DuckDB shipped a client-server mode this week... · The 4:1 ratio is the headline number...

Lakehouse Data Poisoning Path: Iceberg/Polaris CVSS 9.9 + Argo CD Secret Disclosure

A New Attack Surface on Your Training Data

This week's advisory set is distinct from the LiteLLM/Ollama vulnerabilities covered earlier this week. The new vectors target the data and deployment layers specifically.

Apache Iceberg (CVE-2026-42812, CVSS 9.9) lets an attacker with table-write permission redirect metadata writes to an attacker-controlled S3 prefix. The next query reads poisoned Parquet. The next training run ingests silently corrupted features. The thing this doesn't tell you is that most lakehouse observability does not monitor metadata pointer mutations. Default logging covers row changes, not pointer changes.

Apache Polaris (CVE-2026-42809/10/11, CVSS 9.9) ships three credential-broadening bugs enabling cross-tenant access to S3/GCS credentials. Combined with the Iceberg redirect, there is a plausible path from compromised analyst notebook to poisoned training data to cross-tenant credential theft.

Argo CD 3.2.x/3.3.x (CVE-2026-42880, CVSS 9.6) lets read-only users extract plaintext Kubernetes Secrets. For teams promoting models to prod via Argo, every K8s Secret in reachable namespaces should be treated as disclosed until patched and rotated.

An attacker with table-write permission can point metadata at an attacker-controlled S3 prefix, so the next query reads poisoned Parquet and the next training run ingests silently corrupted features.

The Orchestrator Layer Is Soft

Component	CVSS	ML Stack Impact	Patch Action
Apache Iceberg	9.9	Poisoned tables, corrupted training data	Enforce explicit storage credential scoping + write-path allowlisting
Apache Polaris	9.9	S3/GCS creds, cross-tenant access	Patch + rotate all catalog credentials
Argo CD 3.2/3.3	9.6	Plaintext K8s Secrets (HF tokens, SA keys)	Patch to ≥3.2.12/≥3.3.10 + rotate every Secret
n8n	9.8	Workflow DB, OAuth sessions	Patch + scope service accounts
Kestra ≤1.3.3	9.8	Pipeline metadata, schedules	Patch + audit reach

Why This Is Different from Tuesday's Advisory

The LiteLLM and Ollama vulnerabilities covered Tuesday targeted the inference layer: API keys, prompts, model memory. The Iceberg/Polaris bugs target the data layer: table metadata, storage credentials, training inputs. The failure mode is not a data breach that surfaces in access logs. It is silent data poisoning that compounds through every downstream model and decision trained on the corrupted tables. Larger blast radius, harder detection.

The PraisonAI 4-hour exploitation window (CVE-2026-44338) confirms that agent frameworks are now first-class targets with sub-day weaponization. If agent orchestration holds API keys or tool-call permissions, assume a working exploit exists within a day of any disclosure. I would commit to that assumption and revise only if the next two cycles disagree.

The Compound Threat

Draw the reference architecture for a modern data team: Iceberg for storage, Polaris for catalog, Argo for deployment, n8n/Kestra for orchestration. Every component has a CVSS at or above 9.0 this cycle. The combination is what matters. Credential broadening (Polaris) feeds metadata redirect (Iceberg) feeds model poisoning (training pipeline) feeds secret disclosure (Argo) for persistence. This is not a single-patch problem.

Action items

Audit Iceberg/Polaris catalog configurations today: enforce explicit storage credential scoping and add write-path allowlisting for table metadata locations
Patch Argo CD to ≥3.2.12/≥3.3.10 and rotate every Kubernetes Secret in namespaces it can read by end of week
Run a dependency scan for n8n, Kestra, Spring Cloud Config, and Redis in your ML orchestration stack this sprint
Add metadata-pointer-mutation monitoring to lakehouse observability—alert on storage-location changes separate from data-row changes

Sources:LiteLLM landed in the KEV catalog this week... · An Ollama endpoint exposed to the public internet... · PraisonAI, an open-source multi-agent framework...

◆ QUICK HITS

Update: LiteLLM (KEV) — rotate all upstream provider API keys stored in its DB; versions 1.81.16–1.83.7 confirmed actively exploited
LiteLLM landed in the KEV catalog this week...
Abridge runs 80M+ clinical conversations through model routing — cheap triage model in front, expensive reasoning behind — 5-10x cost reduction at scale when routing is confidence-gated
Abridge runs model routing across 100M conversations...
TML-Interaction-Small reports 0.40s turn-taking latency vs 0.57s Gemini-3.1-flash-live and 1.18s GPT-Realtime-2.0 — full-duplex voice is becoming a distinct architecture class
TML is reporting 0.40 seconds of full-duplex latency...
Duolingo pegs AI-generated content slop at ~20% requiring human QC — a rare production quality number from a real deployment; benchmark your own acceptance rate against it
Duolingo's twenty percent AI slop rate...
Only 15% of organizations have the data foundation for agentic AI (Fivetran); data quality/lineage cited as #1 blocker by ~50% — half of funded agent projects are actually data-platform projects with an agent on top
DuckDB shipped a client-server mode this week...
LLM-as-a-Verifier beats LLM-as-a-Judge on tie-rate and decision accuracy by decomposing criteria into repeated binary verifications at token granularity — swap one pairwise judge this sprint
An Ollama endpoint exposed to the public internet...
SWE-ZERO-12M-trajectories released: 112B tokens, 12M trajectories, 122K PRs, 3K repos, 16 languages — positioned as largest open agentic trace corpus for SFT/RM training
Claude just metered your agent SDK calls...
Cerebras IPO closed +70% at $311; OpenAI's $20B commitment signed Dec 2025 is the first dollar-weighted proof wafer-scale handles production LLM inference
Cerebras IPO validates non-Nvidia silicon...
Gemini reproducibly outputs real phone numbers from training data — add PII extraction eval (canary insertion + divergence attacks + membership inference) to LLM CI before next release cut
Gemini is the latest model to surface PII...
New COSO/PCAOB guidance requires deterministic execution and tamper-evident audit trails for ML in regulated finance — LLM stochastic decoding is structurally non-compliant by design
The transformer underwriting models are outperforming...
Persona drift in LLM agents measurable within 8 conversational turns (Li et al. COLM 2024) — add a verbal-tic regex canary to agent logs as a zero-cost drift detector
AI personas drift within eight turns...

◆ Bottom line

The take.

Anthropic's 80x capacity miss has a June 15 deadline attached—every Claude-backed agent burns metered tokens at list price in 30 days—while 59% of production tokens are now agentic and your eval harness still scores single-turn completions. Simultaneously, Apache Iceberg's CVSS 9.9 lets attackers silently poison your training data through metadata redirects that default logging won't catch. The three things your stack needs this week: a Claude credit reconciliation before June 15, trajectory-level eval metrics for the 59% of traffic you're not measuring, and an Iceberg metadata-pointer audit before the next training run ingests something it shouldn't.

Frequently asked

What exactly changes for Claude usage on June 15?: Claude usage through third-party tools like Zed, Conductor, OpenCode, and T3 Code moves to a separate credit bucket capped at plan value, with no rollover and overflow billed at API rates. The effective 70-90% subsidy on programmatic usage through Max plans disappears, so any cost model assuming flat-rate Claude consumption through non-native IDEs needs to be reconciled before then.
Why are single-turn evals no longer sufficient for production?: Vercel's production index shows 59% of token volume is now agentic across 200,000 teams, but most eval harnesses still measure single-turn pass@1 on curated prompts. That misses tool-call precision, steps-to-completion, p95 latency under concurrent load, and reward-hacking paths in passing rollouts—failure modes that only surface in multi-turn traces and dominate real reliability.
How should teams instrument Claude spend given the lack of native telemetry?: Deploy an LLM gateway like LiteLLM or Portkey with per-user and per-feature tagging plus daily token budget alerts. Anthropic provides no native per-user or per-tool usage telemetry, so without a gateway the overage shows up on the invoice after credits are gone—as ServiceNow's CDIO discovered after burning through a full-year budget by May.
Which training efficiency claim is worth spiking first and why?: Nous Token Superposition Training, because it is a pretraining recipe change with no inference-side downstream and claims 2-3x wall-clock speedup at matched FLOPs, validated from 270M up to 10B-A1B MoE. A 1B continued-pretraining run against a matched-FLOPs baseline answers the question in roughly a week, and even a 1.6x replication pays for itself on the next full run.
What makes the Iceberg and Polaris CVEs different from inference-layer vulnerabilities?: They enable silent data poisoning rather than a detectable breach. CVE-2026-42812 lets an attacker redirect Iceberg metadata writes to an attacker-controlled S3 prefix so subsequent queries read poisoned Parquet, and Polaris credential-broadening bugs allow cross-tenant access to storage credentials. Default lakehouse logging tracks row changes, not metadata pointer mutations, so the corruption compounds through every downstream model trained on the affected tables.

◆ Same day, different angle

Read this day as…

◆ Recent in data science

Anthropic'sJune15CapEndstheHiddenAgentSubsidy

◆ INTELLIGENCE MAP

◆ DEEP DIVES

The Capacity Admission That Explains Everything

The June 15 Credit Change Is a Hard Deadline

The Enterprise Share Crossover

What's changing in the next rate-limit window

The Telemetry Gap Compounds the Problem

The Production Shape Has Changed

Why Cost Models Are Off by 5x

The Eval Gap Is Wider Than the Cost Gap

MCP Is Consolidating Faster Than Expected

The Claims, Ranked by Actionability

TST Is the One to Spike First

Datology: The Marginal Dollar Moved from Compute to Curation

Star Elastic: Verify Before Extrapolating

The Compute Backdrop: 4:1 Demand Crunch

DuckDB Quack + Kafka Share Groups: The Single-Node Stack Grows Up

A New Attack Surface on Your Training Data

The Orchestrator Layer Is Soft

Why This Is Different from Tuesday's Advisory

The Compound Threat

◆ QUICK HITS

The take.

Frequently asked

◆ RELATED THREADS