Product daily

Edition 2026-05-07 · read as Product

iOS27ModelPickerTurnsAIProvidersIntoaDefaultRace

Sources
36
Words
1,287
Read
6min

Topics LLM Inference Agentic AI Data Infrastructure

◆ The signal

A user opens Settings once this fall, picks a model provider for iOS 27, and doesn't touch that screen for months. That's one choice, across a billion devices, sitting next to Bluetooth. If a product's AI story is "we use the good model," that story now lives in someone else's menu. The work this quarter is showing the product holds up when the model underneath is swappable. Whoever isn't the default on day one gets no query volume, and "no" here means zero.

◆ INTELLIGENCE MAP

  1. 01

    iOS 27 Model Marketplace Ships Fall 2026 — The Biggest Distribution Shift Since the App Store

    act now

    Apple is opening iOS 27, iPadOS 27, and macOS 27 to multiple third-party AI providers. Users pick once; that choice routes all system-wide AI queries. Apple paid $250M in Siri settlement because it couldn't build competitive AI — so it's becoming the selector layer instead. Model wrappers die; workflow owners survive.

    $250M
    Siri settlement cost
    6
    sources
    • Active iOS devices
    • Siri settlement
    • SDK drop expected
    • User behavior
    1. WWDC SDK docsJune 2026
    2. iOS 27 betaSummer 2026
    3. Public launchFall 2026
    4. OpenAI phone shipsH1 2027
  2. 02

    AI Platform Layer Splits Into Three Incompatible Business Models

    monitor

    OpenAI is an ad network ($100M ARR in 6 weeks, targeting $100B by 2030). Anthropic is a vertical services company ($1.5B finance JV, 10 agents for Wall Street). Google is the infrastructure layer ($200B cloud commitment from Anthropic alone). These aren't competitors — they're three different ecosystems with different incentive structures for your product.

    $100M
    OpenAI ads ARR in 6 weeks
    8
    sources
    • OpenAI ads 2026 target
    • OpenAI ads 2030 target
    • Anthropic finance JV
    • Anthropic cloud commit
    • OpenAI weekly users
    1. OpenAI Ads (6wks)100
    2. OpenAI Ads 20262500
    3. Anthropic JV1500
    4. OpenAI Deploy Co4000
  3. 03

    Architecture Economics Reset: 12M-Token Context at 1/1000th Cost, Vision Agents at 45x Premium

    monitor

    SubQ launched a 12M-token context window at $8 per benchmark run vs ~$2,600 for Opus — potentially eliminating 40% of RAG pipeline work. Simultaneously, vision agents cost 45x structured API paths and better models don't close the gap. MTP drafters deliver 3x throughput at 78M parameters. Your cost assumptions from Q1 are stale in both directions.

    325x
    context cost reduction
    5
    sources
    • SubQ context window
    • SubQ cost vs Opus
    • Vision agent premium
    • MTP throughput gain
    • MTP drafter size
    1. Structured API1
    2. Vision Agent45
    3. SubQ vs Opus0.003
  4. 04

    Dynamic Pricing Bans: 33+ States, Hard Oct 1 Deadline

    act now

    Maryland's Protection From Predatory Pricing Act takes effect Oct 1, 2026 — first state-level ban on AI-powered dynamic pricing. ~33 states have similar legislation in draft. Governor explicitly named 'predictive AI that determines when we'll pay more.' Any pricing logic using individual user signals is the target. Four quarters of engineering work, starting now.

    33+
    states drafting bans
    2
    sources
    • Maryland effective date
    • States with drafts
    • Engineering runway
    1. Maryland signedQ2 2026
    2. Compliance audit dueQ3 2026
    3. Maryland enforcedOct 1, 2026
    4. Additional states2027
  5. 05

    The Reliability Gap: 18 Months of Capability Gains, Minimal Trust Improvement

    background

    Research across 14 frontier models over 18 months shows capability surged while reliability barely moved. Best-in-class still hallucinates ~30% in multi-turn. Power users use AI 'thinking' 7x more than median. 90% of firms report zero AI impact over 3 years. The upgrade from one model to the next doesn't fix the edge cases — orchestration engineering does.

    30%
    multi-turn hallucination
    4
    sources
    • Models studied
    • Hallucination floor
    • Power vs median gap
    • Firms with zero impact
    1. Capability growth (18mo)80
    2. Reliability growth (18mo)15

◆ DEEP DIVES

  1. 01

    iOS 27 Is the New Default — Your Model-Wrapper Product Has Two Quarters to Evolve or Die

    The Distribution Event

    A user upgrades to iOS 27, iPadOS 27, or macOS 27 and is asked to pick an AI provider. She picks once. From that moment, every system-wide AI request — summarize this, draft a reply, generate this image — routes to whichever third-party AI model she selected for text generation, image generation, and editing. The named beneficiaries are Google and Anthropic, which tells you the bar is foundation-model scale. Every system-wide AI query goes through one chosen provider.

    There is no browsing. Users pick once. Whoever is in the default slot on day one gets the query volume. Whoever is not, does not.

    Why This Is Different From Previous Platform Shifts

    The $250M Siri settlement is the backstory. Apple promoted AI features "that did not exist at the time, do not exist now, and will not exist for two or more years." It could not ship competitive AI fast enough, so it became the selector layer instead. This is the App Store playbook applied to AI: control distribution, extract rents, let others compete on capability.

    OpenAI is running the same play from the opposite direction — a dedicated AI phone with dual-NPU architecture targeting 30M units in 2027-2028. Google's Remy agent is in internal dogfood, potentially debuting at I/O May 19-20. Meta has an agentic assistant targeting pre-Q4 2026. Four companies are building the next interaction layer simultaneously.

    The Three-Cell Diagnostic

    If the product is a model: integration work starts now. Latency budgets on-device, Apple's review process, being on the partner list before it goes public. That is a real distribution win at billion-device scale.

    If the product is a model wrapper: the default-assistant slot is where distribution goes to die. The user picks the underlying model. The wrapper layer disappears.

    If the product is an application built on a model: the pricing conversation with the model vendor changes in two quarters, because their marginal query cost is about to drop. The core product is safe only if it owns data or workflow the assistant cannot replicate.

    The Forcing Function

    Pull the last 30 days of session data. Separate sessions where user intent formed elsewhere (the user arrives with a question) from sessions where intent forms inside the product (the user discovers what to do next). If the first bucket is larger, the iOS 27 moves hit the roadmap this quarter. If the second is larger, the product has a workflow moat the selector layer cannot intermediate.

    Action items

    • Map every AI-powered feature in your product against the question: 'Can a user accomplish this by asking their default iOS assistant instead?'
    • Evaluate whether your product could register as a selectable AI provider on Apple's platform — draft partnership proposal if domain-specific AI is your core value
    • Audit all public AI feature marketing for capability claims exceeding delivery — Apple's $250M sets legal precedent at $25-$95 per eligible device

    Sources:A user on iOS 27 long-presses the side button... · A product manager at a mid-size SaaS company opened the iOS 27 developer beta... · A product manager at a mid-sized SaaS company opened Apple's iOS 27 developer preview... · A product manager at a mid-sized SaaS company spent Tuesday morning rewriting...

  2. 02

    The Architecture Economics Just Flipped: 12M Tokens for $8, Vision Agents at 45x, and What to Rebuild

    Three Cost Curves Moving Simultaneously

    Three infrastructure developments landed this week that collectively invalidate Q1 cost assumptions for AI features:

    1. SubQ's 12M-token context window at 1/1000th the compute of frontier models — $8 per benchmark run versus ~$2,600 on Opus. RULER 128K score of 95%, competitive with Opus 4.6 and DeepSeek V4 Pro.
    2. Vision agents cost 45x structured API paths — and better models don't close the gap because screenshot volume (not accuracy) drives token consumption.
    3. Multi-Token Prediction drafters at 78M parameters deliver 3x throughput with no quality degradation, with day-0 support across vLLM, SGLang, Ollama, and llama.cpp.
    A feature that cost $0.03 per interaction six months ago now costs $0.012 on optimized routing. At volume, that's the line between a product and a subsidy.

    What the Context Cost Cliff Means for RAG

    SubQ's numbers are self-reported and need independent validation. But the useful question isn't whether these specific numbers hold — it's what happens to chunking strategies, multi-hop retrieval, summarization chains, and progressive document disclosure when context drops 100x in price. All of those patterns exist because context was scarce. About 40% of most RAG roadmaps becomes unnecessary if the pricing holds. That is not the same as RAG being dead — retrieval still solves freshness and access control — but it's a different product with a different cost profile.

    The Vision Agent Trap

    Teams built vision-first because prototyping was easier. The cost curve is flat against model improvements — a model that is 50% more accurate still sends thousands of input tokens per screenshot. The forcing function is a simple tag on your automation backlog:

    CellSchemaLatencyPath
    SafeStableTightStructured-first always
    DefensibleVariableFlexibleVision earns its 45x
    Judgment callMixedMixedMeasure both paths

    The Inference Router Opportunity

    DigitalOcean's Inference Router delivered 61% cost reduction through intelligent model routing (selecting optimal models by cost, latency, and quality per request). Combined with MTP drafters and the SubQ context breakthrough, the features shelved because 'inference is too expensive at scale' deserve a re-run through current economics. Some are shippable now.

    Action items

    • Pull the list of features shelved for cost reasons in Q4 2025 or Q1 2026 and rebuild unit economics with current routing/MTP pricing — flag newly viable candidates for sprint planning
    • Tag every item in the automation backlog with 'stable schema/API exists' vs 'variable/no API' — move anything in the first category off vision-agent paths immediately
    • Sign up for SubQ private beta and run your top 3 RAG use cases against 12M-token context — determine which parts of your chunking pipeline are solving a real problem vs a constraint that just disappeared

    Sources:A product manager on an agent team ran the same invoice-extraction task... · A RAG engineer opened her vector database dashboard... · A product manager on an agent team ran the numbers... · A staff engineer shipped four features last sprint...

  3. 03

    33 States, One Deadline: The Dynamic Pricing Regulatory Wave You Have Four Quarters to Solve

    The Specific Threat

    A pricing PM at a retail company opened her model's feature importance dashboard this week and saw that device type and browsing history were two of the top five signals. That is the problem. Maryland's Protection From Predatory Pricing Act takes effect October 1, 2026, the first US state-level ban on AI-powered dynamic pricing. Governor Wes Moore named the target directly: "predictive AI that determines what we need, when we need it, when we'll pay for it and when we'll pay more for it." That language describes every modern pricing optimization system that uses individual user signals.

    Per the New York Times, roughly 33 additional states have similar bills in motion. Some target surveillance pricing. Some target surge pricing on essentials. A few cover any algorithm that varies price by user attribute. The drafts do not agree with each other, which makes a patchwork harder to comply with than a single ban.

    The strictest state's rules become your de facto national standard because building 50 state-specific pricing engines is impractical. This is GDPR-for-pricing.

    What's Banned vs. What Survives

    The pitch is usually "personalization." What the pricing engine is actually doing is price discrimination using individual user signals: purchase history, device type, behavioral patterns that move the price person-to-person for the same product. The law does not touch contextual and temporal signals: store location, time of day, inventory levels, demand tiers applied uniformly.

    Revenue optimization does not have to die. It has to become transparent and non-discriminatory:

    • Time-based pricing (happy hour, early bird) likely survives
    • Volume-based pricing and membership tiers likely survive
    • Per-user price discrimination from behavioral signals does not survive

    The Audit Framework

    The 2x2 a pricing PM can run on Monday: on one axis, label every input as individual, contextual, or temporal. On the other, measure the revenue lift attributable to each group. Pull the individual signals out and rerun the model. If the lift collapses when those signals come out, the product was extracting, not personalizing. That is worth knowing before a regulator tells you.

    Why This Is a PM Problem, Not a Legal Problem

    Compliance work slips because nothing breaks on the demo. But re-architecting a pricing engine to be scoped by jurisdiction, category, and customer segment, with decision logs a regulator will accept, is roughly four quarters of engineering. Starting the work after October 1, 2026 means operating illegally during the build.

    Action items

    • Audit every dynamic pricing feature for individual user signals (purchase history, device type, browsing behavior) vs. contextual signals (time, location, inventory) — document all instances and the signals used
    • Draft a 'compliant pricing alternatives' spec that replaces per-user discrimination with uniform mechanisms (time-based tiers, volume discounts, membership pricing) that achieve similar optimization
    • Build jurisdictional scoping into the pricing engine architecture: can it be turned off per state without a code change? Can each price decision be explained in one sentence to a non-technical auditor?

    Sources:A pricing manager at a mid-market retailer... · Dynamic pricing bans hitting 33+ states...

◆ QUICK HITS

  • ElevenLabs hit $500M ARR (up from $350M in ~5 months, +43%) driven by enterprise voice agents — Microsoft simultaneously killed Gaming Copilot because generic AI chatbots don't produce retention without a specific job-to-be-done

    A product lead at a mid-market SaaS company opened the same internal doc three times...

  • North Korean APTs are slopsquatting AI coding agents — registering packages that match hallucinated dependency names, then harvesting installs when developers rubber-stamp AI suggestions. Implement a dependency allowlist blocking packages newer than the model's knowledge cutoff

    A developer on your team asked a coding agent to scaffold a new service...

  • WP Engine's default-on AI bot blocking causes 0% citation in Claude and Meta AI — site owners can't see or disable it. If your product site is on WP Engine, you have a discoverability emergency; escalate to their product engineering team or evaluate migration

    Your product's AI discoverability may be zero...

  • OpenAI's self-serve ChatGPT Ads Manager is live in US beta with CPC bidding — early CPCs are low before sophisticated buyers arrive. Allocate $2-5K test budget this sprint against your highest-converting keywords

    Your web product needs an AI agent UX layer...

  • Update: Gartner published inaugural Market Guide for Guardian Agents — AI agent governance is now a named procurement category. Enterprise deals will ask about it on security questionnaires within two quarters

    A buyer opened a vendor evaluation spreadsheet this week...

  • Linear new customer acquisitions +67% YoY while Jira dropped -32% and Monday.com cratered -41%. Linear dominates the 11-200 employee segment where the tool is chosen by the people who use it, not a program office

    A platform PM spent Tuesday afternoon in a meeting about AI governance...

  • Anthropic's MCP STDIO implementation has a systemic RCE vulnerability affecting 150M+ downloads across all downstream AI frameworks, IDEs, and registries — 30+ disclosures, 10+ CVEs, one root cause. Conduct risk review of any MCP-dependent features before shipping to production

    A developer opened the Stripe dashboard on a Tuesday...

  • Open-source deep research (Onyx) ranked #1 on DeepResearch Bench — 100 PhD-level tasks across 22 fields — ahead of OpenAI, Gemini 2.5 Pro, and Perplexity. Self-hosted option now viable for regulated-industry features previously blocked on data egress

    A research lead at a mid-size company ran the same prompt through OpenAI's deep research agent...

  • Google published explicit guidance on building websites for AI agent consumption — agents use screenshots + accessibility trees + semantic HTML. Treat 'agent navigability' as a first-class product requirement for self-serve flows

    Your web product needs an AI agent UX layer...

◆ Bottom line

The take.

The AI platform layer split into three incompatible business models this week — OpenAI is building a $100B ad network, Anthropic is building vertical services companies for Wall Street, and Apple is letting a billion users pick their AI provider once and forget about it — while dynamic pricing bans in 33+ states create the first hard regulatory deadline (Oct 1, 2026) any pricing PM needs on their wall. The model you picked is no longer a moat; the workflow you own, the data you hold, and the compliance architecture you ship this quarter are the only things left that survive all three platform shifts simultaneously.

— Promit, reading as Product ·

Frequently asked

What does the iOS 27 default model picker actually change for AI products?
It collapses AI distribution to a single Settings choice that users make once and rarely revisit. Whoever isn't selected as the default provider gets effectively zero query volume from system-wide AI requests like summarize, draft, or generate, because there's no in-the-moment browsing — the choice sits next to Bluetooth and stays there for months.
How do I tell if my product is exposed to the default-assistant threat?
Run a three-cell diagnostic. If you're a foundation model, fight for the partner list now. If you're a model wrapper, the default slot will eat your distribution. If you're an application built on a model, you're safe only if you own data or workflow the assistant can't replicate — separate sessions where intent forms inside your product from those where users arrive with a question already formed.
Which Q1 2026 cost assumptions should I rebuild first?
Three curves moved at once: SubQ's 12M-token context at roughly 1/1000th frontier compute, Multi-Token Prediction drafters delivering 3x throughput with no quality loss, and inference routers cutting costs ~61%. Pull features shelved for unit economics in Q4 2025 or Q1 2026 and re-run the math — many are shippable now, and ~40% of typical RAG chunking work may be solving a constraint that just disappeared.
Why are vision agents flagged as a trap rather than a capability win?
Their cost runs about 45x structured API paths, and better models don't close the gap because screenshot token volume — not accuracy — drives spend. Tag every automation backlog item by whether a stable schema or API exists; if it does, route it through structured calls. Reserve vision for genuinely variable interfaces where it earns the multiplier.
What's the minimum viable response to Maryland's October 1, 2026 pricing law?
Audit every dynamic pricing input and separate individual signals (purchase history, device type, behavior) from contextual and temporal ones (location, time, inventory, uniform demand tiers). Individual-signal price discrimination is the target; time-based, volume-based, and membership pricing likely survive. Architect jurisdictional scoping now — re-engineering a pricing system with auditable decision logs is roughly four quarters of work, and ~33 states have overlapping bills in motion.

◆ Same day, different angle

Read this day as…

◆ Recent in product

Keep reading.