Skip to content
·12 min read

Apple Foundation Models Opens to Claude at WWDC 2026

Apple's new LanguageModel protocol lets iOS apps swap between the on-device model and Claude with one argument change

Share

Apple's Foundation Models framework now accepts Claude as a drop-in model provider. Announced at WWDC 2026 on June 9 and backed by Anthropic's official ClaudeForFoundationModels Swift package, the change gives iOS and macOS developers a single LanguageModelSession API that works identically whether the model running underneath is Apple's on-device system model, Claude Sonnet, Claude Opus, or a local MLX model pulled from Hugging Face.

The practical shift is significant. Previously, shipping an AI feature in an Apple app meant choosing a provider and writing to its API directly, with no standard interface and no easy path to swap. Now the swap is one argument.

What exactly changed in the Foundation Models framework at WWDC 2026

Before this release, Foundation Models was Apple's on-device AI layer and nothing else. The LanguageModelSession API worked only with Apple's own system model and Private Cloud Compute backend. Swapping to Claude meant scrapping the Foundation Models integration entirely and writing against the Anthropic Messages API separately.

WWDC 2026 introduces two new protocol types: LanguageModel and LanguageModelExecutor. Any conforming implementation, Apple's own or a third party's, can back a LanguageModelSession. Anthropic shipped ClaudeForFoundationModels, a Swift package that conforms Claude to this protocol, on June 8, one day before WWDC. The package is versioned at 0.1.0, targets iOS 27, macOS 27, visionOS 27, and watchOS 27 (all in beta), and requires Xcode 27.

The session API stays identical across providers:

import FoundationModels
import ClaudeForFoundationModels

let onDevice = SystemLanguageModel()
let cloud = ClaudeLanguageModel(
  name: .sonnet4_6,
  auth: .apiKey(ProcessInfo.processInfo.environment["ANTHROPIC_API_KEY"] ?? "")
)

// same call, different model underneath
let session = LanguageModelSession(model: cloud)
let response = try await session.respond(to: "Summarize these meeting notes.")
print(response.content)

Switching from onDevice to cloud is one argument change. Streaming, structured output via @Generable, and client-side tool use all work through the same session.streamResponse(to:) and session.respond(to:generating:) calls regardless of which model backs the session.

Key Takeaway

Requests from the ClaudeForFoundationModels package go directly from the app to the Anthropic API. Apple is not in the request path and does not see prompts or responses. Usage is billed to your Anthropic account at standard API pricing. For production apps, route requests through a proxy using .proxied(headers:) authentication so no API key ships in the binary.

Google shipped an equivalent Swift package for Gemini through the same protocol. Both providers participate in the new ecosystem Apple described in the WWDC 2026 session "Bring an LLM Provider to the Foundation Models Framework", where the session lead framed the goal as "a thriving ecosystem where developers choose the right model for their app while using a unified, familiar API."

How does the on-device versus Claude routing decision work in practice

Apple's on-device system model is fast, private, and works without a network connection. It has a smaller context window and weaker reasoning than Claude but costs nothing to call and produces no outbound traffic. Claude's current models offer a 200K context window, stronger code generation, and server-side tools that can browse the web or execute code within a single round trip.

The routing pattern the Anthropic docs recommend is: start with SystemLanguageModel for lightweight tasks (classification, short-form Q&A, quick lookups), catch LanguageModelError.contextSizeExceeded or other capacity errors, and escalate to ClaudeLanguageModel for that turn. Because the session API is uniform, the fallback logic is a try-catch, not a full API rewrite.

let smallModel = SystemLanguageModel()
let bigModel = ClaudeLanguageModel(name: .sonnet4_6, auth: auth)

func complete(_ prompt: String) async throws -> String {
  do {
    let session = LanguageModelSession(model: smallModel)
    return try await session.respond(to: prompt).content
  } catch LanguageModelError.contextSizeExceeded {
    let session = LanguageModelSession(model: bigModel)
    return try await session.respond(to: prompt).content
  }
}

Server-side tools, which include web search, web fetch, and code execution running on Anthropic infrastructure, are configured per ClaudeLanguageModel instance via serverTools: rather than on the session, because the session type is Apple's and carries no knowledge of Claude-specific capabilities. You construct separate model instances for different tool configurations.

EXPLAINER DIAGRAM: Flow chart on a white background with three rounded decision nodes and two action boxes. Node 1 at top labeled PROMPT ARRIVES in dark teal, arrow down to diamond labeled SHORT AND OFFLINE-SAFE? in bold black. Left arrow labeled YES points to box ON-DEVICE MODEL in sky blue, labeled FAST FREE PRIVATE with a checkmark. Right arrow labeled NO points to second diamond labeled EXCEEDS CONTEXT? in bold black. From that diamond, left arrow labeled NO points to same sky-blue ON-DEVICE MODEL box, right arrow labeled YES points to box CLAUDE CLOUD MODEL in coral red, labeled LARGER CONTEXT SERVER TOOLS with a star. Both model boxes have a downward arrow pointing to RESPONSE box in light gray at bottom. Bold header at top reads HOW TO ROUTE BETWEEN ON-DEVICE AND CLAUDE. Small gray footer reads apple foundation models LanguageModel protocol june 2026.
The escalation pattern: start with Apple's free on-device model and escalate to Claude when context size or reasoning quality demands it. Both models return responses through the same Foundation Models session API.

What is the Dynamic Profiles system Apple announced alongside this

Dynamic Profiles is a new Foundation Models capability that lets an app swap the model, tool set, and system instructions mid-session without restarting the conversation. Apple is positioning it as the foundation for multi-agent workflows where different steps in a task use different models or capabilities.

In practice this means a single app session can start with the on-device model to parse a user request cheaply, switch to Claude with web search enabled for research, and return to the on-device model for local formatting, all within one conversation transcript that flows through the session without interruption. The context built up across model switches is preserved in the transcript structure, which carries instructions, prompts, tool calls, tool outputs, and prior responses as typed entries.

The combination of the unified LanguageModel protocol and Dynamic Profiles is what makes a sensible cost-tiering architecture possible at the app level. Previous approaches required the developer to manage multiple separate API clients and stitch the context manually.

Track what ships for AI coding workflows

The Vibe Coder Blog covers AI coding tool changes with a focus on what builders can act on today.

Browse All Posts

What about the free compute tier and open-source plans

Apple announced two cost-relevant items alongside the protocol changes. First, developers with fewer than two million cumulative first-time App Store downloads can use Apple's Private Cloud Compute model for free. Private Cloud Compute runs a cloud-hosted version of Apple's own model with privacy guarantees (Apple commits it does not see prompts or logs). This is separate from the Claude or Gemini providers, which bill to your Anthropic or Google account. For small devs building their first AI feature, Private Cloud Compute provides a capable cloud model with no external API cost while you are in early traction.

Second, Apple confirmed the Foundation Models framework will go open source later this summer, along with two companion implementations: CoreAILanguageModel and MLXLanguageModel. CoreAILanguageModel runs third-party local models on the Apple Neural Engine. MLXLanguageModel plugs into the MLX ecosystem, which covers thousands of community models on Hugging Face, runnable on Mac GPU. Once these land, the full spectrum from local open weights to Claude frontier model will be available through the same session API with no code changes.

Apple also shipped a Python SDK for Foundation Models and a new fm CLI tool in macOS 27. fm chat opens a terminal session with the on-device model directly, giving developers a quick way to test prompts without opening an app.

EXPLAINER DIAGRAM: Horizontal spectrum chart on a light gray background. Five model tiers arranged left to right on a single axis. Leftmost box labeled ON-DEVICE SYSTEM MODEL in dark teal, small badge reading FREE PRIVATE OFFLINE. Next box labeled PRIVATE CLOUD COMPUTE in sky blue, badge reading FREE UNDER 2M DOWNLOADS. Middle box labeled CoreAI LOCAL MODELS in purple, badge reading APPLE NEURAL ENGINE. Next box labeled MLX HUGGING FACE in golden yellow, badge reading THOUSANDS OF MODELS. Rightmost box labeled CLAUDE GEMINI in coral red, badge reading BILLED TO YOUR API ACCOUNT. A bold horizontal arrow runs beneath all five boxes labeled SAME LanguageModelSession API. Header at top reads THE FULL FOUNDATION MODELS SPECTRUM. Small footer text reads apple wwdc 2026 os 27 beta.
All five model tiers use the identical LanguageModelSession API. Cost and capability scale from left to right. Small developers get free cloud compute from Apple before they need to pay for Claude or Gemini.

What should vibecoders building iOS apps do right now

Three practical steps apply immediately, though the stack is in beta.

Audit which tasks need Claude and which do not. The on-device system model handles classification, short summarization, keyword extraction, and simple formatting at zero cost. Claude earns its API fee on tasks requiring large context (codebase review, document ingestion), multi-step reasoning, or server-side tools (web search, code execution). Mapping your app's AI calls to one of these two buckets now will make the routing implementation straightforward when you adopt the new framework.

Do not ship an API key in the binary. The official docs are explicit: .apiKey auth is for development only. A key bundled into a shipped binary is extractable. Production apps should route through a proxy using .proxied(headers:), which sends a caller-controlled header to your backend; your backend adds the x-api-key and forwards to the Anthropic API. Vibecoders who have built backend-for-frontend patterns for web apps already have this shape of server.

Target macOS 27 or iOS 27 beta. The Foundation Models server-side language model API (the part that enables third-party providers like Claude) requires OS 27. It is in beta as of WWDC, with general availability expected when OS 27 ships later in 2026. You can build and test today with Xcode 27 beta. If your app needs to support older OS versions, you will need a conditional import path or a separate API client for the earlier versions.

Common Mistake

Configuring server-side tools (web search, code execution) on the LanguageModelSession instead of on ClaudeLanguageModel. Because the session type is Apple's, it has no knowledge of Claude-specific server tools. They must be set on the model instance via serverTools: at construction time. If you need different tool sets for different conversations, construct separate ClaudeLanguageModel instances rather than trying to reconfigure a session mid-use.

How does authentication work without exposing keys

The ClaudeForFoundationModels package offers two authentication modes. .apiKey takes a string directly and is appropriate for development and sandboxed testing. .proxied(headers:) takes a dictionary of HTTP headers that are sent with every request to a URL you provide via baseURL:. Your server at that URL receives a standard Anthropic Messages API request, adds the x-api-key header server-side, and forwards it to https://api.anthropic.com. The app ships only a session token or opaque header, not the Anthropic key itself.

This proxy shape is the same pattern vibecoders already use to protect other third-party API keys in mobile apps, and it fits naturally into a Cloudflare Worker or a lightweight Next.js API route. If you already have a backend-for-frontend for your web product, adding a Claude proxy endpoint is a small addition.

The package also mentions App Attest for device verification, which Apple recommends as an additional layer for production cloud-model calls. App Attest lets your proxy server confirm the request came from a legitimate, unmodified install of your app before adding the Anthropic credential.

Frequently Asked Questions

The full technical spec for the LanguageModel and LanguageModelExecutor protocols, including how to write your own provider conformance, is covered in the WWDC 2026 session 339. The runnable ClaudeExample command-line target in the GitHub repo is a good starting point for local testing before you integrate into an existing app.

Stay current on AI coding tools and workflows

The Vibe Coder Blog covers capability updates for builders shipping with AI tools.

Read More
PJ
Pranay Joshi

20+ years building products at scale. VP of Product & Engineering, startup founder, and AI coach. Helping dreamers turn ideas into reality with vibe coding.

The Tuesday Shipping Report

Every Tuesday, one focused email:

  • - The tool or technique that's actually working right now
  • - A real problem from the community (and how to solve it)
  • - What changed this week in the vibe coding landscape

Read by 1,000+ founders, developers, and creators building with AI. Free forever. No spam.