Apple's Foundation Models framework now accepts Claude as a drop-in model provider. Announced at WWDC 2026 on June 9 and backed by Anthropic's official ClaudeForFoundationModels Swift package, the change gives iOS and macOS developers a single LanguageModelSession API that works identically whether the model running underneath is Apple's on-device system model, Claude Sonnet, Claude Opus, or a local MLX model pulled from Hugging Face.
The practical shift is significant. Previously, shipping an AI feature in an Apple app meant choosing a provider and writing to its API directly, with no standard interface and no easy path to swap. Now the swap is one argument.
What exactly changed in the Foundation Models framework at WWDC 2026
Before this release, Foundation Models was Apple's on-device AI layer and nothing else. The LanguageModelSession API worked only with Apple's own system model and Private Cloud Compute backend. Swapping to Claude meant scrapping the Foundation Models integration entirely and writing against the Anthropic Messages API separately.
WWDC 2026 introduces two new protocol types: LanguageModel and LanguageModelExecutor. Any conforming implementation, Apple's own or a third party's, can back a LanguageModelSession. Anthropic shipped ClaudeForFoundationModels, a Swift package that conforms Claude to this protocol, on June 8, one day before WWDC. The package is versioned at 0.1.0, targets iOS 27, macOS 27, visionOS 27, and watchOS 27 (all in beta), and requires Xcode 27.
The session API stays identical across providers:
import FoundationModels
import ClaudeForFoundationModels
let onDevice = SystemLanguageModel()
let cloud = ClaudeLanguageModel(
name: .sonnet4_6,
auth: .apiKey(ProcessInfo.processInfo.environment["ANTHROPIC_API_KEY"] ?? "")
)
// same call, different model underneath
let session = LanguageModelSession(model: cloud)
let response = try await session.respond(to: "Summarize these meeting notes.")
print(response.content)
Switching from onDevice to cloud is one argument change. Streaming, structured output via @Generable, and client-side tool use all work through the same session.streamResponse(to:) and session.respond(to:generating:) calls regardless of which model backs the session.
Requests from the ClaudeForFoundationModels package go directly from the app to the Anthropic API. Apple is not in the request path and does not see prompts or responses. Usage is billed to your Anthropic account at standard API pricing. For production apps, route requests through a proxy using .proxied(headers:) authentication so no API key ships in the binary.
Google shipped an equivalent Swift package for Gemini through the same protocol. Both providers participate in the new ecosystem Apple described in the WWDC 2026 session "Bring an LLM Provider to the Foundation Models Framework", where the session lead framed the goal as "a thriving ecosystem where developers choose the right model for their app while using a unified, familiar API."
How does the on-device versus Claude routing decision work in practice
Apple's on-device system model is fast, private, and works without a network connection. It has a smaller context window and weaker reasoning than Claude but costs nothing to call and produces no outbound traffic. Claude's current models offer a 200K context window, stronger code generation, and server-side tools that can browse the web or execute code within a single round trip.
The routing pattern the Anthropic docs recommend is: start with SystemLanguageModel for lightweight tasks (classification, short-form Q&A, quick lookups), catch LanguageModelError.contextSizeExceeded or other capacity errors, and escalate to ClaudeLanguageModel for that turn. Because the session API is uniform, the fallback logic is a try-catch, not a full API rewrite.
let smallModel = SystemLanguageModel()
let bigModel = ClaudeLanguageModel(name: .sonnet4_6, auth: auth)
func complete(_ prompt: String) async throws -> String {
do {
let session = LanguageModelSession(model: smallModel)
return try await session.respond(to: prompt).content
} catch LanguageModelError.contextSizeExceeded {
let session = LanguageModelSession(model: bigModel)
return try await session.respond(to: prompt).content
}
}
Server-side tools, which include web search, web fetch, and code execution running on Anthropic infrastructure, are configured per ClaudeLanguageModel instance via serverTools: rather than on the session, because the session type is Apple's and carries no knowledge of Claude-specific capabilities. You construct separate model instances for different tool configurations.

What is the Dynamic Profiles system Apple announced alongside this
Dynamic Profiles is a new Foundation Models capability that lets an app swap the model, tool set, and system instructions mid-session without restarting the conversation. Apple is positioning it as the foundation for multi-agent workflows where different steps in a task use different models or capabilities.
In practice this means a single app session can start with the on-device model to parse a user request cheaply, switch to Claude with web search enabled for research, and return to the on-device model for local formatting, all within one conversation transcript that flows through the session without interruption. The context built up across model switches is preserved in the transcript structure, which carries instructions, prompts, tool calls, tool outputs, and prior responses as typed entries.
The combination of the unified LanguageModel protocol and Dynamic Profiles is what makes a sensible cost-tiering architecture possible at the app level. Previous approaches required the developer to manage multiple separate API clients and stitch the context manually.
The Vibe Coder Blog covers AI coding tool changes with a focus on what builders can act on today.
Browse All PostsWhat about the free compute tier and open-source plans
Apple announced two cost-relevant items alongside the protocol changes. First, developers with fewer than two million cumulative first-time App Store downloads can use Apple's Private Cloud Compute model for free. Private Cloud Compute runs a cloud-hosted version of Apple's own model with privacy guarantees (Apple commits it does not see prompts or logs). This is separate from the Claude or Gemini providers, which bill to your Anthropic or Google account. For small devs building their first AI feature, Private Cloud Compute provides a capable cloud model with no external API cost while you are in early traction.
Second, Apple confirmed the Foundation Models framework will go open source later this summer, along with two companion implementations: CoreAILanguageModel and MLXLanguageModel. CoreAILanguageModel runs third-party local models on the Apple Neural Engine. MLXLanguageModel plugs into the MLX ecosystem, which covers thousands of community models on Hugging Face, runnable on Mac GPU. Once these land, the full spectrum from local open weights to Claude frontier model will be available through the same session API with no code changes.
Apple also shipped a Python SDK for Foundation Models and a new fm CLI tool in macOS 27. fm chat opens a terminal session with the on-device model directly, giving developers a quick way to test prompts without opening an app.

What should vibecoders building iOS apps do right now
Three practical steps apply immediately, though the stack is in beta.
Audit which tasks need Claude and which do not. The on-device system model handles classification, short summarization, keyword extraction, and simple formatting at zero cost. Claude earns its API fee on tasks requiring large context (codebase review, document ingestion), multi-step reasoning, or server-side tools (web search, code execution). Mapping your app's AI calls to one of these two buckets now will make the routing implementation straightforward when you adopt the new framework.
Do not ship an API key in the binary. The official docs are explicit: .apiKey auth is for development only. A key bundled into a shipped binary is extractable. Production apps should route through a proxy using .proxied(headers:), which sends a caller-controlled header to your backend; your backend adds the x-api-key and forwards to the Anthropic API. Vibecoders who have built backend-for-frontend patterns for web apps already have this shape of server.
Target macOS 27 or iOS 27 beta. The Foundation Models server-side language model API (the part that enables third-party providers like Claude) requires OS 27. It is in beta as of WWDC, with general availability expected when OS 27 ships later in 2026. You can build and test today with Xcode 27 beta. If your app needs to support older OS versions, you will need a conditional import path or a separate API client for the earlier versions.
Configuring server-side tools (web search, code execution) on the LanguageModelSession instead of on ClaudeLanguageModel. Because the session type is Apple's, it has no knowledge of Claude-specific server tools. They must be set on the model instance via serverTools: at construction time. If you need different tool sets for different conversations, construct separate ClaudeLanguageModel instances rather than trying to reconfigure a session mid-use.
How does authentication work without exposing keys
The ClaudeForFoundationModels package offers two authentication modes. .apiKey takes a string directly and is appropriate for development and sandboxed testing. .proxied(headers:) takes a dictionary of HTTP headers that are sent with every request to a URL you provide via baseURL:. Your server at that URL receives a standard Anthropic Messages API request, adds the x-api-key header server-side, and forwards it to https://api.anthropic.com. The app ships only a session token or opaque header, not the Anthropic key itself.
This proxy shape is the same pattern vibecoders already use to protect other third-party API keys in mobile apps, and it fits naturally into a Cloudflare Worker or a lightweight Next.js API route. If you already have a backend-for-frontend for your web product, adding a Claude proxy endpoint is a small addition.
The package also mentions App Attest for device verification, which Apple recommends as an additional layer for production cloud-model calls. App Attest lets your proxy server confirm the request came from a legitimate, unmodified install of your app before adding the Anthropic credential.
The full technical spec for the LanguageModel and LanguageModelExecutor protocols, including how to write your own provider conformance, is covered in the WWDC 2026 session 339. The runnable ClaudeExample command-line target in the GitHub repo is a good starting point for local testing before you integrate into an existing app.
The Vibe Coder Blog covers capability updates for builders shipping with AI tools.
Read More