Apple's Foundation Models of the 3rd Generation Apfelpatient

Five models, distributed between iPhones, Apple servers, and third-party cloud hardware: At WWDC 2026, Apple unveiled the third generation of its Foundation Models. For the first time, one of the models does not run on Apple Silicon, but on Nvidia chips in Google's data centers – a break with a previously ironclad principle.

With the third generation of Apple Foundation Models, internally abbreviated as AFM, Apple is reorganizing the foundation of its entire AI strategy. Instead of a single model, the system now consists of five specialized models that, depending on the task, run on the device, on Apple's own servers, or on third-party infrastructure. This step is the provisional culmination of the realignment of Apple's AI future around Siri and Google's Gemini technology, which Apple has been preparing over the past few months. Only now is it becoming concretely clear how this architecture is structured in detail – and at what point Apple is abandoning its strict "everything on its own hardware" principle.

From the 2024 launch to the Google partnership

When Apple first introduced its Foundation Models 2024, the lineup consisted of two components: a language model with around three billion parameters that ran directly on the device, and a larger, server-based language model. The latter was tied to private cloud computing and ran on servers with Apple Silicon chips.

Private Cloud Compute, or PCC for short, was an ambitious project from the outset. It was designed to provide cloud-based AI capabilities while maintaining the same data privacy guarantees that users are accustomed to from processing directly on their devices. To achieve this, control over all the hardware was essential: PCC ran in Apple's own data centers on Apple Silicon servers, and the data privacy guarantees could be reviewed by independent security researchers.

However, Apple wasn't progressing quickly enough with its own AI ambitions. As a result, the company partnered with Google and is using its Gemini technology as the backbone of its new AI efforts. Apple emphasizes that this isn't simply Gemini on the iPhone, but rather models specifically modified for Apple based on the Gemini platform. Apple presented the results of this collaboration at WWDC 2026.

Five models, two of them on the device

The third generation of AFM comprises five models, distributed across three levels. Two run directly on the device, two on Apple's servers, and one on third-party hardware:

Model	Where it's running	Task
AFM 3 Core	On the device	Basic model with three billion parameters, significantly improved quality
AFM 3 Core Advanced	On the device	Most powerful on-device model, natively multimodal, 20 billion parameters
AFM 3 Cloud	Apple Silicon Server	Server workhorse, optimized for speed, efficiency and performance
ADM 3 Cloud (Image)	Apple Silicon Server	Image generation and editing, among other things, powers Image Playground.
AFM 3 Cloud Pro	Nvidia GPUs on Google Cloud	The most demanding tasks, such as agent-based tool use and complex reasoning

The "D" in the name AFM 3 Cloud (Image) stands for diffusion, the technology behind image generation. With the exception of AFM 3 Cloud Pro, all models were designed to run on Apple Silicon—either on the device itself or on Apple's servers. AFM 3 Cloud Pro is the exception, running on Nvidia GPUs hosted in Google Cloud. This was made possible because Apple extended its private cloud compute architecture to third-party infrastructure for the first time, reportedly without compromising security and privacy protections. The two most interesting models are AFM 3 Core Advanced and AFM 3 Cloud Pro.

AFM 3 Core Advanced: 20 billion parameters on the device

AFM 3 Core Advanced packs 20 billion parameters into a model that runs directly on the device—a remarkable feat, as most on-device models intended for the general public remain in the low single-digit billions. The model is also natively multimodal, enabling features like more expressive voices and more precise dictation that you'll notice immediately in everyday use. It's unlocked and optimized for Apple's most powerful Apple Silicon systems.

To make a 20-billion-parameter model function effectively on a device, Apple employs a so-called sparse architecture. Instead of keeping all 20 billion parameters active for every query, as in a dense architecture, the model activates only up to four billion parameters simultaneously, depending on the input. Conceptually, this is similar to the mixture-of-experts approach, but it is based on a technique developed by Apple itself, which the company described a year ago in the study "Instruction-Following Pruning for Large Language Models."

AFM 3 Cloud Pro and the opening of Private Cloud Compute

AFM 3 Cloud Pro is the model that runs on external infrastructure, representing a true break with Apple's previous approach. To deliver Gemini-based peak performance without compromising its data privacy commitment, Apple has, for the first time, extended its PCC architecture to hardware outside its own data centers. The extent of this move is evident in the security measures Apple has implemented in collaboration with Google, which form the same basis as the expansion of Private Cloud Compute to Google Cloud.

Apple explicitly does not rely solely on confidential computing techniques to defend against attacks via privileged access or side channels. Instead, the company includes every component—from firmware and host and guest operating systems to application code—in its trusted base, which is subject to guarantees of verifiable transparency and the absence of privileged access. To combat supply chain attacks, Apple maintains a cryptographically verifiable, extensible directory of all Google Cloud hardware within the PCC network. For particularly sensitive components, software attestation relies on at least two separate trust anchors from independent providers. The processing stack itself follows the same patterns as on Apple Silicon: Incoming data is initially processed in its own isolated process, shared software is recycled after a short time, and cryptographic keys reside in a separate, isolated environment.

How Apple trained the models

According to Apple's research blog, all five models initially shared a common foundation before being specialized for their respective architecture and use cases. This process added multimodal capabilities – such as understanding audio and images, processing long contexts, and generating high-quality visuals.

For training, Apple used a mix of publicly available information, third-party licensed or purchased data, open-source data, data collected specifically for studies, and synthetic data. The company emphasizes that neither user data nor interactions were included in the training. Furthermore, web publishers can opt out of the Foundation Models training.

What the tests show

To evaluate the third generation, Apple relied on extensive human assessments. Internal testers evaluated the models' responses in categories such as following instructions, accuracy, presentation, and image understanding. Where possible, the new models were pitted against their predecessors.

The comparisons included global English as well as other language groups to demonstrate consistent performance across international variants. For the dictation function, Apple directly compared AFM 3 Core Advanced to the existing dictation system and observed a consistent improvement across seven quality dimensions. For a more in-depth analysis, you can find the complete comparison data on Apple's Machine Learning Research Blog.

Apple's AI architecture between device and third-party cloud

The third generation of AFM makes two things clear: Apple is surprisingly venturing to integrate a great deal of processing directly into the device with AFM 3 Core Advanced, while pragmatically relying on Google's Gemini technology and third-party hardware for the most demanding tasks. The real challenge lies less in the models themselves than in the attempt to uphold the privacy promise of Private Cloud Compute even when the computation takes place in a third-party data center. Whether this balancing act will live up to Apple's promises in practice will become clear once the new Siri and the other features are widely rolled out.

The best products for you: Our Amazon Storefront offers a wide selection of accessories, including for HomeKit. (Image: Apple)

Frequently Asked Questions: Apple's Third Generation Foundation Models

What are the Apple Foundation Models?

Apple Foundation Models (AFM) are Apple's proprietary AI models that power Apple's intelligence features. The third generation was introduced at WWDC 2026 and consists of five specialized models.

How many models does the third generation include?

Five: AFM 3 Core and AFM 3 Core Advanced run directly on the device, AFM 3 Cloud and AFM 3 Cloud (Image) run on Apple's servers, and AFM 3 Cloud Pro runs on third-party hardware.

Which model does not run on Apple Silicon?

The AFM 3 Cloud Pro is the only model of its generation that runs on Nvidia GPUs in the Google Cloud, instead of Apple Silicon chips. All other models run on Apple Silicon.

What makes AFM 3 Core Advanced special?

It delivers 20 billion parameters directly to the device – an unusually high number for an on-device model. A sparse architecture activates only up to four billion parameters simultaneously, depending on the request.

Is data privacy still guaranteed despite using Google Cloud?

Apple has extended its private cloud compute architecture to third-party infrastructure for the first time and, according to its own statements, transferred the same protection mechanisms – including a verifiable hardware directory and several independent trust anchors.

Was user data used for the training?

No. Apple emphasizes that neither user data nor interactions were used in the training. A mix of public, licensed, open-source, custom-collected, and synthetic data was used; web publishers can object.

What does the "D" in ADM 3 Cloud (Image) stand for?

Diffusion is the technology behind image generation. This model powers, among other things, image editing tools and Image Playground.

Make Apfelpatient a preferred source One click – and you'll see us more often on Google

Was this article helpful?

YesNo

Tags: Apple Intelligence iOS iOS 27 iPadOS iPadOS 27 macOS macOS 27 visionOS visionOS 27 watchOS watchOS 27

Apple's third-generation Foundation Models: How the new AI architecture works

Widow's Bay: Apple TV renews the surprise hit for a second season

watchOS 27 brings more speed and a dynamic app overview

watchOS 27 brings more speed and a dynamic app overview

Apple TV: The Morning Show ends with season 5

Apple Maps will be permanently installed in Ford electric cars by 2027

EU imposes €890 million fine on Google

About APFELPATIENT

Company

Community

Legal

Resources