Apple publishes study on smart activity tracking Apfelpatient

Apple has published a new study demonstrating how Large Language Models can use audio and motion data to identify current activities. The research combines traditional sensor technology with AI and highlights the reliability of even brief or incomplete information. Apple is thus bringing to the forefront a topic that could enable relevant applications for fitness, health, and everyday life.

Many devices today collect audio and motion data, but this raw data alone is often insufficient to clearly identify activities. Apple's new study therefore investigates an approach that uses LLMs to draw precise inferences from text descriptions. Instead of directly analyzing audio or motion data, the models are given short texts previously generated by smaller audio models and an IMU model. This allows them to recognize what is happening without needing a specially trained multimodal model.

How Apple uses LLMs

The article , titled "Using LLMs for Late Multimodal Sensor Fusion for Activity Recognition," describes how Apple combines various information sources. LLMs receive text about sounds, movements, and class predictions and use this information to infer activity. This approach is less invasive because it doesn't access the actual audio recordings, but only descriptive text labels.

The researchers argue that this approach offers significant advantages. Even if sensors provide only limited data, the LLM can combine the information to create a much clearer picture. This saves memory and computing power because there is no need to train or deploy specially adapted multimodal models.

The dataset used is Ego4D

For the experiments, Apple used the Ego4D dataset. It contains many hours of video and audio material from a first-person perspective and covers everyday situations. From this material, Apple compiled a set of 20-second examples. Twelve activities were selected: vacuuming, cooking, doing laundry, eating, playing basketball, playing soccer, playing with pets, reading a book, sitting at the computer, washing dishes, watching television, and exercising or lifting weights.

This selection covers typical household, leisure, and sports activities and occurs frequently in the dataset. Audio descriptions, audio labels, and predictions from the IMU model were generated for each example.

How the LLMs were tested

The results were tested on two LLMs: Gemini 2.5 Pro and Qwen 32B. The researchers investigated two scenarios. In one, the models were given a list of twelve possible activities. In the second scenario, there was no predefined selection.

Even without specific training, the models achieved F1 values that were clearly above the random level. In zero-shot mode, they were already able to make meaningful classifications. With exactly one example for each activity, the accuracy increased even further. The study thus demonstrates that LLMs are very good at identifying the correct activity from text-based descriptions.

Why the results are relevant

Apple emphasizes that this type of latent heat fusion is particularly helpful when raw sensor data alone doesn't provide a clear picture. LLMs can bridge the gap between individual information sources, creating a comprehensive understanding that traditional models can't achieve without additional training data. This allows for improvements to health features, fitness analytics, and assistive systems without requiring large amounts of coordinated training data.

Apple also provides supplementary material, including segment IDs, timestamps, prompts, and one-shot examples used in the experiments. This openness makes it easier for researchers to replicate the results and build their own studies upon them.

How Apple meaningfully combines sensors and AI

The new study shows how Apple combines the strengths of sensors and AI. Learning Life Generators (LLMs) receive short text descriptions from audio and motion data, enabling them to reliably recognize various activities. The approach is efficient, flexible, and requires no extensive specialized training. Apple is thus providing important impetus for future applications related to health, exercise, and everyday life, while simultaneously opening up a field of research that can continue to grow thanks to the materials provided. (Image: Shutterstock / issaro prakalung)

Have you already visited our Amazon Storefront? There you'll find a hand-picked selection of various products for your iPhone and other devices – enjoy browsing !

This post contains affiliate links .

Add Apfelpatient to your Google News Feed.

Was this article helpful?

YesNo

Tags: TechPatient

Apple publishes study on smart activity tracking

macOS 26.2: The three biggest new features at a glance

Apple publishes study on smart activity tracking

macOS 26.2: The three biggest new features at a glance

Apple will launch new, affordable devices in 2026

About APFELPATIENT

Legal

service