ChatGPT Health put to the test: When AI misreads heart data Apfelpatient

ChatGPT Health launched with the aim of providing understandable analysis of personal health data and deriving meaningful insights from it. For people who have worn an Apple Watch for years, this sounds like a logical next step. Millions of collected data points are supposed to finally reveal the true state of one's health. However, a recent user report shows that this analysis can quickly become problematic.

Earlier this month, ChatGPT Health was officially launched with integration for Apple Health and other data providers. Shortly thereafter, a test by Washington Post technology columnist Geoffrey A. Fowler highlighted just how immature the system still is. His analysis of Apple Watch data led to an assessment that was neither medically accurate nor consistent.

Transfer of ten years of health data

Like many people who wear an Apple Watch daily, Fowler wondered what a decade's worth of health data actually revealed about him. After a short waiting period, he granted ChatGPT Health access to his Apple Health data, which included roughly 29 million recorded steps and about six million heart rate measurements. He then asked the system to assess his heart health.

The result was a numerical rating of 6. This assessment caused him considerable concern. He reacted impulsively, went for a jog, and forwarded the report generated by ChatGPT Health to his family doctor.

Comparison with medical reality

The doctor's reaction was unequivocal. The assessment was incorrect. In fact, the risk of a heart attack was so low that even the health insurance company would likely not cover an additional cardiac fitness test just to refute the artificial intelligence's evaluation. This confirmed that ChatGPT Health had misinterpreted the available data.

Misinterpretation of key measured values

According to Fowler, ChatGPT Health's negative assessment was largely based on the VO₂max value. However, Apple itself points out that these values are merely estimates. They are suitable for observing long-term trends but do not provide precise medical information. Separate, specialized devices are necessary for truly accurate measurements.

Despite these known limitations, ChatGPT Health apparently treated the VO₂max value as a reliable basis for diagnosis. No assessment or qualification of the measurement accuracy took place.

Technical changes were not detected

Another problem concerned resting heart rate. The data showed changes precisely when Fowler received a new Apple Watch. However, these fluctuations were not due to actual physical changes, but rather the result of improved sensors and updated measurement methods. ChatGPT Health failed to consider this technical context and apparently interpreted the data as a sign of declining health.

Highly fluctuating ratings

The inconsistency of the answers was particularly problematic. When Fowler asked the same question about heart longevity again, the rating suddenly changed. The original 6 became a "C". With further inquiries, the result fluctuated between F and B. ChatGPT Health provided no plausible explanation for these discrepancies.

Additionally, the system repeatedly omitted basic information during the conversations. This included gender, age, and certain current vital signs. Although ChatGPT Health had access to recent blood test results, this data was not always included in the analysis.

Why this is particularly problematic

Such errors are nothing new for AI chatbots. Inconsistencies, forgotten contextual information, and fluctuating responses are known weaknesses. However, for a product intended as a source of health information, these problems are significantly more serious. Health assessments generate trust or fear—and neither should be based on flawed analyses.

ChatGPT Health: Technological ambitions meet medical reality

The report on ChatGPT Health clearly demonstrates the current gap between technological vision and medical reality. At the same time, rumors are circulating that Apple is working on its own AI-powered service called "Health+", which is expected to launch later this year.

The current test makes two things clear: Firstly, it will likely be extremely difficult to achieve the level of quality that Apple is presumably aiming for. Secondly, a well-implemented service in this area could quickly assume a leading role in the AI healthcare market.

Until then, ChatGPT Health demonstrates one thing above all: the analysis of health data using artificial intelligence is far from reliable enough to replace medical assessments or even to reliably supplement them. (Image: Shutterstock / ImageFlow)

Have you already visited our Amazon Storefront? There you'll find a hand-picked selection of various products for your iPhone and other devices – enjoy browsing !

This post contains affiliate links .

Add Apfelpatient to your Google News Feed.

Was this article helpful?

YesNo

ChatGPT Health put to the test: When AI misreads heart data

Apple continues testing phase: iOS 26.3 Beta 3 is here

ChatGPT Health put to the test: When AI misreads heart data

Apple continues testing phase: iOS 26.3 Beta 3 is here

AirTag 2: Which airlines support baggage tracking

About APFELPATIENT

Legal

Service