The vision presented by Apple with the HealthKit framework is one of interoperability and seamless interactions between mobile health applications and platforms. This is something that we, at Open mHealth, care a lot about. It is at the the core of our mission.
We’ve found that HealthKit does a lot to advance this vision. It reduces the burden on developers to create their own solutions for storing health data. And once that data is stored in HealthKit, other applications can gain access to it, with the user’s permission. Now, suddenly, you have applications that can enable novel insights and interactions through shared data that they would be unable to do alone.
Another valuable aspect of HealthKit, one that is often overlooked, is its model for how health data sharing permissions are managed. Those permissions are managed in a consistent way and in one centralized place across all applications, so users know where to go.
The model for granting permission at the individual measure level helps people consider each aspect of their health information separately and forces them to be more intentional about what they share and with whom they share it. Too often permissions are all or nothing when authorizing an application to access personal health information. The model actualized by HealthKit prompts a reconsideration of how users can make these decisions. It then forces developers to start to think about how they design their applications in cases where they might not have access to all of a user’s data.
There is a lot to appreciate in HealthKit and it’s a good start to catalyzing the open sharing of health information across the mHealth ecosystem. However, the platform has some major limitations that prevent it from realizing the vision proposed by Apple.
Challenges with the HealthKit Framework
In working with HealthKit while building our iOS app, Hipbone, and a HealthKit serialization library, Granola, we found that it exposes a powerful framework for building applications. However, we found difficulty in interacting with the way HealthKit represents and models health concepts and the mobile health domain. At the heart of the problem is HealthKit’s lack of a well defined data model, meaning that different health concepts are not explicitly modeled in the framework. Steps, blood glucose, and inhaler usage, for example, are all modeled using the same objects with the same properties. Because these measures can have very different information associated with them, such as the context in which an individual used their inhaler, developers are limited in the expressiveness available to represent each measure. In many cases it’s like trying to fit a square peg in a round hole.
Furthermore, the problem that HealthKit proposes to solve – interoperability and exchange between applications – is addressed in an incomplete way. We know that the power of mHealth lies in its ability to bring data together from disparate sources to paint a richer picture of individuals and populations as a whole. This demands the dismantling of silos and a freeing of information. Unfortunately, Apple provides very little support in exposing HealthKit data to the broader mHealth ecosystem, limiting its ability to interact outside iOS. Instead of breaking down silos, HealthKit constructs a new one. It’s just bigger.
In this post, we provide a few examples of specific challenges we encountered in working with HealthKit to demonstrate how its data model, or lack thereof, and siloed approach create challenges for people wanting to use the framework. Our goal is to make people aware of these challenges when working with HealthKit and to help people consider the role HealthKit should play in their application or platform.
1. The missing data model
As briefly mentioned above, HealthKit provides only a few types of objects that are broadly intended to represent the different types of health information that users might store – quantity samples, category samples, correlation samples (combinations of category and quantity samples), and workouts. For quantity and category samples, which are used to represent most information in HealthKit, these samples have properties to represent start and end datetimes, the value of that measure, and its type. The type property is what differentiates between various measures; a step count quantity sample from a heart rate quantity sample, for example.
However, a step count quantity sample and a heart rate quantity sample are still both quantity samples, they just have a different value for their type. There is no inheritance hierarchy or distinct objects for capturing how these different measures can be represented.
The problem is that this model limits what can be expressed about different health measures. In essence, HealthKit avoids modeling the domain of health and instead opts for a small set of objects into which different types of health information can be pigeonholed. For example, both blood glucose and flights of stairs climbed are represented using quantity samples in HealthKit, however blood glucose measurements often capture different information from stairs climbed. The temporal relationship between blood glucose measurements and meals is one example of this difference.
Further complicating this issue is the lack of external data representation for information stored in HealthKit. Consistently, platforms that deal with health data have an underlying data representation captured through a data standard, such as JSON or XML, which helps people understand the structure of the data. This representation also allows data to escape from the confines of an individual platform. Many platforms have an SDK in addition to that, which plays a role more like HealthKit. But the existence of a data representation in other health APIs, which would be an external data model, is what enables much of the interaction between platforms and applications. Unfortunately, HealthKit objects are not inherently transportable outside of iOS.
2. “Metadata” as the model for extending samples
For building an application, HealthKit’s approach to address the shortcomings in its limited data model and lack of extensibility is to include a dictionary property in samples, called metadata. This dictionary contains key-value pairs representing additional information related to the sample. There is a limited set of predefined keys that Apple has created and, in addition, developers can use any string they would like as a key in the dictionary. Values in the metadata dictionary, on the other hand, can be a string, date, or number, but nothing beyond that. This greatly limits the expressiveness of information captured related to health measures and limits the extent to which HealthKit samples can be “extended”.
More broadly, the information captured in the metadata property is not modeled explicitly, so it would be difficult for a developer to recognize that it might be useful to capture that information. There is no structure guiding how that information should be stored, except for the small set of predefined keys. In the case of blood glucose, for example, and the measure’s relationship to a meal or sleep, there is not a predefined metadata key to capture this information. So different applications might store it differently or not at all. The information no longer becomes inherently interoperable and it becomes much more difficult to share across applications.
In contrast, when blood glucose is represented as its own type and explicitly contains properties to capture information about temporal relationships to other activities, even if those properties are optional, then there is only one way to do so. This increases the interoperability of that data. It also prompts developers to capture that data when possible because it’s an explicitly defined property in the object.
It’s important to acknowledge that, for building applications, there are benefits to the approach in HealthKIt. There is a simple, shallow object hierarchy for developers to understand and use. The interfaces with which developers interact are clean and simple. There are no issues of downcasting or working with child objects that expose different methods than their parents. The approach just deals with a single map that can contain any values and any keys, which gives developers the power to specify any data that their hearts’ desire. Just enough freedom to get ourselves into some serious trouble!
3. Capturing time zone information
It makes sense that HealthKit, as an iOS framework, would use existing objects to represent date-time information, namely NSDate. Unfortunately, this means that samples retrieved from HealthKit do not have time zone information associated with them, unless the creating application captures that information in the metadata property using the predefined HKMetadataKeyTimeZone key. Without time zone information it is impossible to resolve the local time at which an activity occurred originally.
But why does that matter? Well imagine your application is trying to provide users with information about the times of day that they are the most and least active – morning, afternoon, evening, or night – using step count data stored in HealthKit. And let’s consider the situation where a user spent a week traveling for business in a location that was 5 hours ahead (+05:00) of their home time zone. When the user returns home and opens up their application to see the time of day they had been most active over the past month, the application is put into a precarious situation.
The most likely approach would be to query the HealthKit store for samples that occurred during the morning, afternoon, evening, and night time frames over the past month. However, in specifying the NSDate objects that constrain the start and end times of the search, the developer needs to take a decision on what time zone to use. The most obvious approach would be to query for data based on the user’s local time – their current time zone, which is their home time zone. For the morning period, for example, the application might query for steps that occurred between 5am and noon using the user’s local time zone to create the NSDate objects that constrain the search time frame.
This will be fine for the activities that occurred while the user is in their home time zone, but what happens with the data that was generated while they were traveling for a week? The same time frame (5am to noon), if rendered in the business trip time zone (+05:00 from their home time zone) will actually be 10am to 5pm. Asking HealthKit for data that occurred between 5am and noon in the user’s home time zone will return data that occurred between 10am and 5pm while they were traveling, which are steps that occurred mostly in the afternoon.If the application were to aggregate those steps and make some inference about the most active time of day, the user may be misled since about a quarter of the data (one week out of the four in a month) would be incorrect. If they wanted to make a change in their daily routine based on that data, we’d be leading them astray.
This issue with missing time zone information is probably the most common limitation of consumer health data APIs right now, so Apple is not alone in this challenge. It’s unfortunate because mobile phones, by their nature, open the door to capturing this rich data at the intersection of place and time, which was previously inaccessible. However, many APIs are failing to take advantage of this. Developers need to consider this limitation in how they process and analyze data coming from HealthKit, especially when it’s data coming from another source in HealthKit. For the time being, developers should be storing the current time zone in their samples’ metadata using the HKMetadataKeyTimeZone key. We have not seen many developers taking advantage of this key or the metadata property in general. Even Apple fails to add the time zone metadata to samples generated through their own Health app.
Additionally, developers could also keep track of when users change time zones and timestamp those changes so that it’s possible to identify the time zone in which every sample occurred. From there, it’s possible to calculate the correct local time of each sample. That way developers need not rely on the quality of data from other applications. Of course this is not a very good solution nor is it trivial to implement. But it provides a route to overcome limitations in HealthKit, with the hopes that Apple may make improvements so that time zones can be captured more consistently with samples.
Moving forward, we believe that Apple must consider how HealthKit can interact with the broader mHealth ecosystem. It’s especially important for Apple to allow developers to easily liberate data that their applications capture in HealthKit into other domains and enable interactions across platforms. Having the ability to export data from HealthKit in interoperable formats and exposing a data model consistent with other standards is essential for the continued innovation of mHealth applications.
That’s why we built Granola. Granola serializes information stored in HealthKit into JSON data that adheres to an open data standard – Open mHealth schemas. Using Granola, developers can liberate data from HealthKit and integrate it into other spaces.
HealthKit has a lot of potential and we believe that the mHealth community, in collaboration with Apple, can make it a more powerful platform that engages the broader mHealth ecosystem.