Notes on Privacy and Ethics

We initially created Ethica at the University of Saskatchewan as a research tool to help us collect more accurate data on people's health behavior. Every study we conducted using Ethica received REB approval. The communications with the REB committee helps us address many of their privacy and security concerns at the core of Ethica. Furthermore, it showed us how to clearly and concisely explain the capabilities Ethica offers to the committee, to ensure they are comfortable with our research design.

In this section, we share the materials that may help you with your REB application for your upcoming study. They are based on the documents we have provided to different REB committees for our research projects. Keep in mind most REB applications we submitted for Ethica are focused on specific study design, and very likely, sharing those completed applications would not be very helpful for your study. The following materials do not focus on any specific study design. You should adjust them as needed per your study design and per the requirements of the REB committee of your institution.

Also, many of the topics below are explained in depth throughout this website. You can refer to the relevant article for more details.

Introduction

It is well understood that self-reporting of many types of health behaviours and exposures are subject to significant inaccuracies, including those involving contact, mobility (including exposures to particular environments), physical activity, nutritional patterns, and adherence to medication regimens. The resulting gaps in knowledge of human health micro-behaviours can be attractively addressed with smartphone telemetry studies involving recording high-fidelity measurements of human behaviour. These studies can involve physical measures of location (often sensed through GPS), contact patterns (sensed through different radio beacons) and moderate to vigorous physical activity as well as sedentary behavior (both sensed using accelerometers and gyroscopes), and nutritional and medication intake (as documented via camera images). From these sensor readings, the quality of insight into health behaviours and exposures can be increased. Importantly, such data can help resolve behaviours and behaviours across multiple causal pathways as well as certain types of health outcomes (e.g., falls, certain symptomology, quality of sleep). Such information can be of considerable value in observational studies, and of exceptional importance in assessing the effect of health interventions.

This document provides an overview of the sorts of protocols commonly used with smartphone-based studies conducted through the Ethica system. While we seek to provide a general orientation to the sort of research conducted with Ethica, many of the considerations mentioned here for smartphone-based research in general do not apply in the context of the current study. Ethica has been used to conduct more than 100 diverse studies to measure human behaviour, ranging from the impact on insulin level of the exercise habits of pregnant mothers with gestational diabetes to understanding how undergraduate students use common university spaces. The platform has been used by a dozen or more REB-approved projects at the University of Saskatchewan, but also at dozens of other clients (including -- but not limited to -- Harvard School of Public Health, Alberta Health Services, Drexel University, University of Michigan, University of North Carolina, San Francisco State University, University of Western Sydney, Baylor College of Medicine, University of Loughborough, University of Regina, Wilfred Laurier University, Columbia University, University of New Mexico, University of Calgary, Memorial University, Boston University, Universite de Montreal, University of the West Indies, and Università Della Svizzera Italiana). While the precise subset of data collected, and the associated survey instruments employed change from study to study, the core components of understanding human behaviour by collecting rich sensor data remains the same. This document outlines a standard framework for conducting these types of studies. When reading this document, it bears emphasis that the study being considered for the current REB application likely collects just a subset (and often a small subset) of the types of data described here.

The overall format of an Ethica study is generally consistent. A sub-population is selected and set of research questions are derived. Appropriate sensor measurement protocols and survey instruments are designed. Participants, after providing informed consent, are either install the Ethica software on their own phone or are issued with a smartphone containing the measurement and questionnaire software. Participants fill out intake questionnaires (typically providing demographic and history information) and leave with the phone and software working. Participants are instructed to keep their phone charged and to answer questionnaires issued by the phone. As data flows in, the research team monitors participant involvement (“adherence” or “compliance”) and watch for technical malfunctions, contacting participants if necessary. At the conclusion of the study, participants fill out post-study questionnaires on the phone. In some cases, participants may further meet with the research team. Keys linking email addresses and participant ID are deleted, as they are no longer necessary for contacting participants. Data analysis proceeds; in some cases, some of the data originally collected (such as audio recordings) may itself be deleted following initial phases of analysis (e.g., transcription).

In this document we first provide an overview of the process and ethical concerns underlying this kind of data collection. The remainder of the document is organized by phase temporally, starting with study design and ending with final data storage. This document serves as a companion to the rest of the content available from Ethica Learn, Ethica's Privacy Policy, and Ethica's Terms of Use which describes the technical capabilities of the system we employ. We refer to the above documents as the Ethica Learn for the purposes of brevity for the remainder of this document. Individual studies using the Ethica system must still submit standard ethics applications (or amendments if applicable). This document is meant to inform members of the Research Ethics Board or Institutional Review Board of the general study flow, and highlight underlying ethical concerns to decrease repetition across individual ethics applications employing these techniques.

Ethical Concerns

These types of studies are not implicitly dangerous for healthy adults, but there are two significant ethical risks associated with them: observation of illegal activity and deanonymization of the data.

Observation of Illegal Activity

While the kinds of records collected cannot immediately distinguish between legal and illegal behaviour (compared to a drug use survey, for example), they could be used to implicate participants as material to an investigation if they were recorded to be in a particular location when and where a crime occurred. While routine analysis would not typically uncover this type of correlation, it is possible. While Ethica would not directly inform law enforcement agencies as to the precise nature of the information, there may be a legal obligation to disclose the existence of such data to appropriate agencies (e.g., for cases of elder or child abuse or neglect). While participants should be urged to avoid taking photographs or audio recordings that include others within the consent form and training, they should be made aware on the consent form that any such recordings could lead to reporting. There is also a risk of judicial orders such as subpoenas for the release of data to appropriate law enforcement agencies. While some jurisdictions have case law supporting the rights of researcher to maintain confidentiality of subjects even in the face of subpoenas, in other jurisdictions there may be a legal obligation or institutional policy to cede to such subpoenas. In cases where the researcher would be obliged or otherwise likely to yield participant data in response to a subpoena, this is made clear to participants in the informed consent forms, and during briefing, and when participants are introduced to the functionality of the system that supports suspension of sensor-based data collection.

Participant Suspension of Sensor Data Collection

A key technical component to combat issues of potential illegal activity by participants is the "pause" functionality, which allows the player to disable sensor-based monitoring for a set duration, typically 1 hour. Tapping the Pause button once will disable monitoring for 60 minutes, after which the participant can renew the privacy request. This functionality allows the participant to temporarily opt out of the study sensor data collection for whatever reason. Sensor data is not recorded by the phone during this period. Surveys and ecological momentary assessments are still issued to the user, but can – as always – be ignored by the participant. This allows us to distinguish data missing due to a participant’s wish for privacy from data missing due to phone malfunction or battery depletion. Between the provision of pause functionality and the strong informed consent, we believe we have dealt with the issue of observation of illegal activity with sufficient rigor.

Deanonymization

All data collected by our system is anonymized. This ID is maintained across acquisition methodologies, leading to a classic anonymized data collection system. However, because of the richness of the data, (particularly location), it is possible for someone skilled in the art to infer home address and place of employment. Knowing these pieces of information, deanonymization is possible in many cases with internet queries.

We combat this risk in three ways. First, all researchers accessing the data must undertake to not intentionally seek to re-identify the data. Generally speaking, a dataset from an appropriately designed study does not by itself have sufficient information to re-identify a participant; re-identification requires the extra step of searching for supplemental external information, making this commitment on the researcher’s behalf reasonably straightforward to keep. Second, we inform participants as part of the consent process that while we will make every effort to ensure their anonymity, it is possible for researchers to learn it, although we commit not to. Finally, we commit to participants to not publish any data which might be individually identifying (such as a map of their daily routine) without their express written consent.

Study Design Phase

In the study design phase, researchers make key configuration decisions about the type and frequency of data collection, the frequency and circumstance of data upload and create appropriate questionnaires for each phase of the study.

Parameters

Referring to the content in the Ethica Learn, it should be apparent that a wide variety of sensors can be accessed to probe different aspects of human behaviour. However, measurement does not come for free. Placing the phone in a measurement state depletes power, quite rapidly in the case of GPS and Bluetooth sensing, so compromises must be made between the completeness of measurement and the expected lifetime of the phone. Moreover, collection of additional types of data imposes data storage and analysis burden. The system deals with this in two ways. Firstly, researchers undertaking Ethica studies select a subset of data sources to use. Every study is slightly different, but most will measure the some subset sensors built into the phone: GPS, Bluetooth, WiFi, accelerometer, magnetometer, gyroscope and battery. As a second way of restricting the cost of sensor-based data collection, the Ethica system automatically restricted the data to be collected only for an interval every 5 minutes. Some sensors (e.g., GPS) have additional “smarts” to reduce unnecessary data collection. Some experiments may additionally add external sensors such as wearable devices (e.g., smartwatches, wristbands, shirts), or bathroom scales (see technical documentation on the Ethica website for details). However, in a standard study, only subsets of on-phone sensors are employed. This partly reflects the fact that many such external devices are burdensome for participants, and collection of data from such devices consumes much power.

Data Upload Protocol

Data is recorded on the phone, encrypted and saved to a file. The second piece of study design is determining under what circumstances and how often data should be uploaded from the phone to the server. Because the data is transmitted in its encrypted format, it can be safely transmitted over a number of networks. Because uploading is energy intensive, uploads are often restricted to times when the phone is plugged into a power source. As long as data is appropriately encrypted, there is no incremental risk to different choices for data upload protocol.

Questionnaires

Questionnaires form an important part of any Ethica deployment, where they go by the name of “surveys”. While sensors can provide a great deal of information on the physical state of a participant, their knowledge, attitudes, beliefs, perceptions and intentions must be probed using more traditional survey-based techniques – with such instruments being delivered on the smartphone. Ethica provides a framework for deploying rich on-phone questionnaires while the participant is involved in data collection. These surveys can provide critical insight into participant perceptions (such as perceived barriers to physical activity), why a participant was engaging in a particular behaviour, and many other factors. In some cases, the collected data can serve to complement the quantitative data collected via the phone with qualitative instruments.

Questionnaire selection and design is one of the most important steps in designing an Ethica study and one where study designers have the greatest degree of freedom. In a typical Ethica deployment, multiple types of questionnaires are delivered via the device: eligibility, time-triggered, user-triggered, sensor-triggered, and study dropout.

Eligibility

For some studies, eligibility questionnaires are issued when a participant has expressed interest in the possibility of joining the study. Such questionnaires can enquire about the characteristics of the participant, in order to determine whether they are eligible to proceed to the stage of seeking their consent. If they are not, they are not subject to potential enrollment. If the (ex-) candidate is a member of or considering involvement another study, they may wish to retain the Ethica app on their phone; no further data will be collected from the study for which they were ineligible. Otherwise, the ex-candidate will often uninstall the app from their phone.

Eligibility questionnaires are only used in a subset of studies. In others, the eligibility (and often consent) process is pursued online or in person with research team members.

Study Dropout

Study dropout questionnaires are issued when the participant decides to drop out of the study. Such questionnaires typically enquire as to the reason for this departure. Study dropout questionnaires may also probe any changes to perception if that was part of an experimental design.

User, Time, and Sensor Triggered

These questionnaires generally fall into two types: Ecological momentary assessments (EMAs) and questionnaires proactively triggered by the user through a button push.

One of the key advantages to the Ethica system is its ability to perform ecological momentary assessment, and ground the responses in the sensed physical context of the participant. which permit querying the perceived state of an individual in a particular time and place in manner that is often proximate to health behaviours or exposures of interest, reducing recall bias and enhancing the quality of the information secured. These questionnaires are generally specified by research team members graphically, through the Ethica website. Question types are described on the Ethica Learn in more details. Questionnaire data is accompanied by timing and location of the questionnaire submittal, as well as the time that particular questions were answered.

To avoid participant fatigue, questionnaires should be short, and triggered relatively infrequently. Ethica offers a variety of mechanisms for scheduling EMAs, including those triggered with uniform probability over time, and scheduled at certain times. In addition, EMAs can be triggered by circumstance, as informed using sensor data.

In addition to EMAs, many studies make use of button-triggered questionnaires. In such studies, participants are asked to press buttons on the Ethica interface to indicate occurrence of health behaviours, symptoms or exposures of specific interest to the study – for example, to signal that the participant is taking medication, has suffered from a bout of wheezing, is eating, has removed a tick. These buttons are labeled with the type of occurrence being recorded, in accordance with the needs of the particular study.

Recruitment

Participants are either drawn from the university community for studies meant to evaluate or expand the tools or techniques employed in these types of studies, or from a specific subpopulation that is the target of the study in deployments investigating specific behaviours or exposures. Participants are often sampled with an intentional bias towards capturing existing social or contact networks to enhance data quality. Recruitment details for each study will be presented in the appropriate application or amendment.

Participant Intake

Once participants have been recruited, they typically meet one-on-one with an a study team member, and are given an informed consent form and walked through the measurement process. Informed consent is a cornerstone of ethical framework. It is the individual’s right to donate behavioural and exposure data to researchers but only if they understand the scope and quality of the data being gathered. The study team member is available to answer any questions. Space is typically available if the participant wants to consider the informed consent in privacy.

Once participants have signed the informed consent and data release forms, a study team member walks them through the download and installation of the app from standard download sites (the Apple Store or the Google Play Store for iPhone and Android, respectively) and shows them how to use the app, including the pause functionality, and how to answer questions posed by the on-device questionnaires. For cases where participants are provided with phones (e.g., for low-SES populations), the phone with the app installed will be given to participant, and basics of phone operation may also be demonstrated, as required. After receiving their phone, participants complete study entry (baseline) questionnaires. Before the participant leaves, they are instructed to keep the phone with them at all times, and to keep it charged. Any additional study protocols are also re-affirmed at this point. As long as the participant is compliant and does not experience technical problems, this should be the last time they have direct contact with an study team member until after primary data collection is complete.

Primary Data Collection

During the primary data collection phase of the study, the device will collect sensor data as proscribed by the study configuration. Questionnaires will be issued automatically as scripted. Data will be uploaded to servers as proscribed in the study design. As data comes in, it will be monitored by study team member to keep track of compliance and the technical performance of both individual phones and the study overall.

Participant Contact

If compliance drops below a threshold value specified in the informed consent the participant will be prompted to improve their performance, and reminded of the commitment they made to the study team members. If technical issues are detected (e.g., sensor malfunction, phone no longer responding) participants in some studies will be contacted with instructions for meeting an study team member for technical assistance. For longer studies, all participants will typically receive periodic reminder emails to keep them engaged. This contact protocol requires the maintenance of a list of participant IDs and email addresses, which is destroyed after the study concludes.

Change in Study Parameters

During the study, it may be necessary or desirable to "patch" the study protocol. This can take place automatically when users are in contact with the network. Small fixes to the app to address bugs are downloaded in the background and will not result in user notification. Changes to the study structure (for example decreasing the measurement frequency to combat lower than expected battery lifetime) will be communicated by email with all participants.

Participant Withdrawal

Participants can withdraw at any time until the data is fully anonymized (after the study is completed). Because we cannot re-identify the data without engaging in ethically questionable activities, we cannot know which participant ID corresponds to the participant, and therefore cannot delete the corresponding data. If contacted prior to the destruction of the contact list, participant withdrawal can by request result in deletion of all participant data.

Study Conclusion

Following study conclusion for a given particular participant, study entry questionnaires are issued to that participant, and data collection is disabled automatically by the system for that participant. For some studies, the participant may be met a final time by a study team member. This meeting is meant to debrief the participant, deal with any remaining data on the phone, and deliver any post-deployment surveys.

If the phone was leant to the participant, and is being returned, the phone will be cleared of all data. Many studies also have a debriefing phase where participants are informally solicited individually by study team member for feedback or concerns relating to the experiment.

Data Security and Storage

Data security is taken very seriously in our studies and is protected on multiple levels. Only a full hacker-based breach of our secure system could result in data loss. Most of the institutional servers at universities do not have as secure a system protection.

On-phone

A small amount of data on the phone is kept in volatile memory (RAM) to help with context detection. This data is almost impossible to obtain as RAM is volatile (can decay unless maintained) and sensitive to the starting address. Periodically the data from the phone is stored as files, which is much easier to find and manipulate, but is strongly encrypted, and better protected than the vast majority of the data that the user produces themselves (notes, contact lists, browsing history, etc.).

On-server

For internal studies, we have a highly secure server system, based in a locked server room in a private location in Canada. Data for a given study can only be accessed by explicitly designated logins. Within the databases themselves, only specific study team members are allowed to access data pertaining to specific studies, so participating in one study does not give individuals access to all others.

Summary

Building on our many previous successful studies, this general framework for Ethica studies provides a robust ethical basis for conducting studies with high-fidelity cross-linked sensor data. Our framework relies upon technical strategies for anonymization and data protection, strong informed consent of the participants, participant control of data collection through the pause functionality, and an undertaking by all researchers accessing the data to not attempt re-identification, or publish data that might be re-identifiable. Using these principles we believe we can ethically and accurately explore human behaviour for the health sciences.