Data Access & Analysis

After you design your study and test it to make sure all pieces are correctly working as you intended, the next step is to move the study to the field, enroll participants, and start collecting the data. At this stage, your main focus will shift to first monitoring study progress and the quality of the collected data, and second exploring the collected data and planning for more in-depth analysis. Here we explain how you can monitor the study progress in Ethica, and how you can access the collected data for the analysis.

Monitoring Study Progress

Monitoring the quality of the collected data during the enrollment is challenging as high-quality data is defined differently for different data sources. Here we simplify this step and mainly focus on the data quantity, and assume data quantity can be a good proxy for the data quality as well.

If your study collects survey responses, you should monitor how participants are adhering with the presented surveys, i.e. how many of those surveys are being completed, how many are left unanswered and expired, and how many are canceled. For other data sources which are collected without participant's interaction, such as GPS, motion sensors, or app usage statistics, the amount of data collected during a certain time-window is a good way to monitor participant adherence. Below we explain each of these in more details.

Adherence Monitoring

Researcher Dashboard's Adherence page allows you to see a list of all participants currently enrolled in your study, and a quick report on the quantity of the data each of them has provided. To access this page, navigate to Participation -> Adherence section of your Researcher Dashboard. Each row in the list of participants is similar to the image below:

Ethica Participation Adherence Page - Closed Row

Here you can see the time the participant has joined the study, and the time her participation will end. As we discussed before, no data will be collected before or after this time. You also can see the last time this participant uploaded any data, the number of surveys they completed successfully (shown in green), the number of surveys canceled and also expired.

If you expand the row, you can see the amount of passively collected data this participant has provided over time, in total (shown as In Operation), for motion sensor data (show as Accelerometery), for location data (shown as GPS), and for Bluetooth data (shown as Bluetooth). Note that this section is only visible if your study uses data sources other than surveys. Also, each of the sections we mentioned above will be only visible if your study uses the associated data source:

Ethica Participation Adherence Page - Expanded Row

Data Quantity Reports

If you want to know exactly how many data points were collected at each interval, you can navigate to Participation -> Data Quantity Reports page. There, you can plot the amount of data uploaded to Ethica servers for a given set of data sources, from a given set of participants, during a certain time range, aggregated over a specific time interval, as shown in the image below:

Ethica Participation Data Quantity Report

Note that you need to interpret this graph based on the type of data source it represents. For example, sampling rate from motion sensors such as accelerometer is around 20 hertz, which generates approximately 345k records per day per participant, while GPS generates a maximum of 18k records per day per participant. In contrast, some data sources such as pedometer or app usage statistics only generate data if there are some relevant events (e.g. the participant is walking or is using an app on her phone). You can read more about the attributes and characteristics of each data source here.

Survey Sessions

When a participant joins your study, Ethica schedules the time she is expected to receive surveys throughout her participation (only for time-triggered triggering logics). Initially, the status of all these survey sessions is pending, as they are expected to be triggered in the future. As time goes on and the participant responds to these surveys or cancels them, or they get expired, Ethica updates their status to reflect this change.

Survey Sessions page allows you to see the schedule and the status of the past and future surveys. For the past surveys, the list will include all survey sessions, whether they were user-triggered, time-triggered, triggered due to the close proximity to beacons, or in any other form. For future surveys, the list will include the sessions which are expected to be triggered at a specific time. The image below shows survey sessions for 3 participants. Each participant started at a different time. That's why the first individual has passed all her surveys until the end of the 9th day, while the 3rd individual hasn't reached the 9th day yet.

Ethica Participation Survey Sessions

Each block in this page shows a survey session. If you hover on the block with your mouse, Ethica shows a pop-up describing when the survey was prompted and when it was closed, whether as completed, expired, or canceled. For example, in the image above, the survey was prompted at Monday 9 am and was responded by 9:50 participant's local time.

Downloading Raw Data

You can download data collected for your study in raw format from the Researcher Dashboard. To do so, for survey responses, navigate to Survey -> Responses, select one or more participants, and one or more surveys, and then click on the Download button. This will open a menu asking whether you want to download the survey responses as CSV or JSON. The download will start when you click on your desired format:

Ethica Analysis Download Raw Data

To download the data collected from other data sources, navigate to Sensor Data -> Data Export, and click on New to request a new data export:

Ethica Analysis Download Raw Data

In the Export Data dialog, you can select the data source you want to export, the file format (defaults to CSV), participants who you want to export their data, and a time range. This will submit a request to Ethica to export the data, compresses them, and make them available for download. You can track the status of the export back in the Data Export page.

Ethica Visualizations

Ethica offers a few basic visualizations which allow you to explore the collected data without downloading them.

To plot location data collected via GPS, navigate to the Sensor Data -> Geo-Location page. In this page, you can choose a set of participants, a time range, and how the location data should be plotted. You can either plot the data as a heatmap:

Ethica Analysis Location Heatmap

Or as a temporal graph:

Ethica Analysis Location Temporal Graph

You can also access the responses submitted by each participant. To do so, navigate to Surveys -> Responses page, select one or more participants and surveys, and view the responses:

Ethica Accessing Survey Responses

Or you can plot them in the map. This feature puts a marker at the location where the survey was responded. Clicking on the marker will show the content of the response:

Ethica Accessing Survey Response Map

Kibana Integration

Behind the scene, Ethica stores your study data in a data storage system called Elasticsearch. Ethica's integration with Kibana allows you to directly access your data, read them, download them, or create dynamic graphs and data dashboards with them.

The following graph shows survey responses submitted to one of the Ethica's sample studies. You can search between the data, and select the fields you are interested to see in the list:

Ethica Accessing Raw Data through Kibana

You can also create different types of visualizations based on your data. For example, the following image shows a bar chart of the type of activities reported by each participant throughout one of the Ethica's sample studies:

Ethica Visualizing Data using Kibana

You can further put these visualizations together and create a data dashboard. Kibana will automatically connect to Ethica servers and update the dashboard with the most recent data:

Ethica Creating Data Dashboards in Kibana

Even if the graph you are looking for is not provided by default in Kibana, you can use VEGA and VEGA-Lite to define your visualization. For example, the following image shows a map, which plots the location data captured from a given participant while responding to the surveys and reported being at work:

Ethica Visualizing Data using VEGA, VEGA-Lite, and Kibana

You can learn more about using Kibana here.