Ethica Kibana Integration

Kibana is an open-source data visualization tool which can create different graphs and charts from large amounts of data. Kibana allows you to explore the data, create visualizations such as bar chart, line charts, scatter plots, maps, and many more. You can further combine these visualizations to create interactive dashboards.

In this document, we describe how you can use Kibana to explore and visualize your Ethica study data. This document does not intend to teach you how Kibana, Elasticsearch, or Lucine works. There are already many online resources and training videos for these technologies. We suggest you review them to better understand how you can use Kibana for your work.

For each study you create in Ethica, a set of data tables are created in a data storage system called Elasticsearch. These tables are also called index or index pattern. Kibana allows you to access these data tables, query them, read the data, and visualize them. To access the Kibana, go to the Researcher Dashboard, and from the left-side menu click on Kibana. This will take you to a page similar to the image below:

Kibana first page

There are 4 important sections in the above image:

Number 1 lists the data tables for your study. Your study does not necessarily have data in all these tables, depending on which data sources and which activities you have added to your study.

Number 2 shows the time range over which you are querying the data. By default, the time range just shows the data for the past 15 minutes, so there is a chance that you do not see any data. You can increase this time range to load more data.

Number 3 lists the data fields that are available in the current data table. This list is different depending on the data table that you have selected. For example, for GPS it includes the location and the speed of the movement, while for Pedometer it includes the number of steps taken.

Number 4 allows you to filter your data. You can put filter on any data field that exist in the current data table. For example, assuming that you are looking at the Pedometer data table, you can filter data for those who have taken 100 steps or more.

Ethica does not store all data sources in the Elasticsearch. At the moment, only the following data sources are stored in the Elasticsearch and therefore are accessible through Kibana:

  • GPS (gps index)
  • Wi-Fi (wifi index)
  • Pedometer (pedometer index)
  • Motion-based Activity Recognition (mb_activity_rec index)
  • Battery (battery index)
  • Screen State (screen_state index)
  • Bluetooth (bluetooth index)
  • Bluetooth Beacons (beacon index)
  • App Usage Statistics (app_uage index)
  • Call & SMS logs (telephonycomms index)
  • Participant Audit Logs (participant_history index)
  • Survey Responses (survey_responses index)
  • Stroop Responses (stroop_responses index)
  • Time Use Diary Responses (time_use_responses index)

Below you can read the details for each of these data sources, including available fields and their types.

Ethica clusters host the services for Kibana and Elasticsearch. Therefore, your study data does not leave Ethica servers for any of the procedures described here.

Available Data Sources

This section describes the available data sources and the fields each data source have in Kibana. The name of the field is necessary to access the field's data in Kibana. The field's type specifies the type of the data being stored in it. You can refer to Elasticsearch's field datatypes for more details about each type. You also can check the Data Sources section for detailed definition of each field.

Common Fields

The following data fields are available in each of the data sources described below:

Study ID
Filed Name study_id
Type Integer
Description The ID of the study this record belongs to.
Participant ID
Filed Name user_id
Type Integer
Description The ID of the participant who provided this record.
Device ID
Filed Name device_id
Type Keyword
Description The ID of the device which provided this record. Each participant can own multiple devices during the course of the study, and each device will have a unique ID. Ethica uses this ID to tag all records coming from the same device. The ID remains the same even when the user uninstalls and reinstalls the Ethica app on their phone.
Record Time
Filed Name record_time
Type Date
Description The time which this record was captured. For survey responses, record_time for all responses in the same session are identical, and it represents the time when the user has pressed Submit button (or equivalent) to finish responding to the survey.
Relative Record Time
Filed Name rel_record_time
Type Date
Description The number of milliseconds between the participant's participation period start time, and the time this record was captured. This field is particularly useful for studies with rolling enrollment, where each participant starts the study at a potentially different date. Therefore, 0 indicates the record was captured right at the start time, 1 indicates the record was captured 1 ms after the start time, and so on. Note that this field is marked as Date, therefore Ethica will show the field as milliseconds passed Unix epoch (Jan. 1st, 1970). If you plan to query the data based on this field, you need to set the time range based on this date.

Survey Responses

For survey responses, Ethica stores each response to a given question as a separate record. Therefore, a given survey session can contain multiple records. For example, assume your survey contains 5 questions, from question ID 1 to 5. Every time a participant responds to your survey, 5 new records will be added to this index, one for each question (assuming the participant has responded to all questions).

Also, note that not each record contains all the fields specified here. If a given record does not have a given field, it means the field was not relevant for that record. For example, if a survey response is of type text, the record will contain answer_content, but it will not contain answer_url.

Index name: survey_responses

Index fields:

Name Field Name Type Description
Participant ID user_id Integer
Device ID device_id Keyword
Record Time record_time Date
Relative Record Time rel_record_time Date
Survey ID survey_id Integer
Question Set ID questionset_id Integer
Response Duration duration integer In minutes.
Scheduled Time scheduled_time Date For Time- and Proximity-Triggered sessions, this shows the time the survey was automatically triggered. For User Triggered sessions, this shows the time the survey was started by the participant.
Prompt Time prompt_time Date Same as Scheduled Time.
Response Time resp_time Date The time this response was provided.
Iteration iteration Integer
Loop Count loop_count Integer
Question ID q_id Integer
Question Content q_content Text
Question Type q_type Keyword
Answer ID answer_id Integer
Answer Content answer_content Text
Answer URL answer_url Keyword
Location location Geo Point
Location Accuracy location_accu Double
Location Speed location_speed Double
Preferred Unit pref_unit Keyword
Selector selector Keyword

GPS

Index name: gps

Index fields:

Name Field Name Type Description
Participant ID user_id Integer
Device ID device_id Keyword
Record Time record_time Date
Relative Record Time rel_record_time Date
Provider provider Keyword
Satellite Time satellite_time Date
Location location Geo Point
Speed speed Double
Accuracy accu Double
Altitude alt Double
Bearing bearing Double

Wi-Fi

Index name: wifi

Index fields:

Name Field Name Type Description
Participant ID user_id Integer
Device ID device_id Keyword
Record Time record_time Date
Relative Record Time rel_record_time Date
SSID ssid Keyword
BSSID bssid Keyword
Access Point Capabilities capabilities Keyword
Frequency freq Integer
Level level Integer

Pedometer

Index name: pedometer

Index fields:

Name Field Name Type Description
Participant ID user_id Integer
Device ID device_id Keyword
Record Time record_time Date
Relative Record Time rel_record_time Date
Steps steps Integer
Accuracy accu Double
Distance distance Double
Average Active Pace avg_active_pace Double
Current Cadence cur_cadence Double
Current Pace cur_pace Double
Duration duration Integer
Floor Ascended floor_ascended Double
Floor Descended floor_descended Double

Motion-based Activity Recognition

Index name: mb_activity_rec

Index fields:

Name Field Name Type Description
Participant ID user_id Integer
Device ID device_id Keyword
Record Time record_time Date
Relative Record Time rel_record_time Date
Activity Type activity_type Integer
Confidence Level confidence_level Integer

Battery

Index name: battery

Index fields:

Name Field Name Type Description
Participant ID user_id Integer
Device ID device_id Keyword
Record Time record_time Date
Relative Record Time rel_record_time Date
Level level Integer
Scale scale Integer
Status status Integer
Plugged plugged Integer
Temperature temperature Integer
Voltage voltage Integer

Screen State

Index name: screen_state

Index fields:

Name Field Name Type Description
Participant ID user_id Integer
Device ID device_id Keyword
Record Time record_time Date
Relative Record Time rel_record_time Date
Screen State state Boolean
End Time end_time Date

Bluetooth

Index name: bluetooth

Index fields:

Name Field Name Type Description
Participant ID user_id Integer
Device ID device_id Keyword
Record Time record_time Date
Relative Record Time rel_record_time Date
MAC Address mac Keyword
Device Name dev_name Keyword
Device Class dev_class Keyword
RSSI rssi Integer

Bluetooth Beacons

Index name: beacon

Index fields:

Name Field Name Type Description
Participant ID user_id Integer
Device ID device_id Keyword
Record Time record_time Date
Relative Record Time rel_record_time Date
MAC Address mac Keyword
Device Name dev_name Keyword
Device Class dev_class Integer
Payload payload Long
Team ID team_id Integer
Role ID role_id Integer
Subject ID subject_id Integer
RSSI rssi Integer

App Usage Statistics

Index name: app_usage

Index fields:

Name Field Name Type Description
Participant ID user_id Integer
Device ID device_id Keyword
Record Time record_time Date
Relative Record Time rel_record_time Date
App Name app_name Keyword
Start Time start_time Date
End Time end_time Date
Last Used last_used Date
Foreground Time foreground_time_ms Integer In milliseconds.