Skip to main content

Location

GPS

Supported in Android & iOS.

This data source measures the precise location of the device using GPS sources.

Each GPS record includes the following:

Satellite Time: The time at which this GPS record was received. Internally stored as satellite_time.

Provider: Determines how this GPS record was acquired. It can contain one of the following five options:

  • GPS: means the Avicenna app explicitly requested the user's location and received it from GPS satellites.
  • Network: means the Avicenna app explicitly requested the user's location and received it from nearby cell towers and Wi-Fi access points (only in Android) (technical details).
  • Fused: means the Avicenna app is using a location API of Google Play service named fused location provider API. This API intelligently combines different signals to provide location information.
  • GPS-Passive: means other apps running on the participant's device requested location from GPS satellite, and Avicenna received a copy as well (only in Android).
  • Network-Passive: means other apps running on the participant's device requested location from nearby cell towers and Wi-Fi access points, and Avicenna received a copy as well (only in Android).
  • Fused-Passive: means other apps running on the participant's device requested location from fused location provider API, and Avicenna received a copy as well.
  • GPS-Reuse: means the app detected the participant did not move since the last GPS reading, and therefore the GPS records for the previous cycle are reused.
  • Network-Reuse: means the app detected the participant did not move since the last reading, and therefore the records of the previous cycle provided by the network provider are reused.
  • Fused-Reuse: means the app detected the participant did not move since the last reading, and therefore the records of the previous cycle provided by the fused provider are reused. This is internally stored as a provider.

Latitude: The latitude of this record, in degrees. Internally stored as lat.

Longitude: The longitude of this record, in degrees. Internally stored as lon.

Altitude: The altitude of this record, in meters above the WGS 84 reference ellipsoid. Internally stored as alt.

Speed: The speed of this record, in meters/second over the ground. Internally stored as speed.

Bearing: The bearing of this record, in degrees. Internally stored as bearing.

Accuracy: The accuracy of this reading, in meters. Internally stored as accu.

Required Permissions

When participants join a study that requires GPS, the Avicenna app will request permission to access their location data. Participants will have three options:

  • Decline the permission request.
  • Grant the permission, but only while the app is open and in use.
  • Grant permission once.

If the Participant declines this permission, or grants it only while the app is in use, Avicenna will not collect any GPS data and will show the participant a notification to inform them that the missing GPS permission is interrupting their study participation.

If the participant chooses "grant permission once," Avicenna is allowed to collect GPS data for a limited time, likely for one or two days. This period is determined by Android or iPhone operating systems. After this period, the permission is automatically revoked by the operating system, and Avicenna will ask the participant for permission again. The second time Avicenna asks the participant for the GPS permission, in addition to the 3 options listed above, the participant will have the fourth option, "Allow Always". Choosing this option grants Avicenna permanent permission to access GPS data, until it's revoked by the participant explicitly.

The data collection continues as long as the participant is actively participating in the Study. Participants can revoke GPS permission at any time, turn off their device’s GPS (accessible from the home screen of most smartphones), or simply terminate the Avicenna app. In all of these cases, Avicenna will not collect any GPS data, and will notify them via a notification that their study participation has been interrupted. In this case, the missing data will be visible to the researchers on the Participation Page of the Researcher Dashboard. At any time, Participants can turn on their GPS, grant the missing permissions, or restart the app. Each of these events will resume the data collection in Avicenna immediately, and will remove the related notifications from the app.

Data Collection Behavior

Compared to other data sources, monitoring GPS requires considerable power and can drain a participant's battery rapidly. To reduce this impact, GPS data sources in Avicenna use different methods to collect as much data as possible and at the same time keep resource consumption very low. Understanding these helps you better understand and analyze the collected GPS data.

Collecting GPS Records

Avicenna starts collecting fresh GPS records every 5 minutes. Avicenna collects GPS data until it reads 3 accurate data points for a maximum period of 60 seconds.

Reusing GPS Records When Detecting Stationary State

Collecting fresh GPS data is very resource consuming. That's why at the beginning of each cycle, before Avicenna starts the GPS data collection, it checks if it can confidently conclude the participant has been stationary since the last GPS reading. If yes, the app simply reuses the GPS records from the last cycle. To detect whether the participant is stationary, Avicenna uses motion-based activity recognition (MBAR) and Wi-Fi data.

In both Android and iOS, Avicenna considers the participant as stationary if all of the following conditions are met since the last cycle:

If both of the above are True, the app assumes the device has been stationary during the last cycle.

In addition to MBAR data, Avicenna on Android uses Wi-Fi data as well. Based on Wi-Fi data, Avicenna considers a device stationary if all the following conditions are true:

  • At least 3 Wi-Fi networks were detected in proximity (based on BSSID) in the previous cycle.
  • At least 3 Wi-Fi networks are detected in proximity (based on BSSID) in the current cycle.
  • The Wi-Fi network sets for the current cycle and the previous cycle have at least 30% similarity.

If all of the above 3 conditions are met, Avicenna for Android concludes that Wi-Fi data suggest the participant has been stationary.

Now, in each cycle, the Avicenna app skips collecting new GPS data and reuses data from the previous cycle, if the following conditions are met:

  • There is enough GPS data collected in the previous cycle; and
  • MBAR data exists and shows the participant has been stationary; or
  • (Android only) Wi-Fi data exists and shows the participant has been stationary.

In this case, Avicenna finds the best GPS reading from the previous cycle, makes a copy of it, updates the provider value of the copied record by appending a -reuse to it, updates the record_time of the record to the current time, and uploads it as the GPS reading for this cycle. Note that in this case:

  • The satellite_time still refers to the time the data was collected originally, but the record_time refers to the time the stationary state was detected by the app, and the previous records were reused.
  • Only one record is uploaded, which is the most accurate reading from the cycle where fresh GPS data was collected.

For example, consider the following GPS record:

{
"study_id": 805,
"user_id": 26988,
"device_id": "a28bfb9b278sa99f",
"record_time": 1603882030265,
"rel_record_time": 14412439335,
"provider": "gps-reuse",
"satellite_time": 1603842281000,
"location": {
"lat": 32.693238,
"lon": -97.175872
},
"speed": 0,
"accu": 3.7900924682617188,
"alt": 175.14569091796875,
"bearing": 0
}

It shows a GPS record was collected at 1603842281000 (i.e. Tuesday, October 27, 2020 23:44:41, identified by satellite_time), had the accuracy of 3.7 meters, and was reused at 1603882030265 (i.e. Wednesday, October 28, 2020 10:47:10.265, identified by record_time). In this example, the GPS record was collected nearly 11 hours before it was reused. During these 11 hours, the Avicenna app had sent the same "reused" GPS record once for each cycle.

Also, it worth mentioning that the above GPS data reuse only works if the study has MBAR and/or Wi-Fi data source added to the study, and the participant has been granted the necessary permissions. Otherwise, no data for MBAR or Wi-Fi will be available to Avicenna's GPS component, and the app will record fresh GPS data each cycle.

Passive Data Collection in Android

Android allows apps to listen for GPS records passively. It means an app like Avicenna does not have to actively ask for fresh GPS records. Instead, it can sign up to receive a copy of the GPS record that is requested by other apps (given that Avicenna holds required permissions to collect GPS data). This way, Avicenna can collect GPS data without causing additional battery consumption.

Note that this behavior does not interfere with Avicenna's periodic GPS data collection. Avicenna collects GPS data periodically. The provider, in this case, is set to gps, network, fused, gps-reuse, network-reuse, or fused-reuse. It also listens to and records GPS records requested and received by other apps. The provider, in this case, is set to gps-passive, network-passive or fused-passive.

For example, assume the participant is navigating from Point A to Point B using Google Maps, and the navigation takes 1 hour. During this 1 hour, Avicenna receives every GPS record requested by Google Maps, and records them as passive data. It also collects GPS data (fresh data, as the person is moving and is hot stationary), and stores the alongside the passive data.

It is worth mentioning that while GPS records collected by Avicenna are still collected in 5-minute intervals, you might find passive GPS records captured between intervals as well.

Mobility Mode

Avicenna trained a machine-learning model named "GPS mobility mode classification" to detect the mobility mode of GPS data points.

The algorithm preprocesses the GPS data to clean the data, remove noises, and extract kinematic features from the GPS data source. Then, it predicts the mobility mode of GPS data points using the extracted features. Here we first define terminologies and then explain the steps of the algorithm in detail.

GPS Trajectory: A sequence of time-stamped GPS points for one user.

GPS Segment: A GPS segment is a subdivision of a user’s trajectory, which is traveled by only one mobility mode (e.g., stationary, walking, driving).

The steps of the algorithm are as follows:

1. Cleaning and Feature Extraction

  • Remove data points that their latitude or longitude is not in the acceptable range.
  • Remove duplicated data points.
  • Downsample data to 5s frequency so that where we have more than one data point in a 5s interval, we only keep one point and remove the remaining points in that interval.
  • Calculate kinematic features (i.e., distance, speed, acceleration, jerk, bearing, and bearing rate).

2. Segmentation

  • Segment each GPS trajectory into GPS segments using a trained machine-learning model.
  • Split GPS segments into smaller ones at points where the travel time between two consecutive GPS points exceeds 20 minutes.
  • Split GPS segments into smaller ones at points where the number of GPS data points exceeds a predefined value (i.e., 50).
  • Finally, small GPS segments with less than 5 data points are skipped with mobiliy_mode = N/A.

3. Mobility Mode Classification

  • In the end, Avicenna utilizes its trained machine-learning model to predict the mobility mode of each GPS segment and appends the new data fields to the raw GPS data source. The model classifies each GPS segment to one of stationary, walking, and driving modes (internally stored as stationary, walk, and car).

The machine-learning algorithm adds the following data fields to the GPS data source:

Distance-Feature Extraction: The distance of this record from the previous record, calculated by our feature extraction algorithm. Internally stored as distance_fe.

Speed-Feature Extraction: The speed of this record, in m/s, calculated by our feature extraction algorithm. Internally stored as speed_fe.

Acceleration-Feature Extraction: The acceleration of this record, in m/s2, calculated by our feature extraction algorithm . Internally stored as acceleration_fe.

Jerk-Feature Extraction: The jerk of this record, in m/s3. Jerk is the rate at which an object's acceleration changes with respect to time, calculated by our feature extraction algorithm. Internally stored as jerk_fe.

Bearing-Feature Extraction: The bearing of this record, in degrees, calculated by our feature extraction algorithm. Internally stored as bearing_fe.

Bearing rate-Feature Extraction: The bearing rate of this record which is the rate of bearing changes with respect to time, calculated by our feature extraction algorithm. Internally stored as bearing_rate_fe.

Segment ID: Unique identifier of a GPS segment in a GPS trajectory. Internally stored as segment_id. GPS trajectories with fewer than 5 data points are excluded from segmentation and mobility mode classification steps, and their segment_id field is labeled as N/A. Note that the segment id is unique among data points of a GPS trajectory with unique (study_id, user_id, device_id) during a given month.

Mobility mode: The mobility mode of this record (e.g. stationary, walk, car) predicted by our machine-learning model. Internally stored as mobility_mode. GPS segments with fewer than 5 data points are skipped in our mobility mode classification and their mobility_mode field is labeled as N/A.

Wi-Fi

Supported in Android.

Monitors Wi-Fi signals in the surrounding environment. This data source scans different frequency channels and records the Wi-Fi networks available.

Each Wi-Fi record includes the following:

SSID: Service Set Identifier, or the name of the network. Internally stored as ssid.

BSSID: Basic Service Set Identifier, or the address of the access point in proximity. Internally stored as bssid.

Capabilities: Describes the authentication, key management, and encryption schemes supported by the access point. Internally stored as capabilities.

Frequency: The frequency (in MHz) of the channel over which the client is communicating with the access point. Internally stored as freq.

Level: The detected signal level in dBm, is also known as the RSSI. Internally stored as level.

Data Collection Behavior

Android devices collect WiFi data in the following steps:

  • Android asks the OS to scan and send a list of all SSIDs in proximity.
  • OS sends the list, usually immediately.
  • Avicenna puts each SSID in one record.

After getting the first batch of SSIDs, Avicenna continues scanning for 1 minute, and the OS keeps sending new data points.

During data collection, the number of collected data records correlates to the proximity of the network. If a network is nearby, Avicenna will get a lot of records for its SSID (often with slightly varying levels of RSSI). But if a network is far, Avicenna collects fewer records, because OS will see this network less often in its scans.

Note that Avicenna collects data points with different RSSI levels, frequencies, and BSSIDs for the same SSID and the same record time.