Side-Channel Attacks on Mobile Devices


Yan Michalevsky

Stanford University

What can side-channels enable?


Fingerprinting

Eavesdropping

Tracking

Mobile Device Identification via Sensor Fingerprinting

Hristo Bojinov, Gabi Nakibly, Yan Michalevsky, Dan Boneh

Internet services need reliable identifiers
for repeat visitors

  • Offering custom services
  • Fraud detection
  • User tracking
  • and more...

Existing identifiers

  • Android
    • Device ID - doesn't exist for devices without telephony capability
    • Serial number
    • ANDROID_ID: randomly generated 64-bit - value can change upon factory reset

Existing identifiers

  • iOS
    • UDID: IMEI - cannot be accessed on iOS
    • identifierForVendor - deleted when app uninstalled
    • advertisingIdentifier - reset upon device reset

Existing identifiers

  • MAC address - can be forged by user
  • Web cookies - can be deleted/blocked by user
  • Panopticlick: browser identification by configuration - doesn't work for mobile browsers

Robust identifiers from sensor fingerprints

  • Independent of software state
  • Survives hard reset
  • (Might depend on calibration data stored in the firmware)

We experimented with two sensor systems

  1. Speaker-microphone system
  2. Accelerometer

Gyrophone

Recognizing Speech from Gyroscope Signals

Yan Michalevsky(1), Gabi Nakibly (2), and Dan Boneh (1)

(1) Stanford University, (2) National Research and Simulation Center, Rafael Ltd.

Microphone access

Requires permissions

Gyroscope access

Does not require permissions

MEMS Gyroscopes

Major vendors:

  • STM Microelectronics (Samsung Galaxy)
  • InvenSense (Google Nexus)

Gyroscopes are susceptible to sound

70 Hz tone power spectral density

50 Hz tone power spectral density

Gyroscopes are (lousy, but still) microphones

  • Hardware sampling frequency:
    • InvenSense: up to 8000 Hz
    • STM Microelectronics: 800 Hz
  • Software sampling frequency:
    • Android: 200 Hz
    • iOS: 100 Hz

Gyroscopes are (lousy, but still) microphones

  • Very low SNR (Signal-to-Noise Ratio)
  • Acoustic sensitivity threshold: ~70 dB
    Comparable to a loud conversation.
  • Sensitive to sound angle of arrival
  • Directional microphone (due to 3 axes)

Browsers allow gyroscope access too

Browsers allow gyroscope access too

Browsers allow gyroscope access too

Browsers allow gyroscope access too

Problem: how do we look into higher frequencies?

Speech range

Adult male85 - 180 Hz
Adult female165 - 255 Hz

We can sense high frequency signals

Due to aliasing

The result of recording tones between 120 and 160 Hz on a Nexus 7 device

Experimental setup

  • Room. Simple speakers. Smartphone.
  • Subset of TIDigits speech recognition corpus
  • 10 speakers $\times$ 11 samples $\times$ 2 pronunciations = 220 total samples

Speech analysis using a single gyroscope

  • Gender identification
  • Speaker identification
  • Isolated word recognition

We can successfully identify gender


Nexus 4

84% (DTW)

Galaxy S III

82% (SVM)
Random guess probability is 50%

A good chance to identify the speaker


Nexus 4 Mixed Female/Male 50% (DTW)
Female speakers 45% (DTW)
Male speakers 65% (DTW)
Random guess probability is 20% for one gender and 10% for a mixed set

Isolated words recognition

Speaker independent

Nexus 4 Mixed Female/Male 17% (DTW)
Female speakers 26% (DTW)
Male speakers 23% (DTW)
Random guess probability is 9%

Isolated words recognition

Speaker dependent

DTW
65%
Random guess probability is 9%

How can we leverage eavesdropping simultaneously on two devices?

Defenses

Software Defenses


  • Low-pass filter the raw samples
  • 0-20 Hz range should be enough for browser based applications (according to WebKit)
  • Access to high sampling rate should require a special permission

Hardware Defenses


  • Hardware filtering of sensor signals
    (Not subject to configuration)
  • Acoustic masking
    (won't help against vibration of the surface)

PowerSpy

Location Tracking using Mobile Device Power Analysis

Yan Michalevsky(1), Gabi Nakibly(2), Dan Boneh(1) and Aaron Schulman(1)

(1) Stanford University, (2) National Research and Simulation Center, Rafael Ltd.

Smartphone location $\approx$ Owner location


Accessing location

Even coarse location based on cellular network information

Requires permissions

Reading voltage and current

Does not require permissions

/sys/class/power_supply/battery/voltage_now /sys/class/power_supply/battery/current_now

$Power = f(Signal\ Strength)$

  • More power used upon transmission under low SNR
  • Signal amplification, error correction on the receive part
  • Verified experimentally in Bartendr [Schulman et al.]

Signal strength depends on geography and environment

Signal strength stability

Signal strength stability

Power profile consistency

Two phones of same model, same drive

Different models, same drive

What can we achieve by that?

  • Route distinguishability
  • Real-time motion tracking
  • New route inference

Route Distinguishability

What can we achieve by that?

  • Route distinguishability
  • Real-time motion tracking
  • New route inference

Real-time tracking

along a known (or assumed) route

What can we achieve by that?

  • Route distinguishability
  • Real-time motion tracking
  • New route inference

New route inference

Evaluation

Data processing

  • Standardization: $\frac{x - mean(x)}{std(x)}$
  • Smoothing: using a Moving Average filter (obtain general trends)
  • Downsampling (important for computation reduction)

Distinguishing routes

Each power profile is a time-series

Classifier based on time series comparison using Dynamic Time Warping (DTW)

Dynamic Time Warping

Euclidean distance

DTW distance

We can distinguish between routes

Unique Routes# Ref. Profiles/Route# Test Routes Success %Random Guess %
810558513
175119716
174136686
213157615
252182534
291211403

Real-time tracking

  1. A window of received samples is a subsequence of the reference power profile
  2. Using Subsequence-DTW determine the offset of the subsequence
  3. Infer location from reference profile

We can track along a route

We can track along a known route

And compensate for obvious errors...

New route inference

  • Points on map represented by nodes
  • Connecting road segments represented by edges
  • Probabilistic graphical model of location

Route inference based on road segments

Destination Localization

Route inference based on road segments

Exact Full Route Fit

Route inference based on road segments

Evaluation metric based on Levenshtein Distance

$d = 0.125$

$d = 0.25$

$d = 0.43$

Route inference based on road segments

Levenshtein Distance

Future work

  • Evaluation on larger datasets
    • More routes
    • More profiles per route
  • Improved tracking (Kalman filter?)
  • Improved route inference (HMM, Viterbi...)

Future work

  • Find better features
  • Current inference (from voltage)
  • State of Discharge (SOD) derivative as very coarse indicator
  • LTE
  • Choice of reference routes (time/condition based)

Defenses

Non-defenses

  • Adding noise
  • Limiting power sampling rate

Defenses

  • Secure hardware design
    • Exclude TX/RX chain from power measurement
  • Require superuser privileges to access power
  • Power consumption as a coarse location indicator

Conclusion

  • Giving applications direct access to hardware is dangerous
  • Permissions need to address sensor access
  • Hardware should not provide more than applications require (problematic)
  • Provide abstractions, not raw data [Jana et al.]

Thank you very much



Questions?



crypto.stanford.edu/gyrophone

It is possible to sample gyro through Javascript

Bias types

  • Linear bias: $v_m = S \cdot v_t + O$
  • Tolerance: non-linear, frequency response, etc.
  • Timing

Gyrophone Features

  • MFCC - Mel-Frequency Cepstral Coefficients
    • Statistical features are used (mean and variance)
    • delta-MFCC
  • Spectral centroid
  • RMS energy
  • STFT - Short-Time Fourier Transform

Gender identification


  • Binary SVM with spectral features
  • DTW with STFT features
    • Window size: 512 samples - corresponds to 64 ms under 8 KHz sampling rate

Speaker identification


  • Multi-class SVM and GMM with spectral features
  • DTW with STFT features (same as before)

Similar to time-interleaved ADC's

Similar to time-interleaved ADC's


DC component removal

Similar to time-interleaved ADC's


Normalization / use a reference signal

Similar to time-interleaved ADC's


Background or foreground calibration

Non-uniform reconstruction requires knowing precise time-skews

Filterbank interpolation
based on Eldar and Oppenheim's paper

Practical compromise

Interleaving samples from multiple devices

Evaluation

(Tested for the case of speaker dependent word recognition)

Single device

Two devices



  • Exhibits improvement over using a single device
  • Using even more devices might yield even better results
  • Not a proper non-uniform reconstruction

Further attacks

  • Source separation
    • Use the 3 axes of the gyro
    • Learn the number of sound sources around
    • Use angle of arrival information for source separation
  • Ambient sound recognition
    • Is the user in a room/outdoors/on a street?

FAQ

  • Why didn't you use gyroscope/magetometer fingerprints?
    Gyroscope offset measurement would require maintaining constant angular velocity.
    Hysteresis effects can disrupt magnetometer fingerprinting. Variability due to nearby magnetic fields.

FAQ

  • Did you experiment with an anechoic chamber?
    Yes, and did not find it beneficial at this stage.

FAQ

  • Perhaps the gyro actually measures the vibrations of the surface?
    Maybe, but tests suggest it's not only that. In any case it is still dangerous.

FAQ

  • Is it possible to use measurements from multiple devices in other ways?
    Yes. For example as in MIMO: EGC (Equal Gain Combining).