On Inferring Browsing Activity on Smartphones via USB Power Analysis Side-Channel
※ 해당 논문을 분석하여 작성한 글입니다.
논문 선정 이유
이와 비슷한 관련 연구를 진행하고 있는데 좀 더 세부적으로 이 연구에서는 브라우저 활동을 어떻게 탐지하는지 궁금해서 읽어보게 되었다. 사용자의 어떤 활동들을 탐지하는지 공격 기법과 환경 세팅들이 궁금해서 읽어보게 되었다. 연구를 진행하면서 부족한 부분들을 채우고 더 이해해보고자 관련 논문으로 선정하게 되었다.
I. Contributions and Findings
To fully characterize the side-channel associated with USB power consumption
: analyzed how webpage identification accuracy is impacted by variables pertinent to mobile devices
e.g., battery charging level, wireless connection (WiFi or LTE), and taps on the screen.
These include availability of browser cache, training and testing signals collected on different smartphones, the time elapsed between the collection of training and testing signals, geographical proximity between the user and the web server, duration of power traces, and availability of encrypted (TLS) connections on identification accuracy
a) Impact of Battery Charge Level and Taps on Screen:
Reduced webpage identification accuracy.
- Charging at 30% level :
Identification accuracy decreased, compared to when the battery was fully charged.
However, even with the decrease in accuracy, it was still possible to reliably infer browsing information.
- Taps on the screen :
Added significant noise to the power traces, making webpage identification challenging.
Results show that this factor caused a significant degradation in identification performance.
b) Impact of Other Variables:
- Time elapsed between collection of training and testing traces :
Training traces older than 30 days leading to a significant drop in identification accuracy.
This suggests that traces used to train the classifiers should be updated frequently to improve the attack’s success rate.
Able to reliably identify webpages under both WiFi and LTE, even when the training power traces were collected using one type of connectivity (e.g., LTE), and the testing traces were obtained with another (e.g., WiFi).
(this is not related to my research.)
- Using different smartphones for training and testing :
Accuracies dropped significantly.
However using two smartphones for training, and a different smartphone for testing reduced the drop in webpage identification accuracies.
- When the user did not tap on the screen :
Enabling browser cache improved identification accuracy.
However, for power traces collected when the user tapped on the screen and while the smartphone was charging, enabling cache led to a decrease in webpage identification accuracies.
- Increasing the geographical distance between the smartphone and the host serving the webpages :
Reduced identification accuracy.
Divided webpages in foreign (located outside of the continental United States), and local (within the United States)
- observed that webpages hosted locally had slightly higher identification accuracies than foreign-hosted webpages.
- Retrieving webpages via secure connections (HTTPS or, more specifically, TLS)
Not have a measurable impact on identification accuracy.
c) Experiment Results:
They used machine learning algorithms to identify which webpage the user visited out of a closed set of fifty webpages . They were able to achieve identification accuracies as high as 98.8% with 2-second traces.
Even in the worst case, they achieved an identification accuracy of 54.2% with 6-second traces.
- i.e., when the cache was enabled, the user tapped on the screen, and the battery was charging from 30%,
When training and testing traces were collected using different smartphones, identification accuracy was at least 44.5%. This is significantly higher than choosing one out of fifty webpages at random (which leads to 2% baseline accuracy).
II . EXPERIMENT SETUP
- Attack Timing and attack model
Collected power traces while webpages were loading on two types of smartphones: Samsung Galaxy S4, and Samsung Galaxy S6.
- Used the homepages of the 50 most popular (non-adult) websites, based on Alexa ranking.
- To collect power traces during webpage loading :
instrumented the USB charging circuit as shown in Figure 1.
Circuit connects a DC power supply, a smartphone, and a data acquisition card (DAQ),
Measures voltage variations (and therefore the corresponding power consumed) across a 0.1 Ω shunt resistor.
To satisfy the USB charging specifications, they connected the data pins of the USB cable using a 200 Ω resistor.
Most smartphones use lithium-ion (Li-ion) batteries due to their high energy density. The charging profile of Li-ion batteries encompasses two stages.
1) In the first stage, the smartphone charging circuit applies constant current to the battery.
This stage ends when the battery reaches a specific charging voltage (usually between 3.7 V and 4.2 V).
2) In the second stage, the battery is charged at a constant voltage, and the current gradually decreases until it reaches a termination value. Because the battery charging process could take several hours, the current used to charge the battery does not vary significantly while the smartphone is loading a webpage.
They used an Agilent E3630A DC power supply as the power source. Measured the voltage drop across the shunt resistor using a National Instrument USB-6211 (DAQ) at a sampling rate of 200 kHz. They set the power supply to output a fixed voltage of 5.5 V. This voltage is higher than the nominal USB voltage of 5 V to compensate for the voltage drop introduced by the shunt resistor.
The resulting voltage was between 5.32 V and 5.48 V, which is within the tolerance of many modern smartphones [22]. The DAQ’s data output was connected to a laptop, which stored data for offline analysis using LabVIEW.
Figure 2 shows the power consumption traces collected while loading the homepages of google.com and youtube.com.
Collected power traces in two modes: user-actuated, and automated.
- User-actuated traces, the user initiates webpage loading by typing a URL in Mobile Chrome’s address bar.
- Automated traces, developed an Android application that launches the Chrome browser, and uses it to load the intended webpage. It allows 10 seconds for webpage loading (only the first 6 seconds of data were recorded), and then loads the next webpage.
Before each measurement
-> Closed all other applications on the smartphone
-> Set the screen brightness to a constant level.
Collected under two conditions:
a. battery level (30% vs. 100%)
b. browser cache (enabled vs. disabled).
They chose these conditions because they impact smartphone energy consumption.
When the battery is fully charged, almost all power from the charger is used to load webpages.
In contrast, when the smartphone is charging, a sizable (almost constant) amount of power is used to charge the battery, hence affecting the traces.
Cache availability was chosen because cache misses increase network activity, and therefore radio activity. Retrieving data wirelessly requires more energy than loading it from local flash memory
Collected 40 automated traces per webpage for each of the following combinations:
30% battery, cache; 30% battery, no cache; 100% battery, cache; 100% battery, no cache.
Used Four Samsung Galaxy S4 devices
To analyze the impact of different smartphone models on the attack, used Samsung Galaxy S6.
III . WEBPAGE IDENTIFICATION
Webpage identification process consists of training and testing phases.
In the training phase:
(1) Extracted frequency-domain features from the power traces;
(2) Trained a classifier (Random Forest [26]) on the extracted feature vectors.
Next, we provide details on feature extraction, classification, and trace segmentation.
Classifier Training and Testing:
They used Random Forest to classify power traces because in our experiments it outperformed other commonly used classifiers, such as SVM, and Dynamic Time Warping (DTW).
We used the WEKA implementation of Random Forest.
They experimented with four training-testing scenarios.
a. The first involved 40 power traces per webpage, collected using automated webpage loading;
This scenario is used when training and testing are performed with data from the same smartphone.
b. Trained the classifier using all 40 automatically-collected traces, and performed testing with 10 traces collected via user-actuated page loading. This scenario was used with data is collected with user taps.
c. They trained classifier using 40 traces per webpage, collected using automated webpage loading, on one smartphone device; then used 40 traces collected from a different smartphone device for testing.
d. They trained the classifier using 80 traces from two smartphones (40 traces from each device), and tested on 40 traces from a different smartphone.
Feature Extraction:
Transformed each power trace to its corresponding frequency-domain representation using Fast Fourier Transform (FFT)
To reduce the impact of noise on individual frequencies, divided the frequency range into equal-size bins.
- settled on using 125 bins
Figure 3 shows the result of feature extraction on the data in Figure 2. Each data point in Figure 3 represents a feature.
Trace Segmentation and Voting:
Variable network conditions, web-server load, and smartphone background applications introduce intermittent noise in power traces. To mitigate the effects of noise, divided each trace into overlapping 0.5-second segments
Feature extraction was performed on each segment, and the classifier was trained using segments from all traces.
Evaluation of Identification Performance:
To evaluate classifier performance, calculated Rank 1 and Rank 5 identification accuracies.
- Rank 1, a trace is classified correctly if the most popular label assigned to the trace’s segments is the correct label for the trace
- Rank 5, consider a trace as correctly classified if the correct label appears within the 5 most popular labels
- For each rank, also present the Normalized Rank-n Accuracy, which is defined as follows
( Let pn be the probability that the classifier correctly labels a trace for Rank-n. The probability of correctly guessing the website loaded by the smartphone is computed as pn/n, and represents the probability that the adversary guesses the correct website label given the Rank-n output of the classifier. )
A. Identification Accuracy on Automated Dataset
- Trace Duration: Increasing the duration of the traces led to an improvement of identification accuracy. ( with 2-second traces)
- Caching: Results show that enabling cache improved identification accuracy (see Table I).
improved identification accuracy for foreignhosted websites more than for websites located within the United States. This is because the farther the host serving the content, the more network-related noise is added to the traces
- Battery Level: Users connect their smartphones to charging ports at various battery levels. In particular, they were consistently able to classify traces with higher accuracy when the battery was fully charged
If the phone is charging, a substantial amount of the available current is directed to the battery, and therefore the fluctuations in power consumption due to webpage loading is limited
B. Identification Accuracy on User-Actuated Dataset
Once included user activity in the form of taps, identification accuracy dropped significantly due to tap induced noise. This is because tap characteristics (e.g., tap location on the screen, timing, and duration) are different in each trace, which leads to noisy traces.
To validate this observation, computed the average (intra-class) Dynamic Time Warp (DTW) distance between pairs of user-actuated traces, and between pairs of automated traces, under different caching and charging conditions.
While two seconds were sufficient to classify webpages with high confidence using automated traces, this was not the case with traces from the user-actuated dataset. Regardless of caching and charging, they achieved good Rank 1 accuracy with six-second traces.
DTW(Dynamic Time Warping)은 동적 시간 워핑이라고 불린다.
두개의 시계열 데이터가 서로 얼마나 유사한지 비교할 때 사용한다.
DTW를 사용하는 이유?
- 두개의 시계열 데이터 길이가 달라도 유사도 비교 가능
- 비슷한 패턴이지만 시간차가 있는 경우(shift 발생) 유사도 비교 가능
[참고]
DTW 기본 설명 및 실습 코드
DTW(Dynamic Time Warping)은 동적 시간 워핑이라고 불린다.
blog.kubwa.co.kr
Overall
Their experiments show that although the presence of taps substantially reduces identification accuracy compared to automated collection of power traces, it is still possible to accurately classify six-second user-actuated traces.
V . IMPACT OF OTHER VARIABLES
Examined the identification accuracies according to the following variables:
(1) different smartphones used for training and testing (training traces were collected from one or more smartphones that are not used for testing);
(2) LTE and WiFi training and testing;
(3) aging of training traces;
(4) domestic vs. foreign websites
(5) websites accessible via unencrypted connections (denoted as “HTTP”) vs. accessible through TLS-encrypted links (denoted as “HTTPS”).
A. Training and Testing Traces From Different Devices
- using different smartphones for training and testing : led to a significant drop in identification accuracy.
- On the other hand, by training on two devices, and testing on a third, we were able to achieve identification accuracies above 80% with 6-second traces. ...
This is likely because the classifier generalizes better when trained on multiple devices, which account for more variety within the traces.
B. Training and Testing Using WiFi and LTE
They collected power traces while accessing websites over both WiFi and LTE and experimented with three trainingtesting configurations: (1) LTE training and LTE testing, (2) LTE training and WiFi testing, and (3) WiFi training and LTE testing.
Results (see Table III) show that accuracy obtained when training and testing on LTE is comparable to that of training and testing on WiFi (in Table I).
C. Aging of Training Traces
Many of the webpages considered in this work contain content that changes over time.
To determine the impact of aging on training data, collected testing traces 32 and 70 days after training, with cache enabled and fully charged battery.
This suggests that, in order to achieve good identification accuracy, training traces should be updated frequently.
D. Foreign vs. Domestic Websites
Tested this variable because the distance between the client and the host serving a webpage is known to affect packets’ delay and jitter.
The farther the host serving the content, the more variable will be its measured bandwidth and delay. In turn, this variability affects page loading, and hence the corresponding power traces.
-> Experiments show that the location of the host serving a webpage has a very small impact on identification accuracy.
E. HTTPS vs. HTTP Websites
Tested this variable because the use of encryption between the smartphone and the server can introduce noise in power traces.
TLS requires additional communication rounds to exchange TLS session keys before a connection can be established. This can potentially increase the variability of power traces.
Results show that there is no significant difference in identification accuracy between the two types of websites. This indicates that the attack is as effective for identifying securely transmitted webpages as with webpages transmitted without encryption.
CONCLUSION AND FUTURE WORK
In this paper, they demonstrated that it is possible to accurately infer browsing activity on a smartphone using USB power consumption measurements.
This work is the first to study this side-channel attack on smartphones, and to analyze a multitude of factors that affect the traces that are collected during the attack, such as:
battery charging level,
user interaction with the touchscreen,
trace length,
time between collection of training and testing traces,
WiFi and LTE connectivity,
training and testing device mismatch,
website characteristics such as type of connection (HTTP or HTTPS)
location of the host serving the webpage relative to the smartphone.
Overall, results show that the attack is highly effective, because webpage loading generates power signatures that are:
(1) distinctive:
: different webpages generate different power traces due to factors such as the amount of data (text, images, and videos) being retrieved, the number of TCP connections required to retrieve all webpage components, and the computational cost of the scripts running within the webpage;
(2) consistent:
each time a particular page is loaded, it generates a power trace that is similar to its previous power traces.
연구를 진행하면서 생각보다 환경변수를 설정하고 결과값을 계측하는 게 어려운 일임을 깨닫고 있는데 이 논문을 통해서 다양한 변수들을 고려해볼 수 있었고 어떤 방법을 통해 해결했는지 알 수 있어서 흥미로웠다.
'SWLUG > paper analysis' 카테고리의 다른 글
[논문분석] 김기훈, and 박재표. "ICT 기반 전력시스템에서 APT 공격탐지를 위한 온톨로지 모델링." 한국산학기술학회 논문지 25.2 (2024): 764-770. (10) | 2024.10.29 |
---|---|
[논문분석] 이지은. "모빌리티 강건 주행을 위한 인공지능 기반 노면 상태 판단 알고리즘 개발." 국내석사학위논문 순천향대학교, 2023. 충청남도 (16) | 2024.09.24 |
[논문 분석] USB Powered devices (0) | 2024.05.11 |
[논문 분석] Charger -Surfing (2) | 2024.04.30 |
[논문 분석] IoT Sensor가 연결된 국방정보통신망의 사이버 보안 연구 (1) | 2024.04.27 |