Volume 11 - Issue 1
Application of Deep Learning on the Characterization of Tor Traffic using Time based Features
- Clayton Johnson
Colorado Mesa University, Grand Junction, Colorado, USA
cpjohnson@mavs.coloradomesa.edu
- Bishal Khadka
Colorado Mesa University, Grand Junction, Colorado, USA
bkhadka2@mavs.coloradomesa.edu
- Ethan Ruiz
Colorado Mesa University, Grand Junction, Colorado, USA
elruiz@mavs.coloradomesa.edu
- James Halladay
Colorado Mesa University, Grand Junction, Colorado, USA
jehalladay@mavs.coloradomesa.edu
- Tenzin Doleck
Simon Fraser University, Burnaby, CA
tdoleck@sfu.ca
- Ram Basnet
Colorado Mesa University, Grand Junction, Colorado, USA
rbasnet@coloradomesa.edu
Keywords: Tor traffic, deep learning, machine learning, traffic identification, encrypted traffic
Abstract
The Onion Router (Tor) is a popular network, widely used by both political dissidents and cyber
criminals alike. Tor attempts to circumvent government censorship and surveillance of individuals
by keeping secret a message’s sender/receiver and content. This work compares the performance
of various traditional machine learning algorithms (e.g. Random Forest, Decision Tree, k-Nearest
Neighbor) and Deep Neural Networks on the ISCXTor2016 time-based dataset in detecting Tor traffic.
The research examines two scenarios: the goal of Scenario A is to detect Tor traffic while
Scenario B’s goal is to determine the type of Tor traffic as one of eight categories. The algorithms
trained on Scenario A demonstrate high performance, with classification accuracies > 99% in most
cases. In contrast, Scenario B yielded a wider range of classification accuracies (40-82%); Random
Forest and Decision Tree algorithms demonstrate performance superior to k-Nearest Neighbors and
Deep Neural Networks.