You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dataset produced from real Wireless Network logging
Full dataset
~ 160,000,000 Rows x 155 Columns
Reduced Training Set Used
~ 1.8 Million Rows x 155 Columns
Produced from 1 hour of logging
Majority of data is of “normal” class in either dataset
Project Goal
Build a classifier capable of properly classifying tuples with four specific attack types:
Amok
Deauthentication
Authentication Request
ARP
3 Major Tasks
Preprocessing/Cleaning
Feature Selection
Classification
About the Attacks
Deauthentication
A Denial of Service attack that uses unprotected deauthentication packets to spoof an entity. The attacker monitors traffic on a network to discover MAC addresses associated with specific clients. A deauthentication message is then sent to the access point on behalf of a particular MAC address, which forces that client off the network. The attacker then connects to the access point as the client that was previously disconnected.
Authentication Request
A type of Flooding Attack -> “In this case the aggressor attempts to exhaust the AP’s resources by causing overflow to its client association table. It is based on the fact that the maximum number of clients which can be maintained in the client AP’s association table is limited and depends either on a hard-coded value on the AP or on its physical memory constraints. An entry on the AP’s client association table is inserted upon the receipt of an Authentication Request message even if the client does not complete its authentication (i.e., is still in the unauthenticated/unassociated state).” - Intrusion Detection in 802.11 Networks: Empirical Evaluation of Threats and a Public Dataset
Amok
Another flooding attack, similar to Authentication Request
ARP (Address Resolution Protocol)
“In computer networking, ARP spoofing, ARP cache poisoning, or ARP poison routing, is a technique by which an attacker sends (spoofed) Address Resolution Protocol (ARP) messages onto a local area network. Generally, the aim is to associate the attacker’s MAC address with the IP address of another host, such as the default gateway, causing any traffic meant for that IP address to be sent to the attacker instead.” - Wikipedia
Preprocessing/Cleaning
Starting Point
Wireshark Column Names
Adding Wireshark Column Names
[FILE: col_names.txt]
frame.interface_id
frame.dlt
frame.offset_shift
...
wlan.qos.buf_state_indicated
data.len
class
Replaced ‘?’ with NaN values, then dropped columns with over 60% NaN values
removed 7 columns
...
data=data.replace('?', np.nan)
...
# If over 60% of the values in a column is null, remove itprev_num_cols=len(data.columns)
data.dropna(axis='columns', thresh=len(data.index) *0.40, inplace=True)
print("Removed "+str(prev_num_cols-len(data.columns)) +" columns with all NaN values.")
Drop the columns that have over 50% of its values as constant
# Output the minimized and preprocessed dataset to a ZIP file# (with no index column added)data.to_csv(
Path(resource_dir, 'preproc_dataset.zip'),
sep=',',
index=False,
compression='zip')
Perform min-max normalization on attributes used for classification (range 0-1)
We examined distinct values in remaining columns, and chose those with more distinct values for the normal class value than the attack class values
Using a little SQL magic…
selectcount(DISTINCT(wlan_fc_moredata))
from AWID_REMOVED_NULL where class='normal'selectcount(DISTINCT(wlan_fc_moredata))
from AWID_REMOVED_NULL where class='arp'selectcount(DISTINCT(wlan_fc_moredata))
from AWID_REMOVED_NULL where class='amok'selectcount(DISTINCT(wlan_fc_moredata))
from AWID_REMOVED_NULL where class='authentication_request'selectcount(DISTINCT(wlan_fc_moredata))
from AWID_REMOVED_NULL where class='deauthentication'select wlan_fc_moredata
from AWID_REMOVED_NULL where class='normal'
Then, chose the following 3 columns for our analysis:
wlan.fc.type
frame.time_delta_displayed
wlan.duration
Classification
Isolated the attack types
...ATTACKTYPE<-"amok"# Keep only the target class and the normal packetswifiLog2<-wifiLog2[wifiLog2$class=="normal"|wifiLog2$class==ATTACKTYPE, ]
wifiLog2$class<-as.character(wifiLog2$class)
wifiLog2$class[wifiLog2$class=="normal"]<-as.character("0")
wifiLog2$class[wifiLog2$class==ATTACKTYPE]<-as.character("1")
wifiLog2$class<-as.factor(wifiLog2$class)
...
Separate files to handle each attack type
Partitioned dataset into 66.6% training data and 33.3% test data