A novel intelligent approach for detecting DoS flooding attacks in software-defined networks

Software-defined networking (SDN) is a novel networking architecture that brings advanced programming capabilities leading to more enhanced behavior achieved by a controlling plane. Unlike traditional network architectures, SDN architecture manages forwarding elements through a centralized controller that maintains a global view of the network. Consequently, this approach opens a door for developing more flexible network services and applications. On the other hand, the increased usage of the Internet, requires us to protect all network devices involved in the communication process. Denial of Service (DoS) attack is a major threat in which the intruder attempts to prevent legitimate users from accessing network resources. This study uses an artificial intelligent based approach to deal with DoS attacks by taking the advantage of the central management provided by the SDN approach.


Introduction
Software-defined networking (SDN) is a novel networking architecture that brings advanced programming capabilities leading to more enhanced behavior achieved by a controlling plane.Unlike traditional network architectures, SDN architecture manages forwarding elements through a centralized controller that maintains a global view of the network.Consequently, this approach opens a door for developing more flexible network services and applications.On the other hand, the increased usage of the Internet, requires us to protect all network devices involved in the communication process.Denial of Service (DoS) attack is a major threat in which the intruder attempts to prevent legitimate users from accessing network resources.This study uses an artificial intelligent based approach to deal with DoS attacks by taking the advantage of the central management provided by the SDN approach.
SDN approach is the fruit of many years' effort to create a programmable network managed by a centralized controller.Separation between control plane and data plane requires an interface between the controller and forwarding elements.OpenFlow represents a vendor-independent interface which translates the high-level orders sent by the controllers to low-level behaviors that can be understood by the switches, that handles the L2-L4 network flows.However, OpenFlow has to be extended in order to handle the L5-L7 flows [1].The control plane provides a global view of the network that provides help in achieving more enhanced control mechanism for the forwarding plane.A basic SDN architecture is illustrated in Fig. 1.
In SDN architecture, the forwarding decision is made based on flow entries.Each flow entry consists of: (i) the header fields, used for matching an incoming packet with this flow, (ii) counters, provide useful statics related to that flow and (iii) actions that will be applied to the matched packets [2].The controller is responsible for adding, modifying and deleting the flow entries.Furthermore, the controller can make proactive or reactive decisions [3].Proactive decisions can be used in multipath forwarding applications which require less switch-controller communication where all flow entries are already added before the incoming traffic arrives at the switch.While in reactive decisions, unmatched packets are encapsulated in PACKET_IN messages and forwarded to the controller which, in turn, will reply to these messages with the appropriate decision by adding a new flow entry to the related switches in order to handle this type of packets by the switch itself [3].Security applications are considered an example of the reactive approach.In addition, SDN architecture introduces new security challenges (which are out from the scope of this work) such as targeting the controller by programming vulnerabilities, error configurations and DoS attacks on the controller-switch secure channel [4].Due to the fact that the data plane is managed remotely, both DoS attack and its distributed version (DDoS) have a significant impact in SDN paradigm.As a result, it will flood the switch with illegitimate flow entries and consequently, will prevent the controller from responding to legitimate flows [5].
A simple way to detect DoS attack in SDN is to monitor the volume of each flow using flow-based statistics provided by the SDN controller.Such a mechanism was introduced in YuHunag et al. [6] where the detection loop consists of two stages.If the total number of the received packets exceeds 3000 packets per second, the system will move onto the next stage which the incoming packets will be dropped when the total number of the received is equal or higher than 800 packets per second for 5 times continuously.Braga et al. [7] employed Self-Organizing Maps (SOM) for detecting the distributed version of DOS attack.The following features were obtained from each collected flow: Average of Packets per flow (APf), Average of Bytes per flow (ABf), Average of Duration per flow (ADf), Percentage of Pair-flows (PPf), Growth of Single-flows (GSf), and Growth of Different Ports (GDP).This approach showed a lower overhead compared to other traditional approaches.
Chen and Yu [8] used a back-propagation neural network in SDN paradigm to detect DoS attacks.The experimental study showed that the growing of network topology will lead to an increase in the attack detection rate.Wang et al. [9] proposed a graphical probabilistic inference model with an updating phase in order to cope with dataset shift problem which appears when the network traffic conditions differ from the actual conditions.
Taekyoung and Yanghee [10] employed a content-oriented networking architecture in which each access router is called an "agent".Each host belongs to a specific agent and sends a content request to that agent, which, in turn, will deliver the requested contents.A solicitation of the contents is made by the agent in order to prevent arbitrary packet forwarding, which improves the security of the network.Furthermore, the agent can keep tracking the requested contents as well as the host that creates these requests.The agent can discover DDoS attacks if the rate of content requests exceeds a threshold or a malicious request is discovered.
Seungwon et al. [11] proposed a simple algorithm inspired by SYN cookie algorithm for handling TCP-ACK packets.This type of packets is used in TCP handshake and can be used to perform a DDoS attack on a targeted server.Moreover, fuzzy logic through its ability to derive conclusions from insufficient or imprecise data performs an effective way to deal with DoS attacks [12]- [14].On the other hand, information theory-based metrics play a significant role in the detection of DoS attacks due to their low computation overhead [15].During a DoS attack, entropy values decrease significantly [16].
A fast entropy algorithm was introduced by David and Thomas [17] in order to implement a lightweight DDoS detection method.The first step in this algorithm is the flow aggregation, in which the flow count of each connection is collected over a particular time interval.The next step is entropy calculation.The last step is applying an adaptive algorithm to improve the detection accuracy.However, the main disadvantage of this approach is that it mainly depends on calculation the entropy values through a specific time-window, which forms a potential overhead in the case of being adopted by an SDN controller.
In this study, we will employ the flow statistics, which can be easily obtained by the SDN controller, in order to calculate the packet rate of each flow and subsequently discriminate the potential malicious flows that may contribute to a DoS flooding attack.After that, we use Support Vector Machines (SVM) algorithm for classifying the collected traffic.

Support Vector Machine (SVM)
Support vector machine (SVM) [18] is a statistical machine learning algorithm that uses the idea of finding the optimal hyperplane separation of labeled instances in a given data set which leads to a better generalization of the unseen data (Fig. 2).SVM and its various modified versions [19], [20] are widely used in intrusion detection systems.the set of hyperplanes in SVM can be written as: These hyperplanes can be combined by the following inequality: This optimization problem can be written as: A non-separable case can be handled by introducing slack variables which allows us to select a hyperplane with minimum errors: where C is a regularization parameter which affects the trade-off between the allowed errors in training phase and margin size.Large values of C lead to a smaller-margin which may cause an over-fitting case.Furthermore, to deal with nonlinear cases, a kernel function can be used to map the data set to a higherdimensional space as shown as (6).
The most commonly used kernel functions are polynomial and Radial basis function (RBF).The RBF kernel which is used by our SVM classifier is given by (7).
where γ is the kernel parameter.

Dataset
This experimental study is conducted by generating both volume-based and protocol-based DoS attacks and taking into consideration the case in which the attacker may send benign traffic besides the malicious one.By this means, a data set which consists of 321 instances of normal traffic and 639 instances of attack traffic, was generated during the evaluation stage where D-ITG tool [21] is used to generate UDP, ICMP flooding attacks whereas scapy library (secdev.org) is used for generating SYN flooding attacks.In this study, we generate DoS attacks with different rates (100,500,1000) pps.Using D-ITG tool one can generate UDP, ICMP floods with different payload size.
As shown in Fig. 3, we generated a UDP flood to a destination host which has 10.0.0.4 IP address with constant payload size (500 bytes) and constant packet rate (100 pps) for 1 second (1000 ms).We also assume that each IP address of each packet was not spoofed.Similarly, scapy library is used to generate SYN floods.As shown in Fig. 4, we lunch a SYN flood attack via Python/scapy script.Finally, we wrote a Python script that collects packet features from each corresponding host for periodically every 4-seconds based on scapy library which actually comes with a built-in packet sniffer and used widely as packet manipulation, capture, modification and replay library [22].In addition, the collected features were normalized using equation (8).

Experimental Setup
In order to evaluate the system, we used Mininet, an open-source SDN emulator where the experiment was conducted in a virtual environment (Linux Ubuntu 14.04/Virtual Box) on Intel i5 machine with 12 GB of RAM.The controller part was implemented using POX controller.The packet inspection unit, on the other hand, was implemented using a combination of scapy and sklearn Python libraries.As shown in Fig. 5, our testbed consists of 9 virtual hosts connected to different OpenFlow switches.The bandwidth of each link is 1 Mb/s.The connection between the SDN controller and the hosts was done locally where the SDN controller can communicate with any device and trigger the packet inspection unit.By this means, one can use the "util/m" script provided by Mininet to control Mininet hosts.For instance, we can write the following command in Linux Ubuntu terminal to access host1 in Mininet environment: -> mininet/util/m h1 ifconfig Instead of running ifconfig command, the SDN controller (i.e.POX controller) runs a Python script, which represents our packet inspection unit.In this context, it is worth noting that securing the communication between the SDN controller and the corresponding host is out of the scope of this study.

Performance Evaluation
evaluate our classifier, we applied 10-fold cross-validation where the dataset is partitioned into ten equally-sized folds.We train the model on 90% of the samples and then test the class labels of the remaining 10%.This procedure is repeated 10 times.The accuracy for each fold is determined by (9).
where True Positives (TP) is the total number of abnormal traffic instances correctly classified as abnormal traffic; True Negatives (TN) is the total number of normal traffic instances correctly classified as normal traffic; False Positives (FP) is the total number of normal traffic instances falsely classified as abnormal traffic and False Negatives (FN) is total number of abnormal traffic instances falsely classified as normal traffic.The accuracy of the whole folds is used to compute the final accuracy.In addition, the false alarm rate for each fold is calculated by (10).

The Proposed Method
In this section, we present our proposed approach which consists of two stages.

The first stage
In this stage includes packet rate calculation based on the flow statistics collected by the SDN controller.Once the packet rate exceeds a predefined threshold, the controller will activate the packet inspection unit which is responsible for collecting packet statistics, applying SVM algorithm and notifying the SDN controller about the classification results.As depicted in Fig. 6, the packet inspection unit is implemented on each host separately, where the controller can activate the packet inspection unit remotely and take the proper decision based on a binary classification step conducted by a previously trained SVM classifier.
In volume-based attacks, the intruder sends numerous packets from different sources to the target server.This consumes a huge amount of network bandwidth.While in protocol-based attacks, the attacker exploits a weakness in the Internet protocols which can lead to a complete denial of service in targeted server's resources.The packet rate is a significant metric which was used in [12] for implementing a host-based DoS attack detection system using Mamdani's fuzzy inference model.In our proposed method, however, we calculate the packet rate using equation (11)  If the packet rate is greater than or equal to 100 packets per second, then the system will move into the next stage and it will be discussed in the next subsection.The reason of selecting low values of the packet rate (to trigger the next stage) is that during our experimental study we found that the performance of the selected controller (i.e.POX controller) was affected significantly by flooding attacks that exceeds 100 packets per second, and therefore we concluded that it can be more efficient to detect these type of DoS attacks starting from the mentioned threshold.

The Second Stage
The aim of the second stage is to reduce the false alarm rate.Once the packet rate exceeds the predefined threshold, the controller will activate the packet inspection unit.At this point, packet-based statistics will be collected during a 4-second period by the packet inspection unit so that it can be used by our SVM classifier.As a consequence, it will determine whether the attack is happened or not.The collected features in the second stage are shown at Table 1.

Results and Discussion
Using our SVM classifier, the accuracy and false alarm rate for each fold is shown at Table 2.The average accuracy for 10-fold cross validation is 96.25% whereas the average false alarm rate is 0.26%.As a result, the SVM classifier showed very low false positives compared to the total number of false negatives.Fig. 7 shows the detection results based on the implementation of our two-stage approach where the SDN controller updates the flow table of the corresponding switch by sending a flow mod instruction that deletes the malicious flows.Thereafter, the controller updates the firewall module by adding the corresponding malicious source MAC address in the blocked list.In addition, Table 3 shows a comparison between the SVM approach and the other supervised machine learning approaches.In terms of false alarm rate, the SVM approach has shown the best results.In terms of accuracy, the random forest classifier has shown a better accuracy compared to SVM approach.However, the highest accuracy was achieved using a Multilayer Perceptron (MLP) neural network (a single hidden layer with 4 neurons).

Conclusion
By this study we have proposed, designed and implemented a two-stage novel DoS flooding attack detection method in SDN paradigm.In the first stage, when the packet rate exceeds a predefined threshold, the system will move onto the next stage which employs SVM algorithm to investigate the suspected traffic.In our experimental studies, we showed that the controller can communicate with any device and trigger the next stage (packet inspection unit) in Mininet environment.Our proposed system was able to detect DoS flooding attacks with 96.25% accuracy and 0.26% false alarm rate based on SVM classifier with RBF-kernel.Compared to other classifiers, the SVM approach has shown the lowest false positive rate.In addition, the MLP-neural network approach has shown the highest accuracy.We conclude that the integration between the SDN approach and machine learning techniques formed a promising solution for providing more secure networks.Our future work will be focused on using the SDN paradigm for detecting application-layer DoS attacks as well as investigating the efficiency of other multi-stage approaches in order to identify more complex types of network attacks.

Fig. 4 .
Fig. 4. Sending a SYN flood using python script with scapy library.
based on the flow statistics collected periodically every 10 seconds by the SDN   =         (11)

Fig. 7 .
Fig. 7. Results for implementing our proposed system based on the POX controller.

Table 1 .
Collected features for the classification stage

Table 3 .
Comparision between SVM and other supervised machine learning approaches