Mitigating Network Denial-of-Service through Traffic Clustering
Network congestion research over the past 20 years includes adaptive strategies to manage bandwidth consumption, and mechanisms to segregate traffic into quality-of-service classes. To date, proposed mechanisms require co-operation among various network components - which is now guaranteed by neither malicious network activity, nor non-malicious applications which consume enormous amounts of bandwidth for purposes they consider legitimate (e.g. peer-to-peer traffic). While a typical approach is to attempt to identify malicious behavior, such maliciousness has become very difficult to distinguish (e.g. requested commercial bulk emails resemble spam; peer-to-peer networks resemble communication networks of compromised machines; flash crowds resemble distributed denial-of-service attacks).
We seek a network management strategy and mechanism permitting legitimate communication even when there is an overwhelming demand on network services, not by distinguishing legitimate from illegitimate traffic, but rather disruptive from non-disruptive traffic. We pursue two questions: is there some way to automatically and efficiently discriminate between disruptive and non-disruptive network traffic; and if so, can this be used to better manage network resources? The main challenge is the variable nature of, and the difficulty to identity, disruptive traffic - loosely defined, that which consumes disproportionate network bandwidth. We propose to "cluster" packets into sets of similar traffic, with aggregation based on a suitable measure of similarity, and then to regulate sets of traffic separately. With proper dynamic quality-of-service configurations, outgoing bandwidth can be apportioned between applications, limiting the amount of network resources that may be consumed by individual clusters. Our classification need not be highly accurate; as long as it assigns most disruptive traffic to a small subset of the total number of classes, we can arrange that non-disruptive traffic gain a reasonable share of network resources (e.g. bandwidth) thus limiting the collateral damage due to malicious activity. While time is less critical for the method of discovering classes, a very fast method (e.g. sub-linear in packet size) is needed for determining in real time into which class a packet fits.
For the classification part, we propose to use a novel enhancement of the basic "n-gram analysis" method used in the past few years by others for the detection of worms and related malicious traffic. We include the offset of n-grams (i.e. target substrings) in a packet as part of our characterization. Preliminary unpublished experiments to classify traces of real network traffic offer promise, however our initial experiments require more advanced (to be determined) mathematical techniques and statistical measures for recognizing patterns which allow us to properly segregate packets into suitable classes. This research is motivated by earlier work[1].
1. M. Vargas Martin, J.M. Robert, P.C. van Oorschot, A Monitoring System for Detecting Repeated Packets with Applications to Computer Worms, Carleton Univ. C.S. Tech. Report TR-04-02 (June 2004). pdf