Why Poissonian traffic models matter more now than ever, part 2

My first encounters with queueing theory and non-Poisson network traffic statistics came at the same time, in early 2001. I was less than a year into my first real job and was tasked with helping a DoD development effort on a then-novel network intrusion detection system. As a mathematician sometimes masquerading as a physicist while sometimes applying these fields to communications security, I was then a total neophyte (and am still very much an amateur) when it comes to the details of network infrastructure, protocol stacks and the like.

At any rate, I arrived at an Army base in Hawaii with not a lot to do immediately: the prototype IDS was still in the early stages of implementation. So I decided a good thing to do with my time would be to simulate its performance. And I figured the best way to do that would be to try to simulate the Internet first. I was a young guy and didn’t know any better.

I rapidly worked through a bunch of papers describing new observations of power-law distributions in both network packet interarrivals and link topologies. And I built a long simulation based on this stuff in MATLAB. Probably I never got rid of half the mistakes in my code, but given an hour on my laptop, it could output a hundred or so nodes producing extremely simple surrogates for network traffic. Depending on the parameters I’d see link queues in equilibrium, or overflowing, but nothing of real utility for what I was really there to work on. So I abandoned the exercise, because by then the earliest iteration of the system was operational and I had some real data to analyze.

The point of this personal background is to highlight that I was naturally predisposed from the very beginning of my professional experience with computer networks to accept the long-prevailing views about power laws and multifractal traffic and all that. But over time that changed. The thing that started it was when a DoD researcher who I know well and whose talents I respect enormously mentioned offhandedly to me that he’d heard that network operators monitoring network backbones said that unlike on most links, their traffic was “pure Poisson”. That claim of emergent Poisson behavior at high speeds made sense to me.

As I later told a roomful of smart people in early 2009 (some of them incredulous because they’d read about fractal interarrival statistics over the years): if you had enough monkeys typing “W-O-R-D” over TELNET sessions on the same link, the fact that each monkey’s keystrokes wouldn’t be independent and identically distributed shouldn’t mean that the overall sequence of packet interarrival times would be non-Poissonian.

In fact there was both theoretical work based on models from the eighties (notably Sriram and Whitt’s 1986 paper “Characterizing Superposition Arrival Processes in Packet Multiplexers for Voice and Data”, which emphasized the importance of the relationship between the number of flows, aggregate traffic rate, and relevant timescales) and measurements from 2001 suggesting essentially the same thing. Two 2001 papers from Bell Labs by Cao et al. titled “On the Nonstationarity of Internet Traffic” and “The Effect of Statistical Multiplexing on the Long-Range Dependence of Internet Packet Traffic” argued that as the number of flows increased, traffic would become closer to the Poissonian ideal. They pointed out that binning interarrival data would not reveal this emergent behavior, and demonstrated the tendency towards Poisson behavior in models and with real data. About the only thing that appeared as if it might affect these results in practice would be link saturation, but even this was conjecture.

So what you had at this point was an opportunity for an academic catfight, but one that could have real consequences for network protocol and infrastructure design. But the fight apparently was decided outside the journals and conferences. There was a quietly solidifying consensus of fractal traffic. (As an aside, an engrossing but massive study by Harry Collins called Gravity’s Shadow details how consensus is reached among research communities through the example of the gravitational wave detection community.)

But that consensus, which still exists among the broader community, is apparently being dismantled and replaced with a new one. The first sign of this might have been a paper by Veitch et al. called “Multifractality in TCP/IP traffic: the case against”, where some of the more exotic claimed traffic behaviors were called into question. Veitch et al. also highlighted the utility of “Poisson cluster processes” (basically, Poissonian flow arrivals, but with distinct intra-flow statistics) in accurately modeling network traffic, including the reproduction of pseudoscaling. They said that

backbone traffic has been said to tend to a Poisson process with increasing traffic rate. While it is true that the distribution of inter-arrival times tends to exponential as [the number of flows] increases, the inter-arrivals remain correlated…This contradicts the necessary assumption of independent inter-arrival times of a Poisson process…One must be careful of the subtle fact that when examining inter-arrival times as traffic rates increase, one is in fact shifting the focus of observation to smaller and smaller scales.

That is, Veitch et al. argued that the correlations amongst intra-flow interarrivals would be preserved within the aggregate traffic even as the overall rate continued to increase. Although this is a subtle argument, the emergence of exponential interarrival behavior was already clear.

The next post in this series will deal with the most recent work on network packet arrival statistics.

Leave a Reply