## Why Poissonian traffic models matter more now than ever, part 5

To sketch how the large-deviation approach discussed in the last post in this series can actually work, consider a process with conditional rate function 1 for the first $N = 10000$ arrivals and $q_*$ thereafter. Take the maximum-likelihood rate estimate over the previous $n$ arrivals and see how the resulting putative martingales behave using the following MATLAB code:

function y=blogf(q,N,n,plotflag)

% Piecewise-constant inhomogeneous Poisson
% process at unit rate for N steps, then at
% rate q for N steps. MLE of rate obtained
% at each timestep using a window of n steps,
% then time rescaling theorem applied to
% obtain a putative unit-rate Poisson process.
% Finally the estimated martingale is formed
% and a large-deviation result applied
% to obtain the probability of the observed
% behavior at the beginning and middle
% (i.e., where the jump occurs).

% NB. There is no error checking done (e.g.,
% for n, N) and the code is far from perfect,
% but should be OK for illustrative purposes

% unit rate Poisson process
at1 = cumsum(-log(rand(1,N)));
% rate q
at2 = cumsum(-log(rand(1,N))/q);
at = [at1, at1(end)+at2];   % arrival times
iat = [at(1), diff(at)];    % interarrivals

% empirical rate estimate over last n arrivals
qe = zeros(1,2*N);
for j = 1:n
qe(j) = j/at(j);
end
for j = (n+1):2*N
qe(j) = n/(at(j) - at(j-n));
end

% time rescaled process w/ empirical rate
tau = zeros(1,2*N);
tau(1) = iat(1)*qe(1);
for j = 2:2*N
tau(j) = tau(j-1) + iat(j)*qe(j);
end

% putative martingale from empirical rescaling
m = (1:2*N) - tau;

t = 1;  % timescale
% j1 and j2 are respectively the indices
% of the first arrival further than n*t from 0
% and from the Nth arrival
j1 = find(tau > n*t, 1, 'first');
j2 = find(tau > tau(N)+n*t, 1, 'first');
% martingale differences
dm1 = m(j1) - m(1); dm2 = m(j2) - m(N);
% a, x, yy vars from large-deviation result
a1 = dm1/n; a2 = dm2/n;
x1 = a1/t + 1; x2 = a2/t + 1;
yy = t*n;   % (note: y is output var.)
% log of asymp. prob. of fluctuations
logp1 = yy*(-x1*log(x1) + x1 - 1);
logp2 = yy*(-x2*log(x2) + x2 - 1);

y = [logp1 logp2];

if plotflag
figure;plot(tau,m,'k');hold;
mmmm = [min(m) max(m)];
plot(tau(N)*ones(1,2),mmmm,'k:');
axis tight
xlabel('\tau'),ylabel('N_\tau - \tau')
s1 = 'actual rate changes from 1 to ';
s2 = sprintf('%g',q);
s3 = ' at ';
s4 = sprintf('%d',N);
s5 = 'th arrival; MLE rate over ';
s6 = sprintf('%d',n);
s7 = ' arrivals';
title([s1 s2 s3 s4 s5 s6 s7]);
end

Two typical outputs are shown below.

Typical graphical output of blogf for $q_* = 1.5, N = 10000, n = 1000.$

As above, except $q_* = .67$.

The following code snippet (which omits the necessary plotting commands) can be used inline for a quick evaluation of the behavior of the overall technique as a function of $q_*$ and $n:$

% L = number of runs to average over
N = 10000; L = 40;
q = .65:.05:1.65; n = ceil(logspace(1,3,25));
for j = 1:21
qj = q(j);
for k = 1:25
nk = n(k); temp = zeros(L,2);
for l = 1:L
temp(l,:) = blogf(qj,N,nk,0);
end
z1(j,k) = mean(temp(:,1));    % initial
z2(j,k) = mean(temp(:,2));    % middle
end
end

Using this, it’s easy to see the difference between the log-likelihood of observed fluctuations before and during an abrupt rate change.

Average log-likelihood of the initial $n$ steps over 40 observed runs as a function of $n$ and $q_*$. Note that there should be no dependence on $q_*$ in this case, and that in fact no such dependence is evident.

Average log-likelihood of the $n$ steps immediately following a rate change over 40 observed runs. Note the tame behavior around $q_* = 1$.

From the figures, it’s pretty clear that this kind of mathematically oriented technique can work effectively to detect and estimate the likelihood of even relatively small but abrupt rate changes quickly, without relying on heuristics.

But it can do more. Suppose the rate is piecewise linear (for convenience, as a function of arrivals), changing from 1 to $q_*$ linearly over $N$ arrivals and then remaining there for another $N$ arrivals. The relevant changes to the MATLAB code basically consist of using the lines

qq = linspace(1,q,N);
at2 = cumsum(-log(rand(1,N))./qq);
at3 = cumsum(-log(rand(1,N))/q);
at12end = at1(end)+at2(end);
at = [at1, at1(end)+at2, at12end+at3];

and changing instances of $2N$ to $3N$ along with revised plotting commands.

Here the size of the window used makes a big difference:

Typical graphical output of the modified blogf code for $q_* = 1.5, N = 10000, n = 1000$.

Typical graphical output of the modified blogf code for $q_* = 1.5, N = 10000, n = 5000$.

The average likelihood after the rate begins to change for $q_* = 1.5, N = 10000$ is approximately $10^{-0.48}$ for $n = 1000$ and $10^{-17}$ for $n = 5000$. Of course, the rate estimation window needn’t equal the window used for measuring the likelihoods of fluctuations, but this just adds another variable to the analysis.

The overall point is that as arrivals get closer to Poissonian, so that an arrival rate can be well-defined and estimated, the picture for characterization and detection algorithms improves substantially. The next and final post in this series will give a little bit of perspective.

### 3 Responses to Why Poissonian traffic models matter more now than ever, part 5

1. [...] The question for now is, once you’ve got a finite Markov process, what do you do with it? There are some obvious things. For example, you could apply a Chebyshev-type inequality to detect when the traffic parameters change or the underlying assumptions break down (which, if the model is halfway decent, by definition indicates something interesting is going on–even if it’s not malicious). This idea has been around in network security at least since Denning’s 1986-7 intrusion detection article, though, so it’s not likely to bear any more fruit (assuming it ever did). A better idea is to construct and exploit martingales. One way to do this to advantage starting with an inhomogeneous Poisson process (or in principle, at least, more general one-dimensional point processes) was outlined here and here. [...]

2. test says:

I am actually excited into possessing my own personal double adjustable piano bench. Thank you for this article. It is extremely handy

3. testwebsite says:

Appreciating the persistence you put into your website and in depth information you provide. It’s great to come across a blog every once in a while that isn’t the same out of date rehashed information. Wonderful read! I’ve saved your site and I’m including your RSS feeds to my Google account