To sketch how the large-deviation approach discussed in the last post in this series can actually work, consider a process with conditional rate function 1 for the first arrivals and thereafter. Take the maximum-likelihood rate estimate over the previous arrivals and see how the resulting putative martingales behave using the following MATLAB code:
function y=blogf(q,N,n,plotflag) % Piecewise-constant inhomogeneous Poisson % process at unit rate for N steps, then at % rate q for N steps. MLE of rate obtained % at each timestep using a window of n steps, % then time rescaling theorem applied to % obtain a putative unit-rate Poisson process. % Finally the estimated martingale is formed % and a large-deviation result applied % to obtain the probability of the observed % behavior at the beginning and middle % (i.e., where the jump occurs). % NB. There is no error checking done (e.g., % for n, N) and the code is far from perfect, % but should be OK for illustrative purposes % unit rate Poisson process at1 = cumsum(-log(rand(1,N))); % rate q at2 = cumsum(-log(rand(1,N))/q); at = [at1, at1(end)+at2]; % arrival times iat = [at(1), diff(at)]; % interarrivals % empirical rate estimate over last n arrivals qe = zeros(1,2*N); for j = 1:n qe(j) = j/at(j); end for j = (n+1):2*N qe(j) = n/(at(j) - at(j-n)); end % time rescaled process w/ empirical rate tau = zeros(1,2*N); tau(1) = iat(1)*qe(1); for j = 2:2*N tau(j) = tau(j-1) + iat(j)*qe(j); end % putative martingale from empirical rescaling m = (1:2*N) - tau; t = 1; % timescale % j1 and j2 are respectively the indices % of the first arrival further than n*t from 0 % and from the Nth arrival j1 = find(tau > n*t, 1, 'first'); j2 = find(tau > tau(N)+n*t, 1, 'first'); % martingale differences dm1 = m(j1) - m(1); dm2 = m(j2) - m(N); % a, x, yy vars from large-deviation result a1 = dm1/n; a2 = dm2/n; x1 = a1/t + 1; x2 = a2/t + 1; yy = t*n; % (note: y is output var.) % log of asymp. prob. of fluctuations logp1 = yy*(-x1*log(x1) + x1 - 1); logp2 = yy*(-x2*log(x2) + x2 - 1); y = [logp1 logp2]; if plotflag figure;plot(tau,m,'k');hold; mmmm = [min(m) max(m)]; plot(tau(N)*ones(1,2),mmmm,'k:'); axis tight xlabel('\tau'),ylabel('N_\tau - \tau') s1 = 'actual rate changes from 1 to '; s2 = sprintf('%g',q); s3 = ' at '; s4 = sprintf('%d',N); s5 = 'th arrival; MLE rate over '; s6 = sprintf('%d',n); s7 = ' arrivals'; title([s1 s2 s3 s4 s5 s6 s7]); end
Two typical outputs are shown below.
The following code snippet (which omits the necessary plotting commands) can be used inline for a quick evaluation of the behavior of the overall technique as a function of and
% L = number of runs to average over N = 10000; L = 40; q = .65:.05:1.65; n = ceil(logspace(1,3,25)); for j = 1:21 qj = q(j); for k = 1:25 nk = n(k); temp = zeros(L,2); for l = 1:L temp(l,:) = blogf(qj,N,nk,0); end z1(j,k) = mean(temp(:,1)); % initial z2(j,k) = mean(temp(:,2)); % middle end end
Using this, it’s easy to see the difference between the log-likelihood of observed fluctuations before and during an abrupt rate change.
From the figures, it’s pretty clear that this kind of mathematically oriented technique can work effectively to detect and estimate the likelihood of even relatively small but abrupt rate changes quickly, without relying on heuristics.
But it can do more. Suppose the rate is piecewise linear (for convenience, as a function of arrivals), changing from 1 to linearly over arrivals and then remaining there for another arrivals. The relevant changes to the MATLAB code basically consist of using the lines
qq = linspace(1,q,N); at2 = cumsum(-log(rand(1,N))./qq); at3 = cumsum(-log(rand(1,N))/q); at12end = at1(end)+at2(end); at = [at1, at1(end)+at2, at12end+at3];
and changing instances of to along with revised plotting commands.
Here the size of the window used makes a big difference:
The average likelihood after the rate begins to change for is approximately for and for . Of course, the rate estimation window needn’t equal the window used for measuring the likelihoods of fluctuations, but this just adds another variable to the analysis.
The overall point is that as arrivals get closer to Poissonian, so that an arrival rate can be well-defined and estimated, the picture for characterization and detection algorithms improves substantially. The next and final post in this series will give a little bit of perspective.