Martingales from finite Markov processes, part 1

15 February 2010

In an earlier series of posts the emerging inhomogeneous Poissonian nature of network traffic was detailed. One implication of this trend is that not only network flows but also individual packets will be increasingly well described by Markov processes of various sorts. At EQ, we use some ideas from the edifice of information theory and the renormalization group to provide a mathematical infrastructure for viewing network traffic as (e.g.) realizations of inhomogeneous finite Markov processes (or countable Markov processes with something akin to a finite universal cover). An essentially equation-free (but idea-heavy) overview of this is given in our whitepaper “Scalable visual traffic analysis”, and more details and examples will be presented over time.

The question for now is, once you’ve got a finite Markov process, what do you do with it? There are some obvious things. For example, you could apply a Chebyshev-type inequality to detect when the traffic parameters change or the underlying assumptions break down (which, if the model is halfway decent, by definition indicates something interesting is going on–even if it’s not malicious). This idea has been around in network security at least since Denning’s 1986-7 intrusion detection article, though, so it’s not likely to bear any more fruit (assuming it ever did). A better idea is to construct and exploit martingales. One way to do this to advantage starting with an inhomogeneous Poisson process (or in principle, at least, more general one-dimensional point processes) was outlined here and here.

Probably the most well-known general technique for constructing martingales from Markov processes is the Dynkin formula. Although we don’t use this formula at present (after having done a lot of tinkering and evaluation), a more general result similar to it will help us introduce the Girsanov theorem for finite Markov processes and thereby one of the tools we’ve developed for detecting changes in network traffic patterns.

The sketch below of a fairly general version of this formula for finite processes is adapted from a preprint of Ford (see Rogers and Williams IV.20 for a more sophisticated treatment).

Consider a time-inhomogeneous Markov process $X_t$ on a finite state space. Let $Q(t)$ denote the generator, and let $P(s,t)$ denote the corresponding transition kernel, i.e. $P(s,t) = U^{-1}(s)U(t),$ where the Markov propagator is

$U(t) := \mathcal{TO}^* \exp \int_0^t Q(s) \ ds$

and $\mathcal{TO}^*$ indicates the formal adjoint or reverse time-ordering operator. Thus, e.g., an initial distribution $p(0)$ is propagated as $p(t) = p(0)U(t).$ (NB. Kleinrock‘s queueing theory book omits the time-ordering, which is a no-no.)

Let $f_t(X_t)$ be bounded and such that the map $t \mapsto f_t$ is $C^1.$ Write $t_0 \equiv 0$ and $t_m = t.$ Now

$f_t(X_t)-f_0(X_0) \equiv f_{t_m}(X_{t_m})-f_{t_0}(X_{t_0})$

$= \sum_{j=0}^{m-1} \left[f_{t_{j+1}}(X_{t_{j+1}}) - f_{t_j}(X_{t_j})\right],$

and the Markov property gives that

$\mathbb{E} \left(f_{t_{j+1}}(X_{t_{j+1}}) - f_{t_j}(X_{t_j}) \ \big| \ \mathcal{F}_{t_j}\right)$

$= \sum_{X_{t_{j+1}}} \left[f_{t_{j+1}}(X_{t_{j+1}}) - f_{t_j}(X_{t_j})\right] \cdot P_{X_{t_j},X_{t_{j+1}}}(t_j,t_{j+1}).$

The notation $\mathcal{F}_t$ just indicates the history of the process (i.e., its natural filtration) at time $t.$ The transition kernel satisfies a generalization of the time-homogeneous formula $P(t) = e^{tQ}:$

$P_{X_{t_j},X_{t_{j+1}}}(t_j,t_{j+1})$

$= \delta_{X_{t_j},X_{t_{j+1}}} + (t_{j+1} - t_j) \cdot Q_{X_{t_j},X_{t_{j+1}}}(t_j) + o(t_{j+1} - t_j)$

so the RHS of the previous equation is $t_{j+1} - t_j$ times

$\frac{f_{t_{j+1}}(X_{t_j}) - f_{t_j}(X_{t_j})}{t_{j+1} - t_j} + \sum_{X_{t_{j+1}}} f_{t_{j+1}}(X_{t_{j+1}}) \cdot Q_{X_{t_j},X_{t_{j+1}}}(t_j)$

plus a term that vanishes in the limit of vanishing mesh. The fact that the row sums of a generator are identically zero has been used to simplify the result.

Summing over $j$ and taking the limit as the mesh of the the partition goes to zero shows that

$\boxed{\mathbb{E} \left(f_t(X_t)-f_0(X_0)\right) = \mathbb{E} \int_0^t \left(\partial_s + Q(s)\right)f_s \circ X_s \ ds.}$

That is,

$M_t^f := f_t(X_t)-f_0(X_0)- \int_0^t \left(\partial_s + Q(s)\right)f_s \circ X_s \ ds$

is a local martingale, or if $Q$ is well behaved, a martingale.

This can be generalized (see Rogers and Williams IV.21 and note that the extension to inhomogeneous processes is trivial): if $X$ is an inhomogeneous Markov process on a finite state space $\{1,\dots,n\}$ and $g : \mathbb{R}_+ \times \{1,\dots,n\} \times \{1,\dots,n\} \times \Omega \longrightarrow \mathbb{R}$ is such that $(t, \omega) \mapsto g(t,j,k,\omega)$ is locally bounded and previsible and $g(t,j,j,\omega) \equiv 0$ for all $j,k,$ then $M_t^g(\omega)$ given by

$\sum_{0 < s \le t} g(s,X_{s-},X_s,\omega) - \int_{(0,t]} \sum_k Q_{X_{s-},k}(s) \cdot g(s,X_{s-},k,\omega) \ ds$

is a local martingale. Conversely, any local martingale null at 0 can be represented in this form for some $g$ satisfying the conditions above (except possibly local boundedness).

To reiterate, this result will be used to help introduce the Girsanov theorem for finite Markov processes in a future post, and later on we’ll also show how Girsanov can be used to arrive at a genuinely simple, scalable likelihood ratio test for identifying changes in network traffic patterns.

12 February 2010

10 February 2010

Random bits

4 February 2010

Hacking for Fun and Profit in China’s Underworld

Google + NSA Information Assurance Directorate

“Every user in the world is convinced they need security features, not security procedures.”

Advanced Persistent Threat highlighted by DNI; Mandiant report gives details. Mandiant have coined the APT term, and it’s probably because they deal with that sort of thing constantly: they’re very good at what they do. We hired them for internal test and eval work as well as usability input as our software began taking shape, and I came away impressed. It’s not surprising to see them tackling high-profile events.

Quantum energy teleportation

Random bits

2 February 2010

“The Internet is a Hobbesian ‘state of nature’ where anything goes, where even government attacks maintain ‘plausible deniability,’ and where 80 percent of industrial control software is hooked into an IP network.”

Congressional Research Service overview of cybersecurity legislation, executive initiatives, and options (PDF)