Effective statistical physics of Anosov systems

14 September 2010

We’ve just posted a paper titled “Effective statistical physics of Anosov systems” that details the physical relevance of the techniques we’ve used to characterize network traffic. The idea is that there appears to be a unique well-defined effective temperature (and energy spectrum) for physical systems that are typical under the so-called chaotic hypothesis. We’ve demonstrated how statistical physics can be used to detect malicious or otherwise anomalous network traffic in another whitepaper also available on the arxiv through our downloads page. The current paper completes the circle and presents evidence indicating that the same ideas can be fruitfully applied to nonequilbrium steady states.


Random bits

19 May 2010

“like-sign dimuon charge asymmetry…in disagreement with the prediction of the standard model by 3.2 standard deviations”

OK, now VMs are totally safe! No need to worry about escape attacks or rootkits…but seriously, it’s good that not everyone takes hypervisor security for granted.

“there is now a significant body of work showing how to break conventional quantum cryptography systems based on various practical weaknesses in the way they are set up…while the known loopholes can be papered over, it’s the unknown ones that represent threats in the future…[researchers have shown that it's easy] with a little malicious intent to bend the assumptions behind perfect quantum cryptography.”


Random bits

7 May 2010

Principles of Robust Timing Over the Internet

“[An IPv4 address space] black market already exists, albeit on a small scale…[currently] IPv4 addresses are still relatively easy to get…[some believe] that regional registries such as ARIN should head off a potentially deleterious black market by creating a “white market” with established rules for trading IPv4 addresses at market-established costs…But the opportunity to cleanly switch from IPv4 to IPv6 passed many years ago. The current transition strategy, called “dual stack,” requires businesses to remain connected to both IPv4 and IPv6 networks until most of the Internet gets to “the other side” — a process expected to take at least five years.”

“Frosted windows may never be private again”

“a fundamental limit to the level of privacy that is possible when social networks are mined for recommendations”

“The 605-page [NSA IAD] PDF document reads like a listing of the pros and cons for a huge array of defensive and counterintelligence approaches and technologies that an entity might adopt in defending its networks…[one] section delves into the challenges of attributing the true origin(s) of a computer network attack”


Random bits

5 April 2010

“A low-complexity approach for reconstructing average packet arrival rates and instantaneous packet counts at a router in a communication network, where the arrivals of packets in each flow follow a Poisson process”

“It’s safe to say that when someone pays that much for a bug, they’re not going to tell the vendor to patch it.”

“Regulation is not the primary driver for new technology, new investment, or new training; the threats are”

Protecting Europe against large-scale cyber attacks

Would you have spotted this ATM skimmer?

DoE and power grid security


Random bits

15 March 2010

“I do believe NSA is still ahead, but not by much — a handful of years”

“[A researcher] gave a talk on his then current project to prove a certain OS kernel was secure…they hoped in two years to have a proof of the OS’s correctness. What struck me during his talk was he could write down on the board, a [formula that] captured the notion of data security: if a certain function f had this property, then he would be able to assert his OS could not leak any information…At the end of his talk I asked him if he wanted a proof now that his function f satisfied the formula. He looked at me puzzled, as did everyone else. He pointed out his f was defined by his OS, so how could I possibly prove it satisfied his formula—the f was thousands of lines of code. He added they were working hard on proving this formula, and hoped to have a full proof in the next 24 months…I walked to the board and wrote out a short set theory proof to back up my claim—any f had his property…I thought he would be shocked. I thought he might be upset, or even embarrassed his formula was meaningless. He was not at all. [He] just said they would have to find another formula to prove.”

“it’s possible to focus light through opaque materials and detect objects hidden behind them, provided you know enough about the material”


Random bits

4 March 2010

Narus develops a scary sleuth for social media

An invisible quantum tripwire

Aspects of CNCI declassified

IPv6 thoughts from Arbor

“Hackers who breached Google and other companies in January targeted source-code management systems”


Random bits

2 March 2010

Ryan Singel’s cri de coeur about cyberwar hype is too juicy to merely provide a link. A few choice excerpts:

The Washington Post gave [former DIRNSA and DNI] McConnell free space to declare that we are losing some sort of cyberwar…But that’s not warfare. That’s espionage…Those enamored with the idea of “cyberwar” aren’t dissuaded by fact-checking…[if the DoS attack on Estonia] was cyberwar, it’s pretty clear the net will be just fine. In fact, none of [the commonly cited examples] demonstrate the existence of a cyberwar, let alone that we are losing it. But this battle isn’t about truth. It’s about power…

the problem with developing cyberweapons…is that you need to know where to point them…The military needs targets…Never shy of extending its power, the military industrial complex wants to turn the internet into yet another venue for an arms race. And it’s waging a psychological warfare campaign on the American people to make that so. The military industrial complex is backed by sensationalism, and a gullible and pageview-hungry media…

There is no cyberwar and we are not losing it. The only war going on is one for the soul of the internet. But if…self-interested exaggerators dominate our nation’s discourse about online security, we will lose that war — and the open internet will be its biggest casualty.

On the opposite end of the nuance spectrum: more than 41% of the zeros of the zeta function are on the critical line.


Random bits

23 February 2010

“Understanding what normalcy looks like on your network so you can pinpoint abnormality is what is really important in the current threat environment,” he says. “Don’t trust only your existing security controls, and get eyes on your network.”

“IT security has evolved into a classic broken windows business. It exists to repair things that shouldn’t break in the first place. Furthermore, every dollar that a business spends on Security subtracts a dollar from expenditure on more worthwhile alternatives—product innovation, improved public services, higher salaries, dividends to investors, etc.”

“US analysts believe they have identified the Chinese author of the critical programming code used in the alleged state-sponsored hacking attacks on Google and other western companies, making it far harder for the Chinese government to deny involvement.”

“[Researchers have designed] a true random number generator that uses an extra layer of randomness by making a computer memory element, a flip-flop, twitch randomly between its two states 1 or 0. Immediately prior to the switch, the flip-flop is in a “metastable state” where its behaviour cannot be predicted. At the end of the metastable state, the contents of the memory are purely random.”

“Cyber ShockWave…featured a number of former US government officials who played the part of senior members of the NSC. The exercise sought to examine how the NSC would react to a major cyber attack in real time…the source of the attack remained unclear during the event…The mock NSC even discussed potentially nationalizing power companies and service providers if they failed to act in the national interest. Ultimately, in the several hours that the war game lasted, the US was increasingly beset by attack with little knowledge of who perpetrated it.” More reaction from Richard Bejtlich.


Martingales from finite Markov processes, part 1

15 February 2010

In an earlier series of posts the emerging inhomogeneous Poissonian nature of network traffic was detailed. One implication of this trend is that not only network flows but also individual packets will be increasingly well described by Markov processes of various sorts. At EQ, we use some ideas from the edifice of information theory and the renormalization group to provide a mathematical infrastructure for viewing network traffic as (e.g.) realizations of inhomogeneous finite Markov processes (or countable Markov processes with something akin to a finite universal cover). An essentially equation-free (but idea-heavy) overview of this is given in our whitepaper “Scalable visual traffic analysis”, and more details and examples will be presented over time.

The question for now is, once you’ve got a finite Markov process, what do you do with it? There are some obvious things. For example, you could apply a Chebyshev-type inequality to detect when the traffic parameters change or the underlying assumptions break down (which, if the model is halfway decent, by definition indicates something interesting is going on–even if it’s not malicious). This idea has been around in network security at least since Denning’s 1986-7 intrusion detection article, though, so it’s not likely to bear any more fruit (assuming it ever did). A better idea is to construct and exploit martingales. One way to do this to advantage starting with an inhomogeneous Poisson process (or in principle, at least, more general one-dimensional point processes) was outlined here and here.

Probably the most well-known general technique for constructing martingales from Markov processes is the Dynkin formula. Although we don’t use this formula at present (after having done a lot of tinkering and evaluation), a more general result similar to it will help us introduce the Girsanov theorem for finite Markov processes and thereby one of the tools we’ve developed for detecting changes in network traffic patterns.

The sketch below of a fairly general version of this formula for finite processes is adapted from a preprint of Ford (see Rogers and Williams IV.20 for a more sophisticated treatment).

Consider a time-inhomogeneous Markov process X_t on a finite state space. Let Q(t) denote the generator, and let P(s,t) denote the corresponding transition kernel, i.e. P(s,t) = U^{-1}(s)U(t), where the Markov propagator is

U(t) := \mathcal{TO}^* \exp \int_0^t Q(s) \ ds

and \mathcal{TO}^* indicates the formal adjoint or reverse time-ordering operator. Thus, e.g., an initial distribution p(0) is propagated as p(t) = p(0)U(t). (NB. Kleinrock‘s queueing theory book omits the time-ordering, which is a no-no.)

Let f_t(X_t) be bounded and such that the map t \mapsto f_t is C^1. Write t_0 \equiv 0 and t_m = t. Now

f_t(X_t)-f_0(X_0) \equiv f_{t_m}(X_{t_m})-f_{t_0}(X_{t_0})

= \sum_{j=0}^{m-1} \left[f_{t_{j+1}}(X_{t_{j+1}}) - f_{t_j}(X_{t_j})\right],

and the Markov property gives that

\mathbb{E} \left(f_{t_{j+1}}(X_{t_{j+1}}) - f_{t_j}(X_{t_j}) \ \big| \ \mathcal{F}_{t_j}\right)

= \sum_{X_{t_{j+1}}} \left[f_{t_{j+1}}(X_{t_{j+1}}) - f_{t_j}(X_{t_j})\right] \cdot P_{X_{t_j},X_{t_{j+1}}}(t_j,t_{j+1}).

The notation \mathcal{F}_t just indicates the history of the process (i.e., its natural filtration) at time t. The transition kernel satisfies a generalization of the time-homogeneous formula P(t) = e^{tQ}:

P_{X_{t_j},X_{t_{j+1}}}(t_j,t_{j+1})

= \delta_{X_{t_j},X_{t_{j+1}}} + (t_{j+1} - t_j) \cdot Q_{X_{t_j},X_{t_{j+1}}}(t_j) + o(t_{j+1} - t_j)

so the RHS of the previous equation is t_{j+1} - t_j times

\frac{f_{t_{j+1}}(X_{t_j}) - f_{t_j}(X_{t_j})}{t_{j+1} - t_j} + \sum_{X_{t_{j+1}}} f_{t_{j+1}}(X_{t_{j+1}}) \cdot Q_{X_{t_j},X_{t_{j+1}}}(t_j)

plus a term that vanishes in the limit of vanishing mesh. The fact that the row sums of a generator are identically zero has been used to simplify the result.

Summing over j and taking the limit as the mesh of the the partition goes to zero shows that

\boxed{\mathbb{E} \left(f_t(X_t)-f_0(X_0)\right) = \mathbb{E} \int_0^t \left(\partial_s + Q(s)\right)f_s \circ X_s \ ds.}

That is,

M_t^f := f_t(X_t)-f_0(X_0)- \int_0^t \left(\partial_s + Q(s)\right)f_s \circ X_s \ ds

is a local martingale, or if Q is well behaved, a martingale.

This can be generalized (see Rogers and Williams IV.21 and note that the extension to inhomogeneous processes is trivial): if X is an inhomogeneous Markov process on a finite state space \{1,\dots,n\} and g : \mathbb{R}_+ \times \{1,\dots,n\} \times \{1,\dots,n\} \times \Omega \longrightarrow \mathbb{R} is such that (t, \omega) \mapsto g(t,j,k,\omega) is locally bounded and previsible and g(t,j,j,\omega) \equiv 0 for all j,k, then M_t^g(\omega) given by

\sum_{0 < s \le t} g(s,X_{s-},X_s,\omega) - \int_{(0,t]} \sum_k Q_{X_{s-},k}(s) \cdot g(s,X_{s-},k,\omega) \ ds

is a local martingale. Conversely, any local martingale null at 0 can be represented in this form for some g satisfying the conditions above (except possibly local boundedness).

To reiterate, this result will be used to help introduce the Girsanov theorem for finite Markov processes in a future post, and later on we’ll also show how Girsanov can be used to arrive at a genuinely simple, scalable likelihood ratio test for identifying changes in network traffic patterns.


Random bits

10 February 2010

Snowstorm round-up edition…

PRC busts a hacker ring…convenient timing for a PR-friendly move. But don’t look too soon…

Verizon blocks 4chan

Phishing .gov and .mil

Mobile phone communication patterns

Graphene superconducting at 90 K

Apparently some people think steganography is nontrivial

Hackers steal $4M in carbon credits

Botnet vs. botnet

Iran’s big day: Thursday


Follow

Get every new post delivered to your Inbox.