Random bits

4 March 2010

Random bits

23 February 2010

“Understanding what normalcy looks like on your network so you can pinpoint abnormality is what is really important in the current threat environment,” he says. “Don’t trust only your existing security controls, and get eyes on your network.”

“IT security has evolved into a classic broken windows business. It exists to repair things that shouldn’t break in the first place. Furthermore, every dollar that a business spends on Security subtracts a dollar from expenditure on more worthwhile alternatives—product innovation, improved public services, higher salaries, dividends to investors, etc.”

“US analysts believe they have identified the Chinese author of the critical programming code used in the alleged state-sponsored hacking attacks on Google and other western companies, making it far harder for the Chinese government to deny involvement.”

“[Researchers have designed] a true random number generator that uses an extra layer of randomness by making a computer memory element, a flip-flop, twitch randomly between its two states 1 or 0. Immediately prior to the switch, the flip-flop is in a “metastable state” where its behaviour cannot be predicted. At the end of the metastable state, the contents of the memory are purely random.”

“Cyber ShockWave…featured a number of former US government officials who played the part of senior members of the NSC. The exercise sought to examine how the NSC would react to a major cyber attack in real time…the source of the attack remained unclear during the event…The mock NSC even discussed potentially nationalizing power companies and service providers if they failed to act in the national interest. Ultimately, in the several hours that the war game lasted, the US was increasingly beset by attack with little knowledge of who perpetrated it.” More reaction from Richard Bejtlich.


Martingales from finite Markov processes, part 1

15 February 2010

In an earlier series of posts the emerging inhomogeneous Poissonian nature of network traffic was detailed. One implication of this trend is that not only network flows but also individual packets will be increasingly well described by Markov processes of various sorts. At EQ, we use some ideas from the edifice of information theory and the renormalization group to provide a mathematical infrastructure for viewing network traffic as (e.g.) realizations of inhomogeneous finite Markov processes (or countable Markov processes with something akin to a finite universal cover). An essentially equation-free (but idea-heavy) overview of this is given in our whitepaper “Scalable visual traffic analysis”, and more details and examples will be presented over time.

The question for now is, once you’ve got a finite Markov process, what do you do with it? There are some obvious things. For example, you could apply a Chebyshev-type inequality to detect when the traffic parameters change or the underlying assumptions break down (which, if the model is halfway decent, by definition indicates something interesting is going on–even if it’s not malicious). This idea has been around in network security at least since Denning’s 1986-7 intrusion detection article, though, so it’s not likely to bear any more fruit (assuming it ever did). A better idea is to construct and exploit martingales. One way to do this to advantage starting with an inhomogeneous Poisson process (or in principle, at least, more general one-dimensional point processes) was outlined here and here.

Probably the most well-known general technique for constructing martingales from Markov processes is the Dynkin formula. Although we don’t use this formula at present (after having done a lot of tinkering and evaluation), a more general result similar to it will help us introduce the Girsanov theorem for finite Markov processes and thereby one of the tools we’ve developed for detecting changes in network traffic patterns.

The sketch below of a fairly general version of this formula for finite processes is adapted from a preprint of Ford (see Rogers and Williams IV.20 for a more sophisticated treatment).

Consider a time-inhomogeneous Markov process X_t on a finite state space. Let Q(t) denote the generator, and let P(s,t) denote the corresponding transition kernel, i.e. P(s,t) = U^{-1}(s)U(t), where the Markov propagator is

U(t) := \mathcal{TO}^* \exp \int_0^t Q(s) \ ds

and \mathcal{TO}^* indicates the formal adjoint or reverse time-ordering operator. Thus, e.g., an initial distribution p(0) is propagated as p(t) = p(0)U(t). (NB. Kleinrock’s queueing theory book omits the time-ordering, which is a no-no.)

Let f_t(X_t) be bounded and such that the map t \mapsto f_t is C^1. Write t_0 \equiv 0 and t_m = t. Now

f_t(X_t)-f_0(X_0) \equiv f_{t_m}(X_{t_m})-f_{t_0}(X_{t_0})

= \sum_{j=0}^{m-1} \left[f_{t_{j+1}}(X_{t_{j+1}}) - f_{t_j}(X_{t_j})\right],

and the Markov property gives that

\mathbb{E} \left(f_{t_{j+1}}(X_{t_{j+1}}) - f_{t_j}(X_{t_j}) \ \big| \ \mathcal{F}_{t_j}\right)

= \sum_{X_{t_{j+1}}} \left[f_{t_{j+1}}(X_{t_{j+1}}) - f_{t_j}(X_{t_j})\right] \cdot P_{X_{t_j},X_{t_{j+1}}}(t_j,t_{j+1}).

The notation \mathcal{F}_t just indicates the history of the process (i.e., its natural filtration) at time t. The transition kernel satisfies a generalization of the time-homogeneous formula P(t) = e^{tQ}:

P_{X_{t_j},X_{t_{j+1}}}(t_j,t_{j+1})

= \delta_{X_{t_j},X_{t_{j+1}}} + (t_{j+1} - t_j) \cdot Q_{X_{t_j},X_{t_{j+1}}}(t_j) + o(t_{j+1} - t_j)

so the RHS of the previous equation is t_{j+1} - t_j times

\frac{f_{t_{j+1}}(X_{t_j}) - f_{t_j}(X_{t_j})}{t_{j+1} - t_j} + \sum_{X_{t_{j+1}}} f_{t_{j+1}}(X_{t_{j+1}}) \cdot Q_{X_{t_j},X_{t_{j+1}}}(t_j)

plus a term that vanishes in the limit of vanishing mesh. The fact that the row sums of a generator are identically zero has been used to simplify the result.

Summing over j and taking the limit as the mesh of the the partition goes to zero shows that

\boxed{\mathbb{E} \left(f_t(X_t)-f_0(X_0)\right) = \mathbb{E} \int_0^t \left(\partial_s + Q(s)\right)f_s \circ X_s \ ds.}

That is,

M_t^f := f_t(X_t)-f_0(X_0)- \int_0^t \left(\partial_s + Q(s)\right)f_s \circ X_s \ ds

is a local martingale, or if Q is well behaved, a martingale.

This can be generalized (see Rogers and Williams IV.21 and note that the extension to inhomogeneous processes is trivial): if X is an inhomogeneous Markov process on a finite state space \{1,\dots,n\} and g : \mathbb{R}_+ \times \{1,\dots,n\} \times \{1,\dots,n\} \times \Omega \longrightarrow \mathbb{R} is such that (t, \omega) \mapsto g(t,j,k,\omega) is locally bounded and previsible and g(t,j,j,\omega) \equiv 0 for all j,k, then M_t^g(\omega) given by

\sum_{0 < s \le t} g(s,X_{s-},X_s,\omega) - \int_{(0,t]} \sum_k Q_{X_{s-},k}(s) \cdot g(s,X_{s-},k,\omega) \ ds

is a local martingale. Conversely, any local martingale null at 0 can be represented in this form for some g satisfying the conditions above (except possibly local boundedness).

To reiterate, this result will be used to help introduce the Girsanov theorem for finite Markov processes in a future post, and later on we’ll also show how Girsanov can be used to arrive at a genuinely simple, scalable likelihood ratio test for identifying changes in network traffic patterns.


Random bits

12 February 2010

Random bits

10 February 2010

Random bits

29 January 2010

The Clinton doctrine

25 January 2010

After the fallout from Aurora, US Secretary of State Hillary Clinton gave a major speech last Thursday at the Newseum in DC. Highlights below:

The spread of information networks is forming a new nervous system for our planet…in many respects, information has never been so free…[but] modern information networks and the technologies they support can be harnessed for good or for ill…

There are many other networks in the world. Some aid in the movement of people or resources, and some facilitate exchanges between individuals with the same work or interests. But the internet is a network that magnifies the power and potential of all others. And that’s why we believe it’s critical that its users are assured certain basic freedoms. Freedom of expression is first among them…

…a new information curtain is descending across much of the world…

Governments and citizens must have confidence that the networks at the core of their national security and economic prosperity are safe and resilient…Disruptions in these systems demand a coordinated response by all governments, the private sector, and the international community. We need more tools to help law enforcement agencies cooperate across jurisdictions when criminal hackers and organized crime syndicates attack networks for financial gain…

States, terrorists, and those who would act as their proxies must know that the United States will protect our networks. Those who disrupt the free flow of information in our society or any other pose a threat to our economy, our government, and our civil society. Countries or individuals that engage in cyber attacks should face consequences and international condemnation. In an internet-connected world, an attack on one nation’s networks can be an attack on all [ed. see article 5 of the North Atlantic Treaty]. And by reinforcing that message, we can create norms of behavior among states and encourage respect for the global networked commons.

China denies everything and is trying to change the subject.

The tone of this speech was remarkable. While it is natural to expect that most nations conduct offensive computer network operations against foreign governments and organizations, getting publicly called on it is rare. Most observers have no doubt that the PRC has been infiltrating and attacking US government and commercial networks for strategic ends, and the NSA would not be doing its job if it were not doing the same thing abroad. So even if everything isn’t Marquis of Queensberry you wouldn’t expect to see folks complain too loudly.

But human rights and censorship is another story. There is a simple reason why Cold War rhetoric was recycled in this speech. Regardless of whether Google capitulates or leaves China (any other outcome is unlikely), by going public instead of leaking to the press they have put the PRC on the defensive. As I remarked earlier, Google surely must have known it had the (at least implicit) backing of the US before it (effectively) named names. The administration must have seen this as a golden opportunity to seize the moral high ground. When force of arms cannot be decisive, the justness of a cause still might be.


Random bits

20 January 2010

China and Google

14 January 2010

Time for the (n+1)th dissection of Google’s recent announcement concerning cyberattacks and censorship. (You’ve got to love recursion!)

As Galrahn points out, discounting Google’s market share relative to Baidu isn’t really sensible. They’ve got a lot of market share there, especially for non-search services without strong competitors—but many of these services (YouTube, Picasa, and often Blogger) have been blocked by the Chinese government. That speaks to two things in China: an opportunity for user base consolidation and to a governmental approach to information that is inimical to Google’s business model. More to the point:

For what amounts to only 2% of revenue, Google is threatening to disrupt the internet behavior of at minimum 118 million internet savvy Chinese and believes that fact alone has value in negotiations.

Source: http://www.flickr.com/photos/dong/4271035989/ / CC BY 2.0

Is this really a funeral, or will a hundred flowers blossom?

That is, Google is using a casus belli to force an issue that predates their entry into the Chinese market. It doesn’t cost them much to do so. They’ve already got the explicit backing of some other heavyweight Western companies (e.g., Yahoo) and network effects may induce many others to climb on board the bandwagon. They surely have the implicit backing of the US government in pushing back against China (and am I the only one who is thinking about the possibility of honeypots here? No way).

The bottom line is that this is not about a moral stand. By taking things public, Google is creating a negotiating opportunity for what it’s wanted all along from China. The real issue here is not who is “right” or “wrong” but who is going to win. For Google to thrive in China, the Chinese Communist Party’s control over information has to be weakened. For the CCP to thrive in China, it has to retain a monopoly on political power, and this requires controlling the flow of information. Moreover, and as I’ve mentioned before, there is a clear path from China’s cyber strategy to the foundations of its politics. So Google will probably not win much if anything in this skirmish.

The larger point is much more interesting, though. After a decade of undeclared cyber war with Chinese characteristics, this is the first overt public response. China has less to lose from cyberwarfare than the West does. But as it finds what it’s looking for with rampant cyberespionage, China may also find that it is hurting itself.


Random bits

13 January 2010