Jaynes and the Gibbs paradox

Ed Jaynes (see also here) quoted Eugene Wigner as saying that “entropy is an anthropomorphic concept” and provided his own elaboration of this idea:

It is necessary to decide at the outset of a problem which macroscopic variable or degrees of freedom we shall measure and/or control; and within the context of the thermodynamic system this defined, entropy will be some function S(X_1,\dots, X_n) of whatever variables we have chosen. We can expect this to obey the second law T dS \ge dQ only as long as all experimental manipulations are confined to that chosen set. If someone, unknown to us, were to vary a macrovariable X_{n+1} outside that set, he could produce what would appear to us as a violation of the second law, since our entropy function S(X_1,\dots, X_n) might decrease spontaneously, while his S(X_1,\dots, X_n, X_{n+1}) increases.

As part of Jaynes’ MAXENT program, he discoursed at some length on the Gibbs paradox. In one version of this, a box contains a monatomic ideal gas of 2N atoms in equilibrium. Now divide the box in half by an impermeable membrane. A naïve computation of the entropy might suggest that S_{box} - 2S_{half} \ne 0. The resolution of this is found in the recognition that introducing the membrane effectively distinguishes atoms on one side of the membrane from those on the other side. But the molecules in the undivided volume are indistinguishable, and if we divide the partition function by the right factorial, the paradox disappears.

Jaynes analyzed the Gibbs paradox in detail and concluded that

The rules of thermodynamics are valid and correctly describe the measurements that it is possible to make by manipulating the macrovariables within the set that we have chosen to use…The entropy of mixing does indeed represent human information; just the information needed to predict the work available from the mixing.

Lawrence Sklar cites an example of Gibbs’ paradox with hydrogen gas in singlet and triplet states that is attractive but that I think is misleading: molecular collisions in a gas will induce transitions in the spin states that would greatly complicate attempts to deal with entropies of mixing. However Sklar’s point that entropies depend on the level of physical description is correct, as an amended example with (e.g.) molecular enantiomers (i.e., chiral “mirror images”) quite clearly shows.

The dependence of entropies on levels of description is fully manifested in information theory, and it is important to keep issues like this in mind when using entropy methods for anomaly detection, or more generally techniques like some of ours, in which network traffic is mapped onto the model thermodynamic system of a (typically single-particle) Bose gas. The practical utility of this sort of technique was again anticipated by Jaynes:

A physical system always has more macroscopic degrees of freedom beyond what we control or observe, and by manipulating them a trickster can always make us see an apparent violation of the second law.

Therefore the correct statement of the second law is not that an entropy decrease is impossible in principle, or even improbable; rather that it cannot be achieved reproducibly by manipulating the macrovariables \{X_1,\dots, X_n\} that we have chosen to define our macrostate. Any attempt to write a stronger law than this will put one at the mercy of a trickster, who can produce a violation of it.

But recognizing this should increase rather than decrease our confidence in the future of the second law, because it means that if an experimenter ever sees an apparent violation, then instead of issuing a sensational announcement, it will be more prudent to search for that unobserved degree of freedom. That is, the connection of entropy with information works both ways; seeing an apparent decrease of entropy signifies ignorance of what were the relevant macrovariables.

One of the things that we’ve done is to demonstrate (look at the data from the paper “Effective temperature for finite systems” on our downloads page) that using a small set of source/destination attributes of packets to define macrovariables reflecting the bulk traffic characteristics and looking for “tricksters” can be done quite effectively using the mathematical apparatus of statistical physics.

I have mixed feelings about Jaynes’ legacy. The two things he is most known for are his advocacy of Bayesian inference (I personally feel like conditional probability is not a big deal and have never been able to understand why the choice of a justified prior is a big deal, but I have seen lots of smart people mess this sort of thing up) and maximum entropy (I think it’s usually a good approximation technique that has to be used with some care–sometimes more than its practitioners employ). But I’ve got to hand it to the man for his ability to flesh out hidden assumptions and to transform “obvious” facts into powerful tools. A referee said about his original maximum entropy work that

It has no apparent practical application whatsoever. The problem and the point of view are not familiar to physicists. As a physicist I would raise the question whether the point of view is entirely new, whether it has been discussed explicitly by Information Theory people, or whether it is implicit in the work of Information Theory experts. I would guess that it is at least implicit in their thinking.

Jaynes framed this review and hung it on his wall. I think it’s safe to say he got the last laugh there.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: