David Layzer - The Arrow of Time

Home > Solutions > Scientists > Layzer > Arrow of Time

The Arrow of Time

Unpublished manuscript, June 24, 1971. This early paper (eight years after his strong cosmological principle) contains a clear account of Layzer's greatest cosmological accomplishment - the explanation of the growth of order in the universe from an original thermal equilibrium state.

The physicist's world is made up of events localized in space and time,which he views as components of a four dimensional continuum. Intuitively, we tend to take a quite different view. We think of the world as extended in space but ne unfolding in time. In the physicist's four-dimensional picture all moments in time are on the same footing. But intuitively we assign special significance to the present moment, which we think of as the crest of a wave continually advancing into the future and transforming it into the past. We regard the future as being radically different from the past. We know the past through tangible records, including those contained in our own nervous systems, but we can only make more or less incomplete predictions about the future. Moreover, we believe that we cannot change the past but that we can influence the future, and we base our ethical and judicial systems on this premise. For such notions as praise and blame, reward and punishment, would be meaningless if the future were not only unknown but also in some degree indeterminate.

The idea that the world is unfolding in time cannot be dismissed by the physicist as being purely subjective. Biological evolution exemplifies the unfolding process on a vast scale. Starting with the formation of simple organic compounds in a chemically favorable environment, it has generated, along continually branching paths, an immense and constantly increasing variety of ever more complex living forms. Nor are evolutionary processes confined to the biosphere. The earth's crust records the vicissitudes of four and one-half billion years of evolutionary change, white the moon, lacking a protective atmosphere, preserves in its pitted surface a chronological record of countless meteor impacts. On a larger scale, the regularities that characterize the orbits of the planets and their satellites have resulted from evolutionary processes not yet completely understood by astronomers. Stellar evolution, on the other hand, is the subject of a detailed and highly successful mathematical theory that has brought to the study of stars the kind of order that the theory of evolution has brought to the study of living organisms. Astronomers now recognize red giants, supernovae, and white dwarfs as successive stages in the evolution of a single star rather than distinct stellar "species." Thus biology, geology, and astronomy all offer support for the intuitively appealing picture of an evolving universe, growing more complex as it unfolds in time.

This picture, however, is strikingly at variance with that currently accepted by most physicists. The physicist's picture has two distinct aspects, corresponding to two distinct levels of description. At the macroscopic level, the second law of thermodynamics defines a preferred direction in time, for it implies that "there exists in nature a quantity which changes always in the same direction in all natural processes."* This quantity, the entropy,is commonly interpreted as a measure of disorder. Thus the Second Law seems to imply that the universe is continually growing more chaotic.

*Treatise on Thermodynamics, Max Planck, 3rd edition translated by A. Ogg, Dover Publications, 1945.

At the microscopic level, however, the laws of physics do not distinguish between the two directions of time. The Second Law has no meaning at this level because entropy is a purely macroscopic quantity whose definition hinges on the distinction between heat and mechanical energy. Thus when large-scale organized motion is converted into irregular molecular motion, as by friction, energy is conserved but heat (a measure of the energy associated with irregular molecular motions) and entropy are generated. The distinction between heat and mechanical energy, and with it the possibility of defining entropy, disappears when we include individual molecules and their motions in our description.

Thus we are faced with three radically different views of the world, each with a substantial claim to generality. Intuition, supported by biology, geology and astronomy tells us that the world is growing more complex, or winding up. Macroscopic physics, on the other band, asserts that the universe is growing increasingly chaotic, or running down. Finally, at the deepest and most detailed level of description all traces of time's arrow seem to have vanished; the universe is neither winding up nor running down. How are these apparently contradictory views to be reconciled?

It is important to recognize at the outset that the intuitive character of the first view does not argue strongly in its favor. On the contrary, every significant advance in physics has demonstrated the falsity of some widely held intuitive notion. If Galileo's claim that heavy and light objects fall at the same rate in a vacuum no longer seems as absurd to us as it did to Galileo's contemporaries,

It is only the probability of finding the particle that exists in many places at the same time

twentieth century common sense still refuses to accept the basic truth that a particle can be, and usually is, in two places at the same time, or the equally basic truth that two particles can not be in different places at the same time (in all frames of reference).

Yet the intuitive view of time does, as we have seen, have an objective basis. The phenomenon of memory, which is intimately bound up with our conscious perception of time, is not in its broadest sense confined to living systems. Many nonliving systems, including most astronomical systems, preserve a partial record of their past states, and all such records, whether they occur in living or nonliving systems, fit into a single interlocking pattern characterized by a continual growth of complexity or order.

The existence of a wide class of order-generating processes does not contradict the second law of thermodynamics, as it may on first sight appear to. The Second Law requires only that the entropy of any closed system never decreases with time. (A closed system is one that does not communicate or interact with the outside world.) All living systems - and indeed all astronomical systems - are open. They can therefore diminish their entropy in various ways, the simplest of which is by giving off heat to the environment. (The flow of entropy into or out of a system is directly proportional to the flow of heat.) But the Second Law requires that when an open system loses entropy, a more than compensating increase of entropy occur elsewhere. Thus the processes responsible for reducing entropy locally, for example in living systems, invariably generate entropy on a wider scale.

On the other hand, the fact that order-generating processes are compatible with the second law of thermodynamics does not imply, as some authors have claimed, that they owe their existence to it. Consider the order-generating process of cell division. When a living cell divides, heat is given off, and the entropy of a larger, effectively closed, system consisting of the cell and some portion of its environment increases, in conformity with the Second Law. But cell division does net occur because it generates entropy. For suppose that the cell were dead. Then instead of dividing it would decompose - a process that also generates entropy. Whether entropy generation occurs through division or decomposition does not in this example depend on the environment, which is the same in both cases, but on a subtle and elusive property of the cell itself, essentially unconnected with the Second Law. This and similar examples suggest that the existence of order-generating processes is a property of the universe that is consistent with, but not explained by, the second law of thermodynamics.

This apparent gap in our current physical description of the universe has been largely ignored by physicists. On the other hand, it has been seized upon by certain philosophers, notably Henri Bergson and Teilhard de Chardin, who maintain that what I have called order-generating processes are manifestations of a movement in nature that cannot be described by existing physical laws. Bergson, indeed, held this movement to be rationally incomprehensible, though capable of being grasped by an act of intuition. I shall argue in this essay that neither the physicist's neglect of evolutionary processes nor the vitalist's rejection of physics as the means of understanding them is justified. We shall find that both the macroscopic picture of a universe evolving toward total disorder and the microscopic picture of a universe that changes but does not evolve rest, not on fundamental physical laws, but on complicated and inconsistent auxiliary assumptions, most of them implicit. I propose to replace these assumptions by simpler and more consistent ones. The resulting physical picture of the universe, though radically different from the one currently accepted by most physicists, reconciles the second law of thermodynamics with the existence of order-generating processes and with the time-symmetric character of physical laws at the microscopic level.

To uncover the assumptions and contradictions inherent in the physicist's current world picture let us begin by analyzing the apparent contradiction between its macroscopic and microscopic aspects. At first sight it may appear that the emergence of a preferred direction in time at the macroscopic level of description, though no asymmetry is present at the microscopic level, presents no special problem. When we view a painting by Seurat at the distance intended by the artist, we perceive a variety of shapes and shades of color, subtly integrated into a coherent pattern. However, w:)en we view the same painting at close range, the "macroscopic" shapes and colors resolve into a complex array of uniformly colored dots, individually unrelated to the "macroscopic" color fields in which they lie. The "macroscopic" colors and shapes are, in effect, statistical properties of the "microscopic" distribution of colored dots. Can we then interpret entropy as a shape and the second law of thermodynamics as a pattern that figure in a macroscopic description of the world but not in a microscopic description?

For the analogy to be a fair one, the "microscopic" description of the painting must be complete. It must contain, therefore, in addition to a precise description of the colored dots and their positions on the canvas, formulas relating the "macroscopic" colors to appropriate statistical properties of the "microscopic" distribution of colored dots. Given such formulas - which in one form or another must have been utilized by the artist himself - we could in fact deduce the "macroscopic" colors and chapes from the "microscopic" description.

To derive macroscopic physical laws from microscopic ones we must systematically ignore certain kinds of information - for example, information about the precise positions and velocities of individual molecules. We must view the microscopic picture through a lens that blurs its details. Now, the act of blurring can bring out patterns not apparent in the microscopic description, as in the analogy of the Seurat painting. But it cannot create a preferred direction in time (or, for that matter, in space) if none was present in the initial microscopic description, for the blurring process itself does not distinguish between the two directions of time(nor between different directions in space). Thus the second law of thermodynamics is not implicit in a time-symmetric microscopic description of nature in the way that "macroscopic" patterns are implicit in a Seurat painting. Because it distinguishes between the direction of the past and the direction of the future, the second law of thermodynamics must contain a new element, a seed of irreversibility, not present in the "complete" microscopic description. What is this seed?

It cannot be a new microscopic law, for we have good reason to believe that our current laws, though admittedly incomplete, afford an adequate basis for describing macroscopic phenomena. A new microscopic law distinguishing between the two directions of time - and recent experimental results on the decay of the K0 meson strongly suggest that the laws governing weak interactions are indeed not perfectly time-symmetric - would have virtually no observable consequences at the macroscopic level.

What then is the nature of the new element? To answer this question I must make some preliminary remarks about the structure of physical theories. Every physical theory of a specific phenomenon or group of phenomena has two distinct kinds of elements, which I shall call laws and constraints. The laws describe the regularities underlying the phenomena. They are few in number and each applies over a wide domain.

For example, Newton's laws of motion apply to mechanical systems composed of particles moving at speeds much less than the speed of light; the laws of thermodynamics to physical systems for which such concepts as temperature, internal energy, and heat can be defined; Maxwell's laws to electromagnetic phenomena in continuous media; and so on. In short, the laws of physics define what is possible.

The constraints serve to select from the set of all events governed by a given set of laws the particular phenomenon one wishes to describe. They define what is relevant.

Constraints often take the form of initial conditions. For example, a theoretical description of planetary motions could consist of (a) Newton's laws of motion and his law of gravitation, together with (b) a list of the positions and velocities of the planets at a given instant of time. The laws predict the positions and velocities of the planets at all earlier and later times, while the initial conditions serve to distinguish the solar system from all other systems of gravitating masses, and its actual history from all possible histories.

Boundary conditions make up a second important class of constraints. Whenever the physicist carves out for special consideration a finite portion of the universe, he must specify what is happening at the interface between the system he is studying and its environment. For example, a physicist studying the energy budget of the earth's upper atmosphere must allow for the inward flow of radiation, fast particles, and plasma from the sun as well as for the outward flow of radiation; he does this by specifying boundary conditions at some conveniently defined level representing the "top" of the atmosphere.

A third class of constraints, of which I shall have more to say later, is made up of symmetry conditions. These are usually less restrictive than initial or boundary conditions. For exampie, we may assume that the temperature, density and chemical composition at any point in a nonrotating star depend only on that point's distance, and not on its direction from the center of the star. This symmetry condition - which exemplifies the utility of Occam's razor ("Do not multiply hypotheses needlessly") in scientific problems enormously simplifies the mathematical treatment of stellar structure and evolution. Symmetry conditions may express temporal as well as spatial constraints. Stationary flows (exemplified by steady currents in the ocean or in the atmosphere) and periodic motions (exemplified by waves of all kinds) are each defined by a particular kind of temporal symmetry.

Laws and constraints play complementary roles in the structure of reality. The laws give phenomena their underlying coherence and unity, the constraints are responsible for their diversity and individuality. Physicists have usually attached Little theoretical importance to constraints.* When one is dealing with processes at the atomic or sub- atomic level, this attitude is clearly justified. To the experimental physicist constraints represent something to be manipuiated rather than studied.

*See in this connection E. P. Wigner's Nobel Prize Lecture, "Events, Laws of Nature, and Invariance Principles" ("Symmetries and Reflections", Indiana University Press, Bloomington and London, 1967].

At the macroscopic - and especially at the astronomical - level, however, constraints appear in a different light., Consider the constraints that define an individual star. These include, among other data, its initial mass and chemical composition and its age. By combining theory with appropriate observations, one can evaluate these properties. In themselves the resulting data have no special meaning; they merely serve to distinguish the particular star selected for study from others that might equally well have been selected. But if one examines data pertaining to appropriately selected groups of stars, one notices certain regularities. For example, stars belonging to the same cluster have always been found to have nearly the same age and the same chemical composition. Different clusters, moreover, usually have different ages and chemical compositions, and these differences depend systematically on the form of the cluster and its location in the Galaxy. Thus the constraints that serve to distinguish one star from another exhibit statistical regularities.

Although these regularities clearly differ in kind from those we call laws, they present a similar challenge to the theorist. To explain them we would need to have a theory for the formation of stars and star clusters. Such a theory would itself involve initial conditions - constraints - that would, presumably, exhibit statistical regularities inviting theoretical explanation at a still deeper level. Following this course, we would be led to formulate a sequence of increasingly general cosmogonic problems whose solutions would yield increasingly general explanations of the statistical regularities characterizing the astronomical universe. Does this hypothetical chain of theories ultimately terminate in an irreducible cosmogonic hypothesis? And if se, what is its nature? I shall return to these questions later.

We are now in a position to resume our discussion of the second law of thermodynamics and its relation to the time-symmetric microscopic description of nature. We have seen that the act of viewing the microscopic picture through a lens that blurs its details cannot destroy its symmetry. A new element must be added,and this new element cannot be a microscopic law. It must therefore be a constraint. Now, the second law of thermodynamics is rightly regarded as one of the most general and profound of all physical laws. Since it owes its unique character - the fact that it singles out a preferred direction in time - to a constraint, this constraint must reflect some equally general and profound property of the physical universe. Oddly enough, there is no consensus among physicists as to the nature of this property - or even as to the fact of its existence. In the following discussion I shall try to elucidate net only the constraint itself but also the reasons why physicists have been unable to agree about its nature.

Layzer's perfume bottle is one of the most important visualizations of physical processes of all time

Let us try to isolate the seed of irreversibility in a simple example. In one corner of a perfectly still room I open a bottle of perfume. A short time later, my partner, standing in the opposite corner, reports that he can smell the perfume. The perfume cannot have been transported by an air current because I have taken pains to ensure that the air is net in motion. But the molecules that compose the air, as well as the perfume molecules, are themselves incessantly in motion. From time to time, individual perfume molecules escape from their container and make their way along independent zigzag trajectories to all corners of the room. If we wait long enough, the perfume will all evaporate and perfume molecules will be found evenly distributed throughout the room. This process, known as molecular diffusion, is clearly irreversible. Our past experience of similar processes has bred in us the conviction that, no matter how long we wait, the perfume-molecules will not spontaneously reassemble in their original container.

Yet it can be argued that such an occurrence is not impossible. Imagine that the experiment just described could be recorded on film in microscopic detail so that we could follow the motion of each individual molecule. If such a film were reversed it would show the perfume-molecules converging along their complicated zigzag trajectories onto the perfume bottle, and there uniting into a liquid.

If physical molecules were motion-reversed, they would not return to the bottle because of quantum erasure of path information

And the trajectory of every molecule in the reversed film would conform in its smallest detail to the laws of physics, for the laws governing molecular motion do not distinguish between the two directions of time. From the closest examination of individual molecular trajectories, we could never distinguish between the actual and the reversed films. Why then would we be reluctant to accept the reversed film as the record of an actual occurrence?

The obvious answer is that the initial conditions (individual molecular positions and velocities) in the reversed film are distinguished by an exceedingly special property, unlikely to be found in the real world, namely, the property of generating a final state in which all the perfume-molecules have reassembled in the perfume bottle. But this answer immediately raises two more basic questions. Why are these initial conditions so special? And why are they unlikely to be found in the real world?

The first question is comparatively easy to answer. What fraction of macroscopically uniform initial states have the special property noted above? If an initial state does not have this property, then, in the final state, not all the perfume-molecules will have found their way back into the bottle. Hence the fraction of initial states that do have the property is the same as the fraction of possible final states in which all the perfume-molecules are in the bottle. This fraction is easy to calculate. Let v be the volume of the perfume bottle and V the volume of the room. For a single perfume-molecule the fraction of "bottled" states is v/V. For a pair of perfume-molecules, the fraction of possible states in which both perfume-molecules are in the bottle is (v/V)², and for N perfume-molecules the fraction of "bottled" states is (v/V)^N. Now, for a small bottle of perfume and a medium-sized room, a reasonable value of v/V is 10^-8, and of N, 10²³. Thus the fraction of possible final states in which all the perfume-molecules are reunited in the bottle which is the same as the fraction of possible initial states that generate final states with this property is 10^-8x10²³.

It is difficult to form an adequate conception of how small this number is. The theoretically observable universe contains about 10⁸⁰ atoms. If one of these atoms bore a lucky number, the chance of selecting it at random would be about 10^-80, which is incomparably greater than the chance of selecting at random a "favorable" initial distribution of perfume molecules. [As Ludwig Boltzmann calculated in the late nineteenth century.]

Yet we have just seen that the final state in the (reversed film, with all the perfume-molecules reunited, is equally special, in the sense that, of all possible final states, only the fraction 10^-8x10²³ has this property. Why then do we not find the existence of a full bottle of perfume astonishing? Let us examine more closely the information needed to describe the initial and final states. The quantity of information contained in e statement "All the perfume-molecules are in the bottle" is exactly equal to that contained in the statement "The perfume-molecules are uniformly distributed throughout the room but will presently reassemble in the perfume bottle." But the quality of the information conveyed by these two statements is different. The information conveyed by the first statement is purely macroscopic; that conveyed by the second is purely microscopic.

This qualitative distinction has a practical aspect. We can prepare the physical state described in the first statement by a series of macroscopic operations, namely, those involved in manufacturing the perfume, bottling it, and bringing it into the room. None of these operations concerns itself with the positions and velocities of individual perfume-molecules. On the other hand, it is precisely these microscopic properties that matter when we seek to realize the physical situation described by the second statement. In practice, of course, it would be impossible to prepare a collection of perfume-molecules that would reassemble in a bottle, while bottled perfume is relatively easy to come by.

We can now give a preliminary answer to our second question: why are the initial conditions in the reversed film unlikely to occur in the real world? The preceding discussion suggests the answer: because the information specifying initial conditions that occur in nature is macroscopic and not microscopic. This statement is in fact the link we have been seeking between microphysics and macrophysics, the seed of irreversibility hidden in the second law of thermodynamics. To make this claim plausible, however, I must explain the technical meaning of "information" and the distinction between macro-information and micro-information.

The concept of information, one of the most powerful and far-reaching in the lexicon of the exact sciences, expresses those aspects of more general concepts like order, organization, and complexity that are amenable to precise mathematical formulation. Underlying all these concepts is the notion of probability. Consider, for example, the concept of order as it relates to the arrangement of books in a bookcase. In the least orderly arrangement the books are distributed on the shelves at random. The number of possible random arrangements is large (if there are many books) and the probability of finding any given book in a particular place is low. A more orderly arrangement would separate the collection into a few broad categories - history, literature, and science, say - leaving the order within a given category random. The number of possible arrangements of this kind is less than in the previous case, and so, therefore, is our uncertainty as to where to look for a particular book. A still more orderly arrangement would have the books ordered alphabetically by author within each category. This arrangement minimizes our uncertainty as to where to look for a particular book. Thus the notions of order and uncertainty are closely related, at least in this example; the more orderly an arrangement, the smaller the uncertainty associated with it. The notion of uncertainty is in turn closely related to that of information, since a gain of information represents a loss of uncertainty and a loss of information represents a gain of uncertainty. Thus uncertainty and information are two sides of the same coin.

Suppose that the bookcase contains n books occupying n numbered places. In general we are not certain in which place a particular book can be found. (The book represents a physical system; the numbered places, its possible states.) To give more precise mathematical expression to our condition of uncertainty, let us assign definite probabilities p_l, p₂..., p_n to each of the n places. The p_k are numbers lying between 0 and 1 whose sum is equal to 1. An event with probability p = 1 is certain to occur, one with probability p = 0 is certain not to occur; intermediate values of p correspond intuitively to intermediate likelihoods of occurrence - but this statement should not be interpreted as a prescription for evaluating them. The probability that a given book will be found either in the jth or the kth place is the sum (p_j + p_k) of the individual probabilities. The sum of all the probabilities p_k equals 1, because a given book in the collection is certain to be found in one of the n places.

Suppose that a complete set of probabilities p_l, p₂..., p_n, or {p_k} for short, is given - leaving aside, for the moment, the important questions of how these numbers can be assigned theoretically and estimated experimentally. We wish to calculate from these probabilities a numerical measure U of the uncertainty associated with this assignment. U must have certain obvious properties: it must depend in a continuous (i.e. smooth) way on all the probabilities in the set; it must assume its minimum value (which we may take to be zero) when any one of the p_k is equal to 1 and the others are equal to 0; it must assume its maximum value when all the probabilities are equal, that is, when p_k = l/n for all values of k. The uncertainty measure should also satisfy some less obvious requirements. Suppose, for example, that we cannot recall whether a certain book - let us say a popular book on science - has been classified as science or literature, nor whether its author is Smith or Psmith. Here we have two separate though related sources of uncertainty, and we may reasonably demand that they should be separately measurable and that their sum should be equal to the total uncertainty. It turns out that there is one and only one uncertainty measure that satisfies this and the preceding requirements. It is given by the extremely simple formula U = Σ p_klog(1/p_k) [the summation is from k = 1 to n].

The uncertainty U takes its maximum value U_max when p_k = l/n for all values of k. From the formula just given it follows that U_max = log n. We define an information measure I by the formula I= U_max - U. This definition makes a loss of uncertainty equivalent to a gain of information and equates to zero the information present in a maximally noncommittal probability distribution.

The probability distributions that convey the most information are those in which one of the probabilities, say p_j, is equal to 1 and the rest are equal to zero. This probability distribution corresponds to the statement, "the book is definitely in the jth place." The corresponding value of U is 0 and the corresponding value of I is I_max = U_max = log n. Suppose that the number of possible places n = 10^k where k is an integer. Then, to specify a particular place j we need a k-digit number. According to our formula, the information needed to specify a particular place is log(10^k) = k, which is just the number of decimal digits needed to convey the information. (If instead of using ordinary logarithms in the definition of I, we had used logarithms with base 2, we would have needed a string of log₂ n binary digits, or bits, to convey the same information.)

This concrete interpretation of information carries over to the general case when the probabilities p_k are not all equal. A basic result in information theory, first obtained by Claude Shannon, states that in general I represents the average number of digits needed to convey the information that a system is in a particular state, when this information has been encoded as efficiently as possible. The last proviso is important. A long book need not contain more information than a short one if the ratio of text to meaning is not the same for both.

From the formulae given above, we can calculate U and I, given a complete set of probabilities p_k. But how do we go about estimating these probabilities? Particularly for systems with a very large number of possible states, this would seem to present a serious practical problem. But the concept of information, edged with Occam's razor, helps us to cut through the difficulty. Suppose, to take the simplest case, that we knew nothing about the probability distribution except the number of possible states. Since we have no information, Occam's razor demands that none should be present in our theoretical description. We must therefore assign values to the probabilities p_k so as to minimize the value of I and maximize the value of U, subject only to the condition that the total number of possible states be equal to n. This is a well-defined mathematical problem whose solution, as we already know, is the maximally noncommittal probability distribution with p_k = l/n for all values of k.

The same procedure works when we do have some information about the system we wish to describe. If this information can be expressed symbolically, we can find the probability distribution that contains it and no more. The mathematical problem is well-defined, and its solution yields not only the probabilities themselves but also the numerical values of U and I; it tells us, literally, how much our information is worth. We shall encounter an important practical example of this procedure later.

We are now almost ready to return to our discussion of the origins of macroscopic irreversibility. One further point, already touched upon, needs to be made more explicit: the distinction between macro-information and micro-information. If, in the bookcase analogy, the numbered places, ordered according to subject and author, represent microstates, then the set of places assigned to a particular subject represents a macrostate. The macrostates are assumed to be non-overlapping and exhaustive. The probability associated with a given macrostate is the sum of the probabilities assigned to the microstates that compose that macrostate. From these macro-probabilities, we can construct the macro-uncertainty U and the macro-information I = U_max - U.* We now define the micro-uncertainty U' = U - U as the amount by which the total uncertainty exceeds the macro-uncertainty. Similarly, we define the micro-information I' = I - I. To see whether these definitions are reasonable, suppose that every macrostate contains an equal number, say r, of equally probable microstates. We should then expect the micro-uncertainty to have the value appropriate to a set of r equally probable states, namely, log r - and it does. The mathematically inclined reader can check that U - U really does equal log r in this case. As I mentioned earlier; it is this consistency property that serves to distinguish our formula for U from other plausible formulae.

[Note: Layzer used U with a macron/long symbol over the U for the macro-uncertainty. HTML can not produce the macron easily, so we have substituted the italic quantities U and I.]

Our excursion into the realm of pure 'information theory is now ended, for in defining the concepts of macro-uncertainty and macro-information we have, without noticing it, made contact with classical thermodynamics. We shall see that the macro-uncertainty inherent in a statistical description of a physical system is identical with the entropy.

This identification serves to vindicate, as well as clarify, the idea that entropy is a measure of disorder, for, as we have already seen, uncertainty and disorder are closely related concepts.

Let us examine the reasons for identifying macro-uncertainty with entropy. Consider a closed system composed of interacting particles, and let its possible microscopic states be numbered from 1 to n. If we assign probabilities to these states, we can calculate the corresponding information measure I and ask how it varies with time. A rigorous mathematical argument yields the intuitively satisfying answer that, so long as the system is not interfered with, I remains constant. The system does not of course remain in its initial microscopic state - for air at room temperature and pressure, the average lifetime of a microscopic state is only about 10-11 second - and even its macroscopic state may change greatly.

Classical dynamical laws conserve the information. Quantum erasure destroys the micro-information of molecular correlations

Nevertheless, these changes, governed by time-symmetric microscopic laws, neither generate nor destroy information. Our thought-experiment on the diffusion of perfume-molecules suggested, however, that the quality of the information changes during this process. In the initial state, with the perfume still in the bottle, the information was wholly macroscopic. As the perfume-molecules diffused through the room, macro-information was converted into micro-information. In the final state all traces of large-scale order had disappeared; the information was then purely microscopic in quality. To prove rigorously that such a flow of information actually occurs in macroscopic systems under appropriate initial conditions presents an exceedingly difficult problem. The first proof was given only in 1955, by the Dutch physicist van Hove, who considered a specific class of many-particle systems and used a specific prescription for separating macro- and micro-information. Closely related theorems, applying to other kinds of physical systems, have subsequently been established by other authors. All these theorems assert that the macro-information decreases monotonically with time (and hence the macro-uncertainty increasea monotonically with time), provided that micro-information is absent in the initial state. As in van Hove's theorem, the precise definition of micro-information depends on the physical system to which the theorem applies. For example, for an ordinary dilute gas, micro-information means information pertaining to correlations between the positions and/or velocities of two or more molecules.

When the macro-information has dropped to zero and the macro-uncertainty has reached its greatest possible value, we can use the mathematical formulation of Occam's razor described earlier to obtain a complete statistical description of the system. Consider, for example, a small region inside a gas that is maintained at a fixed density and temperature.

Because molecules are continually entering and leaving the region, their total number and total energy vary with time, but the average values of these quantities are constants, determined respectively by the density and temperature of the gas. The mathematical expression of this statement imposes two conditions on the underlying probability distribution. Maximizing the macro-uncertainty subject to these conditions yields a complete statistical description of the gas. This description turn out to be identical in every detail with the classical description, based on the two laws of thermodynamics - provided that we identify macro-uncertainty with entropy. Thus the statistical theory includes the whole of classical thermodynamics. It is more general than classical thermodynamics, however, because (a) it does not postulate the law of non-decreasing entropy in closed systems but derives it from more primitive assumptions, and (b) it applies to any system that can be described statistically - even one that consists of a single particle.

Although macro-uncertainty and entropy have identical mathematical properties in all contexts where both are defined, many physicists refuse to accept their physical identity. The reason is that we tend to attribute a high degree of objectivity to concepts like heat, work, and entropy, while information and uncertainty seem tinged with subjectivity. Though less familiar to nonscientists than the concepts of work and heat, the concept of entropy is equally essential and equally concrete to the engineer who designs engines and refrigerators. (That entropy plays a part in everyday experience is illustrated by the following example:-- When the temperature in a room drops, the energy of the air does not change, but the entropy decreases.) Most of us do not attribute the same degree of objectivity to notions like uncertainty and information. By checking the mathematics, we may convince ourselves that a flow of macro-uncertainty into a physical system entails a corresponding flow of heat, but the conviction remains that uncertainty is easier to dispel than heat.

Viewed naively, quantities like mass and energy seem to be objective attributes of physical systems, while quantities like uncertainty and information obviously depend on the observer as well. But this distinction is too simple. Mass and energy are quantities that figure in a mathematical description of nature. Because they are capable of being measured by impersonal procedures, they may properly be described as "objective", and to remind ourselves of this important property we find it convenient to think of them as residing in the object, whereas in fact they belong partly to the object and partly to the description. The same thing is true of the statistical quantities that figure in physical theories. They too can be measured by impersonal procedures and they too dwell simultaneously in the object and the description. But, owing to the subjective connotations of terms like "probability", "uncertainty", and "information", we find it harder to externalize these concepts.

Yet the statistical interpretation of entropy and the second law of thermodynamics does present a real difficulty, which is related to the.level of the statistical description. Let us consider a simple and trite example of a macroscopic process described statistically, the throw of a die. In the 'absence of a detailed physical description of the die and of the circumstances of the throw, the outcome is uncertain, and we assign probabilities p₁, p₂,..., p₆ to the possible outcomes. Each of these probabilities is an "inherent" property of the die. Its value can be calculated theoretically from sufficiently detailed data on the die's physical properties, and it can also be measured experimentally to any desired degree of precision. (To measure the probabilities p_k we could measure the frequencies of occurrence of the various outcomes in a long series of throws or in a single throw of a large number of identical dice. These frequencies fluctuate about the corresponding probabilities, but the amplitude of the fluctuations decreases in a predictable manner as the length of the series increases. Precisely the same procedure is used to measure nonstatistical quantities like mass.)

We can describe the process of radioactive decay in similar statistical terms. The probability that a single radium atom will decay during a small time-interval Δt is Δt/T, where the constant T is the same for all radium atoms. The value of T can be predicted theoretically (by means of quantal calculations based on a suitable model of the radium nucleus) and measured experimentally (through observations of the decay of a sample containing a large number of radium atoms). The quantity 1/T, which represents the probability per unit time for the decay of a single radium atom, is thus analogous to the probabilities p_k in the previous example.

But there is a profound difference between these two examples. From a sufficiently accurate description of the die's initial state, we could in principle predict the outcome of any given throw. Even if we could not predict it with certainty, we would greatly reduce the uncertainty associated with complete ignorance of the initial conditions. By contrast, the decay of an individual radium atom is thought by physicists to be inherently unpredictable. According to the current laws governing atomic and subatomic phenomena, a complete description of a radium atom would enable us to calculate T but not to predict when any given atom would decay. Thus the probability of radioactive decay is irreducible while that of throwing double sixes with a pair of dice is not. Most physicists believe that the uncertainty inherent in all quantal descriptions is irreducible, in the sense that there does not exist a more fundamental description that reduces the quantal uncertainty. The uncertainty inherent in a statistical description of a macroscopic system, on the other hand, can be reduced.

We have now penetrated to the heart of the contradiction between the physicist's two views of the world: the reversible microscopic view and the irreversible macroscopic view. We have seen that macroscopic irreversibility as expressed by the second law of thermodynamics hinges on a statistical property of naturally occurring macroscopic systems: the initial absence of micro-information. It is clearly impossible to derive this property from a description of the world that does not distinguish between "initial" and "final" and in which micro-uncertainty is reducible to the basic quantal uncertainty inherent in every microscopic description.

Our discussion has shown that the contradiction between the macroscopic and microscopic pictures does not stem from a conflict between laws but from a conflict between constraints. We have identified the constraint that lies at the core of macroscopic irreversibility. The constraint underlying the reversible microscopic picture of the world can be stated formally in these terms: "In principle there exists, at any given instant of time, a complete microscopic description of the universe." (By "a complete microscopic description" I mean, of course, a quantal description which, as we have seen, contains an irreducible - but also unvarying - element of uncertainty.) This statement, which has some of the aura of a self-evident truth, seems never to have been seriously questioned by physicists. I shall, however, propose to abandon it in favor of a postulats that, I shall argue, is both simpler and more plausible.

What considerations guide the choice of constraints for the universe as a whole? We saw earlier that the constraints appropriate to astronomical systems are characterized by regularity as well as diversity. The regularities are statistical in character, and their explanation seems to require a sequence of increasingly general cosmogonic theories. The diversity permitted by these statistical regularities is reflected in the diversity of the phenomena. However, we may expect the constraints that apply to the universe as a whole to differ from those appropriate to finite physical systems.

In the first place, if the cosmic constraints have regularities, these must be in some sense self-explanatory, for we cannot invoke a more comprehensive theory to explain them. The chain of cosmogonic problems generated by attempts to explain the statistical regularities of astronomical systems must terminate in or start with - the constraints that characterize the universe as a whole.

In the second place, the universe, unlike any finite system, is the only observable member of its clans. Of course it may be instructive to work out the consequences of different cosmic constraints, just as it may be instructive to work out the consequences of different physical laws, but the ultimate objective is the same in both cases: to find the simplest set (of constraints or laws) on which to base a theoretical description that agrees with the available observational and experimental evidence.

Finally, cosmic constraints are distinguished from those we impose on finite systems by the fact that the universe affords the framework, as well as the arena, for the laws of physics. This fact suggests that there may be a close connection, a kind of harmony, between the laws of physics and the cosmic constraints.

Let us pursue this idea. A fundamental property common to all currently held laws of physics is the negative one of not distinguishing between different positions and different directions in space. In mathematical language, the laws of physics are invariant under translations (changes of position) and rotations (changes of orientation) of the frame of reference. This symmetry principle does not of course prevent the phenomena themselves from singling out preferred positions or directions in space. Thus terrestrial phenomena clearly single our the vertical direction, but this preference reflects a lack of symmetry in the distribution of mass and not in the farm of Newton's law of gravitation.

The observed distribution of matter and motion in the universe is nonuniform and anisotropic on scales smaller than a few billion light years. At substantially larger scales, however, statistical uniformity and isotropy seem to prevail. This fundamental property of the astronomical universe was first deduced by Edwin P. Hubble from photographic observations made during the 1920's and early 1930's with the 100-inch telescope on Mt. Wilson. Subsequent observational studies based on data collected with still more powerful optical telescopes have supported Hubble's conclusion, as have also data collected by radio astronomers on the distribution of extragalactic radio sources, and observations of the cosmic microwave background. Thus the available observational evidence strongly and consistently supports the assumption that, apart from local irregularities, the universe is spatially homogeneous and isotropic. ("Local" in this context means "having a scale less than a few million light years.") This assumption is usually called the cosmological principle.

It is aesthetically pleasing to find that the spatial symmetries inherent in our current physical laws also underlie the large-scale structure of the astronomical universe. But the cosmological principle also has certain practical consequences, two of which are especially relevant to the present discussion.

1. Newtonian physics is based on the concepts of absolute space and absolute time. Because absolute space provided, for Newton, an invisible and intangible but ever-present backdrop, all motion, uniform as well as accelerated, is absolute in his theory. Einstein's two great theories exorcized absolute space and time from the physical arena. Special relativity welded space and time into a four-dimensional continuum in which only space-time intervals, and not space-intervals and time-intervals separately, have an invariant meaning; but it maintained the Newtonian distinction between accelerated and nonaccelerated frames of reference. By enlarging the scope of space-time geometry to include gravitational phenomena, general relativity did away with this remaining distinction. The cosmological principle in a sense reinstates absolute space and absolute time - not as a basis but as a framework. Unlike Newtonian absolute space, the preferred frame of reference introduced by the cosmological principle is defined by the physical contents of the universe. It is the frame in which the distribution of matter appears to be uniform and the distribution of motion isotropic, apart from local irregularities.

2. Einstein's theory of gravitation (general relativity) predicts that a universe satisfying the cosmological principle cannot be static. It must either expand from a singular state of infinite density in the finite part or contract toward such a state in the finite future. The mathematical theory of an expanding (or contracting) universe satisfying the cosmological principle was published in 1922 by the Russian mathematician Friedmann. Seven years latex, Hubble, unaware of Friedmann's theoretical work, advanced the hypothesis of a uniform, isotropic cosmic expansion to account for his and other astronomers' measurements of systematically red-shifted spectral lines in the spectra of distant galaxies. Subsequent observational work has strongly confirmed Hubble's hypothesis of a uniformly expanding universe.

As formulated above, the cosmological principle says nothing about "local irregularities". In effect, it applies to a fictitious uniform and uniformly expanding medium, the substratum, in which the density equals the average density of matter in the actual universe, and the velocity at any given point equals the average velocity in the neighborhood of that point in the actual universe. Now, the mean density field and mean velocity field are merely the two simplest ingredients in a statistical description of the universe. The question immediately arises, whether any other statistical property defines a preferred position or direction in space. Observational evidence bearing on this question is necessarily incomplete, but so far no indication of a preferred position or direction has been found. Allowing ourselves to be guided by Occam's razor, we accordingly postulats that no statistical property of the distribution of matter and motion in the universe serves to define a preferred position or direction in space. I shall refer to this postulate of statistical uniformity and isotropy as the strong cosmological principle.

The strong cosmological principle attributes to the statistical constraints that help to define the real universe the same spatial symmetry properties that characterize the underlying laws. What about the nonstatistical constraints? To answer this question, consider an idealized one-dimensional "universe" consisting of point masses distributed in a statistically uniform way along a straight line extending to infinity in both directions. ("Statistical uniformity" is defined as follows: - The probability of finding a point mass in any given line-segment is (a) independent of the distribution of point masses outside the segment under consideration, (b) the same for all segments having the same length, and (c) proportional to the length of the segment, if the segment is sufficiently short.) This statistical distribution, called the Poisson distribution, is completely defined by a single quantity, the average number of point masses per unit length. Moreover, it contains no information other than that needed to specify that average number; its statistical description, given above, is maximally noncommittal in the technical sense discussed earlier.

To describe our one-dimensional universe in microscopic detail, we divide the line into segments of equal length and the number of mass points contained in each segment. The description then takes the form of an infinite sequence of occupation numbers without beginning or end -for example: ...00010110010... .

Given such a sequence, we can measure its defining parameter, the mean occupation number, with any desired accuracy by averaging over successively larger samples. In an analogous way we can estimate any other average property of the distribution.

Next, suppose we are given two sequences of occupation numbers and asked to decide whether they describe the same one-dimensional "universe". From what has just been said, we can discover, through measurement, whether the statistical properties of the two descriptions tally. Suppose we conclude that both sequences describe a Poisson distribution characterized by the same average density of mass points. Do the two distributions differ in some nonstatistical way? This question would be easy enough to answer if our one-dimensional "universe" were finite, or even if it extended to infinity in one direction only. Each sequence of occupation numbers would then have at least one endpoint and we could decide immediately, by matching them, whether or not the two sequences were identical in detail. But if the sequences are infinite in both directions, there is no unique way of matching them. In fact, given any finite subsequence of occupation numbers in the first sequence, we can find arbitrarily many matching subsequences in the second sequence. Since this statement is true for any finite subsequence, the two descriptions are operationally indistinguishable. In short, a sequence of occupation numbers describing a Poisson distribution on an infinite line contains no nonstatistical information; it is specified completely by a single parameter, the mean occupation number.

Exactly analogous arguments apply to more realistic models of the universe. If the universe satisfies the strong cosmological principle, if it is infinite and unbounded, and if the scale of local irregularities is finite, then the following two statements hold. (1) All statistical properties can be measured with arbitrary precision. (2) No additional information can be either obtained or specified.

Neither of these properties is shared by any finite subsystem. Consider, for example, an enclosed gas in thermodynamic equilibrium. We can measure its instantaneous temperature, but not with arbitrary precision. For the temperature of a gas is a measure of the average kinetic energy of its molecules, and the number of molecules in a sample determines how accurately such an average can be estimated. On the other band, we can acquire nonstatistical information about a finite sample of gas - information about the positions and velocities of individual molecules, for example - because the walls of the container define a fixed frame of reference. It is for this reason that the statistical description of any finite subsystem of the universe is reducible to a detailed microscopic description. The situation is exactly reversed for the universe as a whole, if it satisfies the conditions mentioned above. All the statistical information needed to define the instantaneous state of such a universe is present in the actual distribution of matter and motion, while nonstatistical information is not merely unobtainable but absent; the statistical description is complete.

The strong cosmological principle thus reduces to the simple statement that there are no preferred positions or directions in space. A description that conforms to this requirement is necessarily statistical, and such a description is also complete.

Thus the strong cosmological principle flatly contradicts the commonly held assumption that a complete microscopic description of the universe is possible in principle. And it implies the existence of a new kind of irreducible uncertainty, distinct from the quantal uncertainty exemplified by the random character of radioactive decay. This new kind of uncertainty does not prevent the acquisition of microscopic information about finite subsystems of the universe; it shows up only at the cosmic level of description. Returning to our analogy of the Seurat painting, we may say that the information conveyed by the painting resides in its "statisticai" properties - the colors, shapes, and patterns perceived by an observer who views the picture at the distance intended by the artist. Different "microscopic" arrangements of color dots could have produced the same "macroscopic" effects, and we may presume that those "microscopic" details that do not contribute to the "macroscopic" effect were not part of the artist's plan but were the results of chance. In the same way, there is an irreducibly random element in all natural phenomena, over and above that introduced by quantal uncertainty.

In discarding the assumption that there exists a complete microscopic description of the universe and adopting in its place the strong cosmological principle, we have taken an important step toward an understanding of the constraint implied by the second law of thermodynamics. We have shown that micro-uncertainty, one key element in this constraint, is an irreducible ingredient of macroscopic systems. We have still to understand, however, why the absence of micro-information and the presence of macro-information characterize the initial states of naturally occurring systems. To answer this question, let us shift our attention to the other key element in the thermodynamic constraint: macro-information. What is its origin?

At first sight it may seem that the second law of thermodynamics answers this question. This law implies that all natural processes generate entropy.

Does this not imply that the quantity of information in the world - or rather, since this quantity is infinite, the information per unit mass - is continually decreasing, and that it must therefore have attained its greatest value in the initial instant of the cosmic expansion? A simple example will show that the answer is no.

Let us consider three model universes, the first filled with a uniform gas, the second with a uniform radiation field (or photon gas), the third with a uniform mixture of gas and radiation. Let us assume further that each uni-verse is in a state of thermodynamic equilibrium at some given instant of time, not the initial instant. We recall that the information associated with the state of thermodynamic equilibrium is zero and that the entropy takes its maximum value consistent with the prescribed values of the temperature and mean density. Let us now see what happens to information and entropy in each universe as it expands away from the state of thermodynamic equilibrium.

In the first universe, the gas remains in thermodynamic equilibrium as it expands (provided that its temperature is not too high), but its temperature decreases. In fact, any given element of the gas behaves exactly as if it were expanding against a movable piston in a perfectly insulated cylinder. Thus in the first universe, micro-information and entropy remain constant during the expansion. The same is true of the radiation-filled universe, every element of which also behaves as if it were confined to an insulated cylinder fitted with a movable piston. However, the rate at which the temperature decreases with decreasing density is not the same in the two universes. In the gas-filled universe the temperature varies as the 5/3 power of the density while in the radiation-filled universe it varies as the 4/3 power of the density.

Because the law of temperature decrease is not the same in a universe filled with radiation as in one filled with gas, the behavior of the third universe, in which both gas and radiation are present, is more complicated than that of the other two. If we mix gas at one temperature with radiation at a different temperature in a non-expanding container, the mixture will relax to a state of thermodynamic equilibrium at some intermediate temperature. During this process, which I shall call thermalization, the energy of the mixture stays constant, but its entropy increases. (Thermalization is closely related to the process of molecular diffusion considered earlier.) The rate at which thermalization occurs depends on the composition, density, and temperature of the gas and on the temperature and density of the radiation. Suppose that the thermalization rate is much smaller than the cosmic expansion rate in the third universe. Then the temperature of the gas will decrease nearly in accordance with the law appropriate to a gas-filled universe, while the temperature of the radiation will decrease nearly in accordance with the law appropriate to a radiation-filled universe. Hence a temperature difference will develop between the two constituents. This implies that information has been generated by the expansion. The information associated with a given element of the mixture is just the difference between the actual entropy of that element and the entropy it would have if the expansion were halted and the mixture of gas and radiation were allowed to come to thermodynamic equilibrium. The rate of generation of information is thus exactly equal to the rate at which entropy would need to be generated to maintain the mixture in a state of thermodynamic equilibrium despite the expansion. In the opposite case, when the thermalization greatly exceeds the expansion rate, entropy is actually generated at this maximum rate, and no information is generated.

Layzer's explanation of the growth of order.

In the intermediate case, when the rates of thermalization and expansion are comparable, both entropy and information are generated.

This example shows that entropy generation on a cosmic scale does not imply a destruction of information; information and entropy can be generated simultaneously. The reason is that the sum of information and entropy is not constant, as in more familiar contexts, but changes as the universe expands. The example also illustrates another important point, that the rate at which information can be generated depends on the relative rates of entropy-generation and cosmic expansion.

From the above example it may appear that cosmic expansion plays the main role in the production of information, but this is not the case. Had we reversed the direction of time and considered a universe contracting, instead of expanding,from the initial state of thermodynamic equilibrium, we would have reached exactly the same conclusion: information and entropy are both generated at rates depending upon the relative values of the thermalization and expansion rates. Thus in our example the arrows defined by the growth of entropy and the growth of information always point in the same direction, but they need not coincide with the arrow defined by the cosmic expansion. On the other band, they invariably point away from the state of thermodynamic equilibrium.

In order to apply these considerations to the actual universe, we must investigate whether it is reasonable to suppose that it was ever in a state of thermodynamic equilibrium. So far as local conditions are concerned, we can answer this question by comparing the rate of cosmic expansion with the rates of atomic and subatomic processes that tend to establish thermodynamic equilibrium locally. Both the expansion rate and the thermalization rate increase very rapidly with increasing density, but calculation shows that the thermalization rate increases faster, so that we may expect local thermodynamic equilibrium to have prevailed during the earliest stages of the cosmic expansion.

This conclusion does not exclude the possibility that large-scale variations of such physical quantities as density, temperature, composition, and velocity, were present during the earliest stages of the expansion. Some writers have, indeed, argued that such irregularities are needed to account for the present structure of the universe. But these arguments all depend - implicitly or explicitly - on the belief that the second law of thermodynamics requires more information to be present in the initial state of the universe than in subsequent states. We have seen that this belief is false.

Cosmologists have sadly not given the careful consideration that Layzer's theory deserved. See the history of Layzer's theory of the growth of order.

The Second Law does not in fact exclude the possibility that the present-day universe, with its complex hierarchical structure, has evolved from a structureless initial state - a state of global thermodynamic equilibrium. Because this is the simplest conceivable hypothesis, it deserves careful consideration.

Attempts to base a cosmogonic theory on the hypothesis of a structureless initial state divide into two main groups. Theories belonging to the first group take as their common starting point the assumption that the temperature during the early stages of the cosmic expansion was so high that the energy of the radiation field greatly exceeded that of the matter. The proponents of the hot big bang, as this hypothesis is usually called, interpret the cosmic microwave background - an approximately thermal radiation field with a temperature of 2.7°K, discovered by A. A. Penzias and R. W. Wilson in 1965 - as a remnant of the primordial fireball, degraded in temperature and energy by the cosmic expansion.

The main difficulty with the hypothesis is that high temperatures militate against the formation and growth of density fluctuations. For this reason, most authors who favor a hot initial state also postulate that large-scale irregularities were either present initially or were formed during a stage of the expansion when the cosmic density was so high that current physical laws cannot be applied with confidence.

The alternative hypothesis, which my students and I have been developing, is that of a cold initial state. A detailed consideration of the cosmogonic theory based on this hypothesis would take us too far afield, and, indeed, in its present form the theory has weaknesses and gaps. But a brief outline of it will help to fix ideas. The early history of the cold universe divides into three phases, dominated respectively by elementary-particle interactions, atomic and molecular interactions, and gravitational interactions.

Throughout phase one, the density of the cosmic medium remains uniform, but interactions among elementary particles cause its composition to change. By the time the mean cosmic density has dropped to 10⁸ gm/cc (100 tons per cubic inch), the "primordial" chemical composition of the universe has been established. Matter is then largely in the form of protons (hydrogen nuclei) and electrons, with a smattering of helium nuclei and a sprinkling of some other light nuclei. In spite of its low temperature and high density, the matter remains gaseous during this phase, owing to "zero-point" motions of the nuclei - motions of quantal origin that are present even at zero temperature and that increase with increasing density.

Phase two begins when the mean cosmic density has dropped to a value of about 100 gm/cc. At this point the medium "freezes" because the zero-point motions of the nuclei are no longer large enough to prevent crystallization. Detailed calculations by Dr. Ray Hively indicate that the cosmic medium becomes a metallic hydrogen-helium alloy. Metallic hydrogen has not yet been produced experimentally, but the calculations indicate - though they do not conclusively prove - that the metallic state is stable under the physical conditions that prevail in the cold universe at this stage.

The solid state can exist only over a comparatively narrow range of densities. At a certain point in the cosmic expansion the medium must shatter into fragments. Mr. Hively's calculations indicate that this occurs at a density of about .4 gm/cc. I have estimated that the resulting fragments will have masses comparable to that of the earth (about 10²⁷ gm) - though of course they differ radically from terrestrial material in their chemical composition. So ends phase two.

At the beginning of phase three the cosmic medium resembles a uniform, cold gas in which solid fragments of planetary mass play the part of molecules. Such a gas differs from ordinary gases in one crucial respect. The forces exerted by the molecules of an ordinary gas have limited range. Two molecules interact appreciably only if their separation is less than this range, which is normally much smaller than the average distance between the molecules. Thus a molecule in an ordinary gas spends must of its time in free flight between relatively brief binary encounters. The "molecules" of the phase-three cosmic "gas", on the other band, interact gravitationally. Gravitation is a long-range force. It decreases with distance, but so slowly that the qualitative behavior of a gas composed of gravitating particles differs radically from that of an ordinary gas. Every particle interacts strongly and continuously with a large number of other particles. I have argued that the phase-three gas cannot persist in its initially uniform state (uniform, that is, on scales large compared with the diameter of a typical fragment); density fluctuations develop spontaneously at all scales on which coherent gravitational interactions are possible. The fact that gravitational signais propagate with the speed of light sets an upper limit on the scale of these fluctuations.

As the universe continues to expand, the density fluctuations grow in amplitude. The mean cosmic density decreases steadily, of course, but the contrast between regions of comparatively high and low density grows. Ultimately, discrete self-gravitating systems separate out of the expanding medium. The theory indicates that the smallest self-gravitating systems separate out first and that successively larger systems separate out as clusters of smaller systems. Thus newly formed self-gravitating systems have a complex hierarchical structure - much of which is destroyed during their subsequent evolution. At the present time, second-order clusters of galaxies are in process of separating out, and still larger systems will be formed in the future.

Although this picture of cosmic evolution rests on quantitative and semi-quantitative theoretical arguments, I must stress that it is still highly speculative. Nevertheless, it serves to illustrate a possible sequence of events leading from a structureless initial state to the complex structure of the present-day astronomical universe.

Let us now return to the questions raised at the beginning of this essay. What is the origin of time's arrow, and how are its various aspects related?

Events are shaped partly by physical laws and partly by constraints. The laws do not distinguish strongly, if at all, between the two directions of time. The origin of temporal asymmetry must therefore be sought in the constraints. These exhibit statistical regularities which are must conspicuous at the astronomical level. I have argued that all such regularities must flow from a set of constraints pertaining to the universe as a whole. These cosmic constraints differ from the constraints that define discrete astronomical systems not only because they are, by definition, irreducible and must therefore, in some sense, provide their own justification, but also because they define the framework, as well as the arena, for the laws.

The cosmic constraint proposed in this essay has two parts. The first part, a strengthened version of the Einstein-Hubble cosmological principle, affirms the basic equivalence of all positions and of all directions in space. This postulate, which attributes to the universe itself a symmetry property inherent in all current physical laws, has three important consequences. (1) It leads to a unique splitting of the space-time continuum into absolute cosmic space and absolute cosmic time - but only at the cosmic level of description. (2) It implies a nonstatic universe, expanding from or contracting toward a singular state of infinite density. (3) Finally, it implies that a complete description of the universe contains no microscopic information. The uncertainty associated with this absence of microscopic information is irreducible and is distinct from the quantal uncertainty associated with the complete microscopic description of a finite subsystem.

The second part of the cosmic constraint proposed in this essay refers specifically to the singular state. It asserts that no information whatever (and hence no structure) is present in this state.

Both parts of the constraint are implicit in the assertion:- The universe possesses one information-free state. This, perhaps overly terse, statement of the constraint serves te emphasize its simplicity and economy.

The singular state marks a natural boundary of the time-coordinate, which we may appropriately designate as the beginning of time. Even if the laws of physics are perfectly time-symmetric, they generate a unique sequence of events issuing from this state, and hence a unique direction in time.

The expansion of the universe away from its initial state generates information and, ultimately, structure. At the same time, it generates entropy. Both movements are irreversible, and jointly they define the arrow of time. Of the two, the growth of information is perhaps the more fundamental because it supplies the order that entropy-generating processes seek to destroy. The growth of entropy reflects a one-way flow toward smaller and smaller scales of the information generated by the cosmic expansion.

Philosophers and - less frequently - physicists have sometimes questioned whether the second law of thermodynamics can properly be extrapolated from discrete closed systems to the universe as a whole. The present discussion indicates that such an extrapolation is entirely justified. For if we consider a sufficiently large region of the universe, the flow of entropy into the region must balance the flow out of it, apart from statistical fluctuations, so that entropy generation in the universe at large is determined entirely by local processes, which in turn must conform to the second law of thermodynamics. By contrast, no extrapolation from the behavior of discrete closed systems can disclose the irreversible growth of information in the universe as a whole. It is for this reason that the fundamental role played by this basic tendency in cosmic evolution has gone unrecognized for so long.

When a discrete astronomical system separates out of the cosmic medium, it contains macroscopic structure (and hence information) but no microscopic information. These conditions are both necessary and sufficient for the second law of thermodynamics to apply in the system. Although the instant when a discrete system comes into being is to some extent a matter of definition, once it has been defined it serves as a point of reference in time. An arrow in time painting from the initial instant to the present moment points, by definition, the direction of the future. We have shown that this direction coincides with the direction of entropy generation within the system and with the common direction of entropy generation and information growth in the universe as a whole.

In an ideal isolated system, information present initially at the macroscopic level is degraded with the passage of time into information relating to the system's microscopic state, as in our earlier example of the diffusion of perfume. In any real system, however, the flow of information does not stop at the system's boundaries. Once information has been degraded to the molecular level, it can no longer be confined; it diffuses rapidly - indeed with the speed of light - into outer space. Thus, even if we,shield a system perfectly against all forms of energy and mass flow, we could not shield it from the fluctuating gravitational forces exerted by distant matter. Small as they are, these forces dissipate almost instantaneously any microscopic information that may be present in a macroscopic sample of gas. This means that in practice the constraint "macro-information initially present, micro-information initially absent" applies not only at the initial instant, but also at all subséquent times. Moreover, it also applies to subsystems of the original system, provided that we consistently classify all the information that went into shaping a subsystem's initial state as macroscopic.

At the beginning of this essay I contrasted the generation of entropy in closed systems with the growth of information in open systems - both living and nonliving - endowed with memory. The existence of such systems, capable of preserving and extending a record of their past states in the face of the universal tendency toward the degradation and dissipation of information, is ultimately a consequence of the equally universal tendency toward the growth of information on the cosmic scale. Moreover, the fact that such records, when they exist, define a common arrow whose sense agrees with that defined by entropy-generating processes is explained by the same considerations that account for the common direction of entropy-generating processes in different closed systems: - The instant at which a discrete system is said to come into being serves to define a preferred direction in time, which coincides with that determined by the initial cosmic expansion. On the other hand, the specific mechanisms of information growth in open systems vary widely. In astronomical systems, gravitational contraction, accompanied by radiative energy loss, often plays a key role, especially in early evolutionary phases. Some systems, the moon for example, gain information by accretion. Living systems actively extract information from their environment and use it to preserve and extend exceedingly complex forms of internal organization. Finally, in any open system the ever-present entropy-generating processes can at any moment gain the upper hand and put an end to the growth of information. These considerations make it seem unlikely that there can exist a general principle, comparable in breadth to the second law of thermodynamics, governing the growth of information in open systems. The present discussion does show, however, that (a) the growth of information in discrete systems is not, as some authors have surmised, a consequence of entropy generation, but a directly competitive process, related - though indirectly - to the growth of information in the universe at large; and (b) the temporal direction defined by information growth in discrete systems coincides with that defined by entropy generation.

A century and a half ago, the great French mathematician Pierre-Simon Laplace expressed what I have called the physicist's microscopic view of the world in terms that have since become classic: - *

Une intelligence qui pour un instant donné connaîtrait toutes les forces dont la nature est animée et la situation respective des êtres qui la composent, si d'ailleurs elle était assez vaste pour soumettre ces données à l'analyse, embrasserait dans la même formule les mouvements des plus grands corps de l'univers et ceux du plus léger atome: rien ne serait incertain pour elle et l'avenir comme le passé serait present à ses yeux.
*Essai Philosophiques sur les Probabilités, Gauthier-Villars et Cie, Editeurs, Paris, 1921.

The last clause, in particular, underscores that in this view of the world - still generally accepted by physicists - there is nothing that corresponds to the flow of time; to Laplace's imaginary being all temporal states would be present simultaneously. The world picture that we have been led to in the present essay differs radically from Laplace's. Because the total quantity of information in the universe increases monotonically with time from an initial value of zero, the present state always contains information that was not and could not have been present in any past state. At the same time, it contains less information than any future state. Thus the future is never wholly predictable. And, although cosmic evolution is governed by immutable physical laws, the present instant always contains an element of genuine novelty because life itself involves a continual growth of information, and because consciousness enables us to experience this growth directly, the intuitive perception of the world as unfolding in time captures one of its most deep-seated yet elusive properties.

David Layzer June 24, 1971

Normal | Teacher | Scholar