How to Use Entropy

Entropy is an important property in science but it can be somewhat challenging. It is commonly understood as “disorder”, which is fine as an analogy but there are better ways to think about it. As with many concepts, especially complex ones, better understanding comes with repeated use and application. Here we look at how to use and quantify entropy in applications with steam and chemical reactions.

Entropy is rather intimidating. It’s important to the sciences of physics and chemistry but it’s also highly abstract. There are, no doubt, more than a couple of students who graduate with college degrees in the physical sciences or in engineering who don’t have much of an understanding of what it is or what to do with it. We know it’s there and that it’s a thing but we’re glad not to have to think about it any more after we’ve crammed for that final exam in thermodynamics. I think one reason for that is because entropy isn’t something that we often use. And using things is how we come to understand them, or at least get used to them.

Ludwig Wittgenstein argued in his later philosophy that the way we learn words is not with definitions or representations but by using them, over and over again. We start to learn “language games” as we play them, whether as babies or as graduate students. I was telling my daughters the other day that we never really learn all the words in a language. There are lots of words we’ll never learn and that, if we happen to hear them, mean nothing to us. To use a metaphor from Wittgenstein again, when we hear these words they’re like wheels that turn without anything else turning with them. I think entropy is sometimes like this. We know it’s a thing but nothing else turns with it. I want to plug it into the mechanism. I think we can understand entropy better by using it to solve physical problems, to see how it interacts (and “turns”) with things like heat, temperature, pressure, and chemical reactions. My theory is that using entropy in this way will help us get used to it and be more comfortable with it. So that maybe it’s a little less intimidating. That’s the object of this episode.

I’ll proceed in three parts.

1. Define what entropy is

2. Apply it to problems using steam

3. Apply it to problems with chemical reactions

What is Entropy?

I’ll start with a technical definition that might be a little jarring but I promise I’ll explain it.

Entropy is a measure of the number of accessible microstates in a system that are macroscopically indistinguishable. The equation for it is:

S = k ln W

Here S is entropy, k is the Boltzmann constant, and W is the number of accessible microstates in a system that are macroscopically indistinguishable.

Most people, if they’ve heard of entropy at all, haven’t heard it described in this way, which is understandable because it’s not especially intuitive. Entropy is often described informally as “disorder”. Like how your bedroom will get progressively messier if you don’t actively keep it clean. That’s probably fine as an analogy but it is only an analogy. I prefer to dispense with the idea of disorder altogether as it relates to entropy. I think it’s generally more confusing than helpful.

But the technical, quantifiable definition of entropy is a measure of the number of accessible microstates in a system that are macroscopically indistinguishable.

S = k ln W

Entropy S has units of energy divided by temperature, I’ll use units of J/K. The Boltzmann constant k is the constant 1.38 x 10-23 J/K. The Boltzmann constant has the same units as entropy so those will cancel, leaving W as just a number with no dimensions.

W is the number of accessible microstates in a system that are macroscopically indistinguishable. So we need to talk about macrostates and microstates. An example of a macrostate is the temperature and pressure of a system. The macrostate is something we can measure with our instruments: temperature with a thermometer and pressure with a pressure gauge. But at the microscopic or molecular level the system is composed of trillions of molecules and it’s the motion of these molecules that produce what we see as temperature and pressure at a macroscopic level. The thermal energy of the system is distributed between its trillions of molecules and every possible, particular distribution of thermal energy between each of these molecules is an individual microstate. The number of ways that thermal energy of a system can be distributed among its molecules is an unfathomably huge number. But the vast majority of them make absolutely no difference at a macroscopic level. The vast majority of the different possible microstates correspond to the same macrostate and are macroscopically indistinguishable.

To dig a little further into what this looks like at the molecular level, the motion of a molecule can take the form of translation, rotation, and vibration. Actually, in monatomic molecules it only takes the form of translation, which is just its movement from one position to another. Polyatomic molecules can also undergo rotation and vibration, with the number of vibrational patterns increasing as the number of atoms increases and shape of the molecule becomes more complicated. All these possibilities for all the molecules in a system are potential microstates. And there’s a huge number of them. Huge, but also finite. A fundamental postulate of quantum mechanics is that energy is quantized. Energy levels are not continuous but actually come in discrete levels. So there is a finite number of accessible microstates, even if it’s a very huge finite number.

For a system like a piston we can set its entropy by setting its energy (U), volume (V), and number of atoms (N); its U-V-N conditions. If we know these conditions we can predict what the entropy of the system is going to be. The reason for this is that these conditions set the number of accessible microstates. The reason that the number of accessible microstates would correlate with the number of atoms and with energy should be clear enough. Obviously having more atoms in a system will make it possible for that system to be in more states. The molecules these atoms make up can undergo translation, rotation, and vibration and more energy makes more of that motion happen. The effect of volume is a little less obvious but it has to do with the amount of energy separating each energy level. When a set number of molecules expand into a larger volume the energy difference between the energy levels decreases. So there are more energy levels accessible for the same amount of energy. So the number of accessible microstates increases.

The entropies for many different substances have been calculated at various temperatures and pressures. There’s especially an abundance of data for steam, which has had the most practical need for such data in industry. Let’s look at some examples with water at standard pressure and temperature conditions. The entropy of

Solid Water (Ice): 41 J/mol-K

Liquid Water: 69.95 J/mol-K

Gas Water (Steam): 188.84 J/mol-K

One mole of water is 18 grams. So how many microstates does 18 grams of water have in each of these cases?

First, solid water (ice):

S = k ln W

41 J/K = 1.38 x 10-23 J/K * ln W

Divide 41 J/K by 1.38 x 10-23 J/K and the units cancel

ln W = 2.97 x 1024

That’s already a big number but we’re not done yet.

Raise e (about 2.718) to the power of both sides

W = 10^(1.29 x 10^24) microstates

W = 101,290,000,000,000,000,000,000,000 microstates

That is an insanely huge number.

Using the same method, the value for liquid water is:

W = 10^(2.2 x 10^24) microstates

W = 102,200,000,000,000,000,000,000,000 microstates

And the value for steam is:

W = 10^(5.94 x 10^24) microstates

W = 105,940,000,000,000,000,000,000,000 microstates

In each case the increased thermal energy makes additional microstates accessible. The fact that these are all really big numbers makes it a little difficult to see that, since these are differences in exponents, each number is astronomically larger than the previous one. Liquid water has 10^(9.1 x 10^23) times as many accessible microstates as ice. And steam has 10^(3.74 x 10^24) times as many accessible microstates as liquid water.

With these numbers in hand let’s stop a moment to think about the connection between entropy and probability. Let’s say we set the U-V-N conditions for a system of water such that it would be in the gas phase. So we have a container of steam. We saw that 18 grams of steam has 10^(5.94 x 10^24) microstates. The overwhelming majority of these microstates are macroscopically indistinguishable. In most of the microstates the distribution of the velocities of the molecules is Gaussian; they’re not all at identical velocity but they are distributed around a mean along each spatial axis. That being said, there are possible microstates with different distributions. For example, there are 10^(1.29 x 10^24) microstates in which that amount of water would be solid ice. That’s a lot! And they’re still accessible. There’s plenty of energy there to access them. And a single microstate for ice is just as probable as a single microstate for steam. But there are 10^(4.65 x 10^24) times as many microstates for steam than there are for ice. It’s not that any one microstate for steam is more probable than any one microstate for ice. It’s just that there are a lot, lot more microstates for steam. The percentage of microstates that take the form of steam is not 99% or 99.99%. It’s much, much closer than that to 100%. Under the U-V-N conditions that make those steam microstates accessible they will absolutely dominate at equilibrium.

What if we start away from equilibrium? Say we start our container with half ice and half steam by mass. But with the same U-V-N conditions for steam. So it has the same amount of energy. What will happen? The initial conditions won’t last. The ice will melt and boil until the system just flips among the vast number of microstates for steam. If the energy of the system remains constant it will never return to ice. Why? It’s not actually absolutely impossible in principle. But it’s just unimaginably improbable.

That’s what’s going on at the molecular level. Macroscopically entropy is a few levels removed from tangible, measured properties. What we see macroscopically are relations between heat flow, temperature, pressure, and volume. But we can calculate the change in entropy between states using various equations expressed in terms of these macroscopic properties that we can measure with our instruments.

For example, we can calculate the change in entropy of an ideal gas using the following equation:

Here s is entropy, cp is heat capacity at constant pressure, T is temperature, R is the ideal gas constant, and P is pressure. We can see from this equation that, all other things being equal, entropy increases with temperature and decreases with pressure. And this matches what we saw earlier. Recall that if the volume of a system of gas increases with a set quantity of material the energy difference between the energy levels decreases and there are more energy levels accessible for the same amount of energy. Under these circumstances pressure would decrease so entropy would decrease with pressure.

For solids and liquids we can assume that they are incompressible and leave off the pressure terms. So the change in entropy for a solid or liquid is given by the equation:

Let’s do an example with liquid water. What’s the change in entropy, and the increase in the number of accessible microstates, that comes from increasing the temperature of liquid water one degree Celsius? Let’s say we’re increasing 1 mole (18 grams) of water from 25 to 26 degrees Celsius. At this temperature the heat capacity of water is 75.3 J/mol-K.

Now that we have the increase in entropy we can find the increase in the number of microstates using the equation

Setting this equal to 0.252 J/mol-K

The increase is not as high as it was with phase changes, but it’s still a very big change.

We’ll wrap up the definition section here but conclude with some general intuitions we can gather from these equations and calculations:

1. All other things being equal, entropy increases with temperature.

2. All other things being equal, entropy decreases with pressure.

3. Entropy increases with phase changes from solid to liquid to gas.

Keeping these intuitions in mind will help as we move to applications with steam

Applications with Steam

The first two examples in this section are thermodynamic cycles. All thermodynamic cycles have 4 processes.

1. Compression

2. Heat addition

3. Expansion

4. Heat rejection

These processes circle back on each other so that the cycle can be repeated. Think, for example, of pistons in a car engine. Each cycle of the piston is going through each of these processes over and over again, several times per second.

There are many kinds of thermodynamic cycles. The idealized cycle is the Carnot cycle, which gives the upper limit on the efficiency of conversion from heat to work. Otto cycles and diesel cycles are the cyles used in gasoline and diesel engines. Our steam examples will be from the Rankine cycle. In a Rankine cycle the 4 processes take the following form:

1. Isentropic compression

2. Isobaric heat addition

3. Isentropic expansion

4. Isobaric heat rejection

An isobaric process is one that occurs at constant pressure. An adiabatic process is one that occurs at constant entropy.

An example of a Rankine cycle is a steam turbine or steam engine. Liquid water passes through a boiler, the steam passes through a turbine, expanding and turning the turbine, The fluid passes through a condenser, and then is pumped back to the boiler, where the cycle repeats. In such problems the fact that entropy is the same before and after expansion through the turbine reduces the number of unknown variables in our equations.

Let’s look at an example problem. Superheated steam at 6 MPa at 600 degrees Celsius expands through a turbine at a rate of 2 kg/s and drops in pressure to 10 kPa. What’s the power output from the turbine?

We can take advantage of the fact that the entropy of the fluid is the same before and after expansion. We just have to look up the entropy of superheated steam in a steam table. The entropy of steam at 6 MPa at 600 degrees Celsius is:

The entropy of the fluid before and after expansion is the same but some of it condenses. This isn’t good for the turbines but it happens nonetheless. Ideally, most of the fluid is still vapor so the ratio of the mass that is saturated vapor to the total fluid mass is called “quality”. The entropies of saturated liquid, sf, and of evaporation, sfg, are very different. So we can use algebra to calculate the quality, x2, of the fluid. The total entropy of the expanded fluid is given by the equation:

s2 we already know because the entropy of the fluid exiting the turbine is the same as that of the fluid entering the turbine. And we can look up the other values in steam tables.

Solving for quality we find that 

Now that we know the quality we can find the work output from the turbine. The equation for the work output of the turbine is:

h1 and h2 and enthalpies before and after expansion. If you’re not familiar with enthalpy don’t worry about it (we’re getting into enough for now). It roughly corresponds to the substance’s energy. We can look up the enthalpy of the superheated steam in a steam table.

For the fluid leaving the turbine we need to calculate the enthalpy using the quality, since it’s part liquid, part vapor. We need the enthalpy of saturated liquid, hf, and of evaporation, hfg. The total enthalpy of the fluid leaving the turbine is given by the formula

From the steam tables


And now we can plug this in to get the work output of the turbine.

So here’s an example where we used the value of entropy to calculate other observable quantities in a physical system. Since the entropy was the same before and after expansion we could use that fact to calculate the quality of the fluid leaving the turbine, use quality to calculate the enthalpy of the fluid, and use the enthalpy to calculate the work output of the turbine.

A second example.  Superheated steam at 2 MPa and 400 degrees Celsius expands through a turbine to 10 kPa. What’s the maximum possible efficiency from the cycle? Efficiency is work output divided by heat input. We have to input work as well to compress the fluid with the pump so that will subtract from the work output from the turbine. Let’s calculate the work used by the pump first. Pump work is:

Where v is the specific volume of water, 0.001 m3/kg. Plugging in our pressures in kPa:

So there’s our pump work input.

The enthalpy of saturated liquid is:

Plus the pump work input is:

Now we need heat input. The enthalpy of superheated steam at 2 MPa and 400 degrees Celsius is:

So the heat input required is:

The entropy before and after expansion through the turbine is the entropy of superheated steam at 2 MPa and 400 degrees Celsius is:

As in the last example, we can use this to calculate the quality of the steam with the equation:

Looking up these values in a steam table:

Plugging these in we get:


Now we can calculate the enthalpy of the expanded fluid.

And the work output of the turbine.

So we have the work input of the pump, the heat input of the boiler, and the work output of the turbine. The maximum possible efficiency is:

So efficiency is 32.32%.

Again, we used entropy to get quality, quality to get enthalpy, enthalpy to get work, and work to get efficiency. In this example we didn’t even need the mass flux of the system. Everything was on a per kilogram basis. But that was sufficient to calculate efficiency.

One last example with steam. The second law of thermodynamics has various forms. One form is that the entropy of the universe can never decrease. It is certainly not the case that entropy can never decrease at all. Entropy decreases all the time within certain systems. In fact, all the remaining examples in this episode will be cases in which entropy decreases within certain systems. But the total entropy of the universe cannot decrease. Any decrease in entropy must have a corresponding increase in entropy somewhere else. It’s easier to see this in terms of an entropy balance.

The entropy change in a system can be negative but the balance of the change in system entropy, entropy in, entropy out, and entropy of the surroundings will never be negative. We can look at the change of entropy of the universe as a function of the entropy change of a system and the entropy change of the system’s surroundings.

So let’s look at an example. Take 2 kg of superheated steam at 400 degrees Celsius and 600 kPa and condense it by pulling heat out of the system. The surroundings have a constant temperature of 25 degrees Celsius. From steam tables the entropy of the superheated steam and saturated steam are:

With these values we can calculate the change in entropy inside the system using the following equation;

The entropy decreases inside the system. Nothing wrong with this. Entropy can definitely decrease locally. But what happens in the surroundings? We condensed the steam by pulling heat out of the system and into the surroundings. So there is positive heat flow, Q, out into the surroundings. We can find the change in entropy in the surroundings using the equation:

We know the surroundings have a constant temperature, so we know T. We just need the heat flow Q. We can calculate the heat flow into the surroundings by calculating the heat flow out of the system using the equation

So we need the enthalpies of the superheated steam and saturated steam.

And plugging these in

Q = mΔh=(2)3270.2-670.6=5199 J

Now that we have Q we can find the change in entropy in the surroundings:

The entropy of the surroundings increases. And the total entropy change of the universe is:

So even though entropy decreases in the system the total entropy change in the universe is positive.

I like these examples with steam because they’re very readily calculable. The thermodynamics of steam engines have been extensively studied for over 200 years, with scientists and engineers gathering empirical data. So we have abundant data on entropy values for steam in steam tables. I actually think just flipping through steam tables and looking at the patterns is a good way to get a grasp on the way entropy works. Maybe it’s not something you’d do for light reading on the beach but if you’re ever unable to fall asleep you might give it a try.

With these examples we’ve looked at entropy for a single substance, water, at different temperatures, pressures, and phases, and observed the differences of the value of entropy at these different states. 

To review some general observations:

1. All other things being equal, entropy increases with temperature.

2. All other things being equal, entropy decreases with pressure.

3. Entropy increases with phase changes from solid to liquid to gas.

In the next section we’ll look at entropies for changing substances in chemical reactions.

Applications with Chemical Reactions

The most important equation for the thermodynamics of chemical reactions is the Gibbs Free Energy equation:


Where H, T, S are enthalpy, temperature, and entropy. ΔG is the change in Gibbs free energy. Gibbs free energy is a thermodynamic potential. It is minimized when a system reaches chemical equilibrium. For a reaction to be spontaneous the value for ΔG has to be negative, meaning that during the reaction the Gibbs free energy is decreasing and moving closer to equilibrium.

We can see from the Gibbs free energy equation


That the value of the change in Gibbs free energy is influenced by both enthalpy and entropy. The change in enthalpy tells us whether a reaction is exothermic (negative ΔH) or endothermic (positive ΔH). Exothermic reactions release heat while endothermic reactions absorb heat. This has to do with the total change in the chemical bond energies in all the reactants against all the products. In exothermic reactions the energy released from breaking chemical bonds is greater than the energy used to form new chemical bonds. This extra energy is converted to heat. We can see from the Gibbs free energy equation that exothermic reactions are more thermodynamically favored. Nevertheless, entropy can override enthalpy.

The minus sign in front of the TS term tells us that an increase in entropy where ΔS is positive will be more thermodynamically favored. This makes sense with what we know about entropy from the second law of thermodynamics and from statistical mechanics. The effect is proportional to temperature. At low temperatures entropy won’t have much influence and enthalpy will dominate. But at higher temperatures entropy will start to dominate and override enthalpic effects. This makes it possible for endothermic reactions to proceed spontaneously. If the increase in entropy for a chemical reaction is large enough and the temperature is high enough endothermic reactions can proceed spontaneously, even though the energy required to form the chemical bonds of the products is more than the energy released from the chemical bonds in the reactants.

Let’s look at an example. The chemical reaction for the production of water from oxygen and hydrogen is:

We can look up the enthalpies and entropies of the reactants and products in chemical reference literature. What we need are the standard enthalpies of formation and the standard molar entropies of each of the components.

The standard enthalpies of formation of oxygen and hydrogen are both 0 kJ/mol. By definition, all elements in their standard states have a standard enthalpy of formation of zero. The standard enthalpy of formation for water is -241.83 kJ/mol. The total change in enthalpy for this reaction is

It’s negative which means that the reaction is exothermic and enthalpically favored.

The standard molar entropies for hydrogen, oxygen, and water are, respectively, 130.59 J/mol-K, 205.03 J/mol-K, and 188.84 J/mol-K. The total change in entropy for this reaction is

It’s negative so entropy decreases in this reaction, which means the reaction is entropically disfavored. So enthalpy and entropy oppose each other in this reaction. Which will dominate depends on temperature? At 25 degrees Celsius (298 K) the change in Gibbs free energy is

The reaction is thermodynamically favored. Even though entropy is reduced in this reaction, at this temperature that effect is overwhelmed by the favorable reduction in enthalpy as chemical bond energy of the reactants is released as thermal energy.

Where’s the tradeoff point where entropy overtakes enthalpy? This is a question commonly addressed in polymer chemistry with what’s called the ceiling temperature. Polymers are macromolecules in which smaller molecular constituents called monomers are consolidated into larger molecules. We can see intuitively that this kind of molecular consolidation constitutes a reduction in entropy. It corresponds with the rough analogy of greater order from “disorder” as disparate parts are assembled into a more organized totality. And that analogy isn’t bad. So in polymer production it’s important to run polymerization reactions at temperatures where exothermic, enthalpy effects dominate. The upper end of this temperature range is the ceiling temperature.

The ceiling temperature is easily calculable from the Gibbs free energy equation for polymerization

Set ΔGp to zero.

And solve for Tc

At this temperature enthalpic and entropic effects are balanced. Below this temperature polymerization can proceed spontaneously. Above this temperature depolymerization can proceed spontaneously.

Here’s an example using polyethylene. The enthalpies and entropies of polymerization for polyethylene are

Using our equation for the ceiling temperature we find

So for a polyethylene polymerization reaction you want to run the reaction below 610 degrees Celsius so that the exothermic, enthalpic benefit overcomes your decrease in entropy.


A friend and I used to get together on weekends to take turns playing the piano, sight reading music. We were both pretty good at it and could play songs reasonably well on a first pass, even though we’d never played or seen the music before. One time when someone was watching us she asked, “How do you do that?” My friend had a good explanation I think. He explained it as familiarity with the patterns of music and the piano. When you spend years playing songs and practicing scales you just come to know how things work. Another friend of mine said something similar about watching chess games. He could easily memorize entire games of chess because he knew the kinds of moves that players would tend to make. John Von Neumann once said: “In mathematics you don’t understand things. You just get used to them.” I would change that slightly to say that you understand things by getting used to them. Also true for thermodynamics. Entropy is a complex property and one that’s not easy to understand. But I think it’s easiest to get a grasp on it by using it.

Philosophy of Structure, Part 1: Thinking About Structure

This is the first in a series of episodes exploring a philosophy of structure. To introduce the concept and to start thinking about structure I make use of ideas from thermodynamics, phase spaces, information theory, The Three-Body Problem, the Hebrew Bible, the Book of Mormon, Daniel Dennett, and Jorge Luis Borges. Drawing from a variety of sources the objective is to find patterns in structures across multiple fields, to understand a general structure of structure.

The Hebrew Bible describes the condition of the earth before creation as תֹהוּ וָבֹהוּ (tohu va-bohu), “formless and void”. Both terms, tohu and bohu, convey formlessness and emptiness. It’s an interesting pair of ideas. And carrying these ideas beyond their original narrative setting, I’m intrigued by the thought that lack of form, or structure, could be understood also as a kind of emptiness, or nothingness.

With this episode I’d like to start what I intend to be a series of episodes about structure, looking at a philosophy of structure. Today I just want to introduce some general ideas and then explore particular examples of structure in more detail in later episodes, looking at structure in music, chemistry, biology, and other fields.

To introduce the subject I’d like to pull together some ideas from different subjects that range from highly technical and quantitatively rigorous to conceptual and qualitative. There are tools in physics and in information theory that can give very specific measures of certain kinds of structures. And those tools are also conceptually instructive. But I don’t think those measures exhaust or cover everything that we mean by or understand structure to be. So all of this will fall into a diverse toolbox of ways to think about structure and to approach the topic.

Going back to the Hebrew Bible and the primordial condition of tohu va-bohu, if we think of this condition as a lack of structure, the kind of emptiness or nothingness I imagine in the lack of structure is not absolute nothingness, whatever that might be. But it’s nothingness of the sort of there not being anything very interesting. Even if there’s “stuff” there there’s not really anything going on. Or even if there’s stuff going on, like as a whirling, chaotic mass, it still amounts to uniformity with all the pieces just canceling each other out, adding up to not very much.

Part of the lack is an aesthetic lack. An absence of engaging content. There’s a great literary illustration of this idea in Cixin Liu’s novel The Three Body Problem. This is from a scene where one of the characters, who is suffering from a mental illness, is meditating and trying to heal his troubled mind:

“In my mind, the first ‘emptiness’ I created was the infinity of space. There was nothing in it, not even light. But soon I knew that this empty universe could not make me feel peace. Instead, it filled me with a nameless anxiety, like a drowning man wanting to grab on to anything at hand. So I created a sphere in this infinite space for myself: not too big, though possessing mass. My mental state didn’t improve, however. The sphere floated in the middle of ‘emptiness’—in infinite space, anywhere could be the middle. The universe had nothing that could act on it, and it could act on nothing. It hung there, never moving, never changing, like a perfect interpretation for death. I created a second sphere whose mass was equal to the first one’s. Both had perfectly reflective surfaces. They reflected each other’s images, displaying the only existence in the universe other than itself. But the situation didn’t improve much. If the spheres had no initial movement—that is, if I didn’t push them at first—they would be quickly pulled together by their own gravitational attraction. Then the two spheres would stay together and hang there without moving, a symbol for death. If they did have initial movement and didn’t collide, then they would revolve around each other under the influence of gravity. No matter what the initial conditions, the revolutions would eventually stabilize and become unchanging: the dance of death. I then introduced a third sphere, and to my astonishment, the situation changed completely… This third sphere gave ‘emptiness’ life. The three spheres, given initial movements, went through complex, seemingly never-repeating movements. The descriptive equations rained down in a thunderstorm without end. Just like that, I fell asleep. The three spheres continued to dance in my dream, a patternless, never-repeating dance. Yet, in the depths of my mind, the dance did possess a rhythm; it was just that its period of repetition was infinitely long. This mesmerized me. I wanted to describe the whole period, or at least a part of it. The next day I kept on thinking about the three spheres dancing in ‘emptiness.’ My attention had never been so completely engaged.”

And that’s a very imaginative description of what’s known in physics and mathematics as the three-body problem, which unlike a two-body problem has no closed-form solution, which is the reason for the unending, non-repeating motion. What I like about this story is the way the character responds to the increasing structure in his mental space. As structure increases he becomes increasingly engaged. I think this subjective response to structure will have to be an indispensable aspect of any philosophy of structure.

Another literary, or scriptural example, of this idea is in Latter-day Saint scripture in the Book of Mormon. A prophet named Lehi talks about how existence itself depends on the tension between opposites: “For it must needs be, that there is an opposition in all things. If not so… righteousness could not be brought to pass, neither wickedness, neither holiness nor misery, neither good nor bad. Wherefore, all things must needs be a compound in one; wherefore, if it should be one body it must needs remain as dead, having no life neither death, nor corruption nor incorruption, happiness nor misery, neither sense nor insensibility.” (2 Nephi 2:11)

There’s a similar idea here of “death” as with Cixin Liu’s character who finds only death in his static or repetitive mental structures.

Another idea that comes up in both the aesthetic and technical instances of structure is that of distinction. Lehi talks about opposition, setting one thing against another. We could say that the opposing entities endow each other with definition and identity. The Hebrew Bible also contrasts the formlessness and emptiness, the תֹהוּ וָבֹהוּ (tohu va-bohu), with separation. Elohim brings order the earth by separating things, the verb in the Bible of separation is בָּדל (badal). וַיַּבְדֵּ֣ל אֱלֹהִ֔ים בֵּ֥ין הָאֹ֖ור וּבֵ֥ין הַחֹֽשֶׁךְ (va-yavdel elohim ben ha-or u-ben ha-choshek); “and God separated the light from the darkness” (Genesis 1:4). God separated the sky from the sea, the day from the night. Through separation what was a formless void came to have structure.

What are some examples of structure from a more technical side, scientifically and philosophically? Interestingly enough there are actually some concepts that overlap with these literary and scriptural ones, sharing notions of both distinction and form.

One way to think about a system is by using a phase space. A phase space is an abstract space in which all possible states of a system are represented, with each possible state corresponding to one unique point in the phase space. This is also called a state space. To get the general concept of an abstract space you can think of a graph with a horizontal axis and a vertical axis, each axis representing some property. The points on the graph represent different combinations of the two properties. That’s a kind of phase space.

To give a very simple example, consider a system of 2 particles in 1-dimensional space, which is just a line. The state space containing all possible arrangements of a system of n particles in 1-dimensional space will have 1n dimensions. Such a space dealing strictly with positions is also called a configuration space.  So for our 2 particle system the configuration space will have 2 dimensions. We can represent this on a graph using a horizontal axis for one particle and a vertical axis for the second particle. Any point on the graph represents a single combination of positions for particles 1 and 2. That example is nice because it’s visualizable. When we expand to more than 3 dimensions we lose that visualizability but the basic principles still apply.

A classic example of a phase space is for a system of gas particles. Say we have a gas with n particles. These particles can have several arrangements. That’s putting it mildly. The collection of all possible arrangements makes up a configuration space of 3n dimensions, 3 dimensions for every particle. Such a space could have billions upon billions of dimensions. This is not even remotely visualizable but the principles are the same as in our 2 dimensional configuration space above. A single point in this configuration space is one possible arrangement of all n particles. It’s like a snapshot of the system at a single instant in time. All the points in the configuration space comprise all possible arrangements of these n particles.

To get a more complete picture of the system a phase space will have 3 additional dimensions for the momentum of each particle in the 3 spatial directions. So the phase space will have 6n dimensions. A snapshot of the system, with the positions and momenta of every particle at one instant, will occupy a single point on the 6n dimensional phase space and the entire 6n dimensional phase space will contain all possible combinations of position and momenta for the system. The evolution of a system through successive states will trace out a path in its phase space. It’s just a mind-bogglingly enormous space. There’s no way we can actually imagine this in any sort of detail. But just the concept is useful.

The sum total of all possible states that a system can take constitutes a tremendous amount of information, but most states in a phase space aren’t especially interesting and I’d suggest that this is because they aren’t very structured. One useful affordance of phase spaces is that we can collect groups of states and categorize them according to criteria of interest to us. The complete information about a single state is called a microstate. A microstate is complete and has all the information about that system’s state. So for example, in the case of a system of gas particles the microstate gives the position and momentum of every particle in the system. But for practical purposes that’s too much information. To see if there’s anything interesting going on we need to look at the state of a system at a lower resolution, at its macrostate. The procedure of moving from the maximal information of microstates to the lower resolution of macrostates is called coarse graining. In coarse graining we divide the phase space up into large regions that contain groups of  microstates that are macroscopically indistinguishable. We can represent this pictorially as a surface divided up into regions of different sizes. The states in a shared region are not microscopically identical. In the case of a system of particles, the states have different configurations of positions and momenta for the particles composing them. But the states in the shared region are macroscopically indistinguishable, meaning that they share some macroscopic property. Examples of such macroscopic properties for a gas are temperature, pressure, volume, and density.

The size of a macrostate is given by the number of microstates included in it. Macrostates of phase spaces can have very different sizes. Some states are very unique and occupy tiny regions of the phase space. Other states are very generic and occupy enormous regions. A smaller macrostate would be one where there are fewer microstates that could produce it. It’s more unique. Larger phase spaces are more generic. An example of an enormous macrostate region is thermodynamic equilibrium. Thermodynamic equilibrium is a condition in which the macroscopic properties of a system do not change with time. So, for example, macroscopic properties like temperature, pressure, volume, and density would not change in thermodynamic equilibrium. The reason the space of thermodynamic equilibrium is huge is because it contains a huge number of macroscopically indistinguishable microstates. What this means is that a condition of thermodynamic equilibrium can be realized in an enormous number of ways. In a gas for instance, the particle can have an enormous number of different configurations of positions and momenta that make no difference to the macroscopic properties and that all manifest as a condition of thermodynamic equilibrium. The system will continue to move through different microstates with time, tracing out a curve in phase space. But because the macrostate of thermodynamic equilibrium is so huge the curve will remain in that region. The system is not going to naturally evolve from thermodynamic equilibrium to some more unique state. It is so statistically unlikely to be, for our practical considerations, a non-possibility.

I think of the Biblical תֹהוּ וָבֹהוּ (tohu va-bohu) as a kind of thermodynamic equilibrium. It’s not necessarily that there’s nothing there. But in a sense, nothing is happening. Sure, individual particles may be moving around, but not in any concerted way that will produce anything macroscopically interesting.

This thermodynamic equilibrium is the state toward which systems naturally tend. The number of indistinguishable microstates in a macrostate, the size of a macrostate, is quantified as the property called entropy. Sometimes we talk about entropy informally as a measure of disorder. And that’s well enough. It also corresponds nicely, albeit inversely, to the notion of structure. More formally, the entropy of a macrostate is correlated (logarithmically) to the number of microstates corresponding to that macrostate. Using the intuition of the informal notion of disorder you might see how a highly structured macrostate would have fewer microstates corresponding to it. There aren’t as many ways to put the pieces together into a highly structured state as there are to put them into a disordered state.

Some notions related to structure, like meaning or function, are fairly informal. But in the case of entropy it’s actually perfectly quantifiable. And there are equations for it in physics. If the number of microstates for a given macrostate is W, then the entropy of that macrostate is proportional to the logarithm of W, the logarithm of the number of microstates. This is Boltzmann’s equation for entropy:

S = k log W

in which the constant k is the Boltzmann constant, 1.381 × 10−23 J/K and entropy has units of J/K. This equation for entropy holds when all microstates of a given macrostate are equally probable. But if this is not the case then we need another equation to account for the different probabilities. That equation is:

S = -k ∑ pi log pi

or equivalently,

S = k ∑ pi log (1/pi)

Where pi is the probability of each microstate. This reduces to the first equation if the probabilities of all the microstates are equal and pi = 1/W.

How might W look differently for different macrostates? It’s fairly easy to imagine that a state of thermodynamic equilibrium would have a huge number of indistinguishable microstates. But what if the system has some very unusual macrostate? For example, say all the gas particles in a container were compressed into a tiny region of the available volume. This could still be configured in multiple ways, with many microstates, but far fewer than if they were distributed evenly throughout the entire volume. Under such constraints the particles have far fewer degrees of freedom and the entropy of that unusual configuration would be much lower.

Let’s think about the different sizes of macrostates and the significance of those different sizes in another way, using an informal, less technical, literary example. One of my favorite short stories is La biblioteca de Babel, “The Library of Babel”, by Argentine author Jorge Luis Borges. Borges was a literary genius and philosophers love his stories. This is probably the story referred to most, and for good reason. In La biblioteca de Babel Borges portrays a universe composed of “an indefinite and perhaps infinite number of hexagonal galleries”. This universe is one vast library of cosmic extension. And the library contains all possible books. “Each book is of four hundred and ten pages; each page, of forty lines, each line, of some eighty letters… All the books, no matter how diverse they might be, are made up of the same elements: the space, the period, the comma, the twenty-two letters of the alphabet.” So there are bounds set to the states this library or any of its books can take. But this still permits tremendous variability. As an analogy with statistical mechanics we can think of the Library of Babel as a phase space and of each book as a microstate.

Daniel Dennett has referred to the Library of Babel in his philosophy and proposed some of the books that, under the conditions set by Borges, must be understood to exist in this library: “Somewhere in the Library of Babel is volume consisting entirely of blank pages, and another volume is all question marks, but the vast majority consist of typographical gibberish; no rules of spelling or grammar, to say nothing of sense, prohibit the inclusion of a volume… It is amusing to think about some of the volumes that must be in the Library of Babel somewhere. One of them is the best, most accurate 500-page biography of you, from the moment of your birth until the moment of your death. Locating it, however, would be all but impossible (that slippery word), since the Library also contains kazillions of volumes that are magnificently accurate biographies of you up till your tenth, twentieth, thirtieth, fortieth… birthday, and completely false about subsequent events… Moby Dick is in the Library of Babel, of course, but so are 100,000,000 mutant impostors that differ from the canonical Moby Dick by a single typographical error. That’s not yet a Vast number, but the total rises swiftly when we add the variants that differ by 2 or 10 or 1,000 typos.” (Darwin’s Dangerous Idea)

A key takeaway from this fantastical story is that only an infinitesimal portion of its volumes are even remotely meaningful to readers. The vast majority of the books are complete nonsense. The Library of Babel is a little easier for me to think about in certain ways than phase space. For many things I’m not sure how to generate a phase space by picking out specific properties and assigning them axes onto which individual states would project with numerical coordinates. A lot of things don’t easily lend themselves to that kind of technical breakdown. But thinking of microstates and macrostates more informally, let’s just take the macrostate of all the books in the Library of Babel that are completely meaningless. This would be a huge macrostate comprising the vast majority of the library, the vast majority of its books, i.e. microstates. As with thermodynamic equilibrium this is the most likely macrostate to be in. And the evolution system, moving from one book to the next, will more than likely never leave it, i.e. will never find a book with any meaningful text.

But the Library of Babel does contain meaningful texts. And we could coarse grain in such a way to assign books to different macrostates based on the amount of meaningful text they contain. After the macrostate containing books of complete nonsense the next largest macrostate will contain books with a few meaningful words. The macrostates for books with more and more meaningful words get successively smaller. And even smaller macrostates when those words are put into meaningful sentences and then paragraphs. The smallest macrostates will have entire books of completely meaningful text. But as any book browser knows, books vary in quality. Even among books of completely meaningful text some will be about as interesting as an online political flame war. The macrostates of literary classics and of interesting nonfiction will be comparatively miniscule indeed.

We can think of a book in the Library of Babel as a kind of message. And this is to start thinking about in terms of another technical field that I think is relevant to a philosophy of structure. And that is information theory. Information theory has some interesting parallels to statistical mechanics. And it even makes use of a concept of entropy that is very similar to the thermodynamic concept of entropy. In information theory this is sometimes called Shannon entropy, named after the great mathematician Claude Shannon. The Shannon entropy of a random variable is the average level of “information”, “surprise”, or “uncertainty” inherent in the variable’s possible outcomes. It’s calculated in a very similar way to thermodynamic entropy. If Shannon entropy is H and a discrete random variable X has possible outcomes x1,…,xn, with probabilities P(x1),…,P(xn), then the entropy is calculated by the equation:

H(X) = – ∑ P(xi) logb (xi)

That equation should look very familiar because it’s identical in form to the equation for thermodynamic entropy, in the case where the microstates have different probabilities. The base of the logarithm is b, often base 2, with the resulting entropy being given in units of bits. A bit is a basic unit of data that represents a logical state with one of two possible values. So for example, whether a value is 0 or 1 is a single bit of information. Another example is whether a coin toss result is heads or tails.

An equation of this form gives an average quantity. The value -log pi, or equivalently log (1/pi), is the “surprise” for a single outcome and so has a higher value when its probability is lower, which makes sense. More improbable outcomes should be more surprising. When the surprise values for all the outcomes, multiplied by their respective probabilities are summed together, i.e. the average surprise, the total is the Shannon entropy. This average quantity can also be used to calculate the total information the message contains. It’s equal to the entropy of the message per bit, multiplied by the length of the message. That gives the total information content of the message.

This is an extremely useful way to quantify information and this is just a taste of the power of information theory. Even so I don’t think it exhausts our notions of what information is, or can be. Daniel Dennett makes a distinction between Shannon information and semantic information (From Bacteria to Bach and Back). Shannon information is the kind of information studied in information theory. To explore this distinction let’s return to Borges’s library.

One thing I like about La biblioteca de Babel is the way it conveys the intense human reaction to semantic meaning, or the incredible lack of it in the case of the library’s inhabitants. The poor souls of the Library of Babel are starving for meaning and tortured by the utter senselessness of their universe, the meaningless strings of letters and symbols they find in book after book, shelf after shelf, floor after floor. There’s a brilliant literary expansion of Borges’s library in the novella A Short Stay in Hell by Steven L. Peck, in which one version of Hell itself actually is the Library of Babel, with the single horrific difference that its inhabitants can never die.

In terms of Shannon information most books on the shelves of the Library of Babel contain a lot of information. But almost all the books contain no semantic information whatsoever. This is an evaluation we are only able to make as meaning-seeking creatures. Information theory doesn’t need to make distinctions about semantic meaning. It doesn’t need to and is able to accomplish its scope of work without it. But when we’re thinking about structures, with meaning and functions, in the way I’m trying to, we need that extra level of evaluation that is, at least for now, only available to humans.

That’s not to say there’s an absolute, rigid split between the objective and subjective. Information theory makes use of the subjective phenomena of human perceptions like sight and sound. This is critical to perceptual coding. We’re all beneficiaries of perceptual coding when we use jpeg images, mp3 audio files, and mp4 video files. These are all file types that compress data, with loss, by disposing of information that is determined to be imperceptible to human senses.

That’s getting a little closer to semantic information. Semantic information doesn’t have to be transmitted word for word. Someone can get the “gist” of a message and convey it with quite high fidelity, in a certain sense anyway. The game of telephone notwithstanding, we can share stories with each other without recording and replaying precise scripts of characters or sound wave patterns. We can recreate the stories in our own words at the moment of the telling.

That’s not to say that structure has to be about perception. Something like a musical composition or narrative has a lot to do with perception and aesthetic receptivity. But even compositions and narratives can contain structure that few people or even no people pick up on. And there are also structures in nature and in mathematics that remain hidden from human knowledge until they are discovered.

I think there are some affinities between what I will informally call the degree of structure in the hidden and discovered structures in nature and mathematics and the degree to which the outputs of those structures can be compressed. Data compression is the process of encoding information using fewer bits than the original representation. How is that possible? Data compression programs exploit regularities and patterns in the data and produce code to represent the data more efficiently. Such a program creates a new file using a new, shorter code, along with a dictionary for the code so that the original data can be restored. It’s these regularities and patterns that I see being characteristic of structure. This can be quantified in terms of Kolmogorov complexity.

The Kolmogorov complexity of an object is the length of the shortest computer program that produces the object as output. For an object with no structure that is completely random the only way for a computer program to produce the object as an output is just to reproduce it in its entirety. Because there are no regularities or patterns to exploit. But for a highly structured object the computer program to produce it can be much shorter. This is especially true if the output is the product of some equation or algorithm.

For example an image of part of the Mandelbrot set fractal might take 1.61 million bytes to store the 24-bit color of each pixel. But the Mandelbrot set is also the output of a simple function that is actually fairly easy to program. It’s not necessary to reproduce the 24-bit color of each pixel. Instead you can just encode the function and the program will produce the exact output. The Mandelbrot set is a good example for illustration because the fractal it produces is very elegant. But the same kind of process will work with any kind of function. Usually the program for a function will be much shorter than the data set of its output.

Often scientific discovery is a matter of finding natural structures by working backward from the outputs to infer the functions that produce them. This is the project of trying to discover the laws of nature. Laws are the regularities and patterns at work in nature. The process can be tricky because there are often many confounding factors and variables are rarely isolated. But sorting through all that is part of the scientific process. As a historical example, Johannes Kepler had in his possession a huge collection of astronomical data that had been compiled over decades. Much of it he had even inherited from his mentor Tycho Brahe. What Kepler was ultimately able to do was figure out that the paths traced out by the recorded positions of the planets in space were ellipses. The equation for an ellipse is fairly simple. Now knowing that underlying regularity makes it possible not only for us to reproduce Brahe and Kepler’s original data sets. But we can retrodict and predict the positions of planets outside those data sets, because we have the governing equations. 

That kind of pattern-finding often works well in discerning natural structures. It’s less relevant to human structures where creativity, novelty, and unpredictability can actually be features of greater aesthetic structure. It’s for reasons like this that my approach to a philosophy of structure is highly varied and somewhat unsystematic, pulling pieces together from several places.

Structure seems especially important in the arts and a philosophy of structure in the arts will necessarily overlap with the study of aesthetics. It’s really creative, artistic structures that I find most interesting of all.

Dieter F. Uchtdorf talked about human creativity in a way that I think touches on the key aesthetic features of structure. He said: “The desire to create is one of the deepest yearnings of the human soul… We each have an inherent wish to create something that did not exist before… Creation brings deep satisfaction and fulfillment. We develop ourselves and others when we take unorganized matter into our hands and mold it into something of beauty.” (italics added) A number of important ideas here. I’ll focus on two: (1) that creation is bringing into existence something that did not exist before and (2) that creation is a process of taking unorganized matter and molding it into something of beauty. This coheres with the idea I proposed earlier of the Hebrew creation story, that the lack of form, or structure, in the primordial chaos could be understood also as a kind of emptiness, or nothingness. By imposing a new structure onto raw, unorganized materials it’s possible to bring into existence something that did not exist before.

This is similar to Aristotle’s idea of a formal cause. In Aristotle’s metaphysics he identified four kinds of causes: material, formal, efficient, and final. We’ll just look at the first two here. The material cause is the raw material that composes whatever is being brought about. If we want to understand how a wooden table is created the material cause is the wood used to make it. That’s the unorganized matter. The formal cause is the form, arrangement, shape, or structure, into which this material is fashioned. Clearly the formal cause is just as important to bringing the object about.

The ways we evaluate structure and its aesthetic virtues, its beauty, is a complex subject. Are aesthetic criteria objective or subjective? The aesthetic response is certainly a subjective process. But is the subjective response a consistent and law-like process that correlates to objective features? It’s difficult to say.

David Bentley Hart said of aesthetics: “The very nature of aesthetic enjoyment resists conversion into any kind of calculable economy of personal or special benefits. We cannot even isolate beauty as an object among other objects, or even as a clearly definable property; it transcends every finite description. There have, admittedly, been attempts in various times and places to establish the ‘rules’ that determine whether something is beautiful, but never with very respectable results… Yes, we take pleasure in color, integrity, harmony, radiance, and so on; and yet, as anyone who troubles to consult his or her experience of the world knows, we also frequently find ourselves stirred and moved and delighted by objects whose visible appearances or tones or other qualities violate all of these canons of aesthetic value, and that somehow ‘shine’ with a fuller beauty as a result. Conversely, many objects that possess all these ideal features often bore us, or even appall us, with their banality. At times, the obscure enchants us and the lucid leaves us untouched; plangent dissonances can awaken our imaginations far more delightfully than simple harmonies that quickly become insipid; a face almost wholly devoid of conventionally pleasing features can seem unutterably beautiful to us in its very disproportion, while the most exquisite profile can do no more than charm us… Whatever the beautiful is, it is not simply harmony or symmetry, or consonance or ordonnance or brightness, all of which can become anodyne or vacuous of themselves; the beautiful can be encountered—sometimes shatteringly—precisely where all of these things are deficient or largely absent. Beauty is something other than the visible or audible or conceptual agreement of parts, and the experience of beauty can never be wholly reduced to any set of material constituents. It is something mysterious, prodigal, often unanticipated, even capricious.” (The Experience of God)

These are good points. Aesthetic judgment is difficult to systematize. And I can’t say I know of any theory that successfully defines precise evaluative procedures from objective criteria. But neither is that to say that aesthetic judgment is arbitrary. There are easy cases where there is near universal agreement that artistic creations are of high or low quality. And there are also harder cases where appreciation for high quality art requires refined tastes, refined through training and initiation into an artistic practice. Even the best critics are not able to fully articulate their reasons for making the judgments they do. And they may have imprecise vocabulary that is incomprehensible to those outside the practice. Sommeliers and wine tasters, for example, have a vocabulary for their craft that goes completely over my head (and taste buds). But I don’t doubt that the vocabulary is meaningful to them. I believe all these artforms have structures to which we can refer, if only imprecisely.

Having looked briefly in this episode at some general ideas pertaining to structure, what I want to do in following episodes for the series is look closely at examples of structure in more detail, focusing on individual fields, one at a time. Like music, chemistry, biology, language, social and political organizations, and mathematics. I expect that the characteristics of structure in these different cases will be varied. But I hope that as the coverage gets more comprehensive it will give more opportunity for insight into the general nature of structure. I hope through some inductive and abductive reasoning to infer general patterns of structure across these various domains, to understand a general structure of structure.