How to Use Entropy

Entropy is an important property in science but it can be somewhat challenging. It is commonly understood as “disorder”, which is fine as an analogy but there are better ways to think about it. As with many concepts, especially complex ones, better understanding comes with repeated use and application. Here we look at how to use and quantify entropy in applications with steam and chemical reactions.

Entropy is rather intimidating. It’s important to the sciences of physics and chemistry but it’s also highly abstract. There are, no doubt, more than a couple of students who graduate with college degrees in the physical sciences or in engineering who don’t have much of an understanding of what it is or what to do with it. We know it’s there and that it’s a thing but we’re glad not to have to think about it any more after we’ve crammed for that final exam in thermodynamics. I think one reason for that is because entropy isn’t something that we often use. And using things is how we come to understand them, or at least get used to them.

Ludwig Wittgenstein argued in his later philosophy that the way we learn words is not with definitions or representations but by using them, over and over again. We start to learn “language games” as we play them, whether as babies or as graduate students. I was telling my daughters the other day that we never really learn all the words in a language. There are lots of words we’ll never learn and that, if we happen to hear them, mean nothing to us. To use a metaphor from Wittgenstein again, when we hear these words they’re like wheels that turn without anything else turning with them. I think entropy is sometimes like this. We know it’s a thing but nothing else turns with it. I want to plug it into the mechanism. I think we can understand entropy better by using it to solve physical problems, to see how it interacts (and “turns”) with things like heat, temperature, pressure, and chemical reactions. My theory is that using entropy in this way will help us get used to it and be more comfortable with it. So that maybe it’s a little less intimidating. That’s the object of this episode.

I’ll proceed in three parts.

1. Define what entropy is

2. Apply it to problems using steam

3. Apply it to problems with chemical reactions

What is Entropy?

I’ll start with a technical definition that might be a little jarring but I promise I’ll explain it.

Entropy is a measure of the number of accessible microstates in a system that are macroscopically indistinguishable. The equation for it is:

S = k ln W

Here S is entropy, k is the Boltzmann constant, and W is the number of accessible microstates in a system that are macroscopically indistinguishable.

Most people, if they’ve heard of entropy at all, haven’t heard it described in this way, which is understandable because it’s not especially intuitive. Entropy is often described informally as “disorder”. Like how your bedroom will get progressively messier if you don’t actively keep it clean. That’s probably fine as an analogy but it is only an analogy. I prefer to dispense with the idea of disorder altogether as it relates to entropy. I think it’s generally more confusing than helpful.

But the technical, quantifiable definition of entropy is a measure of the number of accessible microstates in a system that are macroscopically indistinguishable.

S = k ln W

Entropy S has units of energy divided by temperature, I’ll use units of J/K. The Boltzmann constant k is the constant 1.38 x 10-23 J/K. The Boltzmann constant has the same units as entropy so those will cancel, leaving W as just a number with no dimensions.

W is the number of accessible microstates in a system that are macroscopically indistinguishable. So we need to talk about macrostates and microstates. An example of a macrostate is the temperature and pressure of a system. The macrostate is something we can measure with our instruments: temperature with a thermometer and pressure with a pressure gauge. But at the microscopic or molecular level the system is composed of trillions of molecules and it’s the motion of these molecules that produce what we see as temperature and pressure at a macroscopic level. The thermal energy of the system is distributed between its trillions of molecules and every possible, particular distribution of thermal energy between each of these molecules is an individual microstate. The number of ways that thermal energy of a system can be distributed among its molecules is an unfathomably huge number. But the vast majority of them make absolutely no difference at a macroscopic level. The vast majority of the different possible microstates correspond to the same macrostate and are macroscopically indistinguishable.

To dig a little further into what this looks like at the molecular level, the motion of a molecule can take the form of translation, rotation, and vibration. Actually, in monatomic molecules it only takes the form of translation, which is just its movement from one position to another. Polyatomic molecules can also undergo rotation and vibration, with the number of vibrational patterns increasing as the number of atoms increases and shape of the molecule becomes more complicated. All these possibilities for all the molecules in a system are potential microstates. And there’s a huge number of them. Huge, but also finite. A fundamental postulate of quantum mechanics is that energy is quantized. Energy levels are not continuous but actually come in discrete levels. So there is a finite number of accessible microstates, even if it’s a very huge finite number.

For a system like a piston we can set its entropy by setting its energy (U), volume (V), and number of atoms (N); its U-V-N conditions. If we know these conditions we can predict what the entropy of the system is going to be. The reason for this is that these conditions set the number of accessible microstates. The reason that the number of accessible microstates would correlate with the number of atoms and with energy should be clear enough. Obviously having more atoms in a system will make it possible for that system to be in more states. The molecules these atoms make up can undergo translation, rotation, and vibration and more energy makes more of that motion happen. The effect of volume is a little less obvious but it has to do with the amount of energy separating each energy level. When a set number of molecules expand into a larger volume the energy difference between the energy levels decreases. So there are more energy levels accessible for the same amount of energy. So the number of accessible microstates increases.

The entropies for many different substances have been calculated at various temperatures and pressures. There’s especially an abundance of data for steam, which has had the most practical need for such data in industry. Let’s look at some examples with water at standard pressure and temperature conditions. The entropy of

Solid Water (Ice): 41 J/mol-K

Liquid Water: 69.95 J/mol-K

Gas Water (Steam): 188.84 J/mol-K

One mole of water is 18 grams. So how many microstates does 18 grams of water have in each of these cases?

First, solid water (ice):

S = k ln W

41 J/K = 1.38 x 10-23 J/K * ln W

Divide 41 J/K by 1.38 x 10-23 J/K and the units cancel

ln W = 2.97 x 1024

That’s already a big number but we’re not done yet.

Raise e (about 2.718) to the power of both sides

W = 10^(1.29 x 10^24) microstates

W = 101,290,000,000,000,000,000,000,000 microstates

That is an insanely huge number.

Using the same method, the value for liquid water is:

W = 10^(2.2 x 10^24) microstates

W = 102,200,000,000,000,000,000,000,000 microstates

And the value for steam is:

W = 10^(5.94 x 10^24) microstates

W = 105,940,000,000,000,000,000,000,000 microstates

In each case the increased thermal energy makes additional microstates accessible. The fact that these are all really big numbers makes it a little difficult to see that, since these are differences in exponents, each number is astronomically larger than the previous one. Liquid water has 10^(9.1 x 10^23) times as many accessible microstates as ice. And steam has 10^(3.74 x 10^24) times as many accessible microstates as liquid water.

With these numbers in hand let’s stop a moment to think about the connection between entropy and probability. Let’s say we set the U-V-N conditions for a system of water such that it would be in the gas phase. So we have a container of steam. We saw that 18 grams of steam has 10^(5.94 x 10^24) microstates. The overwhelming majority of these microstates are macroscopically indistinguishable. In most of the microstates the distribution of the velocities of the molecules is Gaussian; they’re not all at identical velocity but they are distributed around a mean along each spatial axis. That being said, there are possible microstates with different distributions. For example, there are 10^(1.29 x 10^24) microstates in which that amount of water would be solid ice. That’s a lot! And they’re still accessible. There’s plenty of energy there to access them. And a single microstate for ice is just as probable as a single microstate for steam. But there are 10^(4.65 x 10^24) times as many microstates for steam than there are for ice. It’s not that any one microstate for steam is more probable than any one microstate for ice. It’s just that there are a lot, lot more microstates for steam. The percentage of microstates that take the form of steam is not 99% or 99.99%. It’s much, much closer than that to 100%. Under the U-V-N conditions that make those steam microstates accessible they will absolutely dominate at equilibrium.

What if we start away from equilibrium? Say we start our container with half ice and half steam by mass. But with the same U-V-N conditions for steam. So it has the same amount of energy. What will happen? The initial conditions won’t last. The ice will melt and boil until the system just flips among the vast number of microstates for steam. If the energy of the system remains constant it will never return to ice. Why? It’s not actually absolutely impossible in principle. But it’s just unimaginably improbable.

That’s what’s going on at the molecular level. Macroscopically entropy is a few levels removed from tangible, measured properties. What we see macroscopically are relations between heat flow, temperature, pressure, and volume. But we can calculate the change in entropy between states using various equations expressed in terms of these macroscopic properties that we can measure with our instruments.

For example, we can calculate the change in entropy of an ideal gas using the following equation:

Here s is entropy, cp is heat capacity at constant pressure, T is temperature, R is the ideal gas constant, and P is pressure. We can see from this equation that, all other things being equal, entropy increases with temperature and decreases with pressure. And this matches what we saw earlier. Recall that if the volume of a system of gas increases with a set quantity of material the energy difference between the energy levels decreases and there are more energy levels accessible for the same amount of energy. Under these circumstances pressure would decrease so entropy would decrease with pressure.

For solids and liquids we can assume that they are incompressible and leave off the pressure terms. So the change in entropy for a solid or liquid is given by the equation:

Let’s do an example with liquid water. What’s the change in entropy, and the increase in the number of accessible microstates, that comes from increasing the temperature of liquid water one degree Celsius? Let’s say we’re increasing 1 mole (18 grams) of water from 25 to 26 degrees Celsius. At this temperature the heat capacity of water is 75.3 J/mol-K.

Now that we have the increase in entropy we can find the increase in the number of microstates using the equation

Setting this equal to 0.252 J/mol-K

The increase is not as high as it was with phase changes, but it’s still a very big change.

We’ll wrap up the definition section here but conclude with some general intuitions we can gather from these equations and calculations:

1. All other things being equal, entropy increases with temperature.

2. All other things being equal, entropy decreases with pressure.

3. Entropy increases with phase changes from solid to liquid to gas.

Keeping these intuitions in mind will help as we move to applications with steam

Applications with Steam

The first two examples in this section are thermodynamic cycles. All thermodynamic cycles have 4 processes.

1. Compression

2. Heat addition

3. Expansion

4. Heat rejection

These processes circle back on each other so that the cycle can be repeated. Think, for example, of pistons in a car engine. Each cycle of the piston is going through each of these processes over and over again, several times per second.

There are many kinds of thermodynamic cycles. The idealized cycle is the Carnot cycle, which gives the upper limit on the efficiency of conversion from heat to work. Otto cycles and diesel cycles are the cyles used in gasoline and diesel engines. Our steam examples will be from the Rankine cycle. In a Rankine cycle the 4 processes take the following form:

1. Isentropic compression

2. Isobaric heat addition

3. Isentropic expansion

4. Isobaric heat rejection

An isobaric process is one that occurs at constant pressure. An adiabatic process is one that occurs at constant entropy.

An example of a Rankine cycle is a steam turbine or steam engine. Liquid water passes through a boiler, the steam passes through a turbine, expanding and turning the turbine, The fluid passes through a condenser, and then is pumped back to the boiler, where the cycle repeats. In such problems the fact that entropy is the same before and after expansion through the turbine reduces the number of unknown variables in our equations.

Let’s look at an example problem. Superheated steam at 6 MPa at 600 degrees Celsius expands through a turbine at a rate of 2 kg/s and drops in pressure to 10 kPa. What’s the power output from the turbine?

We can take advantage of the fact that the entropy of the fluid is the same before and after expansion. We just have to look up the entropy of superheated steam in a steam table. The entropy of steam at 6 MPa at 600 degrees Celsius is:

The entropy of the fluid before and after expansion is the same but some of it condenses. This isn’t good for the turbines but it happens nonetheless. Ideally, most of the fluid is still vapor so the ratio of the mass that is saturated vapor to the total fluid mass is called “quality”. The entropies of saturated liquid, sf, and of evaporation, sfg, are very different. So we can use algebra to calculate the quality, x2, of the fluid. The total entropy of the expanded fluid is given by the equation:

s2 we already know because the entropy of the fluid exiting the turbine is the same as that of the fluid entering the turbine. And we can look up the other values in steam tables.

Solving for quality we find that 

Now that we know the quality we can find the work output from the turbine. The equation for the work output of the turbine is:

h1 and h2 and enthalpies before and after expansion. If you’re not familiar with enthalpy don’t worry about it (we’re getting into enough for now). It roughly corresponds to the substance’s energy. We can look up the enthalpy of the superheated steam in a steam table.

For the fluid leaving the turbine we need to calculate the enthalpy using the quality, since it’s part liquid, part vapor. We need the enthalpy of saturated liquid, hf, and of evaporation, hfg. The total enthalpy of the fluid leaving the turbine is given by the formula

From the steam tables

So

And now we can plug this in to get the work output of the turbine.

So here’s an example where we used the value of entropy to calculate other observable quantities in a physical system. Since the entropy was the same before and after expansion we could use that fact to calculate the quality of the fluid leaving the turbine, use quality to calculate the enthalpy of the fluid, and use the enthalpy to calculate the work output of the turbine.

A second example.  Superheated steam at 2 MPa and 400 degrees Celsius expands through a turbine to 10 kPa. What’s the maximum possible efficiency from the cycle? Efficiency is work output divided by heat input. We have to input work as well to compress the fluid with the pump so that will subtract from the work output from the turbine. Let’s calculate the work used by the pump first. Pump work is:

Where v is the specific volume of water, 0.001 m3/kg. Plugging in our pressures in kPa:

So there’s our pump work input.

The enthalpy of saturated liquid is:

Plus the pump work input is:

Now we need heat input. The enthalpy of superheated steam at 2 MPa and 400 degrees Celsius is:

So the heat input required is:

The entropy before and after expansion through the turbine is the entropy of superheated steam at 2 MPa and 400 degrees Celsius is:

As in the last example, we can use this to calculate the quality of the steam with the equation:

Looking up these values in a steam table:

Plugging these in we get:

And

Now we can calculate the enthalpy of the expanded fluid.

And the work output of the turbine.

So we have the work input of the pump, the heat input of the boiler, and the work output of the turbine. The maximum possible efficiency is:

So efficiency is 32.32%.

Again, we used entropy to get quality, quality to get enthalpy, enthalpy to get work, and work to get efficiency. In this example we didn’t even need the mass flux of the system. Everything was on a per kilogram basis. But that was sufficient to calculate efficiency.

One last example with steam. The second law of thermodynamics has various forms. One form is that the entropy of the universe can never decrease. It is certainly not the case that entropy can never decrease at all. Entropy decreases all the time within certain systems. In fact, all the remaining examples in this episode will be cases in which entropy decreases within certain systems. But the total entropy of the universe cannot decrease. Any decrease in entropy must have a corresponding increase in entropy somewhere else. It’s easier to see this in terms of an entropy balance.

The entropy change in a system can be negative but the balance of the change in system entropy, entropy in, entropy out, and entropy of the surroundings will never be negative. We can look at the change of entropy of the universe as a function of the entropy change of a system and the entropy change of the system’s surroundings.

So let’s look at an example. Take 2 kg of superheated steam at 400 degrees Celsius and 600 kPa and condense it by pulling heat out of the system. The surroundings have a constant temperature of 25 degrees Celsius. From steam tables the entropy of the superheated steam and saturated steam are:

With these values we can calculate the change in entropy inside the system using the following equation;

The entropy decreases inside the system. Nothing wrong with this. Entropy can definitely decrease locally. But what happens in the surroundings? We condensed the steam by pulling heat out of the system and into the surroundings. So there is positive heat flow, Q, out into the surroundings. We can find the change in entropy in the surroundings using the equation:

We know the surroundings have a constant temperature, so we know T. We just need the heat flow Q. We can calculate the heat flow into the surroundings by calculating the heat flow out of the system using the equation

So we need the enthalpies of the superheated steam and saturated steam.

And plugging these in

Q = mΔh=(2)3270.2-670.6=5199 J

Now that we have Q we can find the change in entropy in the surroundings:

The entropy of the surroundings increases. And the total entropy change of the universe is:

So even though entropy decreases in the system the total entropy change in the universe is positive.

I like these examples with steam because they’re very readily calculable. The thermodynamics of steam engines have been extensively studied for over 200 years, with scientists and engineers gathering empirical data. So we have abundant data on entropy values for steam in steam tables. I actually think just flipping through steam tables and looking at the patterns is a good way to get a grasp on the way entropy works. Maybe it’s not something you’d do for light reading on the beach but if you’re ever unable to fall asleep you might give it a try.

With these examples we’ve looked at entropy for a single substance, water, at different temperatures, pressures, and phases, and observed the differences of the value of entropy at these different states. 

To review some general observations:

1. All other things being equal, entropy increases with temperature.

2. All other things being equal, entropy decreases with pressure.

3. Entropy increases with phase changes from solid to liquid to gas.

In the next section we’ll look at entropies for changing substances in chemical reactions.

Applications with Chemical Reactions

The most important equation for the thermodynamics of chemical reactions is the Gibbs Free Energy equation:

ΔG=ΔH-TΔS

Where H, T, S are enthalpy, temperature, and entropy. ΔG is the change in Gibbs free energy. Gibbs free energy is a thermodynamic potential. It is minimized when a system reaches chemical equilibrium. For a reaction to be spontaneous the value for ΔG has to be negative, meaning that during the reaction the Gibbs free energy is decreasing and moving closer to equilibrium.

We can see from the Gibbs free energy equation

ΔG=ΔH-TΔS

That the value of the change in Gibbs free energy is influenced by both enthalpy and entropy. The change in enthalpy tells us whether a reaction is exothermic (negative ΔH) or endothermic (positive ΔH). Exothermic reactions release heat while endothermic reactions absorb heat. This has to do with the total change in the chemical bond energies in all the reactants against all the products. In exothermic reactions the energy released from breaking chemical bonds is greater than the energy used to form new chemical bonds. This extra energy is converted to heat. We can see from the Gibbs free energy equation that exothermic reactions are more thermodynamically favored. Nevertheless, entropy can override enthalpy.

The minus sign in front of the TS term tells us that an increase in entropy where ΔS is positive will be more thermodynamically favored. This makes sense with what we know about entropy from the second law of thermodynamics and from statistical mechanics. The effect is proportional to temperature. At low temperatures entropy won’t have much influence and enthalpy will dominate. But at higher temperatures entropy will start to dominate and override enthalpic effects. This makes it possible for endothermic reactions to proceed spontaneously. If the increase in entropy for a chemical reaction is large enough and the temperature is high enough endothermic reactions can proceed spontaneously, even though the energy required to form the chemical bonds of the products is more than the energy released from the chemical bonds in the reactants.

Let’s look at an example. The chemical reaction for the production of water from oxygen and hydrogen is:

We can look up the enthalpies and entropies of the reactants and products in chemical reference literature. What we need are the standard enthalpies of formation and the standard molar entropies of each of the components.

The standard enthalpies of formation of oxygen and hydrogen are both 0 kJ/mol. By definition, all elements in their standard states have a standard enthalpy of formation of zero. The standard enthalpy of formation for water is -241.83 kJ/mol. The total change in enthalpy for this reaction is

It’s negative which means that the reaction is exothermic and enthalpically favored.

The standard molar entropies for hydrogen, oxygen, and water are, respectively, 130.59 J/mol-K, 205.03 J/mol-K, and 188.84 J/mol-K. The total change in entropy for this reaction is

It’s negative so entropy decreases in this reaction, which means the reaction is entropically disfavored. So enthalpy and entropy oppose each other in this reaction. Which will dominate depends on temperature? At 25 degrees Celsius (298 K) the change in Gibbs free energy is

The reaction is thermodynamically favored. Even though entropy is reduced in this reaction, at this temperature that effect is overwhelmed by the favorable reduction in enthalpy as chemical bond energy of the reactants is released as thermal energy.

Where’s the tradeoff point where entropy overtakes enthalpy? This is a question commonly addressed in polymer chemistry with what’s called the ceiling temperature. Polymers are macromolecules in which smaller molecular constituents called monomers are consolidated into larger molecules. We can see intuitively that this kind of molecular consolidation constitutes a reduction in entropy. It corresponds with the rough analogy of greater order from “disorder” as disparate parts are assembled into a more organized totality. And that analogy isn’t bad. So in polymer production it’s important to run polymerization reactions at temperatures where exothermic, enthalpy effects dominate. The upper end of this temperature range is the ceiling temperature.

The ceiling temperature is easily calculable from the Gibbs free energy equation for polymerization

Set ΔGp to zero.

And solve for Tc

At this temperature enthalpic and entropic effects are balanced. Below this temperature polymerization can proceed spontaneously. Above this temperature depolymerization can proceed spontaneously.

Here’s an example using polyethylene. The enthalpies and entropies of polymerization for polyethylene are

Using our equation for the ceiling temperature we find

So for a polyethylene polymerization reaction you want to run the reaction below 610 degrees Celsius so that the exothermic, enthalpic benefit overcomes your decrease in entropy.

Conclusion

A friend and I used to get together on weekends to take turns playing the piano, sight reading music. We were both pretty good at it and could play songs reasonably well on a first pass, even though we’d never played or seen the music before. One time when someone was watching us she asked, “How do you do that?” My friend had a good explanation I think. He explained it as familiarity with the patterns of music and the piano. When you spend years playing songs and practicing scales you just come to know how things work. Another friend of mine said something similar about watching chess games. He could easily memorize entire games of chess because he knew the kinds of moves that players would tend to make. John Von Neumann once said: “In mathematics you don’t understand things. You just get used to them.” I would change that slightly to say that you understand things by getting used to them. Also true for thermodynamics. Entropy is a complex property and one that’s not easy to understand. But I think it’s easiest to get a grasp on it by using it.

Evolutionary Biology With Molecular Precision

Evolutionary biology benefits from a non-reductionist focus on real biological systems at the macroscopic level of their natural and historical contexts. This high-level approach makes sense since selection pressures operate at the level of phenotypes, the observed physical traits of organisms. Still, it is understood that these traits are inherited in the form of molecular gene sequences, the purview of molecular biology. The approach of molecular biology is more reductionist, focusing at the level of precise molecular structures. Molecular biology thereby benefits from a rigorous standard of evidence-based inference by isolating variables in controlled experiments. But it necessarily sets aside much of the complexity of nature. A combination of these two, in the form of evolutionary biochemistry, targets a functional synthesis of evolutionary biology and molecular biology, using techniques such as ancestral protein reconstruction to physically ‘resurrect’ ancestral proteins with precise molecular structures and to observe their resulting expressed traits experimentally.

I love nerdy comics like XKCD and Saturday Morning Breakfast Cereal (SMBC). For the subject of this episode I think there’s a very appropriate XKCD comic. It shows the conclusion of a research paper that says, “We believe this resolves all remaining questions on this topic. No further research is needed.” And the caption below it says, “Just once, I want to see a research paper with the guts to end this way.” And of course, the joke is that no research paper is going to end this way because further research is always needed. I’m sure this is true in all areas of science but I think two particular fields it’s especially true. One is in neuroscience, where there is still so much that we don’t know. And the other is evolutionary biology. The more I dig into evolutionary biology the more I appreciate how much we don’t understand. And that’s OK. The still expansive frontiers in each of these fields is what makes them especially interesting to me. Far from being discouraging, unanswered questions and prodding challenges should be exciting. With this episode I’d like to look at evolutionary biology at its most basic, nuts-and-bolts level at the level of chemistry. This combines the somewhat different approaches of both evolutionary biology and molecular biology.

Evolutionary biology benefits from a non-reductionist focus on real biological systems at the macroscopic level of their natural and historical contexts. This high-level approach makes sense since selection pressures operate at the level of phenotypes, the observed physical traits of organisms. Still, it is understood that these traits are inherited in the form of molecular gene sequences, the purview of molecular biology. The approach of molecular biology is more reductionist, focusing at the level of precise molecular structures. Molecular biology thereby benefits from a rigorous standard of evidence-based inference by isolating variables in controlled experiments. But it necessarily sets aside much of the complexity of nature. A combination of these two, in the form of evolutionary biochemistry, targets a functional synthesis of evolutionary biology and molecular biology, using techniques such as ancestral protein reconstruction to physically ‘resurrect’ ancestral proteins with precise molecular structures and to observe their resulting expressed traits experimentally. This enables evolutionary science to be more empirical and experimentally grounded.

In what follows I’d like to focus on the work of biologist Joseph Thornton, who is especially known for his lab’s work on ancestral sequence reconstruction. One review paper of his that I’d especially recommend is his 2007 paper, Mechanistic approaches to the study of evolution: the functional synthesis, published in Nature and co authored with Antony Dean.

Before getting to Thornton’s work I should mention that Thornton has been discussed by biochemist Michael Behe, in particular in his fairly recent 2019 book Darwin Devolves: The New Science About DNA That Challenges Evolution. Behe discusses Thornton’s work in the eighth chapter of that book. I won’t delve into the details of the debate between the two of them, simply because that’s it’s own topic and not what directly interests me here. But I’d just like to comment that I personally find Behe’s work quite instrumentally useful to evolutionary science. He’s perceived as something of a nemesis to evolutionary biology but I think he makes a lot of good points. I could be certainly wrong about this but I suspect that many of the experiments I’ll be going over in this episode were designed and conducted in response to Behe’s challenges to evolutionary biology. Maybe these kinds of experiments wouldn’t have been done otherwise. And if that’s the case Behe has done a great service. 

Behe’s major idea is “irreducible complexity”. An irreducibly complex system is “a single system which is composed of several well-matched, interacting parts that contribute to the basic function, and where the removal of any one of the parts causes the system to effectively cease functioning.” (Darwin’s Black Box: The Biochemical Challenge to Evolution) How would such a system evolve by successive small modifications if no less complex a system would function? That’s an interesting question. And I think that experiments designed to answer that question are quite useful.

Behe and I are both Christians and we both believe that God created all things. But we have some theological and philosophical differences. My understanding of the natural and supernatural is heavily influenced by the thought of Thomas Aquinas, such that in my understanding nature is actually sustained and directed by continual divine action. I believe nature, as divine creation, is rationally ordered and intelligible, since it is a product of divine Mind. As such, I expect that we should, at least in principle, be able to understand and see the rational structure inherent in nature. And this includes the rational structure and process of the evolution of life. Our understanding of it may be miniscule. But I think it is comprehensible at least in principle. Especially since it is comprehensible to God. So I’m not worried about a shrinking space for some “god of the gaps”. Still, I think it’s useful for someone to ask probing questions at the edge or our scientific understanding, to poke at our partial explanations and ask, “how exactly?” But, perhaps different from Behe, I expect that we’ll continually be able to answer such questions better and better, even if there will always be a frontier of open questions and problems.

With complete admission that what I’m about to say is unfair, I do think that some popular understanding of evolution lacks a certain degree of rigor and doesn’t adequately account for the physical constraints of biochemistry. Evolution can’t just proceed in any direction to develop any trait to fill any adaptive need, even if there is a selection pressure for a trait that would be nice to have. OK, well that’s why it’s popular rather than academic, right? Like I said, not really fair. Still, let’s aim for rigor, shall we? Behe gets at this issue in his best known 1996 book Darwin’s Black Box: The Biochemical Challenge to Evolution. In one passage  he comments on what he calls the “fertile imaginations” of evolutionary biologists:

“Given a starting point, they almost always can spin a story to get to any biological structure you wish. The talent can be valuable, but it is a two edged sword. Although they might think of possible evolutionary routes other people overlook, they also tend to ignore details and roadblocks that would trip up their scenarios. Science, however, cannot ultimately ignore relevant details, and at the molecular level all the ‘details’ become critical. If a molecular nut or bolt is missing, then the whole system can crash. Because the cilium is irreducibly complex, no direct, gradual route leads to its production. So an evolutionary story for the cilium must envision a circuitous route, perhaps adapting parts that were originally used for other purposes… Intriguing as this scenario may sound, though, critical details are overlooked. The question we must ask of this indirect scenario is one for which many evolutionary biologists have little patience: but how exactly?”

“How exactly?” I actually think that’s a great question. And I’d say Joseph Thornton has made the same point to his fellow biologists, maybe even in response to Behe. In the conclusion of their 2007 paper he and Antony Dean had this wonderful passage:

“Functional tests should become routine in studies of molecular evolution. Statistical inferences from sequence data will remain important, but they should be treated as a starting point, not the centrepiece or end of analysis as in the old paradigm. In our opinion, it is now incumbent on evolutionary biologists to experimentally test their statistically generated hypotheses before making strong claims about selection or other evolutionary forces. With the advent of new capacities, the standards of evidence in the field must change accordingly. To meet this standard, evolutionary biologists will need to be trained in molecular biology and be prepared to establish relevant collaborations across disciplines.”

Preach it! That’s good stuff. One of the things I like about the conclusion to their paper is that it talks about all the work that still needs to be done. It’s a call to action (reform?) to the field of evolutionary biology. 

Behe has correctly pointed out that their research doesn’t yet answer many important questions and doesn’t reduce the “irreducible complexity”. True, but it’s moving in the right direction. No one is going to publish a research paper like the one in the XKCD comic that says, “We believe this resolves all remaining questions on this topic. No further research is needed.” Nature and evolution are extremely complex. And I think it’s great that Thornton and his colleagues call for further innovations. For example, I really like this one:

“A key challenge for the functional synthesis is to thoroughly connect changes in molecular function to organismal phenotype and fitness. Ideally, results obtained in vitro should be verified in vivo. Transgenic evolutionary studies identifying the functional impact of historical mutations have been conducted in microbes and a few model plant and animal species, but an expanded repertoire of models will be required to reach this goal for other taxa. By integrating the functional synthesis with advances in developmental genetics and neurobiology, this approach has the potential to yield important insights into the evolution of development, behaviour and physiology. Experimental studies of natural selection in the laboratory can also be enriched by functional approaches to characterize the specific genetic changes that underlie the evolution of adaptive phenotypes.”

For sure. That’s exactly the kind of work that needs to be done. And it’s the kind of work Behe has challenged evolutionary biologists to do. I think that’s great. Granted, that kind of work is going to be very difficult and take a long time. But that’s a good target. And we should acknowledge the progress that has been made. For example, earlier in the paper they note:

“The Reverend William Paley famously argued that, just as the intricate complexity of a watch implies a design by a watchmaker, so complexity in Nature implies design by God. Evolutionary biologists have typically responded to this challenge by sketching scenarios by which complex biological systems might have evolved through a series of functional intermediates. Thornton and co-workers have gone much further: they have pried open the historical and molecular ‘black box’ to reconstruct in detail — and with strong empirical support — the history by which a tightly integrated system evolved at the levels of sequence, structure and function.”

Yes. That’s a big improvement. It’s one thing to speculate, “Well, you know, maybe this, that, and the other” (again, being somewhat unfair, sorry). But it’s another thing to actually reconstruct ancestral sequences and run experiments with them. That’s moving things to a new level. And I’ll just mention in passing that I do in fact think that all the complexity in Nature was designed by God. And I don’t think that reconstructing that process scientifically does anything to reduce the grandeur of that. If anything, such scientific understanding facilitates what Carl Sagan once called “informed worship” (The Varieties of Scientific Experience: A Personal View of the Search for God). 

With all that out of the way now, let’s focus on Thornton’s very interesting work in evolutionary biochemistry.

First, a very quick primer on molecular biology. The basic process of molecular biology is that DNA makes RNA, and RNA makes proteins. Living organisms are made of proteins. DNA is the molecule that contains the information needed to make the proteins. And RNA is the molecule that takes the information from DNA to actually make the proteins. The process of making RNA from DNA is called transcription. And the process of making proteins from RNA is called translation. These are very complex and fascinating processes. Evolution proceeds through changes to the DNA molecule called mutations. And some changes to DNA result in changes to the composition and structure of proteins. These changes can have macroscopically observable effects.

In Thornton’s work with ancestral sequence reconstruction the idea is to look at a protein as it is in an existing organism, try to figure out what that protein might have been like in an earlier stage of evolution, and then to make it. Reconstruct it. By actually making the protein you can look at its properties. As described in the 2007 Nature article:

“Molecular biology provides experimental means to test these hypotheses decisively. Gene synthesis allows ancestral sequences, which can be inferred using phylogenetic methods, to be physically ‘resurrected’, expressed and functionally characterized. Using directed mutagenesis, historical mutations of putative importance are introduced into extant or ancestral sequences. The effects of these mutations are then assessed, singly and in combination, using functional molecular assays. Crystallographic studies of engineered proteins — resurrected and/or mutagenized — allow determination of the the structural mechanisms by which amino-acid replacements produce functional shifts. Transgenic techniques permit the effect of specific mutations on whole-organism phenotypes to be studied experimentally. Finally, competition between genetically engineered organisms in defined environments allows the fitness effects of specific mutations to be assessed and hypotheses about the role of natural selection in molecular evolution to be decisively tested.”

What’s great about this kind of technique is that it spans a number of levels of ontology. Evolution by natural selection acts on whole-organism phenotypes. So it’s critical to understand what these look like between all the different versions of a protein. We don’t just want to know that we can make all these different kinds of proteins. We want to know what they do, how they function. Function is a higher-level ontology. But we also want to be precise about what is there physically. And we have that as well, down to the molecular level. Atom for atom we know exactly what these proteins are.

To dig deeper into these experimental methods I’d like to refer to another paper, Evolutionary biochemistry: revealing the historical and physical causes of protein properties, published in Nature in 2013 by Michael Harms and Joseph Thornton. In this paper the authors lay out three strategies for studying the evolutionary trajectories of proteins.

The first strategy is to explicitly reconstruct “the historical trajectory that a protein or group of proteins took during evolution.”

“For proteins that evolved new functions or properties very recently, population genetic analyses can identify which genotypes and phenotypes are ancestral and which are derived. For more ancient divergences, ancestral protein reconstruction (APR) uses phylogenetic techniques to reconstruct statistical approximations of ancestral proteins computationally, which are then physically synthesized and experimentally studied… Genes that encode the inferred ancestral sequences can then be synthesized and expressed in cultured cells; this approach allows for the structure, function and biophysical properties of each ‘resurrected’ protein to be experimentally characterized… By characterizing ancestral proteins at multiple nodes on a phylogeny, the evolutionary interval during which major shifts in those properties occurred can be identified. Sequence substitutions that occurred during that interval can then be introduced singly and in combination into ancestral backgrounds, allowing the effects of historical mutations on protein structure, function and physical properties to be determined directly.”

This first strategy is a kind of top-down, highly directed approach. We’re trying to follow exactly the path that evolution followed and only that path to see what it looks like.

The second strategy is more bottom-up. It is “to use directed evolution to drive a functional transition of interest in the laboratory and then study the mechanisms of evolution.” The goal is not primarily to follow the exact same path that evolution followed historically but rather to stimulate evolution, selecting for a target property, to see what path it follows. 

“A library of random variants of a protein of interest is generated and then screened to recover those with a desired property. Selected variants are iteratively re-mutagenized and are subject to selection to optimize the property. Causal mutations and their mechanisms can then be identified by characterizing the sequences and functions of the intermediate states realized during evolution of the protein.”

If the first strategy is top-down and the second strategy is bottom-up, the third strategy is to cast a wide net. “Rather than reconstructing what evolution did in the past, this strategy aims to reveal what it could do.” In this approach:

“An initial protein is subjected to random mutagenesis, and weak selection for a property of interest is applied, enriching the library for clones with the property and depleting those without it. The population is then sequenced; the degree of enrichment of each clone allows the direct and epistatic effects of each mutation on the function to be quantitatively characterized.”

Let’s look at an example from Thornton’s work, which followed the first, top-down approach. The most prominent work so far has been on the evolution of glucocorticoid receptors (GRs) and mineralocorticoid receptors (MRs). See for example the 2006 paper Evolution of Hormone-Receptor Complexity by Molecular Exploitation, published in Science by Jamie Bridgham, Sean Carroll, and Joseph Thornton.

Glucocorticoid receptors and mineralocorticoid receptors bind with glucocorticoid and mineralocorticoid steroid hormones. The two steroid hormones studied in Thornton’s work are cortisol and aldosterone. Cortisol activates the glucocorticoid receptor to regulate metabolism, inflammation, and immunity. Aldosterone activates the mineralocorticoid receptor to regulate electrolyte homeostasis of plasma sodium and potassium levels. Glucocorticoid receptors and mineralocorticoid receptors share common origin and Thornton’s work was to reconstruct ancestral versions of these proteins along their evolutionary path and test their properties experimentally.

Modern mineralocorticoid receptors can be activated by both aldosterone and cortisol but modern glucocorticoid receptors are activated only by cortisol in bony vertebrates. So in their evolution GRs developed an insensitivity to aldosterone.

The evolutionary trajectory is as follows. There are versions of MR and GR extant in tetrapods, teleosts (fish), and elasmobranchs (sharks). GRs and MRs trace back to a common protein from 450 million years ago, the ancestral corticoid receptor (AncCR). The ancestral corticoid receptor is thought to have been activated by deoxycorticosterone (DOC), the ligand for MRs in extant fish.

Phylogeny tells us that the ancestral corticoid receptor gave rise to GR and MR in a gene-duplication event. Interestingly enough this was before aldosterone had even evolved. In tetrapods and teleosts, modern GR is only sensitive to cortisol; it is insensitive to aldosterone.

Thornston and his team reconstructed the ancestral corticoid receptor (AncCR) and found that it is sensitive to DOC, cortisol, and aldosterone. Most phylogenetic analysis revealed that precisely two mutations, amino acid substitutions, resulted in the glucocorticoid receptor phenotype: aldosterone insensitivity and cortisol sensitivity. These amino acid substitutions are S106P, from serine to proline at site 106, and L111Q, from leucine to glutamine at site 111. Thornston synthesized these different proteins to observe their properties. The protein with just the L111Q mutation did not bind to any of the ligands: DOC, cortisol, or aldosterone. So it is unlikely that the L111Q mutation would have occurred first. The S106P mutation reduces aldosterone and cortisol sensitivity but it remains highly DOC-sensitive. With both the S106P and L111Q mutations in series aldosterone sensitivity is reduced even further but cortisol sensitivity is restored to levels characteristic of extant GRs. A mutational path beginning with S106P followed by L111Q thus converts the ancestor to the modern GR phenotype by functional intermediate steps and is the most likely evolutionary scenario.

Michael Behe has commented that this is an example of a loss of function whereas his challenge to evolutionary biology is to demonstrate how complex structures evolved in the first place. That’s a fair point. Still, this is a good example of the kind of molecular precision we can get in our reconstruction of evolutionary processes. This does seem to show, down to the molecular level, how these receptors evolved. And that increases our knowledge. We know more about the evolution of these proteins than we did before. That’s valuable. We can learn a lot more in the future using these methods and applying them to other examples. 

One of the things I like about this kind of research is that it not only shows what evolutionary paths are possible but also which ones are not. Another one of Thornton’s papers worth checking out is An epistatic ratchet constrains the direction of glucocorticoid receptor evolution, published in Nature in 2009, co-authored by Jamie Bridgham and Eric Ortlund. The basic idea is that in certain cases once a protein acquires a new function “the evolutionary path by which this protein acquired its new function soon became inaccessible to reverse exploration”. In other words, certain evolutionary processes are not reversible. This is similar to Dollo’s Law of Irreversibility, proposed in 1893: “an organism never returns exactly to a former state, even if it finds itself placed in conditions of existence identical to those in which it has previously lived … it always keeps some trace of the intermediate stages through which it has passed.” In their 2009 paper Harms and Thornton and  state: “We predict that future investigations, like ours, will support a molecular version of Dollo’s law: as evolution proceeds, shifts in protein structure-function relations become increasingly difficult to reverse whenever those shifts have complex architectures, such as requiring conformational changes or epistatically interacting substitutions.”

This is really important. It’s important to understand that evolution can’t just do anything. Nature imposes constraints both physiologically and biochemically. I think in some popular conceptions we imagine that “life finds a way” and that evolution is so robust that organisms will evolve whatever traits they need to fit their environments. But very often they don’t, and they go extinct. And even when they do, their evolved traits aren’t necessarily perfect. Necessity or utility can’t push evolution beyond natural constraints. A good book on the subject of physiological constraints on evolution is Alex Bezzerides’s 2021 book Evolution Gone Wrong: The Curious Reasons Why Our Bodies Work (Or Don’t). Our anatomy doesn’t always make the most sense. It’s possible to imagine more efficient ways we could be put together. But our evolutionary history imposes constraints that don’t leave all options open, no matter how advantageous they would be. And the same goes for biochemistry. The repertoire of proteins and nucleic acids in the living world is determined by evolution. But the properties of proteins and nucleic acids are determined by the laws of physics and chemistry.

One way to think about this is with a protein sequence space. This is an abstract multidimensional space. Michael Harms and Joseph Thornton describe this in their 2013 paper.

“Sequence space is a spatial representation of all possible amino acid sequences and the mutational connections between them. Each sequence is a node, and each node is connected by edges to all neighbouring proteins that differ from it by just one amino acid. This space of sequences becomes a genotype–phenotype space when each node is assigned information about its functional or physical properties; this representation serves as a map of the total set of relations between sequence and those properties. As proteins evolve, they follow trajectories along edges through the genotype–phenotype space.”

What’s crucial to consider in this kind of model is that most nodes are non-functional states. This means that possible paths through sequence space will be highly constrained. Not just any path is possible. There may be some excellent nodes in the sequence space that would be perfect for a given environment. But if they’re not connected to an existing node via a path through functional states they’re not going to occur through evolution.

To conclude, it’s an exciting time for the evolutionary sciences. If you compare our understanding of the actual physical mechanisms for inheritance and evolution, down to the molecular level we are leaps and bounds ahead of where we were a century ago. Darwin and his associates had no way of knowing the kinds of things we know now about the structures of nucleic acids and proteins. This makes a big difference. It’s certainly not the case that we have it all figured out. That’s why I put evolutionary biology in the same class as neuroscience when it comes to what we understand compared to how much there is to understand. We’re learning more and more all the time just how much we don’t know. But that’s still progress. We are developing the tools to get very precise and detailed in what we can learn about evolution.

Ontological Pluralism

Jared and Todd talk about ontological pluralism: What exists? How do we categorize what exists? Are those categories intrinsic or man-made? A related idea is perspectival realism. We discuss the ideas of William Wimsatt and Scott Page, among others. Is reality monistic, dualist, pluralistic? Is the question even meaningful? And what (if any) practical implications would there be?

Outline – Ontological Pluralism

  1. People
    1. William Wimsatt
    2. Scott Page  
    3. Johannes Jaeger
    4. Lawrence Cahoone
    5. Spencer Greenberg 
  2. Ideas
    1. Rainforest Ontology (Wimsatt)
    2. Realms of Truth (Greenberg)
    3. Perspectival realism
      1. Meta-modernism (post-postmodernism)
  3. Ontologies
    1. Monism
    2. Dualism
      1. Matter
      2. Mind
    3. Trialism (Penrose)
      1. Physical world
      2. Mental world
      3. Platonic mathematical world
    4. Pluralism
  4. Reductionism
    1. Ontological: reality is composed of a minimum number of kinds of entities and substances
    2. Epistemological: reality is best explained by reduction to its most basic kinds of entities and substances
    3. Todd: in-principle epistemological reductionist but not an ontological reductionist. Everything that happens in a physical system evolves according to physical laws but those physical processes don’t constitute all there is.
    4. Can a macro-scale entity really be completely inexplicable in terms of micro-scale entities?
    5. Micro-scale events may only make sense in terms of macro-scale events.
      1. Ex: Enzymes and reactants
        1. Enzyme is larger and more complex than the reactants
        2. The speed of the reaction only makes sense by accounting for the enzyme
        3. But the enzyme is still explained in terms of smaller-scale entities (amino acids, atoms, etc.)
  5. Seven Realms of Truth – Spencer Greenberg
    1. Some things “exist” in the sense that they are in physical reality, like atoms (in “Matter Space”).
    2. Other things may “exist” in the sense that they are real experiences conscious beings have, like the taste of pineapple (in “Experience Space”).
    3. Still, other things may “exist” in the sense that they are shared constructs across multiple minds, like the value of money (in “Consensus Space”).
    4. Other things may “exist” in the sense of being conclusions derived from frameworks or sets of premises, like consequences of economic theories (in “Theory Space”).
    5. Some may “exist” in the sense that they are represented in systems that store or process information, such as the information in a database (in “Representation Space”).
    6. If universal moral truths “exist” (e.g. objective facts about what is right and wrong), then we can talk about moral rules existing (in “Morality Space”).
    7. Finally, if supernatural entities “exist”, such as spirits (meaning that not all beings inhabit Matter Space), then these beings are in a different realm than us (in “Supernatural Space”).
  6. Tropical Rainforest Ontology (Wimsatt)
    1. Contra Quine
      1. Willard van Orman Quine once said that he had a preference for a desert ontology.
    2. Robustness
      1. Criterion for what is real
      2. “Things are robust if they are accessible (detectable, measurable, derivable, defineable, produceable, or the like) in a variety of independent ways.
      3. Local
        1. Criteria used by working scientists
        2. “The nitty-gritty details of actual theory, actual inferences from actual data, the actual conditions under which we poised and detected entities, calibrated and ‘burned in’ instruments, identified and rejected artifacts, debugged programs and procedures, explained the mechanisms behind regularities, judged correlations to be spurious, and in general, the real complexities and richness of actual scientific practice.”
    3. Levels
      1. Dissipative wave (pro-reductionistic)
      2. Sharpening wave (pro-holistic)
    4. Perspectives
      1. “As long as there are well-defined levels of organization, there are relatively unambiguous inclusion or compositional relations relating all of the things described at different levels of organization… But conversely, when neat compositional relations break down, levels become less useful as ways of characterizing the organization of systems–or at least less useful if they are asked to handle the task alone. At this point, other ontological structures enter, either as additional tools, or as a replacement. These are what I have called perspectives–intriguingly quasi-subjective (or at least observer, technique or technology-relative) cuts on the phenomena characteristic of a system,which needn’t be bound to given levels.”
      2. “What I am calling perspectives is probably a diverse category of things which nonetheless appear to have at least some of the properties of being ‘from a point of view’ or to have a subjective or quasi-subjective character.”
    5. Causal Thickets
      1. “This term is intended to indicate a situation of disorder and boundary ambiguities. Perspectives may still seem to have an organizing power (just as viewing a thicket or shrub from different sides will reveal a shape to its bushy confusion), but there will be too many boundary disputes.”

Causal and Emergent Models

Models are critical tools that enable us to think about, qualify, and quantify features of many processes. And as with any kind of tool, different kinds of models are better suited to different circumstances. Here we look at two kinds of models for understanding transport phenomena: causal and emergent models. In a causal process there is some kind of distinct, sequential, goal-oriented event with an observable beginning and end. In an emergent process there are uniform, parallel, independent events with no beginning or end but in which observable patterns eventually emerge.

For the video version of this episode, which includes some visual aids, see on YouTube.

Since my university studies I’ve been fascinated by the ways we use models to understand and even make quantitative descriptions and predictions about the world. I don’t remember when exactly, but at some point I really began to appreciate how the pictures of chemical and physical processes I had in my head were not the way things “really” were (exactly) but were useful models for thinking about things and solving problems.

Conceptual models in science, engineering, economics, etc. are similar to toy models like model cars or model airplanes in that they aren’t the things themselves but have enough in common with the things they are modeling to still perform in similar ways. As long as a model enables you to get the information and understanding you need it is useful, at least for the scale and circumstances you’re interested in. Models are ubiquitous in the sciences and one of the major activities in the sciences is to improve models, generate new models, and create more models to apply to more conditions.

Something to bear in mind when working with a model is the set of conditions in which it works well. That’s important because a model may work very well under a certain set of conditions but then break down outside those conditions. Outside those conditions it may give less accurate results or just not describe well qualitatively what’s going on in the system we’re trying to understand. This could be something like being outside a temperature or pressure range, extremes in velocity or gravitational field strength, etc. And often it’s a matter of geometric scale, like whether we’re dealing in meters or nanometers. The world looks different at the microscopic and molecular scale than at the macroscopic scale of daily life.

I’m really a pluralist when it comes to models. I’m in favor of several types to meet the tasks at hand. Is a classical, Newtonian model for gravity superior to a relativistic model for gravity? I don’t think so. Yeah, a Newtonian model breaks down under certain conditions. But it’s much easier and intuitive to work with under most conditions. It doesn’t make sense to just throw away a Newtonian model after relativity. And we don’t. We can’t. It would be absurdly impractical. And practicality is a major virtue of models. That’s not to say there’s no such thing as better or worse models. A Newtonian model of planetary motion is better than a Ptolemaic one because it’s both more accurate and simpler to understand. So I don’t embrace pluralism without standards of evaluation. I suppose there’d be an infinite number of really bad models in the set of all possible models. Even so, there are still multiple that do work well, that overlap and cover similar systems.

I studied chemical engineering in the university and one of my textbooks was Transport Phenomena by Bird, Stewart, and Lightfoot, sort of a holy trinity of the discipline. Transport phenomena covers fluids, heat, and diffusion, which all share many features and whose models share a very similar structure. One of the ideas I liked in that book is its systematic study of processes at three scales: macroscopic, microscopic, and molecular. I’ll quote from the book for their explanations of these different scales.

“At the macroscopic level we write down a set of equations called the ‘macroscopic balances,’ which describe how the mass, momentum, energy, and angular momentum in the system change because of the introduction and removal of these entities via the entering and leaving streams, and because of various other inputs to the system from the surroundings. No attempt is made to understand all the details of the system.”

“At the microscopic level we examine what is happening to the fluid mixture in a small region within the equipment. We write down a set of equations called the ‘equations of change,’ which describe how the mass, momentum, energy, and angular moment change within this small region. The aim here is to get information about velocity, temperature, pressure, and concentration profiles within the system. This more detailed information may be required for the understanding of some processes.”

“At the molecular level we seek a fundamental understanding of the mechanisms of mass, momentum, energy, and angular momentum transport in terms of molecular structure and intermolecular forces. Generally this is the realm of the theoretical physicist or physical chemist, but occasionally engineers and applied scientists have to get involved at this level.”

I came across an interesting paper recently from a 2002 engineering education conference titled How Chemical Engineering Seniors Think about Mechanisms of Momentum Transport by Ronald L. Miller, Ruth A. Streveler, and Barbara M. Olds. It caught my attention since I’ve been a chemical engineering senior so I wanted to see how it compared to my experience. And it tracked it pretty well actually. Their idea is that one of the things that starts to click for seniors in their studies, something that often hadn’t clicked before, is a conceptual understanding of many fundamental molecular-level and atomic-level phenomena including heat, light, diffusion, chemical reactions, and electricity. I’ll refer mostly to the examples from this paper by Miller, Streveler, and Olds but I’ll mention that they base much of their presentation on the work of Michelene Chi, who is a cognitive and learning scientist. In particular they refer to her work on causal versus emergent conceptual models for these physical processes. Her paper on this is titled Misconceived Causal Explanations for Emergent Processes. Miller, Streveler, and Olds propose that chemical engineering students start out using causal models to understand many of these processes but then move to more advanced, emergent models later in their studies.

In a causal process there is some kind of distinct, sequential, goal-oriented event with an observable beginning and end. In an elastic collision for instance, a moving object collides with a previously stationary object and transfers its momentum to it. In an emergent process there are uniform, parallel, independent events with no beginning or end but in which observable patterns eventually emerge. Electricity, fluid flow, heat transfer and molecular equilibrium are examples of emergent processes. Miller, Streveler, and Olds correlate causal and emergent explanations with macroscopic and molecular models respectively. As Bird, Stewart, and Lightfoot had said in their descriptions of their three scales, it’s at the molecular level that “we seek a fundamental understanding of the mechanisms.” But at the macroscopic scales we aren’t looking at so fundamental an explanation.      

Miller, Streveler, and Olds use diffusion, i.e. mass transport, as an example to show the difference between causal and emergent explanations. Say we have a glass of water and we add a drop of color dye to it. The water is a solvent and the color dye is a solute. This color dye solute starts to diffuse, or spread, into the water solvent and we can explain this diffusion process in both causal and emergent ways; or we could also say in macroscopic and molecular ways.

First, a quick overview of diffusion. The mathematical model for diffusion is Fick’s Law of Diffusion. The equation for this is:       

J = -D(dC/dx)

Where,
J is the diffusion flux
C is concentration
x is position
D is diffusivity, the applicable constant of proportionality in this case

The basic logic of this equation is that the diffusion of a solute is proportional to the gradient of the concentration of that solute in a solvent. If the solute is evenly distributed in the solution the concentration is the same everywhere in the solution, so there is no concentration gradient and no diffusion. But there is a gradient if the solute concentration is different at different positions in the space, for example, if it is highly concentrated at one point and less concentrated as you move away from that point. The diffusion flux is proportional to the steepness of that decrease, that gradient. If a drop of dye has just been placed in a glass of water the flux of diffusion is going to be very high at the boundary between that drop and the surrounding water because there is a huge difference in the concentration of the dye there.

So that’s the logic of Fick’s Law of Diffusion. But why does this happen? And here we can look at the two different kinds of explanations, causal and emergent explanations.         

Here are a few examples of both:

Causal Explanation: “Dye molecules move towards water molecules.”
Emergent Explanation: “All molecules exercise Brownian motion.”

Causal Explanation: “Dye molecules flow from areas of high concentration to areas of low concentration.”
Emergent Explanation: “All molecules move at the same time.”

Causal Explanation: “Dye molecules are ‘pushed’ into the water by other dye molecules.”
Emergent Explanation: “Molecules collide independently of prior collisions. What happens to one molecule doesn’t affect interactions with other molecules.”

Causal Explanation: “Dye molecules want to mix with water molecules.”
Emergent Explanation: “The local conditions around each molecule affect where it moves and at what velocity.”

Causal Explanation: “Dye molecules stop moving when dye and water become mixed.”
Emergent Explanation: “Molecular interactions continue when equilibrium is reached.”

This is gives something of a flavor of the two different kinds of explanations. Causal explanations have more of a top-down approach, looking for the big forces that make things happen, and may even speak in metaphorical terms of volition, like what a molecule “wants” to do. Emergent explanations have more of a bottom-up approach, looking at all the things going on independently in a system and how that results in the patterns we observe.

I remember Brownian motion being something that really started pushing me to think of diffusion in a more emergent way. Brownian motion is the random motion of particles suspended in a medium, like a liquid or a gas. If you just set a glass of water on a table it may look stationary, but at the molecular scale there’s still a lot of movement. The water molecules are moving around in random directions. If you add a drop of color dye to the water the molecules in the dye also have Brownian motion, with all those molecules moving in random directions. So what’s going to happen in this situation? Well, things aren’t just going to stay put. The water molecules are going to keep moving around in random directions and the dye molecules are going to keep moving around in random directions. What kind of patter should we expect to see emerge from this?

Let’s imagine imposing a three-dimensional grid onto this space, dividing the glass up into cube volumes or voxels. Far away from the drop of dye, water molecules will still be moving around randomly between voxels but those voxels will continue to look about the same. Looking at the space around the dye, voxels in the middle of the drop will be all dye. Voxels on the boundary will have some dye molecules and some water molecules. And voxels with a lot of dye molecules will be next to voxels with few dye molecules. As water molecules and dye molecules continue their random motion we’re going to see the most state changes in the voxels that are different from each other. Dye molecules near a voxel with mostly water molecules can very likely move into one of those voxels and change its state from one with few or no dye molecules to one with some or more dye molecules. And the biggest state changes will occur in regions where voxels near to each other are most different, just because they can be so easily (albeit randomly) changed.

This is a very different way of looking at the process of diffusion. Rather than there being some rule imposed from above, telling dye molecules that they should move to areas of high concentration to low concentration, all these molecules are moving around randomly. And over time areas with sharp differences tend to even out, just by random motion. From above and from a distance this even looks well-ordered and like it could be directed. The random motion of all the components results in an emergent macro-level pattern that can be modeled and predicted by a fairly simple mathematical expression. The movement of each individual molecule is random and unpredictable, but the resulting behavior of the system, the aggregate of all those random motions, is ordered and highly predictable. I just think that’s quite elegant!

Miller, Streveler, and Olds give another example that neatly illustrates different ways of understanding a physical process at the three different scales: macroscopic, microscopic, and molecular. Their second example is of momentum transport. An example of momentum transport is pumping a fluid through a pipe. As a brief overview, when a fluid like water is moved through a pipe under pressure the velocity of the fluid is highest at the center of the pipe and lowest near the walls. This is a velocity gradient, often called a “velocity profile”, where you have this cross-sectional view of a pipe showing the velocity vectors of different magnitudes at different positions along the radius of the pipe. When you have this velocity gradient there is also a transfer of momentum to areas of high momentum to areas of low momentum. So in this case momentum will transfer from the center of the pipe toward the walls of the pipe.

The model for momentum transport has a similar structure to the model for mass transport. Recall that in Fick’s Law of Diffusion, mass transport, i.e. diffusion, was proportional to the concentration gradient and the constant of proportionality was this property called diffusivity. The equation was:

J = -D(dC/dx)

The corresponding model for momentum transport is Newton’s law of viscosity (Newton had a lot of laws). The equation for that is:

τ = -μ(dv/dx)

Where

τ is shear stress, the flux of momentum transport
v is velocity
x is position
μ is viscosity, the applicable constant of proportionality in this case

So in Newton’s law of viscosity the momentum transport, i.e. shear stress, is proportional to the velocity gradient and the constant of proportionality is viscosity. You have higher momentum transport with a higher gradient, i.e. change, in velocity along the radius of the pipe. Why does that happen?

So they actually asked some students to explain this in their own words to see on what geometric scales they would make their descriptions. The prompt was: “Explain in your own words (no equations) how momentum is transferred through a fluid via viscous action.” And they evaluated the descriptions as one being of the three scales (or a mixture of them) using this rubric. So here are examples from the rubric of explanations at each of those scales:

Macroscopic explanation: The pressure at the pipe inlet is increased (usually by pumping) which causes the fluid to move through the pipe. Friction between fluid and pipe wall results in a pressure drop in the direction of flow along the pipe length. The fluid at the wall does not move (no-slip condition) while fluid furthest away from the wall (at the pipe centerline) flows the fastest, so momentum is transferred from the center (high velocity and high momentum) to the wall (no velocity and no momentum).

Microscopic explanation: Fluid in laminar flow moves as a result of an overall pressure drop causing a velocity profile to develop (no velocity at the wall, maximum velocity at the pipe centerline). Therefore, at each pipe radius, layers of fluid flow past each other at different velocities. Faster flowing layers tend to speed up [and move] slower layers along resulting in momentum transfer from faster layers in the middle of the pipe to slower layers closer to the pipe walls.

Molecular explanation: Fluid molecules are moving in random Brownian motion until a pressure is applied at the pipe inlet causing the formation of a velocity gradient from centerline to pipe wall. Once the gradient is established, molecules that randomly migrate from an area of high momentum to low momentum will take along the momentum they possess and will transfer some of it to other molecules as they collide (increasing the momentum of the slower molecules). Molecules that randomly migrate from low to high momentum will absorb some momentum during collisions. As long as the overall velocity gradient is maintained, the net result is that momentum is transferred by molecular motion from areas of high momentum to areas of low momentum and ultimately to thermal dissipation at the pipe wall.

With these different descriptions as we move from larger to smaller scales we also move from causal to emergent explanations. At the macroscopic level we’re looking at bulk motion of fluid. At the microscopic scale it’s getting a little more refined. We’re thinking in terms of multiple layers of fluid flow. We’re seeing the gradient at a higher resolution. And we can think of these layers of fluid rubbing past each other, with faster layers dragging slower layers along, and slower layers slowing faster layers down. It’s spreading out a deck of cards. In these explanations momentum moves along the velocity gradient because of a kind of drag along the radial direction.

But with the molecular description we leave behind that causal explanation of things being dragged along. There’s only one major top-down, causal force in this system and that’s the pressure or force that’s being applied in the direction of the length of the pipe. With a horizontal pipe we can think of this force being applied along its horizontal axis. But there’s not a top-down, external force being applied along the vertical or radial axis of the pipe. So why does momentum move from the high-momentum region in the center of the pipe to the low-momentum region near the pipe wall? It’s because there’s still random motion along the radial or vertical axis, which is perpendicular to the direction of the applied pressure. So molecules are still moving randomly between regions with different momentum. So if we think of these layers, these cylindrical sheets that are dividing up the sections of the pipe at different radii, these correspond to our cube voxels in the diffusion example. Molecules are moving randomly between these sheets. The state of each layer is characterized by the momentum of the molecules in it. As molecules move between layers and collide with other molecules they transfer momentum. As in the diffusion example the overall pattern that emerges here is the result of random motion of the individual molecular components.

So, does this matter? My answer to that question is usually that “it”, whatever it may be, matters when and where it matters. Miller, Streveler, and Olds say: “If the macroscopic and microscopic models are successful in describing the global behavior of simple systems, why should we care if students persist in incorrectly applying causal models to processes such as dye diffusion into water? The answer is simple – the causal models can predict some but not all important behavioral characteristics of molecular diffusional processes.” And I think that’s a good criterion for evaluation. I actually wouldn’t say, as they do, that the application of causal models is strictly “incorrect”. But I take their broader point. Certainly macroscopic and causal models have their utility. For one thing, I think they’re easier to understand starting off. But as with all models, you have to keep in mind their conditions of applicability. Some apply more broadly then others.

One thing to notice about these transport models is that they have proportionality constants. And whenever you see a constant like that in a model it’s important to consider what all might be wrapped up into it because it may involve a lot of complexity. And that is the case with both the diffusion coefficient and viscosity. Both are heavily dependent on specific properties of the system. For the value of viscosity you have to look it up for a specific substance and then also for the right temperature range. Viscosity varies widely between different substances. And even for a single substance it can still vary widely with temperature. For diffusivity you have to consider not only one substance but two, at least. If you look up a coefficient of diffusivity in a table it’s going to be for a pair of substances. And that will also depend on temperature.

At a macroscopic scale it’s not clear why the rates mass transport and momentum transport would depend on temperature or the type of substances involved. But at a microscopic scale you can appreciate how different types of molecules would have different sizes and would move around at different velocities at different temperatures and how that would all play into the random movements of particles and the interactions between particles that produce, from that molecular scale, the emergent processes of diffusion and momentum transport that we observe at the macroscopic scale.

Once you open up that box, to see what is going on behind these proportionality constants, it opens up a whole new field of scientific work to develop – you guessed it – more and better models to qualify and quantify these phenomena.