Causal and Emergent Models

Models are critical tools that enable us to think about, qualify, and quantify features of many processes. And as with any kind of tool, different kinds of models are better suited to different circumstances. Here we look at two kinds of models for understanding transport phenomena: causal and emergent models. In a causal process there is some kind of distinct, sequential, goal-oriented event with an observable beginning and end. In an emergent process there are uniform, parallel, independent events with no beginning or end but in which observable patterns eventually emerge.

For the video version of this episode, which includes some visual aids, see on YouTube.

Since my university studies I’ve been fascinated by the ways we use models to understand and even make quantitative descriptions and predictions about the world. I don’t remember when exactly, but at some point I really began to appreciate how the pictures of chemical and physical processes I had in my head were not the way things “really” were (exactly) but were useful models for thinking about things and solving problems.

Conceptual models in science, engineering, economics, etc. are similar to toy models like model cars or model airplanes in that they aren’t the things themselves but have enough in common with the things they are modeling to still perform in similar ways. As long as a model enables you to get the information and understanding you need it is useful, at least for the scale and circumstances you’re interested in. Models are ubiquitous in the sciences and one of the major activities in the sciences is to improve models, generate new models, and create more models to apply to more conditions.

Something to bear in mind when working with a model is the set of conditions in which it works well. That’s important because a model may work very well under a certain set of conditions but then break down outside those conditions. Outside those conditions it may give less accurate results or just not describe well qualitatively what’s going on in the system we’re trying to understand. This could be something like being outside a temperature or pressure range, extremes in velocity or gravitational field strength, etc. And often it’s a matter of geometric scale, like whether we’re dealing in meters or nanometers. The world looks different at the microscopic and molecular scale than at the macroscopic scale of daily life.

I’m really a pluralist when it comes to models. I’m in favor of several types to meet the tasks at hand. Is a classical, Newtonian model for gravity superior to a relativistic model for gravity? I don’t think so. Yeah, a Newtonian model breaks down under certain conditions. But it’s much easier and intuitive to work with under most conditions. It doesn’t make sense to just throw away a Newtonian model after relativity. And we don’t. We can’t. It would be absurdly impractical. And practicality is a major virtue of models. That’s not to say there’s no such thing as better or worse models. A Newtonian model of planetary motion is better than a Ptolemaic one because it’s both more accurate and simpler to understand. So I don’t embrace pluralism without standards of evaluation. I suppose there’d be an infinite number of really bad models in the set of all possible models. Even so, there are still multiple that do work well, that overlap and cover similar systems.

I studied chemical engineering in the university and one of my textbooks was Transport Phenomena by Bird, Stewart, and Lightfoot, sort of a holy trinity of the discipline. Transport phenomena covers fluids, heat, and diffusion, which all share many features and whose models share a very similar structure. One of the ideas I liked in that book is its systematic study of processes at three scales: macroscopic, microscopic, and molecular. I’ll quote from the book for their explanations of these different scales.

“At the macroscopic level we write down a set of equations called the ‘macroscopic balances,’ which describe how the mass, momentum, energy, and angular momentum in the system change because of the introduction and removal of these entities via the entering and leaving streams, and because of various other inputs to the system from the surroundings. No attempt is made to understand all the details of the system.”

“At the microscopic level we examine what is happening to the fluid mixture in a small region within the equipment. We write down a set of equations called the ‘equations of change,’ which describe how the mass, momentum, energy, and angular moment change within this small region. The aim here is to get information about velocity, temperature, pressure, and concentration profiles within the system. This more detailed information may be required for the understanding of some processes.”

“At the molecular level we seek a fundamental understanding of the mechanisms of mass, momentum, energy, and angular momentum transport in terms of molecular structure and intermolecular forces. Generally this is the realm of the theoretical physicist or physical chemist, but occasionally engineers and applied scientists have to get involved at this level.”

I came across an interesting paper recently from a 2002 engineering education conference titled How Chemical Engineering Seniors Think about Mechanisms of Momentum Transport by Ronald L. Miller, Ruth A. Streveler, and Barbara M. Olds. It caught my attention since I’ve been a chemical engineering senior so I wanted to see how it compared to my experience. And it tracked it pretty well actually. Their idea is that one of the things that starts to click for seniors in their studies, something that often hadn’t clicked before, is a conceptual understanding of many fundamental molecular-level and atomic-level phenomena including heat, light, diffusion, chemical reactions, and electricity. I’ll refer mostly to the examples from this paper by Miller, Streveler, and Olds but I’ll mention that they base much of their presentation on the work of Michelene Chi, who is a cognitive and learning scientist. In particular they refer to her work on causal versus emergent conceptual models for these physical processes. Her paper on this is titled Misconceived Causal Explanations for Emergent Processes. Miller, Streveler, and Olds propose that chemical engineering students start out using causal models to understand many of these processes but then move to more advanced, emergent models later in their studies.

In a causal process there is some kind of distinct, sequential, goal-oriented event with an observable beginning and end. In an elastic collision for instance, a moving object collides with a previously stationary object and transfers its momentum to it. In an emergent process there are uniform, parallel, independent events with no beginning or end but in which observable patterns eventually emerge. Electricity, fluid flow, heat transfer and molecular equilibrium are examples of emergent processes. Miller, Streveler, and Olds correlate causal and emergent explanations with macroscopic and molecular models respectively. As Bird, Stewart, and Lightfoot had said in their descriptions of their three scales, it’s at the molecular level that “we seek a fundamental understanding of the mechanisms.” But at the macroscopic scales we aren’t looking at so fundamental an explanation.      

Miller, Streveler, and Olds use diffusion, i.e. mass transport, as an example to show the difference between causal and emergent explanations. Say we have a glass of water and we add a drop of color dye to it. The water is a solvent and the color dye is a solute. This color dye solute starts to diffuse, or spread, into the water solvent and we can explain this diffusion process in both causal and emergent ways; or we could also say in macroscopic and molecular ways.

First, a quick overview of diffusion. The mathematical model for diffusion is Fick’s Law of Diffusion. The equation for this is:       

J = -D(dC/dx)

J is the diffusion flux
C is concentration
x is position
D is diffusivity, the applicable constant of proportionality in this case

The basic logic of this equation is that the diffusion of a solute is proportional to the gradient of the concentration of that solute in a solvent. If the solute is evenly distributed in the solution the concentration is the same everywhere in the solution, so there is no concentration gradient and no diffusion. But there is a gradient if the solute concentration is different at different positions in the space, for example, if it is highly concentrated at one point and less concentrated as you move away from that point. The diffusion flux is proportional to the steepness of that decrease, that gradient. If a drop of dye has just been placed in a glass of water the flux of diffusion is going to be very high at the boundary between that drop and the surrounding water because there is a huge difference in the concentration of the dye there.

So that’s the logic of Fick’s Law of Diffusion. But why does this happen? And here we can look at the two different kinds of explanations, causal and emergent explanations.         

Here are a few examples of both:

Causal Explanation: “Dye molecules move towards water molecules.”
Emergent Explanation: “All molecules exercise Brownian motion.”

Causal Explanation: “Dye molecules flow from areas of high concentration to areas of low concentration.”
Emergent Explanation: “All molecules move at the same time.”

Causal Explanation: “Dye molecules are ‘pushed’ into the water by other dye molecules.”
Emergent Explanation: “Molecules collide independently of prior collisions. What happens to one molecule doesn’t affect interactions with other molecules.”

Causal Explanation: “Dye molecules want to mix with water molecules.”
Emergent Explanation: “The local conditions around each molecule affect where it moves and at what velocity.”

Causal Explanation: “Dye molecules stop moving when dye and water become mixed.”
Emergent Explanation: “Molecular interactions continue when equilibrium is reached.”

This is gives something of a flavor of the two different kinds of explanations. Causal explanations have more of a top-down approach, looking for the big forces that make things happen, and may even speak in metaphorical terms of volition, like what a molecule “wants” to do. Emergent explanations have more of a bottom-up approach, looking at all the things going on independently in a system and how that results in the patterns we observe.

I remember Brownian motion being something that really started pushing me to think of diffusion in a more emergent way. Brownian motion is the random motion of particles suspended in a medium, like a liquid or a gas. If you just set a glass of water on a table it may look stationary, but at the molecular scale there’s still a lot of movement. The water molecules are moving around in random directions. If you add a drop of color dye to the water the molecules in the dye also have Brownian motion, with all those molecules moving in random directions. So what’s going to happen in this situation? Well, things aren’t just going to stay put. The water molecules are going to keep moving around in random directions and the dye molecules are going to keep moving around in random directions. What kind of patter should we expect to see emerge from this?

Let’s imagine imposing a three-dimensional grid onto this space, dividing the glass up into cube volumes or voxels. Far away from the drop of dye, water molecules will still be moving around randomly between voxels but those voxels will continue to look about the same. Looking at the space around the dye, voxels in the middle of the drop will be all dye. Voxels on the boundary will have some dye molecules and some water molecules. And voxels with a lot of dye molecules will be next to voxels with few dye molecules. As water molecules and dye molecules continue their random motion we’re going to see the most state changes in the voxels that are different from each other. Dye molecules near a voxel with mostly water molecules can very likely move into one of those voxels and change its state from one with few or no dye molecules to one with some or more dye molecules. And the biggest state changes will occur in regions where voxels near to each other are most different, just because they can be so easily (albeit randomly) changed.

This is a very different way of looking at the process of diffusion. Rather than there being some rule imposed from above, telling dye molecules that they should move to areas of high concentration to low concentration, all these molecules are moving around randomly. And over time areas with sharp differences tend to even out, just by random motion. From above and from a distance this even looks well-ordered and like it could be directed. The random motion of all the components results in an emergent macro-level pattern that can be modeled and predicted by a fairly simple mathematical expression. The movement of each individual molecule is random and unpredictable, but the resulting behavior of the system, the aggregate of all those random motions, is ordered and highly predictable. I just think that’s quite elegant!

Miller, Streveler, and Olds give another example that neatly illustrates different ways of understanding a physical process at the three different scales: macroscopic, microscopic, and molecular. Their second example is of momentum transport. An example of momentum transport is pumping a fluid through a pipe. As a brief overview, when a fluid like water is moved through a pipe under pressure the velocity of the fluid is highest at the center of the pipe and lowest near the walls. This is a velocity gradient, often called a “velocity profile”, where you have this cross-sectional view of a pipe showing the velocity vectors of different magnitudes at different positions along the radius of the pipe. When you have this velocity gradient there is also a transfer of momentum to areas of high momentum to areas of low momentum. So in this case momentum will transfer from the center of the pipe toward the walls of the pipe.

The model for momentum transport has a similar structure to the model for mass transport. Recall that in Fick’s Law of Diffusion, mass transport, i.e. diffusion, was proportional to the concentration gradient and the constant of proportionality was this property called diffusivity. The equation was:

J = -D(dC/dx)

The corresponding model for momentum transport is Newton’s law of viscosity (Newton had a lot of laws). The equation for that is:

τ = -μ(dv/dx)


τ is shear stress, the flux of momentum transport
v is velocity
x is position
μ is viscosity, the applicable constant of proportionality in this case

So in Newton’s law of viscosity the momentum transport, i.e. shear stress, is proportional to the velocity gradient and the constant of proportionality is viscosity. You have higher momentum transport with a higher gradient, i.e. change, in velocity along the radius of the pipe. Why does that happen?

So they actually asked some students to explain this in their own words to see on what geometric scales they would make their descriptions. The prompt was: “Explain in your own words (no equations) how momentum is transferred through a fluid via viscous action.” And they evaluated the descriptions as one being of the three scales (or a mixture of them) using this rubric. So here are examples from the rubric of explanations at each of those scales:

Macroscopic explanation: The pressure at the pipe inlet is increased (usually by pumping) which causes the fluid to move through the pipe. Friction between fluid and pipe wall results in a pressure drop in the direction of flow along the pipe length. The fluid at the wall does not move (no-slip condition) while fluid furthest away from the wall (at the pipe centerline) flows the fastest, so momentum is transferred from the center (high velocity and high momentum) to the wall (no velocity and no momentum).

Microscopic explanation: Fluid in laminar flow moves as a result of an overall pressure drop causing a velocity profile to develop (no velocity at the wall, maximum velocity at the pipe centerline). Therefore, at each pipe radius, layers of fluid flow past each other at different velocities. Faster flowing layers tend to speed up [and move] slower layers along resulting in momentum transfer from faster layers in the middle of the pipe to slower layers closer to the pipe walls.

Molecular explanation: Fluid molecules are moving in random Brownian motion until a pressure is applied at the pipe inlet causing the formation of a velocity gradient from centerline to pipe wall. Once the gradient is established, molecules that randomly migrate from an area of high momentum to low momentum will take along the momentum they possess and will transfer some of it to other molecules as they collide (increasing the momentum of the slower molecules). Molecules that randomly migrate from low to high momentum will absorb some momentum during collisions. As long as the overall velocity gradient is maintained, the net result is that momentum is transferred by molecular motion from areas of high momentum to areas of low momentum and ultimately to thermal dissipation at the pipe wall.

With these different descriptions as we move from larger to smaller scales we also move from causal to emergent explanations. At the macroscopic level we’re looking at bulk motion of fluid. At the microscopic scale it’s getting a little more refined. We’re thinking in terms of multiple layers of fluid flow. We’re seeing the gradient at a higher resolution. And we can think of these layers of fluid rubbing past each other, with faster layers dragging slower layers along, and slower layers slowing faster layers down. It’s spreading out a deck of cards. In these explanations momentum moves along the velocity gradient because of a kind of drag along the radial direction.

But with the molecular description we leave behind that causal explanation of things being dragged along. There’s only one major top-down, causal force in this system and that’s the pressure or force that’s being applied in the direction of the length of the pipe. With a horizontal pipe we can think of this force being applied along its horizontal axis. But there’s not a top-down, external force being applied along the vertical or radial axis of the pipe. So why does momentum move from the high-momentum region in the center of the pipe to the low-momentum region near the pipe wall? It’s because there’s still random motion along the radial or vertical axis, which is perpendicular to the direction of the applied pressure. So molecules are still moving randomly between regions with different momentum. So if we think of these layers, these cylindrical sheets that are dividing up the sections of the pipe at different radii, these correspond to our cube voxels in the diffusion example. Molecules are moving randomly between these sheets. The state of each layer is characterized by the momentum of the molecules in it. As molecules move between layers and collide with other molecules they transfer momentum. As in the diffusion example the overall pattern that emerges here is the result of random motion of the individual molecular components.

So, does this matter? My answer to that question is usually that “it”, whatever it may be, matters when and where it matters. Miller, Streveler, and Olds say: “If the macroscopic and microscopic models are successful in describing the global behavior of simple systems, why should we care if students persist in incorrectly applying causal models to processes such as dye diffusion into water? The answer is simple – the causal models can predict some but not all important behavioral characteristics of molecular diffusional processes.” And I think that’s a good criterion for evaluation. I actually wouldn’t say, as they do, that the application of causal models is strictly “incorrect”. But I take their broader point. Certainly macroscopic and causal models have their utility. For one thing, I think they’re easier to understand starting off. But as with all models, you have to keep in mind their conditions of applicability. Some apply more broadly then others.

One thing to notice about these transport models is that they have proportionality constants. And whenever you see a constant like that in a model it’s important to consider what all might be wrapped up into it because it may involve a lot of complexity. And that is the case with both the diffusion coefficient and viscosity. Both are heavily dependent on specific properties of the system. For the value of viscosity you have to look it up for a specific substance and then also for the right temperature range. Viscosity varies widely between different substances. And even for a single substance it can still vary widely with temperature. For diffusivity you have to consider not only one substance but two, at least. If you look up a coefficient of diffusivity in a table it’s going to be for a pair of substances. And that will also depend on temperature.

At a macroscopic scale it’s not clear why the rates mass transport and momentum transport would depend on temperature or the type of substances involved. But at a microscopic scale you can appreciate how different types of molecules would have different sizes and would move around at different velocities at different temperatures and how that would all play into the random movements of particles and the interactions between particles that produce, from that molecular scale, the emergent processes of diffusion and momentum transport that we observe at the macroscopic scale.

Once you open up that box, to see what is going on behind these proportionality constants, it opens up a whole new field of scientific work to develop – you guessed it – more and better models to qualify and quantify these phenomena.