Set Theory

Jakob and Todd talk about set theory, its historical origins, Georg Cantor, trigonometric series, cardinalities of number systems, the continuum hypothesis, cardinalities of infinite sets, set theory as a foundation for mathematics, Cantor’s paradox, Russell’s paradox, axiomatization, the Zermelo–Fraenkel axiomatic system, the axiom of choice, and the understanding of mathematical objects as “sets with structure”.

Causal and Emergent Models

Models are critical tools that enable us to think about, qualify, and quantify features of many processes. And as with any kind of tool, different kinds of models are better suited to different circumstances. Here we look at two kinds of models for understanding transport phenomena: causal and emergent models. In a causal process there is some kind of distinct, sequential, goal-oriented event with an observable beginning and end. In an emergent process there are uniform, parallel, independent events with no beginning or end but in which observable patterns eventually emerge.

For the video version of this episode, which includes some visual aids, see on YouTube.

Since my university studies I’ve been fascinated by the ways we use models to understand and even make quantitative descriptions and predictions about the world. I don’t remember when exactly, but at some point I really began to appreciate how the pictures of chemical and physical processes I had in my head were not the way things “really” were (exactly) but were useful models for thinking about things and solving problems.

Conceptual models in science, engineering, economics, etc. are similar to toy models like model cars or model airplanes in that they aren’t the things themselves but have enough in common with the things they are modeling to still perform in similar ways. As long as a model enables you to get the information and understanding you need it is useful, at least for the scale and circumstances you’re interested in. Models are ubiquitous in the sciences and one of the major activities in the sciences is to improve models, generate new models, and create more models to apply to more conditions.

Something to bear in mind when working with a model is the set of conditions in which it works well. That’s important because a model may work very well under a certain set of conditions but then break down outside those conditions. Outside those conditions it may give less accurate results or just not describe well qualitatively what’s going on in the system we’re trying to understand. This could be something like being outside a temperature or pressure range, extremes in velocity or gravitational field strength, etc. And often it’s a matter of geometric scale, like whether we’re dealing in meters or nanometers. The world looks different at the microscopic and molecular scale than at the macroscopic scale of daily life.

I’m really a pluralist when it comes to models. I’m in favor of several types to meet the tasks at hand. Is a classical, Newtonian model for gravity superior to a relativistic model for gravity? I don’t think so. Yeah, a Newtonian model breaks down under certain conditions. But it’s much easier and intuitive to work with under most conditions. It doesn’t make sense to just throw away a Newtonian model after relativity. And we don’t. We can’t. It would be absurdly impractical. And practicality is a major virtue of models. That’s not to say there’s no such thing as better or worse models. A Newtonian model of planetary motion is better than a Ptolemaic one because it’s both more accurate and simpler to understand. So I don’t embrace pluralism without standards of evaluation. I suppose there’d be an infinite number of really bad models in the set of all possible models. Even so, there are still multiple that do work well, that overlap and cover similar systems.

I studied chemical engineering in the university and one of my textbooks was Transport Phenomena by Bird, Stewart, and Lightfoot, sort of a holy trinity of the discipline. Transport phenomena covers fluids, heat, and diffusion, which all share many features and whose models share a very similar structure. One of the ideas I liked in that book is its systematic study of processes at three scales: macroscopic, microscopic, and molecular. I’ll quote from the book for their explanations of these different scales.

“At the macroscopic level we write down a set of equations called the ‘macroscopic balances,’ which describe how the mass, momentum, energy, and angular momentum in the system change because of the introduction and removal of these entities via the entering and leaving streams, and because of various other inputs to the system from the surroundings. No attempt is made to understand all the details of the system.”

“At the microscopic level we examine what is happening to the fluid mixture in a small region within the equipment. We write down a set of equations called the ‘equations of change,’ which describe how the mass, momentum, energy, and angular moment change within this small region. The aim here is to get information about velocity, temperature, pressure, and concentration profiles within the system. This more detailed information may be required for the understanding of some processes.”

“At the molecular level we seek a fundamental understanding of the mechanisms of mass, momentum, energy, and angular momentum transport in terms of molecular structure and intermolecular forces. Generally this is the realm of the theoretical physicist or physical chemist, but occasionally engineers and applied scientists have to get involved at this level.”

I came across an interesting paper recently from a 2002 engineering education conference titled How Chemical Engineering Seniors Think about Mechanisms of Momentum Transport by Ronald L. Miller, Ruth A. Streveler, and Barbara M. Olds. It caught my attention since I’ve been a chemical engineering senior so I wanted to see how it compared to my experience. And it tracked it pretty well actually. Their idea is that one of the things that starts to click for seniors in their studies, something that often hadn’t clicked before, is a conceptual understanding of many fundamental molecular-level and atomic-level phenomena including heat, light, diffusion, chemical reactions, and electricity. I’ll refer mostly to the examples from this paper by Miller, Streveler, and Olds but I’ll mention that they base much of their presentation on the work of Michelene Chi, who is a cognitive and learning scientist. In particular they refer to her work on causal versus emergent conceptual models for these physical processes. Her paper on this is titled Misconceived Causal Explanations for Emergent Processes. Miller, Streveler, and Olds propose that chemical engineering students start out using causal models to understand many of these processes but then move to more advanced, emergent models later in their studies.

In a causal process there is some kind of distinct, sequential, goal-oriented event with an observable beginning and end. In an elastic collision for instance, a moving object collides with a previously stationary object and transfers its momentum to it. In an emergent process there are uniform, parallel, independent events with no beginning or end but in which observable patterns eventually emerge. Electricity, fluid flow, heat transfer and molecular equilibrium are examples of emergent processes. Miller, Streveler, and Olds correlate causal and emergent explanations with macroscopic and molecular models respectively. As Bird, Stewart, and Lightfoot had said in their descriptions of their three scales, it’s at the molecular level that “we seek a fundamental understanding of the mechanisms.” But at the macroscopic scales we aren’t looking at so fundamental an explanation.      

Miller, Streveler, and Olds use diffusion, i.e. mass transport, as an example to show the difference between causal and emergent explanations. Say we have a glass of water and we add a drop of color dye to it. The water is a solvent and the color dye is a solute. This color dye solute starts to diffuse, or spread, into the water solvent and we can explain this diffusion process in both causal and emergent ways; or we could also say in macroscopic and molecular ways.

First, a quick overview of diffusion. The mathematical model for diffusion is Fick’s Law of Diffusion. The equation for this is:       

J = -D(dC/dx)

Where,
J is the diffusion flux
C is concentration
x is position
D is diffusivity, the applicable constant of proportionality in this case

The basic logic of this equation is that the diffusion of a solute is proportional to the gradient of the concentration of that solute in a solvent. If the solute is evenly distributed in the solution the concentration is the same everywhere in the solution, so there is no concentration gradient and no diffusion. But there is a gradient if the solute concentration is different at different positions in the space, for example, if it is highly concentrated at one point and less concentrated as you move away from that point. The diffusion flux is proportional to the steepness of that decrease, that gradient. If a drop of dye has just been placed in a glass of water the flux of diffusion is going to be very high at the boundary between that drop and the surrounding water because there is a huge difference in the concentration of the dye there.

So that’s the logic of Fick’s Law of Diffusion. But why does this happen? And here we can look at the two different kinds of explanations, causal and emergent explanations.         

Here are a few examples of both:

Causal Explanation: “Dye molecules move towards water molecules.”
Emergent Explanation: “All molecules exercise Brownian motion.”

Causal Explanation: “Dye molecules flow from areas of high concentration to areas of low concentration.”
Emergent Explanation: “All molecules move at the same time.”

Causal Explanation: “Dye molecules are ‘pushed’ into the water by other dye molecules.”
Emergent Explanation: “Molecules collide independently of prior collisions. What happens to one molecule doesn’t affect interactions with other molecules.”

Causal Explanation: “Dye molecules want to mix with water molecules.”
Emergent Explanation: “The local conditions around each molecule affect where it moves and at what velocity.”

Causal Explanation: “Dye molecules stop moving when dye and water become mixed.”
Emergent Explanation: “Molecular interactions continue when equilibrium is reached.”

This is gives something of a flavor of the two different kinds of explanations. Causal explanations have more of a top-down approach, looking for the big forces that make things happen, and may even speak in metaphorical terms of volition, like what a molecule “wants” to do. Emergent explanations have more of a bottom-up approach, looking at all the things going on independently in a system and how that results in the patterns we observe.

I remember Brownian motion being something that really started pushing me to think of diffusion in a more emergent way. Brownian motion is the random motion of particles suspended in a medium, like a liquid or a gas. If you just set a glass of water on a table it may look stationary, but at the molecular scale there’s still a lot of movement. The water molecules are moving around in random directions. If you add a drop of color dye to the water the molecules in the dye also have Brownian motion, with all those molecules moving in random directions. So what’s going to happen in this situation? Well, things aren’t just going to stay put. The water molecules are going to keep moving around in random directions and the dye molecules are going to keep moving around in random directions. What kind of patter should we expect to see emerge from this?

Let’s imagine imposing a three-dimensional grid onto this space, dividing the glass up into cube volumes or voxels. Far away from the drop of dye, water molecules will still be moving around randomly between voxels but those voxels will continue to look about the same. Looking at the space around the dye, voxels in the middle of the drop will be all dye. Voxels on the boundary will have some dye molecules and some water molecules. And voxels with a lot of dye molecules will be next to voxels with few dye molecules. As water molecules and dye molecules continue their random motion we’re going to see the most state changes in the voxels that are different from each other. Dye molecules near a voxel with mostly water molecules can very likely move into one of those voxels and change its state from one with few or no dye molecules to one with some or more dye molecules. And the biggest state changes will occur in regions where voxels near to each other are most different, just because they can be so easily (albeit randomly) changed.

This is a very different way of looking at the process of diffusion. Rather than there being some rule imposed from above, telling dye molecules that they should move to areas of high concentration to low concentration, all these molecules are moving around randomly. And over time areas with sharp differences tend to even out, just by random motion. From above and from a distance this even looks well-ordered and like it could be directed. The random motion of all the components results in an emergent macro-level pattern that can be modeled and predicted by a fairly simple mathematical expression. The movement of each individual molecule is random and unpredictable, but the resulting behavior of the system, the aggregate of all those random motions, is ordered and highly predictable. I just think that’s quite elegant!

Miller, Streveler, and Olds give another example that neatly illustrates different ways of understanding a physical process at the three different scales: macroscopic, microscopic, and molecular. Their second example is of momentum transport. An example of momentum transport is pumping a fluid through a pipe. As a brief overview, when a fluid like water is moved through a pipe under pressure the velocity of the fluid is highest at the center of the pipe and lowest near the walls. This is a velocity gradient, often called a “velocity profile”, where you have this cross-sectional view of a pipe showing the velocity vectors of different magnitudes at different positions along the radius of the pipe. When you have this velocity gradient there is also a transfer of momentum to areas of high momentum to areas of low momentum. So in this case momentum will transfer from the center of the pipe toward the walls of the pipe.

The model for momentum transport has a similar structure to the model for mass transport. Recall that in Fick’s Law of Diffusion, mass transport, i.e. diffusion, was proportional to the concentration gradient and the constant of proportionality was this property called diffusivity. The equation was:

J = -D(dC/dx)

The corresponding model for momentum transport is Newton’s law of viscosity (Newton had a lot of laws). The equation for that is:

τ = -μ(dv/dx)

Where

τ is shear stress, the flux of momentum transport
v is velocity
x is position
μ is viscosity, the applicable constant of proportionality in this case

So in Newton’s law of viscosity the momentum transport, i.e. shear stress, is proportional to the velocity gradient and the constant of proportionality is viscosity. You have higher momentum transport with a higher gradient, i.e. change, in velocity along the radius of the pipe. Why does that happen?

So they actually asked some students to explain this in their own words to see on what geometric scales they would make their descriptions. The prompt was: “Explain in your own words (no equations) how momentum is transferred through a fluid via viscous action.” And they evaluated the descriptions as one being of the three scales (or a mixture of them) using this rubric. So here are examples from the rubric of explanations at each of those scales:

Macroscopic explanation: The pressure at the pipe inlet is increased (usually by pumping) which causes the fluid to move through the pipe. Friction between fluid and pipe wall results in a pressure drop in the direction of flow along the pipe length. The fluid at the wall does not move (no-slip condition) while fluid furthest away from the wall (at the pipe centerline) flows the fastest, so momentum is transferred from the center (high velocity and high momentum) to the wall (no velocity and no momentum).

Microscopic explanation: Fluid in laminar flow moves as a result of an overall pressure drop causing a velocity profile to develop (no velocity at the wall, maximum velocity at the pipe centerline). Therefore, at each pipe radius, layers of fluid flow past each other at different velocities. Faster flowing layers tend to speed up [and move] slower layers along resulting in momentum transfer from faster layers in the middle of the pipe to slower layers closer to the pipe walls.

Molecular explanation: Fluid molecules are moving in random Brownian motion until a pressure is applied at the pipe inlet causing the formation of a velocity gradient from centerline to pipe wall. Once the gradient is established, molecules that randomly migrate from an area of high momentum to low momentum will take along the momentum they possess and will transfer some of it to other molecules as they collide (increasing the momentum of the slower molecules). Molecules that randomly migrate from low to high momentum will absorb some momentum during collisions. As long as the overall velocity gradient is maintained, the net result is that momentum is transferred by molecular motion from areas of high momentum to areas of low momentum and ultimately to thermal dissipation at the pipe wall.

With these different descriptions as we move from larger to smaller scales we also move from causal to emergent explanations. At the macroscopic level we’re looking at bulk motion of fluid. At the microscopic scale it’s getting a little more refined. We’re thinking in terms of multiple layers of fluid flow. We’re seeing the gradient at a higher resolution. And we can think of these layers of fluid rubbing past each other, with faster layers dragging slower layers along, and slower layers slowing faster layers down. It’s spreading out a deck of cards. In these explanations momentum moves along the velocity gradient because of a kind of drag along the radial direction.

But with the molecular description we leave behind that causal explanation of things being dragged along. There’s only one major top-down, causal force in this system and that’s the pressure or force that’s being applied in the direction of the length of the pipe. With a horizontal pipe we can think of this force being applied along its horizontal axis. But there’s not a top-down, external force being applied along the vertical or radial axis of the pipe. So why does momentum move from the high-momentum region in the center of the pipe to the low-momentum region near the pipe wall? It’s because there’s still random motion along the radial or vertical axis, which is perpendicular to the direction of the applied pressure. So molecules are still moving randomly between regions with different momentum. So if we think of these layers, these cylindrical sheets that are dividing up the sections of the pipe at different radii, these correspond to our cube voxels in the diffusion example. Molecules are moving randomly between these sheets. The state of each layer is characterized by the momentum of the molecules in it. As molecules move between layers and collide with other molecules they transfer momentum. As in the diffusion example the overall pattern that emerges here is the result of random motion of the individual molecular components.

So, does this matter? My answer to that question is usually that “it”, whatever it may be, matters when and where it matters. Miller, Streveler, and Olds say: “If the macroscopic and microscopic models are successful in describing the global behavior of simple systems, why should we care if students persist in incorrectly applying causal models to processes such as dye diffusion into water? The answer is simple – the causal models can predict some but not all important behavioral characteristics of molecular diffusional processes.” And I think that’s a good criterion for evaluation. I actually wouldn’t say, as they do, that the application of causal models is strictly “incorrect”. But I take their broader point. Certainly macroscopic and causal models have their utility. For one thing, I think they’re easier to understand starting off. But as with all models, you have to keep in mind their conditions of applicability. Some apply more broadly then others.

One thing to notice about these transport models is that they have proportionality constants. And whenever you see a constant like that in a model it’s important to consider what all might be wrapped up into it because it may involve a lot of complexity. And that is the case with both the diffusion coefficient and viscosity. Both are heavily dependent on specific properties of the system. For the value of viscosity you have to look it up for a specific substance and then also for the right temperature range. Viscosity varies widely between different substances. And even for a single substance it can still vary widely with temperature. For diffusivity you have to consider not only one substance but two, at least. If you look up a coefficient of diffusivity in a table it’s going to be for a pair of substances. And that will also depend on temperature.

At a macroscopic scale it’s not clear why the rates mass transport and momentum transport would depend on temperature or the type of substances involved. But at a microscopic scale you can appreciate how different types of molecules would have different sizes and would move around at different velocities at different temperatures and how that would all play into the random movements of particles and the interactions between particles that produce, from that molecular scale, the emergent processes of diffusion and momentum transport that we observe at the macroscopic scale.

Once you open up that box, to see what is going on behind these proportionality constants, it opens up a whole new field of scientific work to develop – you guessed it – more and better models to qualify and quantify these phenomena.

Category Theory

Jakob and Todd discuss category theory, an important field in modern mathematics that focuses on the relations (morphisms) between mathematical objects. We discuss the importance of abstraction and the development in the history of mathematics beyond solving particular problems to studying the general nature of mathematical structures as such, the kinds of problems that can and can’t be solved, their properties, etc. We also consider the significance of a relation-centered approach to other fields, how things like languages, theories, and beliefs can be analyzed by the relations between their constituent elements.

For the visual aids referred to in the discussion see the video version on YouTube.

Stokes’ Beautiful Theorem: Differential Forms, Boundaries, Exterior Derivatives, and Manifolds

Stokes’ Theorem in its general form is a remarkable theorem with many applications in calculus, starting with the Fundamental Theorem of Calculus. The pattern in each case is that the integral of a function over a region is equal to the integral of a related function over the boundary of the region. We can use information about the boundary of a region to get information about the entire region, which is both useful and mathematically elegant.

For the visual aids that go with this episode see the video version on YouTube.

I’ve been reviewing some calculus recently but I’ve been approaching it in a different way than I have before, like when I learned it in school to be an engineer. Instead of studying it for the purpose of solving specific problems I’m looking at it more from a high-level, trying to understand it conceptually to see the general structure and patterns. It’s in line with my philosophical penchant to take things up a level or look behind things at another level of abstraction. One of the patterns I’ve seen across calculus has been something that falls under the generalized version of Stokes’ Theorem. I find the generalized Stokes’ Theorem quite beautiful, in that way that mathematics can be beautiful. It can express concisely and compactly a concept that has broad and varied applications in its more particular forms.

I’ll go through the generalized Stokes’ Theorem and some of its special applications. Since a lot of this is better understood visually I’ve made the YouTube video to go along with this so those listening to the podcast might want to check it out as well if this stuff is hard to picture.

The generalized Stokes’ Theorem states that the integral of some differential form of dimension k-1 over the boundary of some orientable manifold of dimension k is equal to the integral of that differential form’s exterior derivative over the whole of that orientable manifold.

The four key concepts here (the ones listed in the subtitle of this episode) are:

1. Differential forms
2. Boundaries
3. Exterior derivatives
4. Manifolds

That’s very abstract, not that there’s anything wrong with that. We need to be abstract to be general. The key concept, the idea I want to drive home with this episode, is that under certain conditions information about a boundary can give you information about the entire region that it bounds. In the generalized form we’re getting information about a whole manifold from the boundary of that manifold. That will be the case in all these examples. But now let’s look at examples to illustrate. Interestingly enough this comes into play from the very beginning of calculus, with the Fundamental Theorem of Calculus.

The Fundamental Theorem of Calculus

As a very quick crash course in calculus for the uninitiated, in calculus the two most important operations are derivatives and integrals. When you take the derivative of a function it produces another function that tells you the instantaneous rate of change of the original function at any point. So for example, if you have an equation for distance from some starting point with respect to time, the derivative of that function will tell you what the velocity is at any point in time. Very useful. You can also do the opposite of that, which is an antiderivative, or integral. Say you were starting with the function of velocity with respect to time. You could take the antiderivative of that function and get a function for the position at any point in time. You’d just need to know what your starting point was and add that to it. One of the most important applications of these operations uses the Fundamental Theorem of Calculus.

The Fundamental Theorem of Calculus is probably the most recognizable thing to calculus students, even if they don’t remember that that’s what it was called. It’s the principle behind finding the area under a curve. For example, if you have a function for the velocity with respect to time plotted on a graph, the area under that curve, between the curve and the horizontal axis, sweeps out an area that will give you the value of the distance traveled between any two points in time you choose. The cool thing is that you only need to know the values of the antiderivative at your starting time and at your ending time. You don’t need to know the values in between them.

So finding the area under a curve, finding it analytically as opposed to numerically, is really quite a remarkable thing if you think about it. Because you’re basically taking some function, doing an operation on it, and then applying the resulting function only to the two points bounding the region you’re interested in. If you’re integrating from point a to point b you’re only paying attention to points a and b, not to any of the points in between them. But you’re still getting information about the whole region. You’re getting the area under all those points in between a and b. I think that’s quite remarkable. And that’s what happens in each particular version of Stokes’ Theorem. You’re able to get information about an entire region from its boundaries.

We might forget this sometimes if we’re using numerical methods more than analytical methods of integration. Using numerical methods like the trapezoidal or the rectangle method we actually do go in and add up all the regions between the boundaries that approximate the total area under the curve. But for analytical integration we don’t need to do that. We only need the antiderivative and the boundary points. And that tells us everything about the region in between those points. It seems almost magical.

Circling back to those four key concepts in the generalized Stokes’ Theorem, with the Fundamental Theorem of Calculus what we have is a differential form of dimension 0 and a manifold of dimension 1. For a function, lower case f(x), the antiderivative, upper case F, is the (0-form) differential form. The closed interval from a to b is a 1-dimensional manifold. (For the sake of simplicity think of a manifold as a surface of some dimension. A 1-dimensional manifold here being a line.) The boundaries a and b are 0-dimensional; they’re points. 

 And the exterior derivative is lower case f(x)dx.

Green’s Theorem

What other applications are there of this generalized Stokes’ Theorem? There’s also Green’s Theorem. Green’s Theorem has a similar form to the Fundamental Theorem of Calculus but instead of looking at a curve bounded by points we’re looking at a plane region bounded by a curve. With the Fundamental Theorem of Calculus we have the integral of a 0-dimensional differential form over the boundary of a 1-dimensional manifold. With Green’s Theorem we have the integral of a 1-dimensional differential form over the boundary of a 2-dimensional manifold.

As with all these theorems we’re looking at, in Green’s Theorem we are able to determine features of a region by looking at its boundaries. If we have a closed curve C that surrounds a region D we can figure out the area of D from the closed curve C. This is actually how planimeters work. And planimeters are pretty cool. A planimeter is a device that you can use to trace out, with a mechanical arm, a curve of any shape, and when you return to the position you started at it calculates the area that that shape encloses. This is exactly what Green’s theorem does.

So now to state Green’s Theorem: Say we have curve C, and functions M and N defined on a region containing D, the region enclosed by C. Green’s Theorem states that the double integral of partial derivative of M with respect to y, minus the partial derivative of L with respect to x, δM/δy – δL/δx over a region D is equal to the line integral of Ldx + Mdy over the curve C. This is that perimeter to area connection, where we can get an area, normally found using a double integral, from a perimeter, found using a line integral. The differential form is Ldx + Mdy. The region D is the two-dimensional manifold. The curve C is the 1-dimensional boundary of the manifold. And the exterior derivative is δM/δy – δL/δx.

The Divergence Theorem

We can bump all this up another dimension to get the Gauss’s Theorem, also known as the Divergence Theorem. With Green’s Theorem we have the integral of a 1-dimensional differential form over the boundary of a 2-dimensional manifold. With the Divergence Theorem we have the integral of a 2-dimensional differential form over the boundary of a 3-dimensional manifold.

With the Divergence Theorem we’re able to get information about a 3-dimensional region from it’s 2-dimensional boundary. Call the 3-dimensional region V and the 2-dimensional surface S. Then we have a vector field F. The Divergence Theorem states that the triple integral or volume integral of the divergence of vector field F over the 3-dimensional region V is equal to the surface integral of F over the surface S.

In terms of the generalized Stokes’ Theorem, with the Divergence Theorem the differential form is F·dS. The space E is the 3-dimensional manifold. The surface S is the 2-dimensional boundary of the manifold. And the exterior derivative is the divergence div F dV.

One way to understand this is that the net flux out of the region gives the sum of all sources of the field in a region. And this has many physical applications. For example, the Divergence Theorem has application to the first two of Maxwell’s four equations in physics. All four of Maxwell’s Equations have an integral form and a differential form. But the integral and differential forms are really equivalent. For the first two of Maxwell’s Equations the Divergence Theorem shows the equivalence between these two forms.

The first of Maxwell’s Equations, Gauss’s Law, relates an electric field to its source charge. This is a perfect application for the Divergence Theorem because the divergence operator gives information about sources and sinks. And an electric charge is a source. Gauss’s Law states that the net outflow of the electric field through any closed surface is proportional to the charge enclosed by the surface. In the integral form the way this is expressed is that the surface integral of electric field E over an enclosed boundary is equal to the charge divided by the permittivity of free space. In the differential form the way this is expressed is that the divergence of the electric field is equal to the charge density divided by the permittivity of free space.

The Divergence Theorem tells us that the triple integral of the divergence of electric field E over a volume is equal to the surface integral of the electric field E over the surface boundary. Since by the differential form of Gauss’s Law the divergence of the electric field is equal to the charge density divided by the permittivity of free space, if we take the triple integral of both sides we see that the triple integral of the divergence is equal to the charge divided by the permittivity of free space. By the integral form of Gauss’s Law the surface integral of electric field E is also equal to the charge divided by the permittivity of free space. So both the surface integral of the electric field E over the surface boundary and the triple integral of the divergence of the electric field E are equal to the charge divided by the permittivity of free space, and so they are equal to each other, which is exactly what the Divergence Theorem says. So these two forms are actually equivalent. 

The second of Maxwell’s Equations, Gauss’s Law for Magnetism has a similar form but demonstrates that there are no magnetic monopoles. The surface integral of a magnetic field B over some surface S is always equal to 0. Magnetic field lines neither begin nor end but make loops or extend to infinity and back. Any magnetic field line that enters a given volume must somewhere exit that volume.  In the integral form the way this is expressed is that the surface integral of magnetic field B over an enclosed boundary is equal to 0. In the differential form the way this is expressed is that the divergence of the magnetic field is equal to zero. The Divergence Theorem tells us that the triple integral of the divergence of magnetic field B over a volume is equal to the surface integral of the magnetic field B over the surface boundary. If we take the triple integral of the divergence of the magnetic field this is still equal to zero, as is the surface integral of the magnetic field over the enclosed boundary. So again these two forms are also equivalent.

Kelvin-Stokes’ Theorem

The last of the particular applications of the generalized Stokes’ Theorem is also called Stokes’ Theorem or Kelvin-Stokes’ Theorem. With Kelvin-Stokes’ Theorem as with Green’s Theorem we have the integral of a 1-dimensional differential form over the boundary of a 2-dimensional manifold, but in R3, i.e. 3-dimensional space. Given a vector field F the theorem states that the double integral or surface integral of the curl of the vector field over some surface is equal to the line integral of the vector field around the boundary of that surface. Here again, the boundary gives us information about the region inside it. 

In terms of the generalized Stokes’ Theorem, with Kelvin-Stokes’ Theorem the differential form is F·dr. The surface S is the 2-dimensional manifold. The curve C is the 1-dimensional boundary of the manifold. And the exterior derivative is curl F·dS.

Curl is another vector operator, like divergence, and it’s easier to get the gist of it from physical examples, which we can get from the other two of Maxwell’s Equations.

The third of Maxwell’s Equations is also known as Faraday’s Law of Induction. Faraday’s Law describes how a time varying magnetic field creates, or induces, an electric field, which is the reason we’re able to generate electricity from turbines. In the integral form the way this is expressed is that the line integral of electric field E is equal to the negative derivative with respect to time of the surface integral of the magnetic field B. In the differential form the way this is expressed is that the curl of the electric field E is equal to the negative derivative of the magnetic field B with respect to time. Kelvin-Stokes’ Theorem tells us that the surface integral of the curl of the electric field E over some surface is equal to the line integral of the electric field E around the boundary of that surface. If we take the surface integral of the curl of the electric field E this is equal to the surface integral of the negative partial derivative of the magnetic field B with respect to time. And by the integral form of Faraday’s Law this is also equal to the line integral of the electric field around the surface boundary. So these two forms are also equivalent.

The fourth of Maxwell’s Equations is also known as Ampère’s Law. Ampère’s Law describes how a magnetic field can be generated by (a) an electric current and (b) a changing electric field. In the integral form the way this is expressed is that the line integral of magnetic field B is equal to the permeability of free space times the surface integral of current density J, plus the permittivity of free space, times the derivative with respect to time of the surface integral of the electric field E. In the differential form the way this is expressed is that the curl of the magnetic field B is equal to the permeability of free space times the current density J plus the permittivity of free space times the partial derivative of the electric field E with respect to time. Kelvin-Stokes’ Theorem tells us that the surface integral of the curl of the electric field E over some surface is equal to the line integral of the electric field E around the boundary of that surface. If we take the surface integral of the curl of the magnetic field B this is equal to the permeability of free space times the surface integral of current density J, plus the permittivity of free space, times the derivative with respect to time of the surface integral of the electric field E. And by the integral form of Ampere’s Law this is also equal to the line integral of magnetic field B around the surface boundary. So these two forms are also equivalent.

Maxwell’s Equations and Differential Forms

All of Maxwell’s equations actually simplify considerably in the language of differential forms. I’m just going to brush over this quickly without going into detail. We can describe both the electric and magnetic fields jointly by a 2-form, F,  in a 4-dimensional spacetime manifold. And we can describe electric current by a 3-form, J. Then we’ll need the exterior derivative operator, d, and the Hodge star operator, *. And Maxwell’s Equations are just:

dF = 0
d*F = J

That’s it. And one benefit of this is that thinking of the equations in terms of differential forms lets them generalize more easily to manifolds and relativistic settings.

To summarize, the general pattern with all these forms of Stokes’ Theorem is that the integral of a function over a region is equal to the integral of a related function over the boundary of the region. We can get information about an entire region from its boundary. And this is something that applies in interesting ways at different dimensions. Mathematically it’s aesthetically satisfying and elegant.