Set Theory

Jakob and Todd talk about set theory, its historical origins, Georg Cantor, trigonometric series, cardinalities of number systems, the continuum hypothesis, cardinalities of infinite sets, set theory as a foundation for mathematics, Cantor’s paradox, Russell’s paradox, axiomatization, the Zermelo–Fraenkel axiomatic system, the axiom of choice, and the understanding of mathematical objects as “sets with structure”.

Philosophy of Structure, Part 3: Chemistry

Part 3 in this series on the philosophy of structure looks at examples and general principles of structure in chemistry. Subjects covered include quantum mechanics, the Schrödinger equation, wave functions, orbitals, molecules, functional groups, and the multiple levels of structure in proteins. General principles discussed include the nature of functions, the embedding of lower-level structures into higher-level structures, and the repeated use of a limited number of lower-level structures in the formation of higher-level structures.

In this third episode on the philosophy of structure I’d like to look at examples of structure in the field of chemistry. I’d like to see how some of the general principles of structure discussed in previous episodes apply to chemistry and to see what new general principles we can pick up from the examples of chemical structures. I’ll proceed along different scales, from the smallest and conceptually most fundamental components of chemical structure up to the larger, multicomponent chemical structures. For basic principles I’ll start with quantum mechanics, the Schrödinger  equation, and its wave function solutions, which constitute atomic and molecular orbitals. From there I’ll look at different functional groups that occur repeatedly in molecules. And lastly I’ll look at the multiple levels of structure of proteins, the embedding of chemical structures, and the use of repeatable units in the formation of multicomponent chemical structures.

One aspect from previous discussions that won’t really show up in chemistry is an aesthetic dimension of structure. That’s not to say that chemical structures lack beauty. I find them quite beautiful and the study and application of chemical structures has actually been the primary subject of my academic and professional work. In other words, I probably find chemistry more aesthetically satisfying than most people commonly would. But what I’m coming to think of as the philosophical problem of systematizing the aesthetic dimension of structure, in fields like music, art, and literature, isn’t so directly applicable here. I’ll get back to that problem in future episodes. The aesthetic dimension is not so intrinsic to the nature of the subject in the case of chemistry.

So let’s start getting into chemical structures by looking at the smallest and most conceptually fundamental scale.

Matter Waves

There is, interestingly enough, an intriguing point of commonality between music and chemistry at the most fundamental level; and that is in the importance of waveforms. Recall that the fundamental building block of a musical composition is a sound wave, a propagation of variations in the local pressure in which parts of the air are compacted and parts of the air are rarified. Sound waves are governed by the wave equation, a second order partial differential equation, and its solutions, in which a series of multiple terms are added together in a superposition, with each individual term in that summation representing a particular harmonic or overtone. There are going to be a lot of similarities to this in the basic building of chemical structures.

One of the key insights and discoveries of twentieth century science was that matter also takes the form of waves. This is foundational to quantum mechanics and it is known as the de Broglie hypothesis. This was a surprising and strange realization but it goes a long way in explaining much of what we see in chemistry. Because a particle is a wave it also has a wavelength. 

Recall that in acoustics, with mechanical waves propagating through a medium, wavelength is related to the frequency and speed of the wave’s propagation. That relation is:

λ = v/f

Where λ, is the wavelength, f is frequency, and v is the wave propagation velocity. With this kind of mechanical wave the wave is not a material “thing” but a process, a disturbance, occurring in a material medium.

But with a matter wave the wave is the matter itself. And the wavelength of the matter wave is related to the particle’s momentum, a decidedly material property. A particle’s wavelength is inversely proportional to its momentum. This relation is stated in the de Broglie equation:

λ = h/p

In which λ is the wavelength, h is a constant called Planck’s constant (6.626×10−34 J/s), and p is momentum. Momentum is a product of mass and velocity:

p = mv

Because the wavelength of a matter wave is inversely proportional to momentum the wavelength for the matter waves of macroscopic particles, the kinds of objects we see and interact with in our normal experience, is going to be very, very short, so as to be completely negligible. But for subatomic particles their wavelengths are going to be comparable to the scale of the atom itself, which will make their wave nature very significant to their behavior.

One interesting consequence of the wave nature of matter is that the precision of simultaneous values for momentum and position of a matter wave is limited. This is known as the Uncertainty Principle. There’s actually a similar limit to the precise specification of both wavelength and position for waves in general, i.e. for any and all waves. But because wavelength is related to momentum in matter waves this limitation gets applied to momentum as well.

Recall that with sound waves a musical pitch can be a superposition of multiple frequencies or wavelengths. This superposition is expressed by the multiple terms in a Fourier Series. Any function can be approximated using a Fourier Series, expressed in terms of added sinusoidal (oscillating) waves. A function that is already sinusoidal can be matched quite easily. The Fourier Series can converge on more complicated functions as well but they will require more terms (that’s important). In the case of musical pitches the resulting functions were periodic waves that repeated endlessly. But a Fourier Series can also describe pulses that are localized to specific regions. The catch is that more localized pulses, confined to tighter regions, require progressively more terms in the series, which means a higher number of wavelengths.

Bringing this back to matter waves, these same principles apply. Under the de Broglie formula wavelength is related to momentum. A pure sine wave that repeats endlessly has only one wavelength. But it also covers an infinite region. As a matter wave this would be a perfect specification of momentum with no specification of position. A highly localized pulse is confined to a small region but requires multiple terms and wavelengths in its Fourier Series. So its position is highly precise but its momentum is much less precise.

The limit of the simultaneous specification of momentum and position for matter waves is given by the equation:

σxσp ≥ h/(4π)

Where σx is the standard deviation of position, σp is the standard deviation of momentum, and h is Planck’s constant. The product of these two standard deviations has a lower limit. At this lower limit it’s only possible to decrease the standard deviation of one by increasing the standard deviation of the other. And this is a consequence of the wave nature of matter.

The most important application of these wave properties and quantum mechanical principles in chemistry is with the electron. Protons and neutrons are also important particles in chemistry, and significantly more massive than electrons. But it’s with the electrons where most the action happens. Changes to protons and electrons are the subject of nuclear chemistry, which is interesting but not something we’ll get into this time around. In non-nuclear chemical reactions it’s the electrons that are being arranged into the various formations that make up chemical structures. The behavior of an electron is described by a wave function and the wave equation is governed by the Schrödinger equation.

The Schrödinger equation is quite similar to the classical wave equation that governs sound waves. Recall that the classical wave equation is:

d2u/dx2 = (1/v2) * d2u/dt2 

Where u is the wave displacement from the mean value, x is distance, t is time, and v is velocity. A solution to this equation can be found using a method of separation of variables. The solution u(x,t) can be written as the product of a function of x and a sinusoidal function of time. We can write this solution as:

u(x,t) = ψ(x) * cos (2πft)

Where f is the frequency of the wave in cycles per unit time and ψ(x) is the spatial factor of the amplitude of u(x,t), the spatial amplitude of the wave. Substituting ψ(x) * cos (2πft) into the differential wave equation gives the following equation for the spatial amplitude ψ(x).

d2ψ/dx2 +2f2/v2 * ψ(x) = 0

And since frequency multiplied by wavelength is equal to velocity (fλ = v) we can rewrite this in terms of wavelength, λ:

d2ψ/dx2 +2/λ2 * ψ(x) = 0

So far this is just applicable to waves generally. But where things get especially interesting is the application to matter waves, particularly to electrons. Recall from the de Broglie formula that:

λ = h/p

In which h is a constant called Planck’s constant (6.626×10−34 J/s) and p is momentum. We can express the total energy of a particle in terms of momentum by the equation:

E = p2/2m + V(x)

Where E is total energy, m is mass, and V(x) is potential energy as a function of distance. Using this equation we can also express momentum in these terms:

p = {2m[E – V(x)]]1/2

And since,

λ = h/p

The differential equation becomes

d2ψ/dx2 + 2m/ħ2 * [E – V(x)] ψ(x) = 0

Where

ħ = h/(2π)

This can also be written as

2/2m * d2ψ/dx2 + V(x) ψ(x) = E ψ(x)

This is the Schrödinger equation. Specifically, it’s the time-independent Schrödinger equation. So what do we have here? There’s a similar relationship between the classical wave equation (a differential equation) and its solution u(x,t), which characterizes a mechanical wave. The Schrödinger equation is also a differential equation and its solution, ψ(x), is a wave function that characterizes a matter wave. It describes a particle of mass m moving in a potential field described by V(x). Of special interest to chemistry is the description of an electron moving in the potential field around an atomic nucleus.

Let’s rewrite the Schrödinger equation using a special expression called an operator. An operator is a symbol that tells you to do something to whatever follows the symbol. The operator we’ll use here is called a Hamiltonian operator, which has the form:

H = -ħ2/2m * d2/dx2 + V(x)

Where H is the Hamiltonian operator. It corresponds to the total energy of a system, including terms for both the kinetic and potential energy. We can express the Schrödinger equation much more concisely in terms of the Hamiltonian operator, in the following form:

H ψ(x) = E ψ(x)

There are some special advantages to expressing the Schrödinger equation in this form. One is that this takes the form of what is called an eigenvalue problem. An eigenvalue problem is one in which an operator is applied to an eigenfunction and the result returns the same eigenfunction, multiplied by some constant called the eigenvalue. In this case the operator is the Hamiltonian, H. The eigenfunction is the wave function, ψ(x). And the eigenvalue is the observable energy, E. These are all useful pieces of information to have that relate to each other very nicely, when expressed in this form.

Orbitals

In chemistry the wave functions of electrons in atoms and molecules are called atomic or molecular orbitals. And these are also found using the Schrödinger equation; they are solutions to the Schrödinger equation. The inputs to these wave functions are coordinates for points in space. The output from these wave functions, ψ, is some value, whose meaning is a matter of interpretation. The prevailing interpretation is the Born Rule, which gives a probabilistic interpretation. Under the Born Rule the value of ψ is a probability amplitude and the square modulus of the probability amplitude, |ψ|2, is called a probability density. The probability density defines for each point in space the probability of finding an electron at that point, if measured. So it has a kind of conditional, operational definition. More particularly, we could say, reducing the space to a single dimension, x, that |ψ(x)|2 gives the probability of finding the electron between x and x + dx. Going back to 3 dimensions, the wave function assigns a probability amplitude value, ψ, and a probability density value, |ψ|2, to each point in space. Informally, we might think of the regions of an orbital with the highest probability density as the regions where an electron “spends most of its time”.

Solutions to the Schrödinger equation, electron wavefunctions, can be solved exactly for the hydrogen atom. Other solutions cannot be solved analytically but can be approximated to high precision using methods like the variational method and perturbation theory. And again, we call these wave functions orbitals. I won’t get into the specifics of the methods for finding the exact solutions for the hydrogen atom but I’ll make some general comments. For an atom the Cartesian (x,y,z) coordinates for the three dimensions of space aren’t so convenient so we convert everything to spherical coordinates (r,θ,φ) in which r is a radial distance and θ and φ are angles with respect to Cartesian axes. The term for potential, V(r) in the Hamiltonian operator will be defined by the relation between a proton and an electron. And the mass of the electron also gets plugged into the Hamiltonian. Solving for the wave function makes use of various mathematical tools like spherical harmonics and radial wave functions. Radial wave functions in turn make use of Laguerre polynomials. Then solutions for the hydrogen atom will be expressed in terms of spherical harmonic functions and radial wave functions, with the overall wave function being a function of the variables (r,θ,φ).

Because the orbitals are functions of r, θ,and φ they can be difficult to visualize and represent. But partial representations can still give an idea of their structure. An orbital is often represented as a kind of cloud taking some kind of shape in space; a probability density cloud. The intensity of the cloud’s shading or color represents varying degrees of probability density.

The shapes of these clouds vary by the type of orbital. Classes of orbitals include s-orbitals, p-orbitals, d-orbitals, and f-orbitals. These different kinds of orbitals are grouped by their orbital angular momentum. s-orbitals are sphere shaped, nested shells. p-orbitals have a kind of “dumbbell” shape with lobes running along the x, y, and z axes. d-orbitals are even more unusual, with lobes running along two axes, and one orbital even having a kind of “donut” torus shape. Although we sometimes imagine atoms as microscopic solar systems with electrons orbiting in circles around the nucleus their structure is much more unusual, with these oddly shaped probability clouds all superimposed over each other. The structure of atoms into these orbitals has important implications for the nature of the elements and their arrangements into molecules. But before getting into that let’s pause a moment to reflect on the nature of the structure discussed up to this point.

Reflection on the Structure of the Wave Function

As with a sound wave, the function for an electron wave function is a solution to a differential equation, in this case the Schrödinger equation. This wave function ψ, is a function of position. In spherical coordinates of r, θ, and φ this function is ψ(r,θ,φ). In the most basic terms a function is a rule that assigns elements in a set, or a combination of elements from multiple sets, to a single element in another set. This rule imposes additional structure on relations between these sets. So in our case we have a set for all r values, a set for all θ values, a set for all φ values, and a set for all ψ values. Prior to the imposition of structure by any function we could combine elements from these sets in any way we like. In a four-dimensional (abstract) phase space or state space with axes r, θ, φ, and ψ all points are available, any ordered quadruple (r,θ,φ,ψ) is an option. That’s because an ordered triplet (r,θ,φ) can be associated with any value of ψ. There’s no rule in place limiting which values of ψ the ordered triplet (r,θ,φ) can associate with. The entire phase space is available; all states are available. But with the imposition of the function ψ(r,θ,φ) the region of permissible states conforming to this rule is significantly smaller. An ordered triplet (r,θ,φ) can be assigned to one and only one value of ψ.

It’s useful here to distinguish between logical possibility and physical possibility. In what sense are all ordered quadruples (r,θ,φ,ψ) in the state space “possible”? Most of them are not really physically possible for the electron in an atom because they would violate the laws of physics, the laws of quantum mechanics. That’s because the function, the wave function in fact is imposed. But in the theoretical case that it were not imposed, any ordered quadruple (r,θ,φ,ψ) would be logically possible; there’s no contradiction in such a combination. At least, not until we start to develop the assumptions that lead to the Schrödinger equation and its solutions. But since the actual, physical world follows physical laws only the states satisfying the function ψ(r,θ,φ) are physically possible.

This distinction between logical possibility and physical possibility highlights one, very basic source of structure: structure that arises from physical laws. Atomic orbitals are not man-made structures. There certainly are such things as man-made structures as well. But atomic orbitals are not an example of that. I say all this to justify including atomic orbitals as examples of structure in the first place, since in a physical sense they seem “already there” anyway, or as something that couldn’t be otherwise. But in light of the much more vast state space of logically possible states I think it makes sense to think of even these physically given states as highly structured when compared to the logically limitless states from which they stand apart.

I’d like to make one point of clarification here, especially considering the reputation quantum mechanics has for being something especially inexact or even anti-realist. What is it that the wave function specifies at each point in space, for each ordered triplet (r,θ,φ)? It’s certainly not the position of the electron. That indeed isn’t specified. But what is specified is the amplitude, ψ. And the square modulus of the amplitude, |ψ|2 is the probability of finding the electron at that position, (r,θ,φ). The wave function doesn’t specify the electron’s exact position. Does this mean that chaos reigns for the electron? The electron could, after all, be anywhere in the universe (with the exception of certain nodes). But that infinite extension of possible positions doesn’t mean that chaos reigns or that the electron isn’t bound by structure. The probability density of the electron’s position in space is very precisely defined and governs the way the electron behaves. It’s not the case that just anything goes. Certain regions of space are highly probable and most regions of space are highly improbable.

This is something of a matter of perspective and it’s a philosophical rather than scientific matter. But still just as interesting, for me at least. It pertains to the kinds of properties we should expect to see in different kinds of systems. What kinds of properties should we expect quantum systems to have? What are quantum properties? Do quantum systems have definite properties? I’ve addressed this in another episode on the podcast, drawing on the thought of Sunny Auyang. In her view there’s an important distinction to be made between classical properties and quantum properties. Even if quantum systems don’t have definite classical properties that’s not to say they don’t have properties at all. They just have properties of a different kind, properties that are more removed from the kinds of classical properties we interact with on a daily basis. We’re used to interacting with definite positions and definite momenta at our macroscopic scale of experience. At the quantum level such definite predicates are not found for position and momentum, but they are found for the position representation and momentum representation of a system’s wave function. Quoting Auyang:

“Are there predicates such that we can definitely say of a quantum system, it is such and so? Yes, the wavefunction is one. The wavefunction of a system is a definite predicate for it in the position representation. It is not the unique predicate; a predicate in the momentum representation does equally well. Quantum properties are none other than what the wavefunctions and predicates in other representations describe.” (How Is Quantum Field Theory Possible?)

I think of this as moving our perspective “up a level”, looking not at position itself but at the wave function that gives the probability amplitude, ψ, and probability density, |ψ|2, of position. That is where we find definite values governed by the laws of physics. It’s appropriate to look at this level for these kinds of quantum systems, because of the kind of things that they are. Expecting something else from them would be to expect something from a thing that is not appropriate to expect from the kind of thing that it is.

Molecular Orbitals

Let’s move now to molecules. Molecules are groups of atoms held together by chemical bonds. This makes use of a concept discussed in the last episode that is pertinent to structure generally: that of embedding. Lower-level structures get embedded, as kinds of modules, into higher-level structures. The lower-level structures remain but their combinations make possible a huge proliferation of new kinds of structures. As we move from the level of atoms to molecules the number of possible entities will expand dramatically. There are many more kinds of molecules than there are kinds of atoms. As of 2021 there are 118 different kinds of atoms called elements. That’s impressive. But this is miniscule compared to the number of molecules that can be made from combinations and arrangements of these elements. To give an idea, the Chemical Abstracts Service, which assigns a unique CAS registry number to different chemicals, currently has a database of 177 million different chemical substances. These are just molecules that we’ve found or made. There are many more that will be made and could be made.

Electrons are again key players in the formation of molecules as well. The behavior of electrons, their location probability densities, and wave-like behavior, continue to be defined by mathematical wave functions and abide by the Schrödinger equation. A wave function, ψ, gives a probability amplitude and its square modulus, |ψ|2, gives the probability of finding an electron in a given region. So many of the same principles apply. But the nature of these functions at the molecular level is more complex. In molecules the wave functions take new orbital forms. Orbitals in molecules take two new important forms: hybridized orbitals and molecular orbitals.

Hybridized orbitals are combinations of regular atomic orbitals that combine to form hybrids. So where before we had regular s-type and p-type orbitals these can combine to form hybrids such as sp3, sp2, and sp orbitals. With a carbon atom for instance, in the formation of various organic molecules, the orbitals of the valence electrons will hybridize.

Molecular orbitals are the wave functions for electrons in the chemical bonds between the atoms that make up a molecule. Molecular orbitals are formed by combining atomic orbitals or hybrid atomic orbitals from the atoms in the molecule. The wave functions for molecular orbitals don’t have analytic solutions to the Shrõdinger equation so they are calculated approximately.

A methane molecule is a good example to look at. A methane molecule consists of 5 atoms: 1 carbon atom and 4 hydrogen atoms. It’s chemical formula is CH4.  A carbon atom has 6 electrons with 4 valence electrons that are available to participate in chemical bonds. In the case of a methane molecule these 4 valence electrons will participate in 4 bonds with 4 hydrogen atoms. In its ground state the 4 valence electrons occupy one 2s orbital and two 2p orbitals. In order to form 4 bonds there need to be 4 identical orbitals available. So the one 2s orbital and three 2p orbitals hybridize to form 4 sp3 hybrid orbitals. An sp3 orbital, as a hybrid, is a kind of mixture of an s-type and p-type orbital. The dumbbell shape of an p-orbital combines with the spherical shape of an s-orbital to form a kind of lopsided dumbbell. It’s these hybrid sp3 orbitals that then combine with the 1s orbitals of the hydrogen atoms to form molecular orbitals. In this case the type of molecular orbitals that form are called σ-bonds.

The 2s and 2p orbitals in the carbon atom can also hybridize in other ways to form two or three bonds. For example, a carbon atom can bond with 2 hydrogen atoms and 1 other carbon atom. When it does this the 2s orbital hybridizes with just 2 of the 2p orbitals to form 3 sp2 orbitals, which bond with the 2 hydrogens and the other carbon. The remaining 2p orbital combines with the other carbon atom again, to its corresponding 2p orbital. This makes two sets of orbitals combining into two molecular bonds, a σ-bond and what is called a π-bond. When a σ-bond and a π-bond form between atoms it is called a double bond. Carbon atoms can also form triple bonds in which two sp orbitals are formed from the 2s orbital and one 2p orbital. This leaves two 2p orbitals to combine with their counterparts in another carbon atom to form a triple bond, composed of 1 σ-bond and 2 π-bonds. Single bonds, double bonds, and triple bonds all have their own geometrical properties like bond angles and freedom of rotation. This has effects on the properties of the resulting molecule.

Functional Groups

σ-bonds, π-bonds, single bonds, double bonds, and triple bonds make possible several basic molecular structures called functional groups. Functional groups are specific groupings of atoms within molecules that have their own characteristic properties. What’s useful about functional groups is that they occur in larger molecules and contribute to the overall properties of the parent molecule to which they belong. There are functional groups containing just carbon, but also functional groups containing halogens, oxygen, nitrogen, sulfur, phosphorus, boron, and various metals. Some of the most common functional groups include: alkyls, alkenyls, akynyls, and phenyls (which contain just carbon); fluoros, chloros, and bromos (which contain halogens); hydroxyls, carbonyls, carboxyls, and ethers (which contain oxygen); carboxamides and amines (which contain nitrogen); Sulfhydryls and sulfides (which contain sulfur); phosphates (which contain phosphorus); and so forth.

Repeatable Units

The last subject I’d like to address with all this is the role of repeatable units in the formation of complex chemical structures. Let’s come at this from a different direction, starting at the scale of a complex molecule and work our way down. One of the most complex, sophisticated kinds of molecules is a protein. Proteins are huge by molecular standards. Cytochrome c, for example, has a molecular weight of about 12,000 daltons. (For comparison, methane, discussed previously, has a molecular weight of 16 daltons). What we find with such molecules is that they are highly reducible to a limited number of repeatable units. But we could imagine it being otherwise; a macromolecule being irreducible from its overall macrostructure and not having any discernible repeating components. Let’s imagine a hypothetical, counterfactual case in which a macromolecule of that size is just a chaotic lump. Imagine going to a landfill and gathering a bunch of trash from a heap with all sorts of stuff in it, gathering it all together, rolling it into a ball, and binding it with hundreds of types of unmixed adhesives. Any spatial region or voxel of that lump would have different things in it. You might find some cans and wrappers in one part, computer components in another, shredded office papers in another, etc. We could imagine a macromolecule looking like that; a completely heterogeneous assembly. We could imagine further such a heterogeneous macromolecule being able to perform the kinds of functions that proteins perform. Proteins can in fact be functionally redundant; there’s more than one way to make a protein that performs a given function. So we might imagine a maximally heterogeneous macromolecule that is able to perform all the functions that actually existing proteins perform. But this kind of maximal heterogeneity is not what we see in proteins.

Instead, proteins are composed of just 20 repeatable units, a kind of protein-forming alphabet. These are amino acids. All the diversity we see in protein structure and function comes from different arrangements of these 20 amino acids. Why would proteins be limited to such a small number of basic components? The main reason is that proteins have to be put together and before that they have to be encoded. And it’s much more tractable to build from and encode a smaller number of basic units, as long as it gives you the structural functionality that you’ll need in the final macrostructure. It might be possible in principle to build a macromolecule without such a limited number of repeatable units. But it would never happen. The process to build such a macromolecule would be intractable.

This is an example of a general principle I’d like to highlight that we find in chemistry and in structure generally. And it’s related to embedding. But it’s a slightly different aspect of it. Complex, high-level structures are composed by the embedding of lower-level structures. And the higher-level structures make use of a limited number of lower-level structures that get embedded repeatedly.

In the case of a protein, the protein is the higher-level structure. Amino acids are the lower-level structures. The structures of the amino acids are embedded into the structure of the protein. And the higher-level structure of the protein uses only a limited number of lower-level amino acid structures.

A comparison to writing systems comes to mind here. It’s possible to represent spoken words in written form in various ways. For example, we can give each word its own character. That would take a lot of characters, several hundred and into the thousands. And such a writing system takes several years to be able to use with any competence. But it’s also possible to limit the number of characters used in a writing system by using the same characters for phonemic properties common to all words, like syllables or phonemes. Many alphabets, for example, only have between 20 and 30 characters. And it’s possible to learn to use an alphabet fairly quickly. And here’s the key. There’s no functional representational loss by using such a limited number of characters. The representational “space” is the same. It’s just represented using a much smaller set of basic components.

Biochemists mark out four orders of biomolecular structure: primary, secondary, tertiary, and quaternary. And this is a perfect illustration of structural embedding.

The primary structure of a protein is its amino acid sequence. The primary structure is conceptually linear since there’s no branching. So you can “spell” out a protein’s primary structure using an amino acid alphabet, one amino acid after another. Like, MGDVEK: methionine, glycine, aspartic acid, valine, glutamic acid, and lysine. Those are the first 6 amino acids in the sequence for human Cytochrome c. What’s interesting about amino acids is that they have different functional groups that give them properties that will contribute to the functionality of the protein. We might think of this as a zeroth-level protein structure (though I don’t know of anyone calling it that). Every amino acid has a carboxyl group and an amino group. That’s the same in all of them. But they each have their own side chain or R-group in addition to that. And these can be classified by properties like polarity, charge, and other functional groups they contain. For example, methionine is sulfuric, nonpolar, and neutral; asparagine is an amide, polar, and neutral; phenylalanine is aromatic, nonpolar, and neutral; lysine is basic, polar, and positively charged. These are important properties that contribute to a protein’s higher-level structure.

The secondary structure of a protein consists of three-dimensional, local structural elements. The interesting thing about secondary structures in the context of embedding and repeatable units is that these local structures take common shapes that occur all the time in protein formation. The two most important structural elements are alpha helices and beta sheets. True chemical bonds only occur between the amino acid units of the primary structure but in the higher level structures the electrostatic forces arising from differences in charge distribution throughout the primary structure make certain regions of the primary structure attracted to each other. These kinds of attractions are called hydrogen bonds, in which a hydrogen atom bound to a more electronegative atom or group is attracted to another electronegative atom bearing a lone pair of electrons. In the case of amino acids such hydrogen bonding occurs between the amino hydrogen and carboxyl oxygen atoms in the peptide backbone.

In an alpha helix these hydrogen bonds form in a way that makes the amino acids wrap around in a helical shape. In a beta sheet strands of amino acids will extend linearly for some length and then turn back onto themselves, with the new strand segment extending backward and forming hydrogen bonds with the previous strand. These hydrogen bound strands of amino acids then form planar sheet-like structures. What’s interesting is that these kinds of secondary structures are very common and get used repeatedly, much like amino acids get used repeatedly in primary structures. Secondary structures, like alpha helices and beta sheets (among others), then get embedded in even higher-level structures.

The tertiary structure of a protein is its full three-dimensional structure that incorporates all the lower-level structures. Tertiary structures are often represented using coils for the alpha helix components and thick arrows for the beta sheet components. The way a protein is oriented in three-dimensional space is determined by the properties of its lower level structures all the way down to the functional groups of the amino acids. Recall that the different amino acids can be polar or nonpolar. This is really important because proteins reside in aqueous environments with highly polar water molecules. Nonpolar groups are said to be hydrophobic because conditions in which the surface area of exposure, the contact, between nonpolar groups and polar water molecules is minimized are entropically favored. Because of this polar and nonpolar molecules will appear to repel each other, a hydro-phobic effect. Think of the separation of oil and water as an example. Water is polar and oil is nonpolar. This is the same effect occurring at the scale of individual functional group units in the protein. Proteins can fold in such a way as to minimize the surface area of nonpolar functional groups exposed to water molecules. One way this can happen is that nonpolar amino acid sections fold over onto each other so that they interact with each other, rather than with water molecules and so that water molecules can interact with each other rather than with the nonpolar amino acid sections. These same kinds of effects driven by the properties of functional groups were also the ones bringing about the secondary structures of alpha helices and beta sheets.

Some proteins also have a quaternary structure in which multiple folded protein subunits come together to form a multi-subunit complex. Hemoglobin is an example of this. Hemoglobin is made up of four subunits; two alpha subunits and two beta subunits.

There’s a pattern here of structure building upon structure. But it does so with a limited set of repeatable structures. I’d like to address this again. Why should proteins be built out of only 20 amino acid building blocks. Certainly there could be (at least in theory) a macromolecule that has similar functionality and global structure, using the same functional group properties to get it to fold and form in the needed way, without the use of repeatable lower-level structural units. But that’s not what we see. Why? One important reason is that proteins need to be encoded.

Proteins are made from genes. Genes are sections of DNA that get translated into RNA and then transcribed in proteins. That’s a gene’s primary function: to encode proteins. DNA and RNA have further simplified components: only four types of nucleotides in each: guanine, adenine, cytosine, and thymine in DNA and guanine, adenine, cytosine, and uracil in RNA. These nucleotides have to match up with the proteins that they encode and it’s going to be very difficult to do that without dividing up the protein into units that can be encoded in a systematic way. There’s a complex biochemical process bringing about the process of transcribing an RNA nucleotide sequence into a protein. But since these are, at bottom, automatic chemical processes they have to proceed in systematic, repeatable ways. An entire macromolecule can’t have an entire intracellular biochemical system dedicated to just that macromolecule alone. For one thing, there are too many proteins for that. The same biochemical machinery for transcription has to be able to make any protein. So all proteins have to be made up of the same basic units.

The way this works in transcription is that molecules called transfer RNA (tRNA) are dedicated to specific combinations of the 4 basic RNA nucleotides. These combinations are called codons. A codon is some combination of 3 nucleotides. Since there are 4 kinds of nucleotides and each codon has 3 there are 43, or 64 possible combinations. Different codons correspond to different amino acids. Since they only code for 20 amino acids there is obviously some redundancy, also called degeneracy (which isn’t meant to be insulting by the way). The way that codons get transcribed into an amino acid is that the tRNA molecules that match the nucleotide sequences of the various codons in the RNA also convey their encoded amino acids. These tRNA molecules come together at the point of transcription, called ribosomes, and link the amino acids together into the chains that form the primary structure of the protein. This is just a part of the biochemical machinery of the process. What’s important to note here is that although there are a number of tRNA types it’s not unmanageable. There are at most 64 possible codon sequences. So there doesn’t have to be a unique set of transcription machinery dedicated to each and every kind of protein, which would be insane. The components only have to be dedicated to codon sequences and amino acids, which are much more manageable.

Key Takeaways

I’d like to summarize the foregoing with 4 key takeaways from this analysis of structure in chemistry that I think apply to a general philosophy of structure.

1. Structure can be modeled using functions

Recall that a function is a relation between sets that associates an element in one set or combination of elements from multiple sets to exactly one element in another set. The source sets are called domains and the target sets they map onto are called codomains. One example of a function we’ve looked at in both the previous episode on music and in this episode on chemistry is the waveform function. In chemistry mathematical functions called orbitals assign to each point in space (the domain) an amplitude value (the codomain).

2. Functions occupy only a small portion of a phase space

Functions, by nature, impose limitations. A relation that associates an element in a domain to more than one element in a codomain would not be a function. A function associates the domain element to only one codomain element. In this way functions are very orderly. To give an example, in an orbital a given point in space (a domain element) can have only one amplitude value (the codomain element). This is highly limited. To illustrate this, imagine a phase space of all possible combinations of domain and codomain values. Or to give a simpler comparison, imagine a linear function on an x-y plane; for example, the function y = x. This is a straight line at a 45 degree angle to the x and y axes. The straight line is the function. But the phase space is the entire plane. The plane contains all possible combinations of x and y values. But the function is restricted to only those points where y = x. A similar principle applies to orbitals. The corresponding phase space would be, not a plane, but a 4-dimensional hyperspace with axes r,θ,φ, and ψ. The phase space is the entire hyperspace. But the wave function, or orbital, is restricted to a 3-dimensional space in this 4-dimensional hyperspace. This kind of restriction of functions to small portions of phase spaces is a characteristic feature of structure generally.

3. Structural embedding

Embedding is a feature of structure that came up in music and has come up again in even more obvious form in chemistry. Just looking at proteins the different orders of structure is quite obvious and well known to biochemists, with their conceptual division of proteins into primary, secondary, tertiary, and quaternary protein structures, with each level of structure incorporating the lower level structures embedded into them. Using proteins as an example, even primary structures have embedded into them several layers of additional structure such as functional groups, molecular orbitals, atomic orbitals, and the general structure of the wave function itself. One key feature of such embedding is that properties and functionality of the lower-level structures are taken up and integrated into the higher-level structures into which they are embedded. We saw, for example, how the three-dimensional tertiary structure of a protein takes the form that it does because of the properties of functional groups in the side chains of individual amino acids, in particular polarity and nonpolarity.

4. Repeatable units

A final key takeaway is the use of repeatable units in the process of structural embedding. In retrospect this is certainly something that is applicable to music as well. We see repeatable units in the form of pitches and notes. In chemistry we see repeatable units in macromolecules like polymers and proteins. Polymers, like polyethylene, PVC, ABS, polyester, etc. certainly use repeatable units; in some cases a single repeating unit, or sometimes two or three. Proteins make use of more repeatable units but even there they make use of a limited number: 20 amino acids. We see here an important general principle of structure: that high-level structures tend to be composed through the repeated use of a limited number of lower-level structures rather than by forming as a single, bulk, irreducible macrostructure. The use of lower-level repeatable units in the higher-level structure facilitates the encoding and construction of high-level structures.

And that wraps up this study of structure in chemistry. Thank you for listening.

Philosophy of Structure, Part 2: Music

This is the second episode in a series on philosophy of structure and focuses on the nature of structure in music in particular. Topics covered include the physics of sound, the wave equation, overtones, Fourier transforms, physiology of the inner ear, intervals, chords, group theory, modular arithmetic, transformations, invariance, musical forms, a Borgesian musical Library of Babel (Library of Vienna), musical phenomenology, serialism, and artificial neural networks. The overall objective of the series is to find patterns in structures across multiple fields, like music, to understand a general structure of structure.

This is the second episode in a series on a philosophy of structure. In the previous and first episode I gave a general overview of structure and some of the ideas I was looking to develop. With this episode I’ll start to get into specific fields in which structure plays a significant role. The first I’d like to look at is music.

I’d like to look at music at three levels:  

1. at the level of physics, acoustics, and physiology
2. at the level of musical theory, and 
3. at the level of musical expressivity and sensitivity. 

Each level has its own set of structures. And between levels a structure at lower level will get wrapped up and translated into a new kind of element in the structures of a higher level; for example, in the move from physical frequencies to musical pitches. There’s some homology in this to modular programming, in which lower-level operations in a computer program are performed by modules, or subroutines, programs within programs.

One useful distinction that will come into play is between two senses of sound, as I will use them.

In one sense sound is a purely objective thing or event that occurs independent of any human perception. This is physical sound, vibration that propagates as a wave of pressure changes through a transmission medium like air. So if a tree falls in a forest and no one hears it does it make a sound? In this first sense, yes. Sound is just the vibration propagating as a wave through the air in the forest. Doesn’t matter if anyone hears it or not.

The other sense of sound is the perception of sound, the way the ear and brain respond and ultimately produce a subjective experience of sound. It is in this sense that a tree falling in a forest with no one to hear it doesn’t make a sound. These two senses overlap but not perfectly. There are physical sounds that are not perceptible. And there are perceptions of sound that don’t directly correspond to any physical kind of sound. This is why it makes sense to speak in terms of different levels of sound that have their own native concepts and vocabularies.

Before getting into these different kinds of musical structures I’d like to propose one more framework for thinking about structures generally in addition to the ideas I got into in the introductory episode. And this is an idea from abstract algebra, a subject I’ll get into in more detail in a later episode. In abstract algebra an algebraic structure is understood to be an arbitrary set, with one or more operations defined on it. Say we have a set of elements. These elements could be anything. And in lumping these elements together we get a set. What other features does this set need to have in order to have structure? For a set that is an algebraic structure it has one or more operations defined on it. So we have to think about what an operation is. Informally, an operation on a set is a way of combining any two elements of the set to produce another element in the same set. More formally – and this will get a little dense here for a moment but bear with me – more formally, let A be any set:

An operation * on A is a rule which assigns to each ordered pair (a, b) of elements of A exactly one element a * b in A.

What this means is that any set with a rule, or rules, for combining its elements, is an algebraic structure. With just a few more conditions, which we’ll bypass for now, such a set is also a group, which is another important kind of algebraic structure. The set is no longer just a collection of unrelated elements. There are rule-like relations between its elements. It’s a way of defining how all its pieces fit together. This is structure. We’ll see quietly nicely in musical structures the way such relations between elements work themselves out. This will be especially evident at the level of musical theory.

But first, let’s look at the acoustics of physical sound. Physical sound is vibration that propagates as a wave of pressure changes through a transmission medium. In general a wave is a propagating dynamic disturbance from equilibrium of one or more quantities. Some quantity is oscillating around the equilibrium position. In the case of a sound wave there is an equilibrium pressure, which would just be the global, average pressure, say in a room or surrounding environment. Then the sound wave is the propagation of variations of the local pressure; parts of the air are compacted, and parts of the air are rarified.

Features of waves include frequency, wavelength, and amplitude. Frequency is the number of cycles per unit time. The Hertz is a common unit for frequency and a Hertz is a cycle per second. So for example, a sound wave with a frequency of 440 Hz, cycles 440 times per second. Wavelength is inversely proportional to frequency; the high frequency sounds have shorter wavelengths and low frequency sounds have longer wavelengths. The constant of proportionality is the speed of the wave’s propagation. So in our case that’s the speed of sound. The equation for this relation is:

fλ = v

Where f is frequency, λ is wavelength, and v is the speed of sound.

The speed of sound is 340 meters per second, with some variation depending on the air conditions. So if a sound wave has a frequency of 440 Hz, i.e. 440 cycles per second, then the wavelength of the wave is 77 cm.

Amplitude is the maximum displacement of a quantity from equilibrium; how high and low the wave goes. For sound the metric used is the sound pressure level. This is the local pressure deviation from the ambient atmospheric pressure, caused by a sound wave. Sound pressure is the difference between the average local pressure and the pressure in the sound wave. When you hear sound volume being spoken of in units of decibels this is what is being quantified. The equation to calculated the sound pressure level in decibels is:

Lp = 20 log10 (p/p0) dB

Where Lp is the sound pressure level in decibels, log10 is a base 10 logarithm, p0 is the ambient pressure, and p is the root mean square sound pressure, which is a function of the amplitude. Basically a sound wave with larger changes in sound pressure will have a higher sound pressure level in decibels.

The behavior of a wave is characterized by the wave equation, which has the following form:

d2u/dt2 = v2 * d2u/dx2

Where x is distance, t is time, and v is propagation velocity. And u is some scalar function of x and t; a multivariable function that depends on more than one variable. The quantity u may be pressure in a medium or the displacement of particles of a vibrating solid, a string for example, away from their resting positions. Both will be relevant to music. Let’s think of it in terms of pressure difference from the mean. This is a second-order partial differential equation, containing second derivatives with respect to distance (d2u/dx2) and with respect to time (d2u/dt2). A differential equation is an equation that relates functions and their derivatives. A derivative gives the rate at which a value changes with respect to some variable. A second derivative repeats this process to give a rate of change of a rate of change. In this case the function is the multivariable function with respect to both distance and time: u(x,t). And because the function u(x,t) is a multivariable function this differential equation is called a partial differential equation. Solving this differential equation is the process of finding the equation for the function u(x,t). In the case of air pressure, solving the differential equation will give us an equation for the air pressure difference from the mean with respect to distance and time.

Coming up with a solution to the wave equation involves specifying certain boundary conditions that will correspond to the physical conditions to which it will apply. For example, if we’re finding a solution for a vibrating string one of the conditions will be the length of the string. So a solution u(x,t) will depend on the precise conditions. I’ll skip over the process of finding a solution, as interesting as that is, and just skip to some examples.

Applying some appropriate boundary conditions one solution to the wave equation is:

u(x,t) = sum( ak * sin(kπx) * cos(kπt),1,∞)

where

ak = 2 * integral( f(x) * sin(kπx) * dx, 0,1) 

The solution u(x,t) is a series with a number terms that get added together. Each term in the series gives a harmonic or overtone for the wave and each term in the series has a coefficient that gives them different weights. This is that ak term for the Fourier series coefficient. The complete wave is a superposition of multiple waves that add up linearly. When a string vibrates it’s actually vibrating at multiple frequencies. Using a process of mathematical analysis called Fourier analysis we can break a complete wave into its component waves and see the amplitudes of each frequency. And the breakdown of this Fourier analysis has important implications for the quality or timbre of the sound.

So let’s pause here a moment and think about all this in terms of structure, both to reflect on everything I’ve said so far about physical sound and to think about what I’ll get into next about overtones. We can see here that even with something as seemingly simple as a sound there’s a lot of structure wrapped into it. Let’s think of this in terms of algebraic structures

We start with sets containing elements of values for air pressure, string displacement, distance, and time. So there’s a set for air pressure values, a set for string displacement values, a set for distance values, and a set for time values. Even at this level, before doing anything with these sets it’s worth noting that there’s already structure there. These sets are equipped with operations, so they are also groups. Within each set we can add and multiply values together. But where things get really interesting is where we start to see the structure of the relations between sets, using functions.

A function is a binary relation between two sets that associates every element of the first set to exactly one element of the second set. Functions can also be multivariate and associate more than just two sets. For example, a bivariate function can associate every element of a first set and every element of a second set to exactly one element of a third set. This is what we have with the wave equation and its solution u(x,t). We have the set X of all distance values, x. We have the set T of all time values, t. And we have the set U of all pressure difference values, u. The function u(x,t) takes elements of X and T and assigns each pair (x,t) an element in the set of string displacement values U.

What’s the philosophical significance of that? We can imagine an alternate state of affairs where no such relations obtain. For some system we could have a set of distance values, a set of time values, and a set of pressure difference values with no structure of relations between their elements. That would be much less restricted. Imagine a phase space with all logically possible states of this system. This phase space would not be constrained by physical possibility so we can have any combination of distances, times, and values. We could match any pair of elements (x,t) from sets X and T to any element u, in set U. No restrictions. But the phase space region occupied by physically possible states would be subject to the constraints of the wave function and would be a much smaller region. When the function u(x,t) is applied to the sets X and T every pair (x,t) is matched to only one element, u, in the set U.

There’s also a great deal of structure embedded within most sound waves. A sound wave can be as simple as a pure sine wave. But most sounds are superpositions of multiple frequencies and Fourier analysis allows us to break this down and look at the underlying structure. Recall that the solution to the wave equation is a series of multiple terms. That equation again is:

u(x,t) = sum( ak * sin(kπx) * cos(kπt),1,∞)

This is a series because we add up the terms for each value of k, going from 1 to . For a musical note produced on an instrument, let’s say on a string, the first term will be for the fundamental frequency. The fundamental frequency is the lowest frequency. Musically it’s also the frequency of the musical pitch that we perceive. And this is where we start to slide gradually over into the perception of sound versus just physical sound. For example, an A played on a piano has a fundamental frequency of 440 Hz. But there are also other frequencies produced at several multiples of 440 Hz. And these multiples are the harmonics or overtones. Mathematically these show up in the subsequent terms of the series.

Sine and Cosine are trigonometric functions that oscillate. One interesting feature of these trigonometric functions is that they can be used to approximate any arbitrary function over a certain interval. And for periodic functions, like waves, a summation of trigonometric functions can approximate the entire periodic function. This is the work of Fourier analysis. The series of trigonometric functions is called a Fourier series. This kind of analysis can work in two directions. We can start from the bottom up and build a composite function from sinusoidal components. Or we can start with the composite function and break it down into its sinusoidal components. I’ll focus on the second.

Breaking a sound wave down into its component sine or cosine waves is done using a Fourier transform. One algorithm for this is the Fast Fourier Transform (FFT). We can look at the output of a fast Fourier transform graphically on a FFT spectrum, with frequency on the horizontal axis and amplitude on the vertical axis. This shows how much of each frequency composes the complete wave. The frequencies occur at different amplitudes. The way that these different frequencies add up with different weights affects the way the sound sounds to our ears. This is what we call timbre. Timbre is what’s different between an A played on a piano and an A played on a trumpet. Even though they both have the same fundamental frequency at 440 Hz the amplitudes of their harmonic frequencies compared to the fundamental frequency are quite different. The FFT spectra of an A played on a piano and an A played on a trumpet will look different. And that’s why they sound different.

Let’s look at a few examples. When we play an A on the piano it’s fundamental frequency is 440 Hz. The harmonics will be at 880 Hz, 1320 Hz, 1760 Hz, 2200 Hz, and so on, going up by 440 Hz for each harmonic. Let’s look at the amplitudes of the sound waves at each harmonic, relative to the fundamental. So we’ll say the fundamental frequency has an amplitude of 1 relative to itself. How do the amplitudes of the harmonics compare?

For an A below middle C on a piano, the amplitudes for first few harmonics relative to the fundamental, starting with the first are (in order):

H1: 0.1
H2: 0.325
H3: 0.06
H4: 0.05
H5: 0.045
H6: 0
H7: 0.01

How about the same note on a guitar? Those amplitudes are:

H1: 0.68
H2: 1.27
H3: 0.13
H4: 0.13
H5: 0.12
H6: 0.01
H7: 0.02
H8: 0.2
H9: 0.05

Speaking structurally, each of these spectra is a kind of fingerprint signature that we recognize as having a kind of instrument-specific timbre. These instrumental notes have, in addition to the waveform structure of a single frequency, additional structure composed of multiple frequencies with regular relative amplitudes.

And now let’s move further into the subject of the perception of sound. Our perception of sound depends on a system of anatomical structures, both in the ear and in the brain. We’ll look just at the ear for now. The interesting thing about the ear and our perception of timbre is that our ears basically perform a Fourier transform on the composite sound wave.

The cochlea of the inner ear is a conical structure with varying diameter, getting smaller in diameter spiraling inward toward the apex. The varying dimensions of the cochlea means that different parts of it resonate at different frequencies. So as a sound wave enters the cochlea the component frequencies of the sound wave will cause different parts of the ear to resonate at these component frequencies, but at different amplitudes. The cochlea is effectively performing a Fourier transform by breaking down the composite sound wave into its component frequencies and weighing them by their amplitudes. The basilar membrane then conveys these signals to the brain. In our brains we recognize the different spectra of these sound waves as different instrumental timbres. When the cochlea breaks down a sound wave into its component frequencies we respond to those stimuli in our brain by recognizing them as the sounds made by different instruments: as a piano, or as a guitar, etc. The structure of the sound waves produced by the instruments interacts with anatomical structures in our ear and brain so that we are able to perceive and distinguish different timbers of sound.

This is an interesting example of how, as we translate between physical events and our mental perception of them, the complex structure of the physical event gets embedded in the perception. When we hear an A played on a piano we aren’t consciously aware of all the detailed physical structure discussed here. All of that structure gets wrapped up into a kind of mental module that we perceive as “an A played on a piano”. We perceive that as a musical pitch with the timbre of a certain instrument. The complexity of the structure doesn’t disappear but it gets packaged in a way. And as we move away from the level of physical sound to the level musical theory this becomes very useful. In musical theory we can refer to these pitches on different instruments without having to get into all the structure that goes into them every time. So let’s move to that level now, the level of musical theory.

What are the elements that make up a piece of music? Certainly there are pitches or musical notes. Also durations, timbres, and dynamics, to name a few. Musical composers have at their disposal a wide array of raw materials to work with, to draw upon and organize into an ordered composition.

Let’s look at pitches first. An important feature of pitches is the way they relate to each other. The difference in pitch between two sounds is called an interval. One of the most important intervals is the octave. An octave is the interval between one musical pitch and another with double its frequency. So taking the 440 Hz A pitch, one octave above that would be 880 Hz. This is also an A, but it sounds higher. It is also the first harmonic of the 440 Hz fundamental. The human ear tends to hear both notes as being essentially “the same”, due to closely related harmonics. All octaves are harmonics but not all harmonics are octaves. This is because harmonics increase linearly in frequency but octaves increase exponentially, doubling with each octave. So for example, the first four octaves above 440 Hz are 880 Hz, 1750 Hz, 3520 Hz, and 7040 Hz.

In musical notation pitches separated by an octave are given the same note, so both 440 Hz and 880 Hz are called “A”, though we can distinguish them as A4 and A5 respectively. We can also select pitches between these two pitches to make up a scale. There are various possible scales but let’s look first at the chromatic scale, which includes all the notes of most other scales. A chromatic scale is composed of 12 pitches. These would be all the keys on a piano between octaves; all the white keys and all the black keys. The interval between adjacent pitches in a chromatic scale is a semitone or half step. The difference in frequency between half steps actually increases for higher pitches. Recall that octave frequencies increase exponentially, doubling with each octave. That’s a ratio of 2. For half steps the ratio of frequencies from one to the next is 2^(1/12), which is about 1.059. One important feature of a scale is that when they arrive at the next octave they can be understood to return to their starting point, albeit in a higher octave.

This has the form of modular arithmetic. We can think of the set of pitches as a group of modulo 12. Recall that a group is a set with an operation. Our set has 12 pitches that we can number in this way:

0 = A
1 = A#
2 = B
3 = C
4 = C#
5 = D
6 = D#
7 = E
8 = F
9 = F#
10 = G
11 = G#
0 = A

And with this set we can assign an operation called addition modulo 12. We can label this group Z12. In the operation addition these elements eventually “cycle back” on themselves. A clock face, for example, is also modulo 12. After 12 o’clock there’s no 13 o’clock. You start over again at 1. Similarly, in this group of musical pitches there’s no pitch H. Rather it starts again at A. It’s helpful to visualize this kind of group in a kind of clock face representation, with all the notes arranged a half step apart and then circling back on themselves. In modular arithmetic when you add two numbers h and k you start with h on the clock face and move clockwise k additional units around the circle: h + k is where you end up. For example, 3 + 5 = 7, 7 + 2 = 9. That’s normal. But also, 10 + 5 = 3, 5 + 11 = 4, and 7 + 12 = 7. Those are a little more unusual. But those are the correct sums under this modular arithmetic. 

Let’s see how this pertains to some other musical structures. One important musical structure is the melody. Melodies include rhythm as well but let’s just focus on the pitches for now. Melodies include a sequence of pitches. So we’re taking elements from our set of available pitches and arranging them in some new order. For example, the sequences of pitches for “Mary Had A Little Lamb”.

As a sequences of notes this is:

{C#,B,A,B,C#,C#,C#,B,B,B,C#,E,E,C#,B,A,B,C#,C#,C#,C#,B,B,C#,B,A}

Or in numerical form:

{4,2,0,2,4,4,4,2,2,2,4,7,7,4,2,0,2,4,4,4,4,2,2,4,2,0}

An important feature of such musical melodies is that they can undergo transformations, or in musical terminology, transpositions that preserve the melodic structure, even if they use a different subset of pitches. The melodic structure is invariant under the transformation. For example, let’s add 3 to each element of the melody:

{7,5,3,5,7,7,7,5,5,5,7,10,10,7,5,3,5,7,7,7,7,5,5,7,5,3}

Which translates into the musical pitches:

{E,D,C,D,E,E,E,D,D,D,E,G,G,E,D,C,D,E,E,E,E,D,D,E,D,C}

It’s the same melody. It’s just transposed into a different key. The first was in the key of A Major and the second is in the key of C Major. Let’s do another translation that shows the modular arithmetic in particular at work. Let’s add 10 to each element of the first melody:

{2,0,10,0,2,2,2,0,0,0,2,5,5,2,0,10,0,2,2,2,2,0,0,2,0,10}

Which translates into the musical pitches:

{B,A,G,A,B,B,B,A,A,A,B,D,D,B,A,G,A,B,B,B,B,A,A,B,A,G}

And this is the melody in the key of G Major. Something to note with this transposition is that adding 10 to most of the elements results in a number “less” than 10 in the regular additive group of integers, Z. But in Z12, the group of integers modulo 12, we see sums like 4 + 10 = 2 and 2 + 10 = 0. Even if a transposition crosses over that point of wrapping back onto itself, it doesn’t matter, under this transposition the structure of the melody is invariant all the same.

We can also look at pitches played simultaneously, which make harmonic intervals (2 notes) or chords (3 or more notes). As with melodies, harmonic intervals and chords can be transposed and still preserve their essential structure. What defines them is not the absolute pitches that compose them but the spacing between them in the Z12 group. For example, the notes in a major third will always be 4 semitones apart, regardless of the specific notes used. The following are examples of major third intervals

{0,4} {A,C#}

{3,7} {C,E}

{5,9} {D,F#}

All have the form {n,n+4}.

Such arrays can have multiple notes to make up chords. Such as a major chord of the form {0,n+4,n+7}

{0,4,7} {A,C#,E}

{3,7,10} {C,E,G}

{5,9,0} {D,F#,A}

Or a dominant seventh chord of the form {0,n+4,n+7,n+10}

{0,4,7,10} {A,C#,E,G}

{3,7,10,1} {C,E,G,B♭}

{5,9,0,3} {D,F#,A,C}

Pitches, along with their arrangements and relations in scales, intervals, and chords, seem to have been the most theorized aspects of musical structure. Or at least I’m most familiar with the theoreticization of these aspects. Other musical elements like rhythm (the duration of pitches), dynamics (volume), timbre, etc. are certainly parts of musical compositions. I won’t get into those in terms of sets, operations, and groups, as I have with pitches, but it’s certainly possible to see, even informally, from the highly ordered form of musical compositions that all these elements are features of musical structure.

For example, the sequence of pitches in “Mary Had a Little Lamb” could have various rhythms. In what is called a 4/4 time signature where a quarter note is equivalent to one beat the traditional melody has the following sequence of note durations:

{1,1,1,1,1,1,2,1,1,2,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,4}

With some half notes occuring in the sequence. But there are literally infinite possible ways to assign the durations of each pitch. For example, the melody could have this rhythm:

{1.5,0.5,1.5,0.5,1,1,2,1,1,2,1,1,2,1.5,0.5,1.5,0.5,1,1,1,1,1,1,1,1,4}

Adding some dotted quarter notes and eighth notes. Those are very common note durations, so nothing too crazy there. But we could, in theory, make these values any real positive number. We could have a note with an irrational duration like the square root of 2 or pi beats, for example. Not that anyone would ever do that. I don’t even know how you’d play something like that. It’s theoretically possible. But in practice we restrict ourselves to an infinitesimal fraction of possible note durations with manageable beat values like 1,2,1/2,1/3,1/16, etc.

Composers can also assign notes, or more commonly whole sections of music, dynamic values or volume levels. These go by names like pianissimo, piano, mezzo-piano, mezzo-forte, forte, fortissimo. And transitions between them like crescendo and decrescendo. Also instructions of  articulation like legato, staccato, tenuto, marcato. These are related to duration and dynamics and we might think of them as musical modules into which these structures are embedded for ease of reference. When musicians see a legato marking they already understand intuitively what that means and don’t have to think down to the more basic structures of note duration and dynamics.

So there are a variety of elements on hand to use and arrange into musical compositions: notes, chords, rhythm, dynamics, articulation, timbre, different types of instruments if a composition uses an ensemble. What makes musical composition an art is that we distinguish structured compositions from random assemblages of all these components. Let’s think about the ways structured and “meaningful” musical compositions look in comparison to the set of all possible musical compositions. And this will start to move us into the third level of musical structure: the level of musical expressivity and sensitivity.

Recall from the previous, introductory episode the literary device of the Library of Babel from the short story by Jorge Luis Borges. Let’s adapt that story for musical compositions. We have a library with books containing every possible musical composition. Right away we must see that this library is more complex than Borges’s library of Babel. The Library of Babel was limited to a certain number of characters, arranged unidimensionally. Books in the Library of Babel can’t have more than one character at a time. The characters don’t have different durations, dynamics, or articulations. Maybe that kind of information could be encoded using Borges’s system but musical notation already has that structure embedded into it. A single bar of music, which is a unit of duration in musical time, could have one staff with one note at a time, one staff with multiple notes at a time, multiple staffs for a single instrument – like a piano or organ, or multiple staffs for several instruments – such as in a full orchestra – all playing at once. Just a single beat has myriad possible forms. Since this library is different enough let’s call it the Library of Vienna, in honor of Mozart and Beethoven.

Recall that in Borges’s Library of Babel most of the words were meaningless gibberish. But in that case there was a standard by which to determine whether a string of characters was gibberish or not. Characters would be considered gibberish if they didn’t make up a word in a language. But it wasn’t quite so simple, because there are many languages. So even if a string of characters didn’t make up a word in Borges’s Spanish or in my Enligh, that’s not to say it couldn’t be a word in some other language. Maybe even in a language that doesn’t use a Roman alphabet, since it could be a Romanization, like Pinyin for Chinese. In Borges’s story some of the characters thought they were able to find patterns in other languages.

“For a long time it was believed that these impenetrable books corresponded to past or remote languages. It is true that the most ancient men, the first librarians, used a language quite different from the one we now speak… Five hundred years ago, the chief of an upper hexagon came upon a book as confusing as the others, but which had nearly two pages of homogeneous lines. He showed his find to a wandering decoder who told him the lines were written in Portuguese; others said they were Yiddish. Within a century, the language was established: a Samoyedic Lithuanian dialect of Guarani, with classical Arabian inflections.”

And I just have to insert here that – as a Guarani speaker – I’m a big fan of that part of the story.

So the existence of multiple languages means there are multiple standards of meaningfulness. So that makes the process of discerning meaningfulness more complicated. But in principle there is still a standard. But what about with music? Is there such a standard of meaningfulness for music? Is there some standard by which to say that a composition in the Library of Vienna is complete gibberish? If somebody makes up a word we can easily dismiss it as gibberish, unless for some reason it catches on and becomes a new word. But in music innovation is valued. We value the experience of hearing a melody that we have never heard before and praise composers for producing them. That is, if the new melody is musically pleasing, whatever that might mean. So in one sense the process of discerning meaningful music from musical gibberish is more complicated in the Library of Vienna. But in another sense it might also be easier, if maybe less immediately subjectible to systematization. Whereas for words we have to know all the words in all languages that a string of characters may or may not match, in music the musicality of a melody or more complex composition is more intuitively discerned. There’s no virtual repository of musical “words” that a possible segment has to match. So the actual practice of discerning meaningfulness from gibberish might be easier in the Library of Vienna than in the Library of Babel. But it would seem to be more difficult to rationally reconstruct exactly what that process is and how it actually works.

What are some of the compositions we might find in the Library of Vienna? Some of them are composed by my toddler when she bangs on the piano keys. Some of them are orchestral works for which the parts of each instrument are taken from my toddler’s improvisations. A few very special compositions will be just one note played over and over again. And each of these would have different orchestrations, rhythmic arrangements, and lengths. Some compositions would have all 88 notes on the piano played simultaneously, over and over again, also with different orchestrations, rhythmic arrangements, and lengths. Hidden somewhere in the lightyears of shelving is the score to Beethoven’s Ninth Symphony. Somewhere else is Howard Shore’s score to The Lord of the Rings. And there are also infinite variations on these works, with variations ranging from slight to significant. But the vast majority are completely random compositions, with every possible arrangement of note sequences, chords, orchestrations, rhythms, and lengths. 

From one perspective we might say, “What a wealth of fresh and original compositions!” But how much of this music would you like to listen to? I’m pretty sure that, like the residents of Borges’s Library of Babel, we would be much more excited about finding the score to Beethoven’s Ninth Symphony, or even some variation of it, than any given random tome from the shelves. Why? What are the features of Beethoven’s Ninth Symphony that distinguish it from random arrangements of pitches and rhythms? Informally we can say that the Ninth Symphony just has an exceptional degree of musicality. It moves and flows in ways that make musical sense, whatever that might mean.

Are there ways to systematize these kinds of musical, aesthetic intuitions? One way to do this is to pick out patterns in the enduring musical compositions to see what common features characterize our musical grammar. I’ll look at two lists of such features from Roger Scruton and Dmitri Tymoczko.

Roger Scruton in his philosophy talks of music having its own internal logic and moving in its own kind of abstract space called musical space. This musical space has its own set of rules and “physics”, so to speak. Here is a description from Scruton:

“Consider the simple theme that opens Beethoven’s Third Piano Concerto. From the point of view of science this consists of a series of pitched sounds, one after the other, each identified by frequency. But we do not hear a sequence of pitched sounds. We hear a melody, which begins on the first note and moves upward from the C to G, via E-flat, and then stepwise downward to the starting point. But somehow the movement hasn’t stopped, and Beethoven decides to nail it down with two emphatic dominant-tonic commas. Then comes an answering phrase, harmonized this time, and leading up to A-flat construed as a dissonant minor ninth on G. We hear a sudden increase in tension, and a strong gravitational force pulling that A-flat downward on to G, although the melody doesn’t rest there, since it is looking for the answer to the two dominant-tonic commas that we heard earlier, and it finds this answer in another pair of such commas, though this time in the key of G. You could go on describing these few bars for a whole book, and you won’t have exhausted all that they contain by way of musical significance. The point I want to emphasize, however, is that you cannot describe what is going on in this theme without speaking of movement in musical space, of gravitational forces, of answering phrases and symmetries, of tension and release, and so on. In describing the music, you are not describing sounds heard in sequence; you are describing a kind of action in musical space, in which things move up and down in response to each other and against resisting fields of force. These fields of force order the one-dimensional space of music, in something like the way gravity orders the spatiotemporal continuum. In describing pitched sounds as music, we are situating them in another order of events than the order of nature.” (The Soul of the World)

While it may be possible to translate some of these ideas into the language of physics, Scruton seems more or less satisfied with the self-sufficiency and internal coherence of musical space. The notions of “movement in musical space, of gravitational forces, of answering phrases and symmetries, of tension and release” don’t easily map onto other methods of description outside the musical discipline. But they are coherent in that discipline. Scruton doesn’t dismiss other, external descriptions of these things. Rather he valorizes both with what he calls “cognitive dualism”. Even if different methods are incommensurable – not fully intertranslatable – they can still both be accurate for their own purposes, as far as they can go.

An important aspect of Scruton’s philosophy of music, for my purposes, is the role of structure, or organization, in it. And he understands the rules of such organization to be highly delimited, while still permitting endless variety:

“We should recognize here that music is not just an art of sound. We might combine sounds in sequence as we combine colors on an abstract canvas, or flowers in a flowerbed. But the result will not be music. It becomes music only if it also makes musical sense. Leaving modernist experiments aside, there is an audible distinction between music and mere sequences of sounds, and it is not just a distinction between types of sound (e.g. pitched and unpitched, regular and random). Sounds become music as a result of organization, and this organization is something that we perceive and whose absence we immediately notice, regardless of whether we take pleasure in the result. This organization is not just an aesthetic matter, not simply a style. It is more like a grammar, in being the precondition of our response to the result as music. We must therefore acknowledge that tonal music has something like a syntax—a rule-guided process linking each episode to its neighbors, which we grasp in the act of hearing, and the absence of which leads to a sense of discomfort or incongruity.” (The Space of Music: Review Essay of Dmitri Tymoczko’s A Geometry of Music)

I think this idea can help with the Library of Vienna and the reasons for why most of its compositions are not musically satisfying. As in the Library of Babel, compositions in the Library of Vienna can have or not have musical sense. It’s the absence of this musical sense that is gibberish. Musical sense doesn’t translate into semantic sense. Even with program music, in which there is a purported semantic meaning, such meaning doesn’t really come from the music itself. The meaning of music is internal to itself and expressible in its own terms. Scruton lists some structural features of music that give it coherence and musical sense. But before going through Scruton’s list let’s take a look at Tymoczko, who also lists out critical features of Western music and gives an interesting point of musicological comparison.

In his book, A Geometry of Music, Tymoczko lists five features that contribute to a sense of tonality. These are:

1. Conjunct melodic motion
2. Acoustic consonance
3. Harmonic consistency
4. Limited macroharmony
5. Centricity

Conjunct melodic motion is a term for the way that “melodies tend to move short distances from note to note”. We could think here of the melody in “Mary Had a Little Lamb” as an excellent example of this.

Acoustic consonance is the tendency for consonant harmonies to be preferred to dissonant harmonies, tending to be used as points of musical stability. This has to do with the kinds of harmonic intervals we perceive to be consonant and dissonant. Highly consonant intervals include unisons, octaves, fourths, and fifths. Dissonant intervals include tritones (diminished fifths), minor seconds, and major sevenths. As points of musical stability we can think about the kinds of harmonies on which lines will tend to resolve. The idea of musical resolution actually is this movement from dissonance to consonants. Dissonance will give a sense of being incomplete and the movement to consonance a sense of satisfaction.

Harmonic consistency is the tendency for the harmonies in a passage of music to be structurally similar to each other. Musical compositions will use consonant sequences or dissonant sequences, but not a scrambled mixture of both.

Tymoczko uses the term limited “macroharmony” to refer to “the total collection of notes heard over moderate spans of musical time. Tonal music tends to use relatively small macroharmonies, often involving five to eight notes”. Another way of putting this is that pitches are organized as scales within the octave; they’re in a certain key.

Centricity is the tendency for one note to be heard as more prominent than the others over small spans of musical time. These central notes occur more frequently and serve as a goal for musical motion.

Tymoczko says that his collection of these five components is empirical, theoretical, and historical. We might consider here an analogy to language, playing off Scruton’s idea of musical grammar. Learning a language is taking part in a linguistic tradition or practice. And I think much the same can be said for tonal music. The reason for the prominence of these five features may be partially innate, but regardless they are certainly refined and reinforced by the musical tradition and community of musicians and listeners. Tymoczko has an interesting comment that I think pertains well to the Library of Vienna:

“My central conclusion is that the five features impose much stronger constraints than we would intuitively expect. For example, if we want to combine melodic motion and harmonic consistency, then we have only a few options, the most important of which involve acoustically consonant sonorities. And if we want to combine harmonic consistency with limited macroharmony then we are led to a collection of very familiar scales and modes.”

Because of the strictness of the constraints that these five features impose on tonal music we would expect only an infinitesimal subset of the compositions in the Library of Vienna to conform to these constraints.

Let’s look at the features Scruton picks out for tonal music, or what he calls “diatonic” music, which includes music that uses the major and minor scales as well as various modes, like the Dorian and Phyrgian modes. Scruton lists six features (Music as an Art):

1. Closure
2. The musical boundary
3. The topology of musical space
4. The distinction between (a) subject, thesis, and theme and (b) what is built on it
5. The distinction between harmonies and simultaneities
6. The specific phenomenology of diatonic space

These certainly require some explanation.

By closure Scruton has in mind the tendency of diatonic music to come to points of conclusion or pauses. He compares these to “colons, semi-colons, and full stops”. And there are two reasons for this. The first is that the scale has a point of rest in the tonic. The second is that harmonic progressions can lead to chords without tension; they can resolve. “Hence tonal music admits of both melodic and harmonic cadences – sequences in which accumulated tension is released, as when a suspension is resolved by neighbor-note movement”.

The feature of the musical boundary is very similar to closure but is just the most final form of it, as the ultimate closure of a composition, which defines the temporal stage on which the musical work takes place. “Musical elements in the diatonic language have temporal boundaries: they begin and they end, and between those two points they are in continuous movement. This is true even if there is no sound to be heard. Tonal music moves through silences, and is on its way to closure even when there is nothing to be heard.” It’s worth noting here how this feature of music also gives it a narrative structure, even if the musical meaning is only intelligible in its own terms; it’s only “about” itself. But still there’s a kind of narrative progression, development, and resolution. This kind of narrative structure would be lacking from a random composition in the Library of Vienna.

Scruton says of the topology of musical space: “Thanks to octave equivalence the one-dimensional space of music is folded over at the octave, coming back to its point of departure at every twelfth semitone… it creates a kind of lattice on which melodies and harmonies are arranged and transposed.” This is a point covered under the modular arithmetic of chromatic scales. The modular chromatic scale is a musical structure that supports musical compositions, tonal compositions but also atonal compositions for that matter.

By the distinction between (a) subject, thesis, and theme and (b) what is built on it Scruton is referring to the way musical compositions are often structured to refer back to principle themes on which variations are made. “We recognize the return of the theme, its occurrence in other places, and the various augmentations, diminutions, ornamentations, and variants that made it a mutating presence in the work, a personality that can change its dress and its manner but remain always in essence the same.”

I’d like to dwell on this point a bit because it marks another important feature of structure in music. One very non-random feature of musical compositions is that they tend to contain repetitive elements. We can easily hear this in music containing repeating verses and a chorus. Many songs follow an AABA structure with 2 sections: an A section that occurs twice followed by a second B section before returning again to the opening A section. Think “Over the Rainbow” as an example. There’s a definite narrative arc to that kind of structure.

Twelve bar blues has a similar kind of structure in its chord progression, using tonic (I), subdominant (IV), and dominant chords (V). For example, the progression:

{I7,I7,I7,I7,IV7,IV7,I7,I7,V7,IV7,I7,I7}

Another narrative arc, with a little more complexity, moving first to the subdominant and back down, then to the dominant, and resolving again at the tonic.

The sonata form of classical period music has this similar arc structure. It consists of three main sections: exposition, development, and recapitulation. A theme is presented in the exposition upon which the sonata will be based. A development section elaborates and contrasts with the theme presented in the exposition and this development section can make for some of the most challenging and interesting material in the composition. Be that as it may, it is elevated to a higher energy state that is not restful, and the recapitulation section returns things to the ground state and resolves the piece, not simply as a repetition but as a musical section that functions grammatically as a response and summary of the whole composition. Like in the hero’s journey, the hero returns home, but wiser and more mature.

The fugue is a form with one of the most sophisticated musical structures. Johann Sebastian Bach is rightfully considered the master of this form as found, for example in The Art of Fugue and The Well-Tempered Clavier. These works have variations on a principal subject of increasing complexity. For example, in The Art of Fugue the first four fugues are called “Simple Fugues” on a principal subject. Nevertheless even these simple fugues consist of 4 voices and progressively employ inversions, intense chromaticism, and counter-subjects. All the fugues employ counterpoint, in which two musical lines play against one another and weave together. This is typical of a Bach fugue in which the overwhelming impression is that there’s just a lot going on at once. But there’s a great deal of method in it. Inversion is a kind of rearrangement in intervals, chords, voices, and melodies. In the case of melodies we can think of it as flipping a melody “upside-down” and reversing its contour, a kind of mirror image. And Bach uses this frequently. The later fugues in The Art of Fugue use their principal subjects simultaneously in regular, inverted, augmented and diminished forms (doubling or halving the duration of the notes), and they start including two or three subjects. Musicologist Christoph Wolff said of The Art of Fugue that: “The governing idea of the work was an exploration in depth of the contrapuntal possibilities inherent in a single musical subject.” Bach was pressing the form as far as he could take it.

So all that was to comment on Scruton’s fourth feature of diatonic music: the distinction between the subject, thesis, or theme and what is built on it. Hopefully those examples give an idea of how important and pervasive those are in musical tradition and practice.

Something to consider in terms of structure with all these forms, especially with the fugue, is that there are musical structures that can undergo transformation, much like other kinds of structures, such as vectors or images. With inversions, augmentations, and diminutions the elements of the musical structure are undergoing reflections, expansions, and contractions that preserve the original in recognizable form. That’s part of the idea, to refer everything back to the original. There are changes to the structure but it’s not a random or disjunct transition between unrelated sets of elements. There is order to it. It reminds me of Carl Sagan’s line that “things change alright, but according to patterns, [and] rules”. There’s a non-arbitrariness to the change that makes the term transformation an appropriate description, as something structure-preserving.

The fifth of Scruton’s features of diatonic music is the distinction between harmonies and simultaneities. What is the difference between these? In any composition in our Library of Vienna it’s very likely to have notes occurring simultaneously. This is the toddler banging on the keys or all 88 keys on the piano being played at once. Those are simultaneities. But Scruton is making the case that those kinds of simultaneities are not harmonies. He says: “This is perhaps one of the most important of all the marks of tonality. Tones sounding together strike us only rarely as simultaneous but separate events, and more often as parts of a single complex event.” A C-Major chord, for example, is in one sense a simultaneous occurrence of 3 pitches. But musically we experience it as a single entity, as a C-Major chord to which those 3 pitches belong together as parts of a whole. We might think of this as a kind of embedding or creation of musical modules, similar to the way frequencies and overtones get embedded into musical concepts of notes and timbres.

Scruton’s sixth and final feature of diatonic music was the specific phenomenology of diatonic space. This may be the most interesting and also challenging feature, and it is very characteristic of Scruton’s phenomenological interests. Phenomenology here being the philosophical study of the subjective, first-person experience one has of things, the “what it’s like” quality of an experience. Scruton says of diatonic space and of most musical space in general that “tones seem to move into each other, to compel each other’s appearance, to belong together by a kind of magnetism that makes one tone an introduction to the other and the other a fitting sequel… there is a phenomenological ‘belong together’ that leads us spontaneously to distinguish right from wrong in what we hear.” Recall Scruton’s description of Beethoven’s Third Piano Concerto and its movement through musical space. This kind of abstract space has its own logic and laws with gravitational forces, answering phrases and symmetries, tension and release, and so on.

To give an idea of the unique and self-sufficient nature of musical space I’ll quote another passage from Scruton that I really like:

“Ask yourself just what it is that moves, when music moves. The melody of the Beethoven began on C and moved up to E-flat. But what moved? Not C, which is stuck forever at C. Nor did anything release itself from that C and travel to E-flat—there is no musical ectoplasm that travels across the void between the semitones. If you go on pushing questions like those, you will soon come to the conclusion that there is something contradictory in the idea that a note can move along the pitch spectrum—no note can be identified independently of the place that it occupies, which makes it seem as though the idea of a place is in some way illegitimate. In all kinds of ways musical space defies our ordinary understanding of movement: for example, octave equivalence means that a theme can return to its starting point even though moving constantly upward—a kind of Escher paradox, which has no equivalent in ordinary geometry. Musical space has other interesting topological features. For example, things can rarely be moved through musical space in such a way as to coincide with their mirror image, any more than the left hand, to take Kant’s famous example, can be turned in physical space so as to coincide with the right hand. Thus no asymmetrical chord can be transposed onto its mirror form. The net result of those and similar reflections is to conclude that nothing literally moves in musical space, but that in some way the idea of space cannot be eliminated from our experience of music. We are dealing with an entrenched metaphor—but not a metaphor of words, exactly, for we are not talking about how people describe music; we are talking about how they experience it. It is as though there is a metaphor of space and movement embedded within our experience and cognition of music. This metaphor cannot be ‘translated away,’ and what it says cannot be said in the language of physics—for example, by talking instead of the pitches and timbre of sounds in physical space. Yet what it describes, the musical movement, is a real presence—and not just for me: for anyone with a musical ear.” (The Soul of the World)

This phenomenological structure of music is the most difficult to describe and, if Scruton is correct about cognitive dualism and incommensurability, may have to be described in its own terms. Still the limiting forces of structure seem to apply in this phenomenological space as well. I mentioned in the last episode how some of the criteria distinguishing structure might be more difficult to pin down. David Bentley Hart noted how many aesthetic criteria we might define have a tendency to exclude artistic creations of merit or to include formulaic works without vision. Still, we know that there are such standards. The residents of both the Library of Babel and the Library of Vienna know and feel the differences between the gibberish and the structurally meaningful.

The difficulty of translation between the objective/physical and subjective/phenomenological aspects of structure calls for attention and it’s something I want to think more about and revisit in future episodes. But some ideas occur to me now that I’ll touch on. The basic meta-structure, i.e. the structure of structure, at work at the bridge or gulf between the physical and phenomenological is one in which the operations occurring between inputs and outputs is obscured. There’s a kind homology here to the basic structure of an artificial neural network (ANN). An artificial neural network is a computational model that consists of several processing elements that receive inputs and deliver outputs based on their predefined activation functions. It’s modeled after the biological neural networks that constitute animal brains. But what’s interesting for our purposes here is that the structure of an artificial neural network consists of three basic layers: (1) an input layer, (2) an output later, and (3) a hidden layer.

In a traditional computer programming scheme the pattern is to (1) write the algorithm or rules for the program to follow, (2) feed the program the data to process and, (3) get the desired output. The programming scheme of machine learning works in the other direction. With machine learning you (1) start with the desired output, (2) feed the program the data to process and, (3) have the computer generate the algorithm or rules for the program to follow. This can be a much more convenient scheme for many types of problems. In particular, it works well when you don’t know exactly what the rules should be but you have a pretty good idea of what the general patterns should look like. That actually seems pretty analogous to the gap between the physical and phenomenological conceptions of music.

Let’s look at this from the phenomenological side of the canyon facing the physical side. Why the gap? There are many things that humans are very good at doing without being able to describe what we’re doing. Facial recognition is a good example. We’re really good at it, but we don’t know how we do it. Because we don’t know how we do it, it’s really hard to verbalize an explicit rule that a machine could use to solve the same task. And I think we could say something similar about our aesthetic musical judgment. The fundamental insight of neural networks is that we can represent very big, very general programs capable of making many subtle distinctions, yet in a form that is simple and regular and therefore amenable to

systematic optimization. A neural network consists of (1) units, (2) weighted connections between the units, and (3) activation functions in each unit. This is a very general program but where the specificity can arise is in the weights.

It’s the weights that are adjusted during the learning process and constitutes the generation of new “rules”, so to speak, to achieve the desired outcomes. The networks learn by receiving inputs from examples. While under training, the examples have known inputs and outputs. With these examples the network determines the difference between a processed output from the network and the target output, which is the error. Then the network uses the error to adjust weights assigned to the connections between units. Over repeated training periods and exposure to more examples the network will adjust these weights to better match the target output. Then the idea is that after the training period the network will be able to process real data for which the output isn’t previously known and be able to produce an accurate output, using the weights that were developed during training.

What’s interesting to me about all this in terms of a philosophy of structure is that this kind of process seems to be able to accurately identify what we perceive to be certain kinds of structures without having to give explicit criteria or descriptions of those structures. Nevertheless, there are rules there in the weights. It’s just that those rules take a form that doesn’t easily lend itself to verbal description, much like the phenomenological properties themselves. Still, it does seem to be closer to a kind of bridge between the physical and phenomenological.

The key for me is that the phenomenological aspects of structure cannot be left out. And this brings up an example of structure in musical history that cannot go without mention. And that is the compositional method of serialism. Serialism is highly relevant to the present study because structure is arguably its most salient feature. Serialism is a compositional technique in which a fixed series of notes – and sometimes rhythms, dynamics, timbres or other musical elements – are used to generate the harmonic and melodic basis of a composition and are subject to change only in specific, and heavily rule-governed ways. Twentieth century serialism was highly controversial on many fronts, including in terms of aesthetics. And it makes for an interesting case study on structure and musical aesthetics. In a certain sense no form of music is more structured. Pierre Boulez may be the greatest example of tying every detail of pitch, rhythm, and dynamic down to its most systematized and structured form. He even titled two of his most important compositions Structures I and Structures II. So these cry out for attention under the present discussion.

Both the twelve-tone technique and serialism are easy to criticize but I don’t want to simplistically pile on. This is a subject on which my opinion has changed and Robert Greenberg’s Great Courses series Great Music of the 20th Century was very helpful in that regard. Greenberg shared a few insights that I found very helpful to better appreciate serialism:

1. Even if serialism sounds dated today it served an important role in the historical development of music, which was part of Arnold Schoenberg’s intention from the start.

2. Much serialist music sounds the same and mediocre because, as with all types of music, much of it is mediocre. But…

3. In the hands of certain musical geniuses like Arnold Schoenberg and Igor Stravinsky who, to the somewhat algorithmic nature of the twelve-tone technique also brought consummate musical creativity and sensitivity, serial compositions could be as remarkable and aesthetically estimable as the greatest canonical works.

Arnold Schoenberg felt compelled as a matter of duty to play the role of the great German innovators like Beethoven and Wagner, to break free from the constraints of habit and custom in order to move musical development forward. And he arguably succeeded. But his vision was never about removing the creativity of the composer, removing the human element, from the process of composition. Schoenberg always remained as the flesh-and-blood craftsman of his compositions even as he constrained his compositions according to new structural forms. He continued to shape his music for purposes of eliciting response in the listener. And in this way he was still attuned to the phenomenological aspects of musical structure.

Pierre Boulez had a different vision, aiming to make music much more abstract, incorporeal and removed from the flesh-and-blood world of composers and listeners. He said of his composition Structures:

“I wanted to eradicate from my vocabulary absolutely every trace of the conventional, whether it concerned figures and phrases, or development and form; I then wanted gradually, element after element, to win back the various stages of the compositional process, in such a manner that a perfectly new synthesis might arise, a synthesis that would not be corrupted from the very outset by foreign bodies—stylistic reminiscences in particular.”

Robert Greenberg described music of the type Boulez promoted as “music that is only ‘about’ its compositional process.” In this compositional theory the method is the content. Along similar lines, Ernsk Krenek said, in favor of serialist composition:

“Actually the composer has come to distrust his inspiration because it is not really as innocent as it was supposed to be, but rather conditioned by a tremendous body of recollection, tradition, training, and experience. In order to avoid the dictations of such ghosts, he prefers to set up an impersonal mechanism which will furnish, according to premeditated patterns, unpredictable situations… while the preparation and the layout of the material as well as the operations performed therein are the consequence of serial premeditation, the audible results of these procedures were not visualized as the purpose of the procedures. Seen from this angle, the results are incidental.” (Extents and Limits of Serial Techniques, 1960)

This is about as far removed from Scruton’s phenomenological feature of music as you can go. I’m tempted to say it coheres well with a kind of eliminativist theory of mind that dispenses with notions of self-consciousness and first-person subjective experience. At the very least those are understood to be superfluous to the composition of the music.

Quoting Robert Greenberg again:

“The end result, the actual piece of music, is a manifestation of its formula, and as a result, it is not so much a piece of music as it is a document. Like most conceptual art, the real substance of such a composition lies in its ‘idea’, its formula, its row, rather than in its actual ‘execution’ in real time.”

Well, I said I had no desire to pile on the criticism of serialism. And I really don’t. My point here is not so much criticism of Boulezian serialism as much as the identification of those features that are most salient to the criticisms that have been made. And they are the same features that are not applicable to the most enduring and exceptional works of the serialists for whom the methods still permitted great freedom in which to compose with musical creativity. Both a random composition in the Library of Vienna and the most highly algorithmic and aspirationally composer-less serialist composition might have a high degree of structure in one sense. But high degree of structure and high information content alone don’t translate into high musicality or exhibit the degree of musical structure of the kind described by Scruton and Tymoczko. There are more requirements imposed on musical structures from the boundary conditions of musical space.

So let’s return to the larger project of which this whole discussion of music is a part, a study of structure. What are some features at work here?

One important feature of structure is looking at the parts and the whole, the elements and the sets they compose. We see in music that one example of individual elements is of pitches or notes. And these are basic building blocks of music. Other examples include note durations (units of rhythm), dynamics, and timbre.

Another important feature of structure is the relationship between parts. If a system is structured it is more than a matter of there being multiple parts composing a whole. It’s also a matter of the way the parts relate to each other. In music we have pitches but we also understand these pitches to relate to each other in well-defined ways. They are separated by intervals. They can combine into harmonic intervals and chords. And they wrap back onto themselves across octaves and scales.

Another important feature of structure is embedding. Structures can be embedded as elements into higher structures. They can act as kinds of modules by which the structure of the embedded element can be “called” and run as a kind of subroutine, without having to attend to all the details of the embedded structure. For example, the physical details of a composite sound wave produced by a particular instrument, with its harmonics and Fourier series coefficients, can be wrapped up into a module and embedded as a pitch element in a musical set. Musically we only have to think of it as an “A”, or whatever note it might be. The lower-level structure is still present but it’s conveniently packaged to be used in the higher-level operations.

One final feature of structure mentioned here that I find fascinating and want to explore in more detail later is the kind of structure that spans the gap between the objective/physical aspects of structure and subjective/phenomenological aspects of structure. This is one of the most perplexing features of structure. And it’s this part that overlaps with aesthetic philosophy that is so pertinent to musical structure. In music the important musical structures are not necessarily heavy in information content. They may in fact be highly regular and compressible in terms of information theory. But they are heavy in meaning. And how can that be defined in a rigorous way? Whether or not it ends up being intelligible it may be that some kind of neural-network-like structure, with weights generated through aesthetic training might at least quantify some form of structure that spans this gap.

So that’s an exploration of a philosophy of structure as it pertains to music. I’ll pick up some of the ideas developed here and carry them on to more subjects and continue the process of thinking about the structure of structure.

Philosophy of Structure, Part 1: Thinking About Structure

This is the first in a series of episodes exploring a philosophy of structure. To introduce the concept and to start thinking about structure I make use of ideas from thermodynamics, phase spaces, information theory, The Three-Body Problem, the Hebrew Bible, the Book of Mormon, Daniel Dennett, and Jorge Luis Borges. Drawing from a variety of sources the objective is to find patterns in structures across multiple fields, to understand a general structure of structure.

The Hebrew Bible describes the condition of the earth before creation as תֹהוּ וָבֹהוּ (tohu va-bohu), “formless and void”. Both terms, tohu and bohu, convey formlessness and emptiness. It’s an interesting pair of ideas. And carrying these ideas beyond their original narrative setting, I’m intrigued by the thought that lack of form, or structure, could be understood also as a kind of emptiness, or nothingness.

With this episode I’d like to start what I intend to be a series of episodes about structure, looking at a philosophy of structure. Today I just want to introduce some general ideas and then explore particular examples of structure in more detail in later episodes, looking at structure in music, chemistry, biology, and other fields.

To introduce the subject I’d like to pull together some ideas from different subjects that range from highly technical and quantitatively rigorous to conceptual and qualitative. There are tools in physics and in information theory that can give very specific measures of certain kinds of structures. And those tools are also conceptually instructive. But I don’t think those measures exhaust or cover everything that we mean by or understand structure to be. So all of this will fall into a diverse toolbox of ways to think about structure and to approach the topic.

Going back to the Hebrew Bible and the primordial condition of tohu va-bohu, if we think of this condition as a lack of structure, the kind of emptiness or nothingness I imagine in the lack of structure is not absolute nothingness, whatever that might be. But it’s nothingness of the sort of there not being anything very interesting. Even if there’s “stuff” there there’s not really anything going on. Or even if there’s stuff going on, like as a whirling, chaotic mass, it still amounts to uniformity with all the pieces just canceling each other out, adding up to not very much.

Part of the lack is an aesthetic lack. An absence of engaging content. There’s a great literary illustration of this idea in Cixin Liu’s novel The Three Body Problem. This is from a scene where one of the characters, who is suffering from a mental illness, is meditating and trying to heal his troubled mind:

“In my mind, the first ‘emptiness’ I created was the infinity of space. There was nothing in it, not even light. But soon I knew that this empty universe could not make me feel peace. Instead, it filled me with a nameless anxiety, like a drowning man wanting to grab on to anything at hand. So I created a sphere in this infinite space for myself: not too big, though possessing mass. My mental state didn’t improve, however. The sphere floated in the middle of ‘emptiness’—in infinite space, anywhere could be the middle. The universe had nothing that could act on it, and it could act on nothing. It hung there, never moving, never changing, like a perfect interpretation for death. I created a second sphere whose mass was equal to the first one’s. Both had perfectly reflective surfaces. They reflected each other’s images, displaying the only existence in the universe other than itself. But the situation didn’t improve much. If the spheres had no initial movement—that is, if I didn’t push them at first—they would be quickly pulled together by their own gravitational attraction. Then the two spheres would stay together and hang there without moving, a symbol for death. If they did have initial movement and didn’t collide, then they would revolve around each other under the influence of gravity. No matter what the initial conditions, the revolutions would eventually stabilize and become unchanging: the dance of death. I then introduced a third sphere, and to my astonishment, the situation changed completely… This third sphere gave ‘emptiness’ life. The three spheres, given initial movements, went through complex, seemingly never-repeating movements. The descriptive equations rained down in a thunderstorm without end. Just like that, I fell asleep. The three spheres continued to dance in my dream, a patternless, never-repeating dance. Yet, in the depths of my mind, the dance did possess a rhythm; it was just that its period of repetition was infinitely long. This mesmerized me. I wanted to describe the whole period, or at least a part of it. The next day I kept on thinking about the three spheres dancing in ‘emptiness.’ My attention had never been so completely engaged.”

And that’s a very imaginative description of what’s known in physics and mathematics as the three-body problem, which unlike a two-body problem has no closed-form solution, which is the reason for the unending, non-repeating motion. What I like about this story is the way the character responds to the increasing structure in his mental space. As structure increases he becomes increasingly engaged. I think this subjective response to structure will have to be an indispensable aspect of any philosophy of structure.

Another literary, or scriptural example, of this idea is in Latter-day Saint scripture in the Book of Mormon. A prophet named Lehi talks about how existence itself depends on the tension between opposites: “For it must needs be, that there is an opposition in all things. If not so… righteousness could not be brought to pass, neither wickedness, neither holiness nor misery, neither good nor bad. Wherefore, all things must needs be a compound in one; wherefore, if it should be one body it must needs remain as dead, having no life neither death, nor corruption nor incorruption, happiness nor misery, neither sense nor insensibility.” (2 Nephi 2:11)

There’s a similar idea here of “death” as with Cixin Liu’s character who finds only death in his static or repetitive mental structures.

Another idea that comes up in both the aesthetic and technical instances of structure is that of distinction. Lehi talks about opposition, setting one thing against another. We could say that the opposing entities endow each other with definition and identity. The Hebrew Bible also contrasts the formlessness and emptiness, the תֹהוּ וָבֹהוּ (tohu va-bohu), with separation. Elohim brings order the earth by separating things, the verb in the Bible of separation is בָּדל (badal). וַיַּבְדֵּ֣ל אֱלֹהִ֔ים בֵּ֥ין הָאֹ֖ור וּבֵ֥ין הַחֹֽשֶׁךְ (va-yavdel elohim ben ha-or u-ben ha-choshek); “and God separated the light from the darkness” (Genesis 1:4). God separated the sky from the sea, the day from the night. Through separation what was a formless void came to have structure.

What are some examples of structure from a more technical side, scientifically and philosophically? Interestingly enough there are actually some concepts that overlap with these literary and scriptural ones, sharing notions of both distinction and form.

One way to think about a system is by using a phase space. A phase space is an abstract space in which all possible states of a system are represented, with each possible state corresponding to one unique point in the phase space. This is also called a state space. To get the general concept of an abstract space you can think of a graph with a horizontal axis and a vertical axis, each axis representing some property. The points on the graph represent different combinations of the two properties. That’s a kind of phase space.

To give a very simple example, consider a system of 2 particles in 1-dimensional space, which is just a line. The state space containing all possible arrangements of a system of n particles in 1-dimensional space will have 1n dimensions. Such a space dealing strictly with positions is also called a configuration space.  So for our 2 particle system the configuration space will have 2 dimensions. We can represent this on a graph using a horizontal axis for one particle and a vertical axis for the second particle. Any point on the graph represents a single combination of positions for particles 1 and 2. That example is nice because it’s visualizable. When we expand to more than 3 dimensions we lose that visualizability but the basic principles still apply.

A classic example of a phase space is for a system of gas particles. Say we have a gas with n particles. These particles can have several arrangements. That’s putting it mildly. The collection of all possible arrangements makes up a configuration space of 3n dimensions, 3 dimensions for every particle. Such a space could have billions upon billions of dimensions. This is not even remotely visualizable but the principles are the same as in our 2 dimensional configuration space above. A single point in this configuration space is one possible arrangement of all n particles. It’s like a snapshot of the system at a single instant in time. All the points in the configuration space comprise all possible arrangements of these n particles.

To get a more complete picture of the system a phase space will have 3 additional dimensions for the momentum of each particle in the 3 spatial directions. So the phase space will have 6n dimensions. A snapshot of the system, with the positions and momenta of every particle at one instant, will occupy a single point on the 6n dimensional phase space and the entire 6n dimensional phase space will contain all possible combinations of position and momenta for the system. The evolution of a system through successive states will trace out a path in its phase space. It’s just a mind-bogglingly enormous space. There’s no way we can actually imagine this in any sort of detail. But just the concept is useful.

The sum total of all possible states that a system can take constitutes a tremendous amount of information, but most states in a phase space aren’t especially interesting and I’d suggest that this is because they aren’t very structured. One useful affordance of phase spaces is that we can collect groups of states and categorize them according to criteria of interest to us. The complete information about a single state is called a microstate. A microstate is complete and has all the information about that system’s state. So for example, in the case of a system of gas particles the microstate gives the position and momentum of every particle in the system. But for practical purposes that’s too much information. To see if there’s anything interesting going on we need to look at the state of a system at a lower resolution, at its macrostate. The procedure of moving from the maximal information of microstates to the lower resolution of macrostates is called coarse graining. In coarse graining we divide the phase space up into large regions that contain groups of  microstates that are macroscopically indistinguishable. We can represent this pictorially as a surface divided up into regions of different sizes. The states in a shared region are not microscopically identical. In the case of a system of particles, the states have different configurations of positions and momenta for the particles composing them. But the states in the shared region are macroscopically indistinguishable, meaning that they share some macroscopic property. Examples of such macroscopic properties for a gas are temperature, pressure, volume, and density.

The size of a macrostate is given by the number of microstates included in it. Macrostates of phase spaces can have very different sizes. Some states are very unique and occupy tiny regions of the phase space. Other states are very generic and occupy enormous regions. A smaller macrostate would be one where there are fewer microstates that could produce it. It’s more unique. Larger phase spaces are more generic. An example of an enormous macrostate region is thermodynamic equilibrium. Thermodynamic equilibrium is a condition in which the macroscopic properties of a system do not change with time. So, for example, macroscopic properties like temperature, pressure, volume, and density would not change in thermodynamic equilibrium. The reason the space of thermodynamic equilibrium is huge is because it contains a huge number of macroscopically indistinguishable microstates. What this means is that a condition of thermodynamic equilibrium can be realized in an enormous number of ways. In a gas for instance, the particle can have an enormous number of different configurations of positions and momenta that make no difference to the macroscopic properties and that all manifest as a condition of thermodynamic equilibrium. The system will continue to move through different microstates with time, tracing out a curve in phase space. But because the macrostate of thermodynamic equilibrium is so huge the curve will remain in that region. The system is not going to naturally evolve from thermodynamic equilibrium to some more unique state. It is so statistically unlikely to be, for our practical considerations, a non-possibility.

I think of the Biblical תֹהוּ וָבֹהוּ (tohu va-bohu) as a kind of thermodynamic equilibrium. It’s not necessarily that there’s nothing there. But in a sense, nothing is happening. Sure, individual particles may be moving around, but not in any concerted way that will produce anything macroscopically interesting.

This thermodynamic equilibrium is the state toward which systems naturally tend. The number of indistinguishable microstates in a macrostate, the size of a macrostate, is quantified as the property called entropy. Sometimes we talk about entropy informally as a measure of disorder. And that’s well enough. It also corresponds nicely, albeit inversely, to the notion of structure. More formally, the entropy of a macrostate is correlated (logarithmically) to the number of microstates corresponding to that macrostate. Using the intuition of the informal notion of disorder you might see how a highly structured macrostate would have fewer microstates corresponding to it. There aren’t as many ways to put the pieces together into a highly structured state as there are to put them into a disordered state.

Some notions related to structure, like meaning or function, are fairly informal. But in the case of entropy it’s actually perfectly quantifiable. And there are equations for it in physics. If the number of microstates for a given macrostate is W, then the entropy of that macrostate is proportional to the logarithm of W, the logarithm of the number of microstates. This is Boltzmann’s equation for entropy:

S = k log W

in which the constant k is the Boltzmann constant, 1.381 × 10−23 J/K and entropy has units of J/K. This equation for entropy holds when all microstates of a given macrostate are equally probable. But if this is not the case then we need another equation to account for the different probabilities. That equation is:

S = -k ∑ pi log pi

or equivalently,

S = k ∑ pi log (1/pi)

Where pi is the probability of each microstate. This reduces to the first equation if the probabilities of all the microstates are equal and pi = 1/W.

How might W look differently for different macrostates? It’s fairly easy to imagine that a state of thermodynamic equilibrium would have a huge number of indistinguishable microstates. But what if the system has some very unusual macrostate? For example, say all the gas particles in a container were compressed into a tiny region of the available volume. This could still be configured in multiple ways, with many microstates, but far fewer than if they were distributed evenly throughout the entire volume. Under such constraints the particles have far fewer degrees of freedom and the entropy of that unusual configuration would be much lower.

Let’s think about the different sizes of macrostates and the significance of those different sizes in another way, using an informal, less technical, literary example. One of my favorite short stories is La biblioteca de Babel, “The Library of Babel”, by Argentine author Jorge Luis Borges. Borges was a literary genius and philosophers love his stories. This is probably the story referred to most, and for good reason. In La biblioteca de Babel Borges portrays a universe composed of “an indefinite and perhaps infinite number of hexagonal galleries”. This universe is one vast library of cosmic extension. And the library contains all possible books. “Each book is of four hundred and ten pages; each page, of forty lines, each line, of some eighty letters… All the books, no matter how diverse they might be, are made up of the same elements: the space, the period, the comma, the twenty-two letters of the alphabet.” So there are bounds set to the states this library or any of its books can take. But this still permits tremendous variability. As an analogy with statistical mechanics we can think of the Library of Babel as a phase space and of each book as a microstate.

Daniel Dennett has referred to the Library of Babel in his philosophy and proposed some of the books that, under the conditions set by Borges, must be understood to exist in this library: “Somewhere in the Library of Babel is volume consisting entirely of blank pages, and another volume is all question marks, but the vast majority consist of typographical gibberish; no rules of spelling or grammar, to say nothing of sense, prohibit the inclusion of a volume… It is amusing to think about some of the volumes that must be in the Library of Babel somewhere. One of them is the best, most accurate 500-page biography of you, from the moment of your birth until the moment of your death. Locating it, however, would be all but impossible (that slippery word), since the Library also contains kazillions of volumes that are magnificently accurate biographies of you up till your tenth, twentieth, thirtieth, fortieth… birthday, and completely false about subsequent events… Moby Dick is in the Library of Babel, of course, but so are 100,000,000 mutant impostors that differ from the canonical Moby Dick by a single typographical error. That’s not yet a Vast number, but the total rises swiftly when we add the variants that differ by 2 or 10 or 1,000 typos.” (Darwin’s Dangerous Idea)

A key takeaway from this fantastical story is that only an infinitesimal portion of its volumes are even remotely meaningful to readers. The vast majority of the books are complete nonsense. The Library of Babel is a little easier for me to think about in certain ways than phase space. For many things I’m not sure how to generate a phase space by picking out specific properties and assigning them axes onto which individual states would project with numerical coordinates. A lot of things don’t easily lend themselves to that kind of technical breakdown. But thinking of microstates and macrostates more informally, let’s just take the macrostate of all the books in the Library of Babel that are completely meaningless. This would be a huge macrostate comprising the vast majority of the library, the vast majority of its books, i.e. microstates. As with thermodynamic equilibrium this is the most likely macrostate to be in. And the evolution system, moving from one book to the next, will more than likely never leave it, i.e. will never find a book with any meaningful text.

But the Library of Babel does contain meaningful texts. And we could coarse grain in such a way to assign books to different macrostates based on the amount of meaningful text they contain. After the macrostate containing books of complete nonsense the next largest macrostate will contain books with a few meaningful words. The macrostates for books with more and more meaningful words get successively smaller. And even smaller macrostates when those words are put into meaningful sentences and then paragraphs. The smallest macrostates will have entire books of completely meaningful text. But as any book browser knows, books vary in quality. Even among books of completely meaningful text some will be about as interesting as an online political flame war. The macrostates of literary classics and of interesting nonfiction will be comparatively miniscule indeed.

We can think of a book in the Library of Babel as a kind of message. And this is to start thinking about in terms of another technical field that I think is relevant to a philosophy of structure. And that is information theory. Information theory has some interesting parallels to statistical mechanics. And it even makes use of a concept of entropy that is very similar to the thermodynamic concept of entropy. In information theory this is sometimes called Shannon entropy, named after the great mathematician Claude Shannon. The Shannon entropy of a random variable is the average level of “information”, “surprise”, or “uncertainty” inherent in the variable’s possible outcomes. It’s calculated in a very similar way to thermodynamic entropy. If Shannon entropy is H and a discrete random variable X has possible outcomes x1,…,xn, with probabilities P(x1),…,P(xn), then the entropy is calculated by the equation:

H(X) = – ∑ P(xi) logb (xi)

That equation should look very familiar because it’s identical in form to the equation for thermodynamic entropy, in the case where the microstates have different probabilities. The base of the logarithm is b, often base 2, with the resulting entropy being given in units of bits. A bit is a basic unit of data that represents a logical state with one of two possible values. So for example, whether a value is 0 or 1 is a single bit of information. Another example is whether a coin toss result is heads or tails.

An equation of this form gives an average quantity. The value -log pi, or equivalently log (1/pi), is the “surprise” for a single outcome and so has a higher value when its probability is lower, which makes sense. More improbable outcomes should be more surprising. When the surprise values for all the outcomes, multiplied by their respective probabilities are summed together, i.e. the average surprise, the total is the Shannon entropy. This average quantity can also be used to calculate the total information the message contains. It’s equal to the entropy of the message per bit, multiplied by the length of the message. That gives the total information content of the message.

This is an extremely useful way to quantify information and this is just a taste of the power of information theory. Even so I don’t think it exhausts our notions of what information is, or can be. Daniel Dennett makes a distinction between Shannon information and semantic information (From Bacteria to Bach and Back). Shannon information is the kind of information studied in information theory. To explore this distinction let’s return to Borges’s library.

One thing I like about La biblioteca de Babel is the way it conveys the intense human reaction to semantic meaning, or the incredible lack of it in the case of the library’s inhabitants. The poor souls of the Library of Babel are starving for meaning and tortured by the utter senselessness of their universe, the meaningless strings of letters and symbols they find in book after book, shelf after shelf, floor after floor. There’s a brilliant literary expansion of Borges’s library in the novella A Short Stay in Hell by Steven L. Peck, in which one version of Hell itself actually is the Library of Babel, with the single horrific difference that its inhabitants can never die.

In terms of Shannon information most books on the shelves of the Library of Babel contain a lot of information. But almost all the books contain no semantic information whatsoever. This is an evaluation we are only able to make as meaning-seeking creatures. Information theory doesn’t need to make distinctions about semantic meaning. It doesn’t need to and is able to accomplish its scope of work without it. But when we’re thinking about structures, with meaning and functions, in the way I’m trying to, we need that extra level of evaluation that is, at least for now, only available to humans.

That’s not to say there’s an absolute, rigid split between the objective and subjective. Information theory makes use of the subjective phenomena of human perceptions like sight and sound. This is critical to perceptual coding. We’re all beneficiaries of perceptual coding when we use jpeg images, mp3 audio files, and mp4 video files. These are all file types that compress data, with loss, by disposing of information that is determined to be imperceptible to human senses.

That’s getting a little closer to semantic information. Semantic information doesn’t have to be transmitted word for word. Someone can get the “gist” of a message and convey it with quite high fidelity, in a certain sense anyway. The game of telephone notwithstanding, we can share stories with each other without recording and replaying precise scripts of characters or sound wave patterns. We can recreate the stories in our own words at the moment of the telling.

That’s not to say that structure has to be about perception. Something like a musical composition or narrative has a lot to do with perception and aesthetic receptivity. But even compositions and narratives can contain structure that few people or even no people pick up on. And there are also structures in nature and in mathematics that remain hidden from human knowledge until they are discovered.

I think there are some affinities between what I will informally call the degree of structure in the hidden and discovered structures in nature and mathematics and the degree to which the outputs of those structures can be compressed. Data compression is the process of encoding information using fewer bits than the original representation. How is that possible? Data compression programs exploit regularities and patterns in the data and produce code to represent the data more efficiently. Such a program creates a new file using a new, shorter code, along with a dictionary for the code so that the original data can be restored. It’s these regularities and patterns that I see being characteristic of structure. This can be quantified in terms of Kolmogorov complexity.

The Kolmogorov complexity of an object is the length of the shortest computer program that produces the object as output. For an object with no structure that is completely random the only way for a computer program to produce the object as an output is just to reproduce it in its entirety. Because there are no regularities or patterns to exploit. But for a highly structured object the computer program to produce it can be much shorter. This is especially true if the output is the product of some equation or algorithm.

For example an image of part of the Mandelbrot set fractal might take 1.61 million bytes to store the 24-bit color of each pixel. But the Mandelbrot set is also the output of a simple function that is actually fairly easy to program. It’s not necessary to reproduce the 24-bit color of each pixel. Instead you can just encode the function and the program will produce the exact output. The Mandelbrot set is a good example for illustration because the fractal it produces is very elegant. But the same kind of process will work with any kind of function. Usually the program for a function will be much shorter than the data set of its output.

Often scientific discovery is a matter of finding natural structures by working backward from the outputs to infer the functions that produce them. This is the project of trying to discover the laws of nature. Laws are the regularities and patterns at work in nature. The process can be tricky because there are often many confounding factors and variables are rarely isolated. But sorting through all that is part of the scientific process. As a historical example, Johannes Kepler had in his possession a huge collection of astronomical data that had been compiled over decades. Much of it he had even inherited from his mentor Tycho Brahe. What Kepler was ultimately able to do was figure out that the paths traced out by the recorded positions of the planets in space were ellipses. The equation for an ellipse is fairly simple. Now knowing that underlying regularity makes it possible not only for us to reproduce Brahe and Kepler’s original data sets. But we can retrodict and predict the positions of planets outside those data sets, because we have the governing equations. 

That kind of pattern-finding often works well in discerning natural structures. It’s less relevant to human structures where creativity, novelty, and unpredictability can actually be features of greater aesthetic structure. It’s for reasons like this that my approach to a philosophy of structure is highly varied and somewhat unsystematic, pulling pieces together from several places.

Structure seems especially important in the arts and a philosophy of structure in the arts will necessarily overlap with the study of aesthetics. It’s really creative, artistic structures that I find most interesting of all.

Dieter F. Uchtdorf talked about human creativity in a way that I think touches on the key aesthetic features of structure. He said: “The desire to create is one of the deepest yearnings of the human soul… We each have an inherent wish to create something that did not exist before… Creation brings deep satisfaction and fulfillment. We develop ourselves and others when we take unorganized matter into our hands and mold it into something of beauty.” (italics added) A number of important ideas here. I’ll focus on two: (1) that creation is bringing into existence something that did not exist before and (2) that creation is a process of taking unorganized matter and molding it into something of beauty. This coheres with the idea I proposed earlier of the Hebrew creation story, that the lack of form, or structure, in the primordial chaos could be understood also as a kind of emptiness, or nothingness. By imposing a new structure onto raw, unorganized materials it’s possible to bring into existence something that did not exist before.

This is similar to Aristotle’s idea of a formal cause. In Aristotle’s metaphysics he identified four kinds of causes: material, formal, efficient, and final. We’ll just look at the first two here. The material cause is the raw material that composes whatever is being brought about. If we want to understand how a wooden table is created the material cause is the wood used to make it. That’s the unorganized matter. The formal cause is the form, arrangement, shape, or structure, into which this material is fashioned. Clearly the formal cause is just as important to bringing the object about.

The ways we evaluate structure and its aesthetic virtues, its beauty, is a complex subject. Are aesthetic criteria objective or subjective? The aesthetic response is certainly a subjective process. But is the subjective response a consistent and law-like process that correlates to objective features? It’s difficult to say.

David Bentley Hart said of aesthetics: “The very nature of aesthetic enjoyment resists conversion into any kind of calculable economy of personal or special benefits. We cannot even isolate beauty as an object among other objects, or even as a clearly definable property; it transcends every finite description. There have, admittedly, been attempts in various times and places to establish the ‘rules’ that determine whether something is beautiful, but never with very respectable results… Yes, we take pleasure in color, integrity, harmony, radiance, and so on; and yet, as anyone who troubles to consult his or her experience of the world knows, we also frequently find ourselves stirred and moved and delighted by objects whose visible appearances or tones or other qualities violate all of these canons of aesthetic value, and that somehow ‘shine’ with a fuller beauty as a result. Conversely, many objects that possess all these ideal features often bore us, or even appall us, with their banality. At times, the obscure enchants us and the lucid leaves us untouched; plangent dissonances can awaken our imaginations far more delightfully than simple harmonies that quickly become insipid; a face almost wholly devoid of conventionally pleasing features can seem unutterably beautiful to us in its very disproportion, while the most exquisite profile can do no more than charm us… Whatever the beautiful is, it is not simply harmony or symmetry, or consonance or ordonnance or brightness, all of which can become anodyne or vacuous of themselves; the beautiful can be encountered—sometimes shatteringly—precisely where all of these things are deficient or largely absent. Beauty is something other than the visible or audible or conceptual agreement of parts, and the experience of beauty can never be wholly reduced to any set of material constituents. It is something mysterious, prodigal, often unanticipated, even capricious.” (The Experience of God)

These are good points. Aesthetic judgment is difficult to systematize. And I can’t say I know of any theory that successfully defines precise evaluative procedures from objective criteria. But neither is that to say that aesthetic judgment is arbitrary. There are easy cases where there is near universal agreement that artistic creations are of high or low quality. And there are also harder cases where appreciation for high quality art requires refined tastes, refined through training and initiation into an artistic practice. Even the best critics are not able to fully articulate their reasons for making the judgments they do. And they may have imprecise vocabulary that is incomprehensible to those outside the practice. Sommeliers and wine tasters, for example, have a vocabulary for their craft that goes completely over my head (and taste buds). But I don’t doubt that the vocabulary is meaningful to them. I believe all these artforms have structures to which we can refer, if only imprecisely.

Having looked briefly in this episode at some general ideas pertaining to structure, what I want to do in following episodes for the series is look closely at examples of structure in more detail, focusing on individual fields, one at a time. Like music, chemistry, biology, language, social and political organizations, and mathematics. I expect that the characteristics of structure in these different cases will be varied. But I hope that as the coverage gets more comprehensive it will give more opportunity for insight into the general nature of structure. I hope through some inductive and abductive reasoning to infer general patterns of structure across these various domains, to understand a general structure of structure.