No seriously what is Entropy

 I always found the popular science description of entropy as ‘disorder’ as a bit unsatisfying.

It has a level of subjectivity that the other physical quantities don’t.  Temperature, for example, is easy- we all experience low and high temperatures, so can readily accept that there’s a number which quantifies it. It’s a similar story for things like pressure and energy. But no one ever said ‘ooh this coffee tastes very disordered.’

Yet entropy is in a way one of the most important concepts in physics. Among other things, because it’s attached to the famous second law of thermodynamics, with significance towering over the other laws of thermodynamics (which are, in relation, boring as shit). It states that the entropy of a closed system can only increase over time.

But what does that mean?! What is entropy really? If you dig deep enough, it has an intuitive definition. I’ll start with the most general definition of entropy. Then, applying it to some every day situations, we can build up an idea of what physicists mean when they say ‘entropy’.

Missing Information

Fundamentally, entropy is not so much the property of a physical system, but a property of our description of that system. It quantifies the difference between the amount of information stored in our description, and the total quantity of information in the system, i.e. the maximum information that could in principle be extracted via an experiment.

Usually in physics, it’s too difficult to model things mathematically without approximations. If you make approximations in your model, or description, you can no longer make exact predictions of how the system will behave. You can instead work out the probabilities of various outcomes.

Consider the general setup of an experiment with n possible outcomes, which one can label outcome 1, outcome 2, … outcome n. Each outcome has a probability p1,p2, p3 …pn assigned to it. Each p has a value between 0 (definitely won’t happen) and 1 (definitely will happen). The Gibbs entropy S one assigns to the description of a system is a function of these p‘s (see mathsy section at the end for the explicit definition).

Imagine we thought up a perfect theory to describe the system under study using no approximations, so we could use this theory to predict with certainty that outcome 1 would occur. Then p1 = 1 and all other p‘s would be zero. One can plug these probabilities into the Gibbs entropy, and find that in this case, S = 0. There is no missing information. In contrast, if we had no information at all, then all probabilities will have the same value. How could we say one outcome is more or less likely than any other? In that case, S ends up with its maximum possible value.

There’s a classic example that’s always used to describe the Gibbs entropy – tossing a coin. There are two possible outcomes- heads or tails. Usually we consider the outcome to be pretty much random, so we say that they’re equally likely: p(heads) = 1/2, p(tails) = 1/2. This description contains no information, a prediction using these probabilities is no better than a random guess. What if we discovered that one of the sides of the coin was weighted? Then one outcome would be more likely than the other, we can make a more educated prediction of the outcome. The entropy of the description has been reduced. 

Going further, if we modelled the whole thing properly with Newton’s laws, and knew exactly how strongly it was flipped, its initial position, etc, we could make a precise prediction of the outcome and S would shoot towards zero.


Fig.1: S for different predictions of the outcome of a coin flip.

Working with this definition, the second law of thermodynamics comes pretty naturally. Imagine we were studying the physics of a cup of coffee. If we had perfect information, and knew the exact positions and velocities of all the particles in the coffee, and exactly how they will evolve in time, then S=0, and would stay at 0. We always know exactly where all the particles are at all times. However, what if there was a rogue particle we didn’t have information about, then S is small but non-zero. As that particle (possibly) collides with other particles around it, we become less sure what the position and velocity of those neighbours could be. The neighbours may collide with further particles, so we don’t know their velocities either. The uncertainty would spread like a virus, and S can only increase. It can never go the other way.

I said before that this S is about a description, rather than a physical quantity. But entropy is usually considered to be a property of the stuff we’re studying. What’s going on there? This brings us to…

Microstates and Macrostates

In physics, we can separate models into two broad groups. The first, with “perfect” information, is aiming to produce exact predictions. This is the realm containing, for example, Newtons laws. The specification of a “state” in one of these models, contains all possible information about what its trying to describe, and is called a microstate.

The second group of models are those with “imperfect information”, containing only some of the story. Included in the second set is thermodynamics. Thermodynamics seeks not to describe the positions and velocities of every particle in the coffee, but more coarse quantities like the temperature and total energy, which only give an overall impression of the system. A thermodynamic description is missing any microscopic information about particles and forces between them, so is called a macrostate.

A microstate specifies the position and velocity of all the atoms in the coffee.

A macrostate specifies temperature, pressure on the side of the cup, total energy, volume, and stuff like that.

In general, one macrostate corresponds to many microstates. There are many different ways you could rearrange the atoms in the coffee, and it would still have the same temperature. Each of those configurations of atoms corresponds to a microstate, but they all represent a single macrostate.

Some macrostates are “bigger” than others, containing lots of microstates, and some contain little. We can loosely refer to the number of ways you could rearrange the atoms while remaining in a macrostate as its size.

What does all this have to do with entropy? If I were to tell you that your coffee is in a certain macrostate, this gives you information. It narrows down the set of possible microstates the coffee could be in. But you still don’t know for sure exactly what’s going on in the coffee, so there is missing information, and a non-zero entropy. But if the coffee was in a smaller macrostate, our thermodynamic description would give more information, since we’ve narrowed down further the number of microstates the coffee could be in. Then our description contains more information, so this is a lower entropy macrostate.
Hence, the entropy of a macrostate (called the Bolzmann entropy) is defined to be proportional to its size. For messy thermodynamic systems like the coffee, entropy is a measure of how many different ways you can rearrange its constituents without changing its macroscopic behaviour. The Bolzmann entropy can be derived from the Gibbs entropy. It is not a different definition, but a special case of Gibbs, the case where we’re interested only in macroscopic physics.

Working with this definition, the second law of thermodynamics comes reasonably  naturally. Over time, a hot and messy system like a cup of coffee will explore through microstates randomly, the molecules will move around producing different configurations. Without any knowledge of what’s going on with the individual atoms, we can only assume that each microstate is equally likely. What macrostate is the system most likely to end up in? The one containing the most microstates. Which is also the one of highest entropy.

Consider the milk in your coffee. Soon after adding the milk, it ended up evenly spread out through the coffee, since in the macrostate of ‘evenly spread out milk’ is the biggest, so has the highest entropy. There are many different ways the molecules in the milk could arrange themselves, while conspiring to present a macroscopic air of spreadoutedness.
You don’t expect all the milk to suddenly pool up into one side of your cup, since this would be a state of low entropy. There are few ways the milk molecules could configure themselves while making sure they all stayed in that side. The second law predicts that you will basically never see your coffee naturally partition like this.
Fig.2: Cups of coffee in different microstates. The little blobs represent molecules of milk.

The Mystery Function of Thermodynamics

When one talks about the Boltzmann entropy, naturally there is a transition between considering entropy a property of the description to a property of physics. Different states in thermodynamics can be assigned different entropies depending on how many microstates it represents.

Once we stop thinking at all about what is going on with individual atoms, we are left with a somewhat mysterious quantity, S.

The “original” entropy, now known as the thermodynamic entropy, is a property of a system related to its temperature and energy. This was defined by Clausius in 1854, before the nature of the atoms at the macroscopic level were even understood. Back then, not everyone had been convinced that atoms were even a thing.

Thermodynamic entropy is what people most commonly mean when they refer to entropy, but, since it is defined without any consideration of the microscopic world, its true meaning is obscured. I hope it’s slightly less obscured for you now.

Why entropy is at the heart of information theory

All of thermodynamics can be derived from entropy

The arrow of time: why the second law causes a paradox


The Equations – for those who like maths

A system can be in n possible states. The probability of it being in state i is a number between 0 and 1, called pi. The Gibbs entropy is defined by:


log is the natural logarithm, and all you need to know about it is that log(pi) = 0 when pi=1. If the system is definitely in state 1, p1=1 and the rest are 0. Then the first term disappears since log(p1) becomes zero, and the rest of the terms also disappear, not because of the log but because of the factor of pat the front is zero. We end up with S = 0, corresponding to the fact that there is no missing information- we can perfectly predict the behaviour of the system.

In the more realistic situation, the p’s are a bunch of non-zero numbers, representing non-perfect information and leading to a non-zero S. Applying this equation to the coin flipping scenario, you end up with Fig. 1.

If we’re only interested in the macroscopic nature of a system, we would model it to be in a macrostate, which contains Ω microstates. The Boltzmann entropy is defined by:


If we’re in a macrostate with Ω=1, there is a single microstate it can be in so we know everything about the system. Ω=1 causes the log , and therefore S, to become 0. All pretty consistent. For Ω larger than 1, the log increases as Ω gets bigger. This leads to S increasing as the number possible states increases, i.e., we become less sure about which state the system is in.

The thermodynamic entropy is defined by


In words, it says that when ΔQ of heat energy is added to a system, the change in entropy ΔS, will be equal to ΔQ divided by temperature T. It’s actually the same number as the Boltzmann entropy, just written in terms of purely thermodynamic quantities.

This isn’t a very illuminating equation in my opinion. The best I can offer to help is the following: imagine again pouring a drop of milk into your coffee. If, by some twist of fate, the cup was instantly cooled until both the milk and coffee froze, the milk would be frozen into the pretty pattern it made when it hit the coffee. This is quite a special, low entropy state.

You’re annoyed by the sudden freezing of your tea so you shove it in the microwave to add ΔQ worth of heat. As it melts, the milk is allowed to mix more and more with the tea, heading towards states of higher mixedness, ΔQ leads to ΔS. The division by T? As it approaches a state one would happily drink (T getting larger), adding more ΔQ leads to less of an increase in S. The milk is close to being fully mixed in, heating it more has less of an effect on S.


One thought on “No seriously what is Entropy

  1. Liked this blog and found it almost all understable and all interesting. You are developing an interesting style of writing and I always like your graphics. One spelling rouge > rogue


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s