2024 · Physics

Memory as a landscape: the physics behind machine learning

Awarded to John J. Hopfield and Geoffrey Hinton “for foundational discoveries and inventions that enable machine learning with artificial neural networks”.

What was the 2024 Nobel Prize in Physics awarded for?

The 2024 Physics prize honours the physics that underpins machine learning. John Hopfield showed that a simple network of connected nodes can store a memory as the low point of an energy landscape and recall it from a noisy or partial clue. Geoffrey Hinton extended that idea into the Boltzmann machine, a network that learns the hidden patterns in data on its own and helped launch today's deep learning.

Predict first

A network is shown a blurry, half-erased photo of a face it has seen before. With no database lookup, it cleans the image up and returns the original. How can a web of simple on/off nodes do that?

It treats memory as a downhill roll. Each stored pattern sits at the bottom of a valley in an energy landscape. The blurry photo starts partway up a slope, and the network keeps flipping nodes to lower its energy, so the state slides down into the nearest valley. The valley it lands in is the stored pattern most similar to the clue, which is the cleaned-up face.
Predict first

You never tell the network what a cat is. You just show it thousands of pictures. Later it can sketch a brand new cat-like image on its own. What did it actually learn?

The statistics of the data, not a list of rules. A Boltzmann machine has hidden nodes that are not pinned to the picture. By adjusting its connection strengths until the patterns it tends to produce match the patterns it was shown, it captures the recurring features of cats. Because those features now live in its weights, it can generate fresh examples that share them.
A memory is a valley in the network's energy landscape. A noisy clue dropped on a slope rolls downhill into the nearest valley, recovering the stored pattern closest to it.

Imagine a hilly landscape with a few deep valleys. Roll a ball anywhere on it and the ball runs downhill until it settles at the bottom of the nearest valley.

John Hopfield showed that a network of tiny connected switches can work the same way. Each thing you want it to remember, like a picture, becomes its own valley. If you then hand the network a smudged or half-missing version of that picture, it acts like the rolling ball. It keeps adjusting its switches to move downhill until it reaches the nearest valley, which is the clean memory. That is how it fills in and completes the clue.

The big idea in one line

Remembering is rolling downhill

A memory is stored as the bottom of a valley. Give the network a noisy hint and it slides down to the closest valley, recalling the full pattern. No searching through a list is needed.

Geoffrey Hinton took this further. He built a network that studies many examples and quietly learns the patterns hiding inside them, so it can even make new examples of its own. These physics ideas about energy and chance are the seeds of the machine learning we use every day.

Worth knowing

A trillion parameters grew from fewer than 500

Hopfield's original 1982 network had 30 nodes and fewer than 500 connections to adjust, simple enough to run on the computers of the day. The large language models built on the same basic idea now juggle more than a trillion parameters, a jump of more than a billionfold in barely four decades.

Check yourself

In a Hopfield network, what does a stored memory correspond to?

Why: Each stored pattern is written into the weights so that it sits at a local minimum of the network's energy. Recall is the network rolling downhill into the nearest such valley.

You feed a Hopfield network a noisy, partial version of a stored pattern. What happens?

Why: The update rule only ever lowers the total energy, so the state slides down to the nearest minimum, which is the stored pattern most similar to the noisy clue. That is how the network completes and cleans up the input.

What did Hinton's Boltzmann machine add beyond the Hopfield network?

Why: The Boltzmann machine introduces hidden units and uses the Boltzmann distribution, training the weights until the network's own samples match the data. This lets it discover features on its own and generate new examples, not just store fixed memories.

Key terms

Hopfield network
A recurrent network where every node connects to every other with symmetric weights. It stores patterns as low-energy states and recalls them by settling into the nearest one.
Energy function
A single number for the whole network, E = -1/2 times the sum of w_ij s_i s_j over all node pairs, borrowed from the physics of magnetic spins. Stored patterns sit at its lowest points.
Associative memory
A memory addressed by content rather than by location. Given part of a pattern, it returns the complete stored pattern that best matches.
Hebbian learning
A rule for setting connection weights so that nodes which should be active together are linked positively. It carves each stored pattern into a valley of the energy landscape.
Boltzmann machine
A network with hidden units that learns the statistical structure of data using the Boltzmann distribution, and can generate new examples like those it was trained on.
Hidden units
Nodes in a Boltzmann machine that are not fixed to the input. They let the network represent features that are not directly given in the data.

The laureates

Portrait of John J. Hopfield
John J. Hopfield
Princeton University, Princeton, NJ, USA

Born in Chicago in 1933, Hopfield was a physicist who moved into biology at Caltech. In 1982 he showed that a network of simple connected nodes can act as an associative memory, storing patterns as low points of an energy landscape and recalling them from noisy or partial clues.

Photo: bhadeshia123, CC BY 3.0 (via Wikimedia Commons)
Portrait of Geoffrey Hinton
Geoffrey Hinton
University of Toronto, Toronto, Canada

Born in London in 1947, Hinton is a computer scientist long based at the University of Toronto. He built the Boltzmann machine on top of Hopfield's idea, using statistical physics to let a network learn the structure of data by itself, work that helped start the modern growth of machine learning.

Photo: Cmichel67, CC BY-SA 4.0 (via Wikimedia Commons)

Sources

Facts are pinned from the official Nobel Prize API. The explanations were written from these sources:

Your notessaved
← Back to all prizes