Skip to Main Content

Anatomy of a Formal Proof

Jeremy Avigad
Johan Commelin
Heather Macbeth
Adam Topaz

Introduction

It has been a long day and you are making your way through a paper related to your work. You suddenly come across the following remark: “…since and are eigenvectors of with distinct eigenvalues, they are linearly independent.” Wait—how does the proof go? You should really know this. Here and are nonzero elements of a vector space and is a linear map. You force yourself to pick up a pen and write down the following argument:

Let and with . Suppose . Applying and using linearity, we have . Multiplying the original equation by , we have . Subtracting the two yields and since and are nonzero, we have . The corresponding argument with and swapped yields , so the only linear combination of and that yields is the trivial one.

That works, doesn’t it?

Your colleagues have all gone home and there is nobody around to discuss this with. So, instead, you turn to your computer and start up Lean, the proof assistant you happen to use. Can you prove the claim formally? As you type, the information window in your editor complains about every misstep—the syntax is fiddly, and you have to get the notation and the instructions just right—but that’s okay, because working through the proof is relaxing and kind of fun. Lean often makes you spell out arguments that are painfully obvious, but you have found that if you set things up just right, it will cheerfully fill in some of the details. After a short while, you have success! Lean signs off on the proof, confirming you have managed to construct a formal derivation in the system’s axiomatic foundation.

import Mathlib.LinearAlgebra.LinearIndependent

variable [Field K] [AddCommGroup V] [Module K V]

example (f : V →ₗ[K] V)
    (μ ν : K) (hμν : μ ≠ ν)
    (x y : V) (hx₀ : x ≠ 0) (hy₀ : y ≠ 0)
    (hx : f x = μ • x) (hy : f y = ν • y) :
    ∀ a b : K,
      a • x + b • y = 0 → a = 0 ∧ b = 0 := by
  intro a b hab
  have :=
  calc (μ - ν) • a • x
      = (a • μ • x + b • ν • y) -
        ν • (a • x + b • y) := by module
    _ = f (a • x + b • y) -
        ν • (a • x + b • y) := by simp [hx, hy]
    _ = 0 := by simp [hab]
  simp_all [sub_eq_zero]

What a lovely proof! It took some effort to work it out, but it was a pleasure seeing the steps play out formally, and you are proud of the result.

Understanding the Proof

Let’s take a look at what you have done. You are using Lean’s mathematical library, Mathlib, a communally developed and maintained repository of formally verified mathematics. At the time of writing, Mathlib contains more than 80,000 definitions and more than 160,000 theorems, starting from axiomatic primitives and developing mathematics from the bottom up. It also contains dozens of user-contributed automated reasoning procedures that help with the formalization process, as well as notation declarations and configuration information, all of which codify different aspects of our mathematical understanding.

To avoid having to load the entire library into memory at once, Lean asks you to tell the system what parts of the library you want to use. So the first line of the proof, the one that starts with the keyword import, tells Lean you want to use the file with the name shown. That pulls in all the files in the library that those notions depend on, which is quite a lot, including basic algebra, properties of scalars and vectors, and so on.

The next line, the one that begins with the word variable, declares some of the objects you want to work with: a field, , and a vector space, . You could have, alternatively, put these declarations after the example keyword on line 5; using the variable command to declare them separately is especially convenient when multiple definitions and theorems share the same data and hypotheses. It’s a quirk of the library’s design that you declare that is an additive, commutative, group, coupled with a scalar action that turns into a -module. (Remember, a vector space is nothing more than a module over a ring in which the ring in question is actually a field.) Mathlib includes thousands of axiomatically declared structures, and the contributors to the library take great care to maximize reusability by breaking structural hypotheses into reusable pieces that can easily be configured and composed.

Next comes the keyword example (line 5), which indicates that the result you are proving is not intended for future use. Using example is good for experimentation. To prove a lemma or theorem that you intend to use later, you would instead use lemma or theorem and provide a name. In that case, once Lean has processed the proof, it is stored in the environment, which means that any file that imports this one can see it. If you think you have proved a theorem that ought to go into Mathlib, you can issue a pull request to add it to the library. Mathlib is overseen by teams of maintainers and reviewers who moderate and update the contents, ensuring the quality and stability of the library.

You begin stating the claim by introducing the key players:

a linear map from to itself. Lean’s notation for such linear maps is V →ₗ[K] V,

two elements of , denoted and , which will end up being your eigenvalues, and a hypothesis called hμν that says and are distinct,

two vectors called and , as well as hypotheses hx₀ and hy₀ that say that they are nonzero,

and, finally, your two key assumptions, hx and hy, which say that and .

In the notation V →ₗ[K] V, the is a fixed symbol signifying linear maps, whereas the parameter in brackets indicates that we mean linearity over K.

At this point, you have set up your context, that is, the data and hypotheses you need to state your claim. You have your -vector space , a linear map , two scalars and , and eigenvectors and for and respectively. At that point you are ready to state the conclusion (lines 9 and 10), namely, that and are linearly independent: ∀ a b : K, a • x + b • y = 0 → a = 0 ∧ b = 0. You have decided to use the definition of linear independence for a pair of vectors: Whenever and are scalars, if then and .

Notice that you have used the symbol : in expressions like x y : V (line 7) to express that x and y are elements of V, rather than writing x ∈ V and y ∈ V. In Lean’s axiomatic foundation, a version of dependent type theory, every object has a fundamental data type. In this example, we would say that that and are terms of the type . Interestingly, we use the same notation in an expression like hx₀ : x ≠ 0 (also line 7) to express that hx₀ is a label for the assumption x ≠ 0. In dependent type theory, assertions like x ≠ 0 (also known as propositions) are analogous to data types, and they are handled by the same fundamental mechanisms. Lean checks that an expression like a • x + b • y is a well-formed expression of type V, given information about the variables and symbols involved, and, in the same way, it checks that a formal expression is a well-formed proof of the proposition ∀ a b : K, a • x + b • y = 0 → a = 0 ∧ b = 0, given the data and assumptions it depends on. That is why the latter appears after the colon in the example’s conclusion: the statement of the example announces your intent to construct a formal proof of the conclusion, given the data and assumptions that come before.

What comes after the symbol := (line 10) is the proof itself, or, more precisely, instructions that tell Lean how to construct the proof. If you are successful, the corresponding expression is stored in memory and checked for correctness by Lean’s trusted kernel. The keyword by after the := instructs Lean to enter tactic mode, which means that the text that follows should be interpreted as a list of instructions that tell Lean how to build the required proof. A tactic is a procedure that automatically fills in a chain of formal inferences that is needed to justify a reasoning step. Such a chain can be as short as a single logical axiom or rule, but it can also be quite long and involved. Tactics thus play an important role in bridging the gap between the kind of reasoning that is intuitively clear and natural to mathematicians and the stringent axiomatic rules embodied by a formal system.

The first tactic you use is intro a b hab (line 11). This introduces the two variables a : K, b : K which your statement quantifies over, as well as the antecedent of the implication, that is, the assumption hab : a • x + b • y = 0. As you type or move your cursor around a proof, Lean displays the tactic state, i.e., information that is relevant at that point in the proof, in its infoview window. After the intro tactic, the tactic state looks something like this:

K : Type
V : Type
f : V →ₗ[K] V
μ ν : K
hμν : μ ≠ ν
x y : V
hx₀ : x ≠ 0
hy₀ : y ≠ 0
hx : f x = μ • x
hy : f y = ν • y
a b : K
hab : a • x + b • y = 0
⊢ a = 0 ∧ b = 0

This is nice a summary of where you are in the proof, including the objects and assumptions you started with as well as the objects a, b and the assumption hab that you introduced in the first step. The line that starts with indicates that your current goal is to prove the conjunction a = 0 ∧ b = 0.

While writing the proof, you notice that Lean complains with warnings and error messages. This is expected, since the proof is incomplete. You can appease Lean by apologizing for the incomplete proof: If you use the sorry tactic on the final line of the proof, Lean will not raise an error over the fact that the proof is incomplete, but it will still raise a gentle warning about the use of sorry.

The next step in the proof starts with have := calc (line 12). This introduces a calculation, similar to a \begin{align*} …\end{align*} block in LaTeX. The calculation itself is very similar to the one in the proof sketch above, but notice that the individual steps in the calculation are justified by short subproofs like by module (line 15) or by simp [hx, hy] (line 17). The module tactic proves equalities of universal linear expressions, in other words, linear equalities that are true in all modules, and do not use specific facts about the module at hand. The simp tactic is a powerful tool that uses a database of equalities and equivalences from Mathlib, together with user-specified equations (like hx, hy) to rewrite its goal into a simpler form. Happily, in this case, the goal becomes a trivial equality, and the goal is therefore closed. In a moment, we will consider the steps that simp has taken in greater detail.

After this calculation, you completed your proof with the simp_all tactic (line 19), which is a variant of simp that recursively uses all the hypotheses to simplify all hypotheses and the goal. Now if you leave a sorry after the simp_all, Lean will complain that there is nothing to be sorry about and insist that remove the apology. Lean checks the correctness of the proof and confirms it with its silence: the absence of errors is the proclamation that you have succeeded.

Using Automation

So what exactly does the simp tactic do? The following provides a manual, more detailed proof of one step of the calculation in your proof, with the direction of the equality reversed.

variable (f : V →ₗ[K] V)
  (μ ν : K) (x y : V) (a b : K)

example (hx : f x = μ • x) (hy : f y = ν • y) :
  f (a • x + b • y) = (a • μ • x + b • ν • y) := by
calc f (a • x + b • y)
    = f (a • x) + f (b • y) := by
        rw [map_add]
  _ = a • f x + b • f y := by
        rw [map_smul, map_smul]
  _ = (a • μ • x + b • ν • y) := by
        rw [hx, hy]

This example demonstrates the rewrite tactic, denoted rw. In the calculation, we first rewrite with the lemma map_add, which states that for any linear map f, the equation f (v + w) = f v + f w holds. The next step in the calculation is justified by rewriting twice with the lemma map_smul, which states that for any linear map f, scalar c, and vector v, the equation f (c • v) = c • f v holds. (Here smul stands for “scalar multiplication.”) The underscores before each subsequent step are part of the calc syntax, indicating to the parser that the calculation continues. The proof concludes by rewriting with the hypotheses hx and hy, which assert that x and y are eigenvectors of f with eigenvalues μ and ν respectively.

The lemmas map_add and map_smul are part of the Mathlib library, and they are labeled with the attribute @[simp]. This attribute tells Lean that the lemma should be added to the database of lemmas that simp uses to simplify expressions. And that is the reason why the simp tactic could prove the goal in one go: it chained together the rw steps that we spelled out step by step in the calculation above.

One of the challenges of formalization is that we often need to spell out, in painful detail, inferences that seem obvious or straightforward to us. The more we can get the computer to fill in, the better. The simplifier is an important example of automated reasoning that can help in this respect. The module tactic is another.

Broadly speaking, there are two classes of automation. Firstly, there is general purpose automation. One example that we have seen is the simp tactic, the simplifier. Another example is aesop, a tactic that provides “Automated Extensible Search for Obvious Proofs.” In the proof assistant Isabelle, there is a tool called sledgehammer, which can search for proofs using a large database of lemmas. And recently AI copilots have demonstrated the ability to suggest tactics and fill in parts of proofs in Lean. The wide applicability of these general-purpose tools is balanced by the fact that, at least for now, they can only assist with proofs that are relatively straightforward.

The other class of automation is domain-specific automation. Here we have also seen an example: the module tactic. Other examples include the ring tactic, which solves equations in commutative rings, the linarith tactic, which solves linear arithmetic problems, and the fun_prop tactic, which proves that functions satisfy a given property such as “continuity” or “measurability.” These tactics are less general; they typically have a well-defined and narrow scope, but they can be very powerful in their domain of applicability. A prime example of this approach comes from the use of software tools known as SAT solvers and SMT solvers: if a claim can be encoded as a boolean formula or a formula in some decidable theory, then these can be used to justify it automatically. This has been fruitfully applied in non-trivial ways, see for example BHMN22.

Generalizing the Hypotheses

Often, one of the first things we do when we have proved a lemma or a theorem is check whether the hypotheses can be weakened, in order to increase its applicability. Proof assistants are especially helpful in this respect, because they enable us to tinker with hypotheses interactively and see what breaks. Returning to our example, upon verifying the initial result, you might wonder: what are the minimal assumptions that we need to make this proof go through? Let’s experiment. The proof never mentions inverses of scalars, so it should work for a larger class of rings. In a first attempt, we might try to replace the vector space V by an arbitrary module M as follows:

variable {R M : Type}
  [Ring R] [AddCommGroup M] [Module R M]

But Lean complains at the first step in the calculation, which requires b • ν • y = ν • b • y. We can address this by assuming that the ring of scalars is commutative. However, after making that change, the final step of the proof is still broken. A bit of reflection shows that this step uses the result of the calculation steps together with the assumptions hμν : μ ≠ ν and hx₀ : x ≠ 0 to prove that a = 0. In other words, we need the additional property that r • m = 0 implies that r = 0 or m = 0 for every r in R and m in M. Mathlib expresses this as the property NoZeroSMulDivisors R M, and we can add that assumption as follows:

variable {R M : Type}
  [CommRing R] [AddCommGroup M] [Module R M]
  [NoZeroSMulDivisors R M]

example (f : M →ₗ[R] M)
  (μ ν : R) (hμν : μ ≠ ν)
  (x y : M) (hx₀ : x ≠ 0) (hy₀ : y ≠ 0)
  (hx : f x = μ • x) (hy : f y = ν • y) :
  ∀ a b : R, a • x + b • y = 0 → a = 0 ∧ b = 0

Lean accepts this statement with the same proof as before! We have therefore obtained a more general theorem without changing a single character of the proof.

Strengthening the Theorem

Let’s see if we can strengthen your result even further. You started with a basic version involving vector spaces over fields and only two eigenvectors, and we just generalized it to suitable modules over commutative rings. Where can we go from here?

Let’s try to generalize the theorem to arbitrary families of eigenvectors. To do this, we will need to think carefully about how to express, in Lean, the fact that an arbitrary family of vectors is linearly independent. We could write down a definition from scratch, but it makes sense to see if we can take advantage of things that are already in the library. If we navigate to the Mathlib documentation webpage ⁠Footnote1 and start typing linear independent in the search box, one of the first few results that comes up is the following:

def LinearIndependent {ι : Type}
  (R : Type) {M : Type}
  (v : ι → M) [Semiring R]
  [AddCommMonoid M] [Module R M] :
  Prop

Here R and M are the relevant ring and module, as before, ι is an indexing type, and v is a family of elements of M indexed by ι. It is standard, and convenient, to represent a family of elements of as a function v : ι → M, in which case the th element, , is simply written v i. All the arguments in curly and square brackets are generally left implicit, which is to say, we expect to write LinearIndependent R v and have Lean figure out the rest. The annotation Prop means that the expression LinearIndependent R v is a proposition, namely, the proposition that the family v is linearly independent over R. At the moment, we don’t need to know the body of the definition; we can use it as a black box.

To do so, we will add an indexing type ι to the statement of our theorem, and since we also want to say that the vectors are all eigenvectors, we will also use ι to index a family of scalars which will act as the eigenvalues. We can now formulate the statement we are after as follows:

example {R M : Type} [CommRing R]
    [AddCommGroup M] [Module R M]
    [NoZeroSMulDivisors R M]
    (f : M →ₗ[R] M)
    (μ : ι → R) (hμ : Function.Injective μ)
    (v : ι → M) (hv : ∀ i, v i ≠ 0)
    (h : ∀ i, f (v i) = μ i • v i) :
    LinearIndependent R v

Note that we have formulated the fact that we are considering distinct eigenvalues by assuming is injective.

We should check to see whether something like this theorem is already in the library. If we go back to the documentation webpage and search for eigenvector linear independent, only two results come up:

theorem
  Module.End.eigenvectors_linearIndependent
  {R : Type} {M : Type} [CommRing R]
  [AddCommGroup M] [Module R M]
  [NoZeroSMulDivisors R M]
  (f : Module.End R M)
  (μs : Set R) (xs : ↑μs → M)
  (h_eigenvec : ∀ (μ : ↑μs),
    f.HasEigenvector (↑μ) (xs μ)) :
  LinearIndependent (ι := ↑μs) R xs

and the variant

theorem
  Module.End.eigenvectors_linearIndependent'
  {R : Type} {M : Type} [CommRing R]
  [AddCommGroup M] [Module R M] {ι : Type}
  [NoZeroSMulDivisors R M]
  (f : Module.End R M)
  (μ : ι → R) (hμ : Function.Injective μ)
  (v : ι → M)
  (h_eigenvec : ∀ (i : ι),
    f.HasEigenvector (μ i) (v i)) :
  LinearIndependent R v

The difference between the two is that the first is about a set of scalars and a function assigning an eigenvector to each scalar, whereas the second one is about indexed families of scalars and eigenvectors. The second one is more promising for our application, since we also chose to use indexed families of vectors and scalars. To use this theorem to prove our version, we invoke the apply tactic.

example {R M : Type} [CommRing R]
    [AddCommGroup M] [Module R M]
    [NoZeroSMulDivisors R M]
    (f : M →ₗ[R] M)
    (μ : ι → R) (hμ : Function.Injective μ)
    (v : ι → M) (hv : ∀ i, v i ≠ 0)
    (h : ∀ i, f (v i) = μ i • v i) :
    LinearIndependent R v := by
  apply Module.End.eigenvectors_linearIndependent'

This leaves us with a number of goals:

1.

Function.Injective ?μ, asking us to prove that something is injective.

2.

∀ (i : ι), ?f.HasEigenvector (?μ i) (v i), which looks like it should have something to do with our assumptions h and hv.

3.

Module.End R M, asking us to provide an endomorphism of ; this should just be our linear map f.

4.

And finally, ι → R, which will be our .

The in the first two goals and the ?f in the second goal mean that Lean does not yet know how to instantiate the variables and in the theorem we have invoked. It left those tasks as the third and fourth goals; Lean expects that it will be more convenient for us to provide that information implicitly when we solve the other goals.

We have made progress, but we still need to provide the information requested. We should clearly use to solve the first goal, which we do by adding the next line to our proof:

apply Module.End.eigenvectors_linearIndependent'
  exact hμ

As you might guess, the exact tactic tells Lean to use the assumption to close the goal. As a side effect, that also closes goal 4: Lean is now able to infer that the family in question is . We then have two goals left:

1.

∀ (i : ι), ?f.HasEigenvector (μ i) (v i), and

2.

Module.End R M.

We know that the endomorphism of should be , so we swap the order of the goals and give Lean this information:

apply Module.End.eigenvectors_linearIndependent'
  exact hμ
  swap ; exact f

This leaves us with one last goal, namely, ∀ (i : ι), Module.End.HasEigenvector f (μ i) (v i). We need to use our assumptions h and hv, but it’s unclear how to package them together to satisfy the definition of Module.End.HasEigenvector. Instead of going back to the documentation page and looking up the definition, we can ask Lean what we have to do.

apply Module.End.eigenvectors_linearIndependent'
  exact hμ
  swap ; exact f
  intro i ; constructor

As before, the intro tactic introduces an arbitrary i, and then the constructor tactic tells Lean that we are ready to provide the information needed to show that v i is an eigenvector of with eigenvalue μ i. This makes progress, but we still have the following goals to fulfill:

1.

v i ∈ Module.End.eigenspace f (μ i)

2.

v i ≠ 0, which is obviously an application of hv.

For 1, the hypothesis h : ∀ i, f (v i) = μ i • v i should do the trick, but applying it directly doesn’t work. After digging into the details, we see that this is because the eigenspace of with respect to a scalar is defined as the kernel of , so we will need to convert h to this form. We could do this manually, but we can first check whether we can use a preexisting lemma from the library. The fastest way to do this, in the middle of a Lean proof, is to try the exact? tactic. This will do a search for ways of applying existing lemmas to close the goal exactly. This tactic doesn’t always work, but it can’t hurt to try. Aha! In this case it tells us right away that we can close the first goal by writing exact Module.End.mem_eigenspace_iff.mpr (h i). The second goal is also easy to close by applying hv, so now we have a complete proof:

apply Module.End.eigenvectors_linearIndependent'
  exact hμ
  swap ; exact f
  intro i
  constructor
  exact Module.End.mem_eigenspace_iff.mpr (h i)
  apply hv

The beginning of our proof was a bit messy, but we can clean things up to obtain a nice final result:

example {R M : Type} [CommRing R]
    [AddCommGroup M] [Module R M]
    [NoZeroSMulDivisors R M]
    (f : M →ₗ[R] M)
    (μ : ι → R) (hμ : Function.Injective μ)
    (v : ι → M) (hv : ∀ i, v i ≠ 0)
    (h : ∀ i, f (v i) = μ i • v i) :
    LinearIndependent R v := by
  apply
    Module.End.eigenvectors_linearIndependent' f μ hμ
  intro i
  constructor
  · exact Module.End.mem_eigenspace_iff.mpr (h i)
  · apply hv

The apply tactic now supplies f, μ, and right away, leaving only one remaining goal.

We have thus achieved the level of generality we were after. Should this theorem be added to Mathlib? Upon consideration, we should conclude that our theorem is not substantially different from Module.End.eigenvectors_linearIndependent'. In our proof, we only added some plumbing and changed the way we talk about eigenvectors to match the theorem in the library. Now that we know about Module.End.HasEigenvector, it seems that this is the standard way to talk about eigenvectors in Mathlib. If we were to modify our statement to use this instead of our bespoke h and hv, our theorem would be a direct application of Module.End.eigenvectors_linearIndependent'. In other words, we have come to realize that the theorem in the library is what we really wanted all along. But the good news is that we have learned a lot in the process, and we are now much more comfortable reasoning about linear independence and eigenvectors with Mathlib.

Using Version Control

Programmers and computer scientists have long made use of version control platforms like GitHub to work on large, collaborative software projects. Formalization has brought the same tools and methodologies to mathematics. At the time of writing, Mathlib comprises approximately 5,000 files and 1.5 million lines, written by over 300 contributors. These contributions go through an open review process on GitHub, before they are merged into the main repository. In total there have been a bit more than 30,000 contributions since Mathlib’s inception in 2017.

A nice thing about version control tools is that they maintain the entire history of the project, allowing us to see what has changed and when. For example, we can trace the history of the concept of “linear independence” in Mathlib. It all started on December 7, 2017, when Johannes Hölzl committed⁠Footnote2 the file algebra/linear_algebra/basic.lean to the repository. The file was 708 lines long, and on line 186 it contained a definition of linear independence.

def linear_independent (s : set β) : Prop :=
  ∀ l : lc α β, (∀x∉s, l x = 0) →
    l.sum (λv c, c • v) = 0 → l = 0

The terms l quantify over lc α β, the type of all linear combinations of elements of β with coefficients in α.

On March 10, 2018, an administrative operation⁠Footnote3 moved linear_algebra/ out of algebra/ so that it became a top-level folder. Another such move occurred on January 15, 2019, when Simon Hudon moved⁠Footnote4 all the mathematical content into src/ to separate it from the tests and other auxiliary files.

On July 3, 2019, Alexander Bentkamp morphed⁠Footnote5 the definition into

def linear_independent : Prop :=
  (finsupp.total ι M R v).ker = ⊥

In other words, a collection of vectors v in M is linearly independent if the natural map from the free module generated by the vectors v to M has trivial kernel.

Then, on October 5, 2020, Anne Baanen split⁠Footnote6 linear independence and the accumulated supporting theory into a separate file: src/linear_algebra/linear_independent.lean. The file was 918 lines long. The story continued on February 23, 2023, when Pol’tta / Miyahara Kō ported⁠Footnote7 the file to Lean 4 as part of a massive collaborative effort to move all of Mathlib to the new version of Lean. The filename is now Mathlib/LinearAlgebra/LinearIndependent.lean, and it lives in the new mathlib4 repository on GitHub. The definition has not changed substantially since the change by Bentkamp. At the time of writing, it reads as follows:

def LinearIndependent : Prop :=
  LinearMap.ker (Finsupp.total ι M R v) = ⊥

Conclusions

The earliest programs for checking mathematical proofs include Nicolaas de Bruijn’s Automath system, launched in 1967, and Andrzej Trybulec’s Mizar system, launched in 1973. Since then, dozens of proof assistants have been developed; Coq, Isabelle, and HOL Light are among the more prominent ones still in use today. The Lean project, launched by Leonardo de Moura in 2013, is a relative newcomer. A special issue of the Notices, with articles by John Harrison Har08, Thomas Hales Hal08, and Freek Wiedijk Wie08, surveyed the state of the field in 2008. We have come a long way since then.

Proof assistants are now commonly used in industry to verify hardware, software, network protocols, cryptographic protocols, cyberphysical systems, and more. Mathematicians have only recently begun to embrace the technology, and it is becoming clear that there are several benefits to representing mathematics in digital form. Just as the word processor opened up new opportunities for written expression and communication, the digitization of mathematics opens up new opportunities for mathematical research and teaching.⁠Footnote8 The practical benefits are not the only motivation; to many of us, formalizing mathematics feels like the right thing to do. Mathematical definitions and theorems deserve to be rendered digitally.

8

See Mas21Avi24Buz23 and the recent pair of special issues of the Bulletin of the American Mathematical Society (Volume 61, Numbers 2 and 3), “Will Machines Change Mathematics?”

It is also becoming clear that the technology is here to stay. Mathlib currently has roughly 1.5 million lines of code and continues to grow. Important results, including foundations for Clausen and Scholze’s condensed mathematics,⁠Footnote9 the polynomial Freiman–Ruzsa conjecture,⁠Footnote10 and an exponentially improved upper bound to Ramsey’s theorem,⁠Footnote11 have been formally verified before journal referees signed off on them. A number of collaborative verification projects have been launched, including a proof of the sphere eversion theorem⁠Footnote12 and a proof of a strengthened version of Carlson’s theorem on pointwise almost everywhere convergence of Fourier series.⁠Footnote13

We expect that, in the years to come, AI copilots that combine neural and symbolic methods will significantly ease the burden of formalization. More dramatically, we expect that the technology we have discussed here will play a significant role in the discovery of new mathematics. Note that DeepMind’s AlphaProof, which was deemed to have performed at the level of a silver medalist at the most recent International Mathematical Olympiad,⁠Footnote14 was trained to find formal proofs in Lean. It is exciting to think about what a synergetic combination of machine learning, symbolic methods, and user interaction will bring to mathematics in the years ahead.

You can find online documentation and tutorials for all the proof assistants we have just mentioned. Lean’s lively social media channel on the Zulip platform is welcoming to newcomers, and the Lean community web pages contain links⁠Footnote15 to learning resources, like the Natural Number Game, to help you get started. Proof assistants are not easy to use, and learning to formalize mathematics requires significant time and effort. Interaction with proof assistants like Lean comes naturally, however, to those who have grown up immersed in computational technology. We have therefore found that one of the best ways to take advantage of proof assistants is to have our students help us out. Formal mathematics is a language, and only they can claim to be among the first generation of native speakers, while the rest of us struggle to master the grammar and intonation.

With all the changes looming, we ought to be concerned about the ways that proof assistants and AI will change the mathematics that we know and love. It is therefore all the more important for those of us who are more settled in our careers to play an active role in the adoption of the new technologies, using our mathematical values and expertise to guide our students as they negotiate the changing landscape. The new developments offer us a wonderful opportunity to lead from behind, and it falls on all of us to support the next generation of mathematicians as they forge a path into the digital future.

Acknowledgment

We are grateful to the Hausdorff Research Institute for Mathematics for hosting three of us for the trimester program, “Prospects of Formal Mathematics,” in the summer of 2024, during which most of this article was written. We are also grateful to Paul Buckingham and three anonymous referees for helpful comments, corrections, and suggestions.

References

[Avi24]
Jeremy Avigad, Mathematics and the formal turn, Bull. Amer. Math. Soc. (N.S.) 61 (2024), no. 2, 225–240, DOI 10.1090/bull/1832. MR4726989,
Show rawAMSref \bib{MR4726989}{article}{ author={Avigad, Jeremy}, title={Mathematics and the formal turn}, journal={Bull. Amer. Math. Soc. (N.S.)}, volume={61}, date={2024}, number={2}, pages={225--240}, issn={0273-0979}, review={\MR {4726989}}, doi={10.1090/bull/1832}, }
[BHMN22]
Joshua Brakensiek, Marijn Heule, John Mackey, and David Narváez, The resolution of Keller’s conjecture, J. Automat. Reason. 66 (2022), no. 3, 277–300, DOI 10.1007/s10817-022-09623-5. MR4449705,
Show rawAMSref \bib{MR4449705}{article}{ author={Brakensiek, Joshua}, author={Heule, Marijn}, author={Mackey, John}, author={Narv\'{a}ez, David}, title={The resolution of Keller's conjecture}, journal={J. Automat. Reason.}, volume={66}, date={2022}, number={3}, pages={277--300}, issn={0168-7433}, review={\MR {4449705}}, doi={10.1007/s10817-022-09623-5}, }
[Buz23]
Kevin Buzzard, What is the point of computers? A question for pure mathematicians, ICM—International Congress of Mathematicians. Vol. 2. Plenary lectures, EMS Press, Berlin, 2023, pp. 578–608. MR4680264,
Show rawAMSref \bib{MR4680264}{article}{ author={Buzzard, Kevin}, title={What is the point of computers? A question for pure mathematicians}, conference={ title={ICM---International Congress of Mathematicians. Vol. 2. Plenary lectures}, }, book={ publisher={EMS Press, Berlin}, }, date={2023}, pages={578--608}, review={\MR {4680264}}, }
[Hal08]
Thomas C. Hales, Formal proof, Notices Amer. Math. Soc. 55 (2008), no. 11, 1370–1380. MR2463990,
Show rawAMSref \bib{MR2463990}{article}{ author={Hales, Thomas C.}, title={Formal proof}, journal={Notices Amer. Math. Soc.}, volume={55}, date={2008}, number={11}, pages={1370--1380}, issn={0002-9920}, review={\MR {2463990}}, }
[Har08]
John Harrison, Formal proof—theory and practice, Notices Amer. Math. Soc. 55 (2008), no. 11, 1395–1406. MR2463992,
Show rawAMSref \bib{MR2463992}{article}{ author={Harrison, John}, title={Formal proof---theory and practice}, journal={Notices Amer. Math. Soc.}, volume={55}, date={2008}, number={11}, pages={1395--1406}, issn={0002-9920}, review={\MR {2463992}}, }
[Mas21]
Patrick Massot, Why formalize mathematics (2021).,
Show rawAMSref \bib{massot2021formalize}{article}{ author={Massot, Patrick}, title={Why formalize mathematics}, date={2021}, url={https://www.imo.universite-paris-saclay.fr/~patrick.massot/files/exposition/why_formalize.pdf}, }
[Wie08]
Freek Wiedijk, Formal proof—getting started, Notices Amer. Math. Soc. 55 (2008), no. 11, 1408–1417. MR2463993,
Show rawAMSref \bib{MR2463993}{article}{ author={Wiedijk, Freek}, title={Formal proof---getting started}, journal={Notices Amer. Math. Soc.}, volume={55}, date={2008}, number={11}, pages={1408--1417}, issn={0002-9920}, review={\MR {2463993}}, }

Credits

Photo of Jeremy Avigad is courtesy of Carnegie Mellon University.

Photo of Johan Commelin is courtesy of Johan Commelin.

Photo of Heather Macbeth is courtesy of Peter Insley.

Photo of Adam Topaz is courtesy of Adam Topaz.