# Differentiating by Prime Numbers

Jack Jeffries

Communicated by Notices Associate Editor Steven Sam

It is likely a fair assumption that you, the reader, are not only familiar with but even quite adept at differentiating by . What about differentiating by 13? That certainly didn’t come up in my calculus class! From a calculus perspective, this is ridiculous: are we supposed to take a limit as 13 changes?

One notion of differentiating by 13, or any other prime number, is the notion of -derivation discovered independently by Joyal Joy85 and Buium Bui96. -derivations have been put to use in a range of applications in algebra, number theory, and arithmetic geometry. Despite the wide range of sophisticated applications, and the fundamentally counterintuitive nature of the idea of differentiating by a number, -derivations are elementary to define and inviting for exploration.

In this article, we will introduce -derivations and give a few basic ways in which they really do act like derivatives by numbers; our hope is that you will be inspired and consider adding -derivations to your own toolkit!

## $p$-Derivations on ${\mathbb{Z}}$

First we want to discuss differentiating one number , by another, ; i.e., what we will call -derivations on . Before we succeed, we need to abandon the notion of derivative as a limit as the input varies by a small amount: the thing that we are differentiating by does not vary, and the thing that we are differentiating does not even have an input! Instead, we take a little inspiration from elementary number theory.

Let be a prime number. By Fermat’s little theorem, for any integer , we have

so we can divide the difference by . The starting point of our journey is that not only can we divide by here, but we should. The -derivation on is the result of this process. Namely:

So, in particular, there is the -derivation on and the -derivation on given respectively by

Let’s plug in a few values:

A quick look at this table suggests a few observations, easily verified from the definition:

Numbers are no longer “constants” in the sense of having derivative zero, but at least and are.

These functions are neither additive nor multiplicative, e.g.:

is an odd function, at least for .

The outputs of are just the negatives of the triangular numbers.

We might also note that the outputs are very large in absolute value, and think that this operation is simply making a mess of our numbers. However, something more informative occurs if we think about largeness of the outputs from the point of view of , namely, the -adic order of —the number of copies of in its prime factorization. Writing with , if , we get

Since and , we must have , so does not divide . In particular, the -derivation decreases the -adic order of a multiple of by exactly one. This leads to our first comparison with old-fashioned :

In particular, if is a simple root or is a simple factor, then it is no longer a root or factor of or respectively.

Let’s check this against our table: the numbers , and that were divisible by but not result in odd numbers when we apply , whereas returned even numbers no longer divisible by . Note that this order-decreasing property says nothing about what happens when you apply to an odd number, and indeed, based on the table we observe that even and odd numbers can result. You can convince yourself that

We’ve observed already that these -derivations on are not additive. This can be a bit unsettling for those of us (like myself) who are usually accustomed to the luxury of additive operators. However, any function satisfying the order-decreasing property of above must not be additive, since an additive function has to take multiples of to multiples of . However, the error term can be made concrete:

All of the binomial coefficients appearing above are multiples of , so this expression is, given a particular value of , a particular polynomial in and with integer coefficients; let’s call it for convenience. This gives us the following “sum rule” for :

Products satisfy a rule with a similar flavor:

The fact that we have rules to break things down into sums and products gives the basis for another comparison with old-fashioned :

We might pause to ask whether we could have hoped for a simpler way to differentiate by 13. If we want Comparsion 2 to hold, then the following theorem of Buium provides a definitive answer.

That is, any function satisfying a sum rule and a product rule is a mild variation on a -derivation.

With the properties of -derivations we have so far, we can recreate analogues of some familiar aspects of calculus. For example, from the product rule and a straightforward induction, we obtain a power rule:

Note that the term in the sum above, , looks a bit like the power rule for usual derivatives. If we allow ourselves to extend to a map on , then we get an analogue of the quotient rule:

Of the main cast of characters in a first class on derivatives, perhaps the most conspicuous one missing at this point is the chain rule. Since there is no way to compose a number with a number, we will need a notion of -derivations for functions to state a sensible analogue of the chain rule.

## $p$-Derivations for General Commutative Rings

One can define -derivations for commutative rings with .

Evidently, the functions we defined on above are -derivations. In fact, for a fixed , a simple induction and the sum rule show that for any -derivation on a ring and any in the prime subring (image of ) of , .

The other basic example is as follows. Take the ring of polynomials in variables with integer coefficients, . For any polynomial , we can consider its th power , or we can plug in th powers of the variables as inputs to get . These are different, but they agree modulo as a consequence of the “freshman’s dream.” Namely, in the quotient ring ,

since each is a multiple of , and

as a consequence of commutativity, so the map is a ring homomorphism in , called the Frobenius map. Thus, in , taking th powers before doing polynomial operations is just as good as after. So, back in we can divide the difference by , and we will! Namely, we can define the function

and this function is a -derivation. Just so we can refer to this function later, let’s call this the standard -derivation on and denote it by (though this notation is not at all standard).

For example,

As this operator measures the failure of the freshman’s dream, one might think of this as a freshman’s nightmare. In fact, in large generality, -derivations all arise from some freshman’s nightmare. Let’s make this precise. Given a ring , we say that a map is a lift of Frobenius if it is a ring homomorphism and the induced map from is just the Frobenius map, i.e., for all . Given a -derivation , the map given by

is a lift of Frobenius. Indeed, the congruence condition is automatic, and the sum rule and product rule on translate exactly to the conditions that respects addition and multiplication. Conversely, if is a nonzero divisor on , and is a lift of Frobenius, then the map is a -derivation: the freshman’s nightmare associated to the lift of Frobenius .

It is worth noting that not every ring admits a -derivation. For a quick example, no ring of characteristic admits a -derivation, since we would have

in . Much more subtle obstructions exist, and it is an interesting question to determine which rings admit -derivations; see AWZ21 for some recent work on related questions.

Note that the power rule from before follows for any -derivation on any ring, since we just used the product rule to see it. The order-decreasing property holds in general, too, at least if is a nonzero divisor on —this follows from writing with and applying the product rule:

Let’s wrap up our cliffhanger from the previous section. Now that we have -derivations of polynomials, we have the ingredients needed for a chain rule: given a polynomial and a number , we will think of the number as the composition of the function and the number , and we can try to compare with and . Here’s the chain rule:

This is a bit more complicated than the original, but let’s notice in passing that the term in the sum, , looks pretty close to the classic chain rule, besides the th power on . The curious reader is encouraged⁠Footnote1 to prove the formula above.

1

For a hint, consider the lift of Frobenius on that sends , and use Taylor expansion to rewrite the associated -derivation in terms of the standard -derivation and derivatives of .

We have collected a decent set of analogues for the basics of differential calculus for -derivations. One can ask how far this story goes, and the short answer is very far. Buium has developed extensive theories of arithmetic differential equations and arithmetic differential geometry, building analogues of the classical (nonarithmetic) versions of these respective theories with -derivations playing the role of usual derivatives. The reader is encouraged to check out Bui05Bui17 to learn more about these beautiful theories, though our story now diverges from these. Instead, we will turn our attention towards using -derivations to give some algebraic results with geometric flavors.

## A Jacobian Criterion

One natural geometric consideration is whether, and where, a shape has singularities: points that locally fail to look flat, due to some sort of crossing or crinkle (or some harder-to-imagine higher-dimensional analogue of a crossing or crinkle). For example, the double cone cut out by the equation has a singularity at the origin where the two cones meet, but any other point on the cone is not a singularity, see Figure 1.

We are going to consider shapes like this that are cut out by polynomial equations, though to state the classical Jacobian criterion, we will consider their solution sets over the complex numbers.

Since it is difficult to envision higher-dimensional shapes (and impossible to envision what we’re doing next!), it will be useful to give a somewhat more algebraic heuristic definition of singularity. We will say that a point is a nonsingular point in if within one can locally cut out by exactly -many equations without taking roots, and singular otherwise. For example, the point in the cone is nonsingular, and I claim that the two equations , “work” for our definition: with these two equations and the equation for , we get

Substituting in, we get , and “near ,” is nonzero, so we can divide out and get , so . On the other hand, is singular, and the two equations , don’t “work” for our definition: we have

so , but we need to take a root to get .

The classical Jacobian criterion gives a recipe for the locus of all singularities of a shape cut out by complex polynomials.

For example, for the Whitney umbrella cut out by the polynomial , the singular locus is cut out by t