Bounds on the number and sizes of conjugacy classes in finite Chevalley groups with Applications to Derangements

We present explicit upper bounds for the number and size of conjugacy classes in finite Chevalley groups and their variations. These results have been used by many authors to study zeta functions associated to representations of finite simple groups, random walks on Chevalley groups, the final solution to the Ore conjecture about commutators in finite simple groups and other similar problems. In this paper, we solve a strong version of the Boston-Shalev conjecture on derangements in simple groups for most of the families of primitive permutation group representations of finite simple groups (the remaining cases are settled in two other papers of the authors and applications are given in a third).


Introduction
One might expect that there is nothing more to be done with the study of conjugacy classes of finite Chevalley groups. For instance over forty years ago Wall [W] determined the conjugacy classes and their sizes for the unitary, symplectic, and orthogonal groups. However the formulas involved are complicated and it is not automatic to derive upper bounds on numbers of classes or their sizes. Moreover many applications seem to require such bounds (in particular, universal explicit bounds of the form cq r where r is the rank of the ambient algebraic group and q is the size of the field of definition). To convince the reader of this, we mention some places in the literature where bounds on the number of conjugacy classes in finite classical groups were needed: (1) The work of Gluck [Gl] on convergence rates of random walks on finite classical groups. His bounds were of the form cq 3r . (2) The work of Liebeck and Pyber [LiP] on number of conjugacy classes in arbitrary groups; for finite groups of Lie type their bound was (6q) r .
(3) The work of Maslen and Rockmore [MR] on computations of Fourier transforms; they obtained a bound of q n for GL(n, q) and 8.26q n for U (n, q). These bounds are of the type we prove here, namely cq r where c is explicit. (4) Liebeck and Shalev [LS1] have used bounds in the current paper to study probabilistic results about homomorphisms of certain Fuchsian groups into Chevalley groups, and random walks on Chevalley groups [LS2]. Shalev used these results in a crucial way to study the images of word maps [Sh]. Our bounds were also critical in the solution of the Ore conjecture on commutators in finite simple groups [LOST]. (5) Our results have been used in studying various versions of Brauer's k (GV ) problem [GT1] -in particular, the noncomprime version and some new related conjectures of Geoff Robinson [R]. They were used in [GR] in obtaining new results about the commuting probability in finite groups. In particular, these results will be useful in improving results of Liebeck-Pyber [LiP] and Maróti [Mar] about the number of conjugacy classes in completely reducible linear groups over finite fields and in permutation groups. We also use our results to prove a large part of a conjecture of Boston et. al. [Bo] and Shalev stating that the proportion of fixed point free elements of a finite simple group in a transitive action on a finite set X with |X| > 1 is bounded away from zero. This immediately reduces to the case of primitive actions (and so to studying maximal subgroups of simple groups). This conjecture has applications to random generation of groups [FG4] and to maps between varieties over finite fields [GW]. In fact, we prove much stronger results for many actions of almost simple groups in this paper as a consequence of our bounds on class numbers and centralizer sizes. See [FG1] where the bounded rank case was handled. The remaining cases are treated in [FG2,FG3].
We now state some of the main results of the paper. If G is a finite group, we let k(G) denote the number of conjugacy classes of G. See [GLS] for background on Chevalley groups.
Theorem 1.1. Let G be a connected simple algebraic group of rank r over a field of positive characteristic. Let F be a Steinberg-Lang endomorphism of G with G F a finite Chevalley group over the field F q .
(2) k(G F ) ≤ q r + 68q r−1 . In particular, lim q→∞ k(G F )/q r = 1 and the convergence is uniform with respect to r. (3) The number of conjugacy classes of G F that are not semisimple is at most 68q r−1 . (4) The number of conjugacy classes of G that are F -stable is between q r and 27.2q r .
There are much better bounds on the constants in Theorem 1.1 for many of the families. The correct upper bound for part 1 is about 15.2q r (for Sp(2n, 2)). We give limiting values for k(G F )/q r (as r → ∞ with q fixed) for each of the families of classical groups. In particular, we see that this ratio does not tend to 1 for q fixed. See the tables in Section 4 for a summary of the results. There are precise formulas for the number of conjugacy classes for the exceptional Chevalley groups -see [Lu2] (and similarly, one can work out such formulas for the low rank classical groups). As we have already noted earlier in the introduction, the existence of a C(q) that depends on r has already been proved (and this is straightforward). In fact, using the results of Lusztig and others about unipotent classes, it is easy to prove that k(G F ) is bounded above by a monic polynomial in q of degree r (independently of q), whence for a fixed r, it follows that k(G F )/q r → 1 as q → ∞. One of the key features of our result is that our bounds are independent of the rank of the group, and many of the applications depend on this.
We show that one can get a similar bound for almost simple Chevalley groups allowing all types of outer automorphisms.
Corollary 1.2. Let G be an almost simple group with socle S, a Chevalley group of rank r defined over F q .
Recall that a permutation is called a derangement if it has no fixed points. If G is a transitive group on a finite set Ω, define δ(G, Ω) to be proportion of derangements in G. By an old theorem of Jordan, it follows that δ(G, Ω) > 0 if |Ω| > 1. By a much more recent (but still elementary) theorem [CC], δ(G, Ω) ≥ 1/|Ω| for |Ω| > 1. See Serre [Se] for many applications of Jordan's theorem. See [GW] for applications of better bounds of δ(G, Ω) and bounds on derangements in a given coset. If Ω is the coset space G/H, we write δ(G, Ω) as δ(G, H). Theorem 1.3. Let G be a classical Chevalley group defined over F q of rank r. Let H be a maximal subgroup of G that acts irreducibly and primitively on the natural module and does not preserve an extension field structure. Then there is a universal constant δ > 0 such that δ(G, H) > δ. Moreover, δ(G, H) → 1 as r → ∞.
The case where r is bounded (including the case of exceptional groups) was dealt with in [FG1]. In that case, the first statement about the existence of δ is the same. The second statement is valid if and only if H does not contain a maximal torus. We give another proof here. In fact, we prove a much stronger result than Theorem 1.3. See Theorems 7.3 and 7.7. We also prove some results about derangements in cosets of simple groups. See Section 7.
The three remaining families of maximal subgroups (reducible subgroups -including parabolic subgroups, groups preserving an extension field structure and imprimitive groups) are dealt with in [FG2,FG3]. The Boston-Shalev conjecture was proved for alternating and symmetric groups in [LuP]. See also [D], [DFG], and [FG3].
Two other results of interest that we prove and use in the preceding result are: Theorem 1.4. Let G be a connected simple algebraic group of rank r of adjoint type over a field of positive characteristic. Let F be a Steinberg-Lang endomorphism of G with G F a finite Chevalley group over the field F q . There is an absolute constant A such that for all x ∈ G F , .
See Section 6 for bounds of the form in Theorem 1.4, with explicit constants in all cases. The result holds for the finite simple Chevalley groups as well except that if G = P SL(n, q) or P SU (n, q), then q n−1 needs to be replaced by q n−2 . The result also holds for orthogonal groups except that in even dimension, q r needs to be replaced by 2q r−1 (but only for elements outside SO).
Theorem 1.5. Let G be a finite simple Chevalley group defined over F q of rank r with q a power of the prime p. The number of conjugacy classes of maximal subgroups of G is at most Ar(r + r 1/2 p 3r 1/2 + log log q) for some constant A.
In fact, we conjecture that the r 1/2 p 3r 1/2 term can be removed above. If r is bounded, a much stronger result is given in [LMS].
The organization of this paper is as follows. Section 2 studies the number of conjugacy classes in a given coset of a normal subgroup. In particular, we give a very short proof (Lemma 2.2) of a generalization of results in [BW, I] on the distribution of conjugacy classes over cosets of some normal subgroup. Section 3 obtains explicit and sharp upper bounds and asymptotics for the number of conjugacy classes in finite classical groups (some of these were announced in the survey [FG1]). This mostly involves a careful analysis of Wall's generating functions for class numbers, but we do obtain new generating functions for groups such as SO ± (2n, q) and Ω ± (2n, q) with q odd. Section 4 tabulates some of the results from previous sections and summarizes corresponding results for exceptional Chevalley groups, due to Lübeck [Lu2] and others. In Section 5, we turn to almost simple groups, proving Theorem 1.1, Corollary 1.2, and some related results. Section 6 derives explicit lower bounds on centralizer sizes (and so upper bounds on the sizes of conjugacy classes) in finite classical groups. In Section 7, we get upper bounds for the number of conjugacy classes in a maximal subgroup aside from three families of maximal subgroups. We then combine those results with Theorem 6.15 to obtain Theorem 1.3, Theorem 7.7, and related results on derangements.

Outer Automorphisms
In this section, we prove some results about the number of conjugacy classes in a given coset. This will allow us to pass between various forms of our group. We first recall an elementary result of Gallagher [Ga].
The following lemma will be useful in getting some better bounds for almost simple groups. See also [BW, I, K] for similar but somewhat weaker results. Many of the proofs of related results use character (or Brauer character) theory (in particular, Brauer's Lemma), and thus do not immediately extend to the case of π-elements.
Our method of proof is entirely different -it is shorter, more elementary and based on a very easy variant of what is known as Burnside's Lemma.
Lemma 2.2. Let N be a normal subgroup of the finite group G with G/N cyclic and generated by aN . Let π be a set of primes containing all prime divisors of |G/N |. Set α to be the number of G-invariant conjugacy classes of π-elements of N .
(1) The number of G-conjugacy classes of π-elements in the coset bN that are a single N -orbit is equal to α for any coset bN .
(2) The number of G conjugacy classes of π-elements in the coset aN is α.
Proof. Note that G acts on the coset bN by conjugation, and thus on the π-elements in that coset. We want to calculate the number of common G, N orbits on the π-elements of bN . By a slight variation of Burnside's Lemma (with essentially the same proof -see [FGS,§13]), this is the average number of fixed points of an element in the coset aN . Let x ∈ aN . It follows that C G (x) ∩ a i N = x i C N (x). Let y = x e where e ≡ 1 mod [G : N ] and y is a π-element (this is possible since π contains all prime divisors of [G : N ]). Then y i N = x i N for all i, and so if and only if w is. Thus, the number of π-elements in C G (x) ∩ a i N is the number of π-elements in C N (x) -in particular, this number is independent of the coset. This proves (1).
If x ∈ aN , then G = N C G (x) and so x N = x G , whence every G-conjugacy class in aN is a single N -orbit. So (1) implies (2). This allows us to prove a generalization of part of Lemma 2.1. If π is a set of primes and X is a finite group, let k π (X) be the number of conjugacy classes of π-elements of X.
Lemma 2.3. Let N be a normal subgroup of the finite group G Let π be a set of primes. Let x i N denote a set of representatives of the π-conjugacy classes of G/N . Let f (x i ) denote the number of N -conjugacy classes of π-elements that are x i -invariant.
Proof. If y ∈ G is a π-element, then yN is conjugate to x i N for some i.
Set G i = N, x i and note that every prime divisor of [G i : N ] is in π. By Lemma 2.2, the number of G i conjugacy classes of π-elements in x i N is at most f (x i ) ≤ k π (N ). Thus, the number of G-conjugacy classes of π-elements that intersect x i N is at most f (x i ). This completes the proof.
Corollary 2.4. Let N be a normal subgroup of G with π a set of primes containing all prime divisors of |G/N |. If each π-element g of G satisfies G = N C G (g), then G/N is abelian and the number of π-conjugacy classes of G in any coset of N is k π (N ).
Proof. We first show that G/N is abelian. Consider a coset xN . Let x ∈ G. Then xN = yN where y is a π-element (as in the previous proof). Then The hypothesis implies that every π-class of G is a single N -orbit. Applying Lemma 2.2 to the subgroup H := N, x shows that the number of common H, N -orbits on π-elements of xN is the number of common H, N -orbits on π-elements in N . The hypothesis implies that all G-orbits on π-elements are N -orbits, whence the number of conjugacy classes of π-elements in xN is k π (N ).
The hypotheses apply to the case where N is a quasisimple Chevalley group in characteristic p and G is contained in the group of inner diagonal automorphisms of N with π consisting of all primes other than p. See [S1, 2.12]. Thus, we have: Corollary 2.5. Let S be a quasisimple Chevalley group. Assume that S ≤ G ≤ Inndiag (S). Then the number of semisimple conjugacy classes in each coset of S in G is the same.
Another easy consequence of Lemma 2.2 is showing that the finiteness of unipotent classes in a disconnected reductive group follows from the result for connected reductive groups. There have been several proofs of the finiteness of the number of unipotent classes in the connected case -see [Lus1]. The result is also known for the disconnected case (see [Gu] for a generalization).
Lemma 2.6. Let G be an algebraic group defined over a finite field L of characteristic p. Let H be its connected component. Suppose that H has finitely many conjugacy classes of unipotent elements. Then G has finitely many conjugacy classes of unipotent elements.
Proof. Let U be the variety of unipotent elements in G. So U is defined over L. Let k be the algebraic closure of L. Suppose that H has m conjugacy classes of unipotent elements. Let L ′ /L be a finite extension. By Lang's theorem, H(L ′ ) has at most me conjugacy classes of unipotent elements, where e is the maximal number of connected components in C H (u) for u a unipotent element of H.
Let s be the number of conjugacy classes of p-elements in G/H (which is isomorphic to G(L ′ )/H(L ′ )). By Lemma 2.3, G(L ′ ) has at most sme conjugacy classes of p-elements.
Since G(k) is the union of the G(L ′ ) as L ′ ranges over all finite extensions of L ′ /L, it follows that G(k) has at most sme conjugacy classes of unipotent elements.
By [GLMS,Prop 1.1]), it follows that the number of unipotent classes of G is the same as the number of G(k) classes of unipotent elements..
The following must surely be known, but it follows easily from Lemma 2.2.
Proof. Let a be the number of A m classes that are stable under S m . Let b be the number of S m classes in A m that are not A m classes. Clearly k(A m ) = a + 2b, and by Lemma 2.2, k(S m ) = 2a + b. So we only need to show that a > b. If m = 4, 5, the result is clear. So we show that a > b for m > 5.
Note that b is precisely the number of classes where all cycle lengths are distinct and odd. There clearly is an injection into stable classes; namely since m > 4 the largest cycle is odd of length j ≥ 5, so one can replace it by a product of two 2-cycles and j − 4 fixed points. The image misses an element of order 4, and so the injection is not surjective.
Another easy consequence of Lemma 2.2 is: Corollary 2.8. Let N be a normal subgroup of the finite group G. Let K be a subgroup of G containing N with K/N cyclic and central in G/N . Let π be a set of primes containing all prime divisors of |K/N |. Let ∆ be the set of G-conjugacy classes of π-elements such that K = N C G (g). Then ∆ is equally distributed among the cosets of N contained in K.
Proof. Let Γ be the union of the conjugacy classes in ∆. Note that g ∈ Γ implies that g ∈ K.
Let α be the number of K-stable conjugacy classes of π-elements of N . By the proof of Lemma 2.2, it follows that K has precisely α orbits on Γ ∩ gN for each g ∈ K. Since G/K acts freely on the K orbits on Γ, it follows that there are precisely α/[G : K] elements of ∆ in each coset of K/N .
In certain cases, one can describe the conjugacy classes in a coset very nicely using the Shintani correspondence. See [K, §2]. We first need some notation. Let G be a connected algebraic group. Let F be a Lang-Steinberg endomorphism of G (i.e. the fixed points G F form a finite group). We first recall the well known result of Lang-Steinberg.
Lemma 2.9. Let G be a connected linear algebraic group, and let F be a surjective endomorphism of G such that G F is finite. Then the map f : Note that if F is such an endomorphism of a simple connected algebraic group G, then we can attach a prime power q = q F of the characteristic to F . Then G F is said to be defined over q. We write G F = G(q) (of course, there may be more than one endomorphism associated with the same q -in particular, this is the case if G admits a graph automorphism). The Shintani correspondence is: We may define ψ as follows. Given x ∈ H F . This depended upon the choice of α x , but another choice preserves the conjugacy class, and ψ defines the desired bijection on classes. It is straightforward to see that this bijection has the properties described in the theorem.
Combining this theorem together with Lemma 2.2 gives: Proof. The previous theorem implies that the first two quantities are equal. Lemma 2.2 implies that the second and third quantities are equal.

Number of conjugacy classes in classical groups
In this section, we obtain upper bounds for k(G) with G a classical group. Subsection 3.1 develops some preliminary tools. Type A groups are treated in Subsections 3.2 and 3.3; symplectic and orthogonal groups are treated in Subsections 3.4 and 3.5 respectively.
3.1. Preliminaries. The following result of Steinberg [St1,14.8,14.10] gives lower bounds for k(G) with G a Chevalley group. See also [C, p. 102]. The inequality is not stated explicitly though.
Theorem 3.1. Let G be a connected reductive group of semisimple rank r > 0. Let F be a Frobenius endomorphism of G associated to q. Let Z 0 denote the connected component of the center of G. The number of semisimple conjugacy classes in G F is at least |Z F 0 |q r with equality if G ′ is simply connected. In particular, k(G F ) > q r .
Proof. By [St1,14.8], the number of F -stable conjugacy classes is exactly |Z F 0 |q r . If G ′ is simply connected, then the centralizer of any semisimple element is connected, whence there is a bijection between stable conjugacy classes of semisimple elements and semisimple conjugacy classes in G F .
On the other hand, every F -stable class has a representative in G F by [St1,14.10], and so there are at least q r semisimple conjugacy classes in G F (with equality in the simply connected case). Since there must be at least 1 stable class of nontrivial unipotent elements, the last statement follows.
The following two asymptotic lemmas will be useful.
Lemma 3.2. (Darboux [O]) Suppose that f (u) is analytic for |u| < r, r > 0 and has a finite number of simple poles on |u| = r. Let w j denote the poles, and suppose that f (u) = j g j (u) Lemma 3.3. ( [O]) Suppose that f (u) is analytic for |u| < R. Let M (r) denote the maximum of |f | restricted to the circle |u| = r. Then for any 0 < r < R, the coefficient of u n in f (u) has absolute value at most M (r)/r n .
The following lemma is Euler's pentagonal number theorem (see for instance page 11 of [A1]).
Throughout this section quantities which can be easily re-expressed in terms of the infinite product ∞ i=1 (1 − 1 q i ) will often arise, and Lemma 3.4 gives arbitrarily accurate upper and lower bounds on these products. Hence (1− 1 4 i ) (1− 1 2 i ) ≤ 2.4 without explicitly mentioning Euler's pentagonal number theorem on each occasion.
3.2. GL(n, q) and its relatives. To begin we discuss GL(n, q). By a formula of Feit and Fine [FF,M1], the number of conjugacy classes in GL(n, q) is the coefficient of t n in the generating function Using clever reasoning and Euler's pentagonal number theorem, it is proved in [MR] that the number of conjugacy classes of GL(n, q) is less than q n .
To this we add the following simple proposition.
Proof. The generating function for conjugacy classes of GL(n, q) gives that k(GL(n,q)) q n is the coefficient of t n in For the first assertion, use Lemma 3.2. For the second assertion, the upper bound on k(GL(n, q)) was mentioned earlier, and the lower bound holds since GL(n, q) has q n − q n−1 semisimple conjugacy classes, corresponding to the possible characteristic polynomials.
Remark: In fact k(GL(n,q)) q n is even closer to 1 than one might suspect from Proposition 3.5. Indeed, is analytic for all |t| < q 1/2 (subtracting the (1 − t) −1 removed the pole at t = 1). Thus Lemma 3.3 gives that for any 0 < ǫ < 1/2, k(G) q n − 1 ≤ Cq,ǫ q n(1/2−ǫ) where C q,ǫ is a constant depending on q and ǫ (which one could make explicit with more effort). This is consistent with the fact ( [BFH], [MR]) that k(GL(n, q)) is a polynomial in q with lead term q n and vanishing coefficients of q n−1 , · · · , q ⌊ n+1 2 ⌋ .
Macdonald [M1] derived formulas for the number of conjugacy classes of SL(n, q), P GL(n, q) and P SL(n, q) in terms of k(GL(n, q)). As these will be used below it is useful to recall them. Let where the product is over primes dividing n. Thus φ 1 (n) is Euler's φ function. Macdonald showed that k(SL(n, q)) = 1 q − 1 d|n,q−1 φ 2 (d)k(GL(n/d, q)), k(P GL(n, q)) = 1 q − 1 d|n,q−1 φ 1 (d)k(GL(n/d, q)), and k(P SL(n, q)) = 1 where the sum is over all pairs of divisors d 1 , d 2 of q − 1 such that d 1 d 2 divides n.
For part 3, note from Macdonald's formula that k(SL(n, q)) q n−1 Since q is fixed, it is clear from Proposition 3.5 that only the d = 1 term contributes in the n → ∞ limit, yielding the result.
The following corollary concerns groups between SL(n, q) and GL(n, q) or between P SL(n, q) and P GL(n, q).
(2) Suppose that P SL(n, q) ⊆ H ⊆ P GL(n, q), and let j denote the index of H in P GL(n, q). Then Proof. Let H be as in part 1 of the corollary. Then k(H) ≥ k(GL(n,q)) j ≥ q n−1 (q−1) j , where the first inequality is Lemma 2.1 and the second is the fact that GL(n, q) has q n−1 (q − 1) semisimple conjugacy classes. The inequality k(H) ≤ q−1 j k(SL(n, q)) comes from Lemma 2.1, and Proposition 3.6 yields the inequality (q − 1)k(SL(n, q)) ≤ q n + 3q n−1 .
Let H be as in part 2 of the corollary. Then k(H) ≥ k(P GL(n,q)) j ≥ q n−1 j , where the first inequality is Lemma 2.1 and the second is the fact that P GL(n, q) has at least q n−1 conjugacy classes (clear from Macdonald's formula and the fact that GL(n, q) has at least q n−1 (q − 1) conjugacy classes). The inequality k(H) ≤ gcd(n,q−1) j k(P SL(n, q)) comes from Lemma 2.1, and the inequality gcd(n, q − 1)k(P SL(n, q)) ≤ q n−1 + 5q n−2 follows from Macdonald's formula for k(P SL(n, q)) and an analysis similar to that in Proposition 3.6.
We close this section with the following exact formula for the number of conjugacy classes of a group H between SL(n, q) and GL(n, q). It involves the quantity φ 2 defined earlier in this section.
Proposition 3.8. Suppose that SL(n, q) ⊆ H ⊆ GL(n, q) and let j denote the index of H in GL(n, q). Then Proof. As in [M1], to each conjugacy class of GL(n, q), there is associated a partition ν of n. To describe this recall that conjugacy classes of GL(n, q) are parametrized by associating to each monic irreducible polynomial p(x) over F q with non-zero constant term a partition; if the partition corresponding to p(x) has m i parts of size i, then it contributes deg(p)m i parts of size i to the partition ν. Throughout the proof we let c ν denote the number of conjugacy classes of GL(n, q) of type ν. We also let ν 1 , · · · , ν r denote the parts of ν. Given the partition ν, we determine the number of conjugacy classes of GL(n, q) of type ν in H, multiply it by the number of H classes into which each such class splits (this number depends only on ν) and then sum over all ν. Arguing as on pages 33-36 of [M1] shows that the number of conjugacy classes of GL(n, q) of type ν in H is gcd(j, ν 1 , · · · , ν r )c ν j and that each such class splits into gcd(j, ν 1 , · · · , ν r ) many H classes. Thus the total number of conjugacy classes of H is 1 j |ν|=n gcd(j, ν 1 , · · · , ν r ) 2 c ν .
Arguing as on pages 36-37 of [M1], this can be rewritten as 1 j 3.3. GU (n, q) and its relatives. The paper [MR] proves that Proposition 3.9 gives an asymptotic result.

for a universal constant
A; one can take A = 16 for q = 2 and A = 7 for q ≥ 3. Thus lim q→∞ k(GU (n,q)) q n = 1, and the convergence is uniform in n.
Proof. Wall [W] shows For the first assertion, use Lemma 3.2.
For the second assertion, the lower bound comes from the easily proved fact (essentially on page 35 of [W]) that GU (n, q) has q n + q n−1 many semisimple conjugacy classes. For the upper bound, the assertion when q = 2 is immediate from the fact that k(GU (n, q)) ≤ 8.26q n . For q ≥ 3, recall that where the last inequality is an easy calculus exercise.

Remarks:
(1) The value of the limit in part 1 of Proposition 3.9 is 8.25... when q = 2.
(2) As in the remark after Proposition 3.5, the convergence of k(GU (n,q)) ) for any 0 < ǫ < 1/2. Indeed, subtracting off the simple pole at t = 1 from the generating function in Proposition 3.9 gives that is analytic for all |t| < q 1/2 , so the claim follows from Lemma 3.3. Macdonald [M1] derived useful formulas for k(SU (n, q)), k(P GU (n, q)) and k(P SU (n, q)). These involve the quantity where the product is over all primes dividing n. He showed that where the sum is over all pairs of divisors d 1 , d 2 of q + 1 such that d 1 d 2 divides n.
(2) k(SU (n, q)) ≤ q n−1 +Aq n−2 for a universal constant A; one can take A = 16 for q = 2 and A = 7 for q ≥ 3. Thus lim q→∞ k(SU (n,q)) q n−1 = 1, and the convergence is uniform in n.
The lower bound in part 1 is immediate from Theorem 3.1. The upper bounds in parts 1 and 2 will be proved together. For n ≤ 7 the upper bounds are checked directly from Macdonald's formula for k(SU (n, q)). Next suppose that q = 2 and n ≥ 8. Then Macdonald's formula for k(SU (n, q)) and the upper bound on k(GU (n, q)) give that k(SU (n, 2)) ≤ 8.26 3 2 n + 8 · 2 n/3 ≤ 8.26(2 n−1 ).
Corollary 3.11 gives bounds on k(H) where H is a group between SU (n, q) and GU (n, q) or between P SU (n, q) and P GU (n, q).
(1) Suppose that SU (n, q) ⊆ H ⊆ GU (n, q) and that j is the index of H in GU (n, q).
where A is a universal constant. One can take A = 25 for q = 2 and A = 11 for q ≥ 3. (2) Suppose that P SU (n, q) ⊆ H ⊆ P GU (n, q) and that j is the index of H in P GU (n, q).
Proof. Let H be as in part 1 of the corollary. Then k(H) ≥ k(GU (n,q)) , where the first inequality is Lemma 2.1 and the second is the fact that GU (n, q) has q n−1 (q + 1) semisimple conjugacy classes. Lemma 2.1 gives that k(H) ≤ q+1 j k(SU (n, q)). The inequality (q + 1)k(SU (n, q)) ≤ q n + Aq n−1 with the stated A values follows from part 2 of Proposition 3.10.
Let H be as in part 2 of the corollary. Then k(H) ≥ k(P GU (n,q)) j ≥ q n−1 j , where the first inequality is Lemma 2.1 and the second is the fact that P GU (n, q) has at least q n−1 conjugacy classes (clear from Macdonald's formula and the fact that GU (n, q) has q n−1 (q + 1) semisimple conjugacy classes). The inequality k(H) ≤ gcd(n,q+1) j k(P SU (n, q)) comes from Lemma 2.1, and the inequality gcd(n, q + 1)k(P SU (n, q)) ≤ q n−1 + 8q n−2 follows from Macdonald's formula for k(P SU (n, q)) and an analysis similar to that in Proposition 3.10.
3.4. Symplectic groups. We next consider symplectic groups. We treat the cases q odd and even separately.
Theorem 3.12. Let q be odd.
(2) k(Sp(2n, q)) ≤ q n + Aq n−1 for a universal constant A; one can take A = 30 for q = 3 and A = 12 for q ≥ 5. Thus lim q→∞ k(Sp(2n, q)) q n = 1, and the convergence is uniform in n.
(3) For q fixed, lim n→∞ Proof. The lower bound in part 1 is immediate from Theorem 3.1.
For q odd, Wall [W] shows that k(Sp(2n, q)) is the coefficient of t n in the generating function

Rewrite this generating function as
Since all coefficients of powers of t in the second infinite product are nonnegative, it follows that 1−qt i is the generating function for the number of conjugacy classes in GL(n, q). Hence the coefficient of t n−m in it is at most q n−m . It follows that For part 3, note that k(Sp(2n,q)) q n is the coefficient of t n in Then use Lemma 3.2.
Remark: The value of the limit in part 3 of Theorem 3.12 is 10.7... when q = 3.
Next we treat the symplectic group in even characteristic.
Theorem 3.13. Let q be even. (1) and the convergence is uniform in n.
For the first assertion, one combines work of Wall [W] and Andrews' solution of the L-M-W conjecture [A2] to obtain that An identity of Gauss (page 23 of [A1]) states that and the first assertion follows.
For the second assertion, combining part 1 with the same trick as in the odd characteristic case gives that The last step used Lemma 3.4. The lower bound in the second assertion is immediate from Theorem 3.1.
The proofs of parts 3 and 4 are analogous to the proofs of parts 2 and 3 in the odd characteristic case.
Remark: The value of the limit in part 4 of Theorem 3.13 is 15.1... when q = 2.
3.5. Orthogonal groups. This section gives the results for the orthogonal groups. We assume that the dimension of the underlying space is at least 3 (almost all of the results are valid for the two dimensional case as well, but the results are trivial in that case and the lower bounds do not always hold because the semisimple rank is 0).
First we treat the case of even dimension with q odd.
Theorem 3.14. Let q be odd. ( and the convergence is uniform in n. Proof. For the lower bound in part 1, Theorem 3.1 gives that SO ± (2n, q) has at least q n semisimple classes, and at most two of these can fuse into one class in O ± (2n, q). For the upper bound, clearly . By upper bounding each of these terms, we will upper bound k(O ± (2n, q)).
Wall [W] shows that Rewrite this generating function as Arguing as in the proofs for the symplectic cases and using Lemma 3.4, the coefficient of t 2n is at most Wall [W] shows that k Since this is analytic for t < q −1 + ǫ, Lemmas 3.3 and 3.4 imply an upper bound of Combining this with the previous paragraph gives that k(O ± (2n, q)) ≤ 9.5q n . For part 2, the q = 3 case is immediate from part 1. For q ≥ 5, the upper bound on k(O + (2n,q))+k(O − (2n,q)) q n in the proof of part 1 and the lower bound The result follows from Lemma 3.4 (as in the unitary case) and basic calculus.
For the third assertion, By Lemma 3.2, as n → ∞ this converges to Since this is analytic for |t| < q −1/2 , it follows from Lemma 3.3 that Remark: The value of the limit in part 3 of Theorem 3.14 is 8.14... when q = 3.
To treat even dimensional special orthogonal groups in odd characteristic, the following lemma will be helpful.
Lemma 3.15. Let q be odd and let G = SO ± (n, q).
The following are equivalent: (1) C = g H ; (2) g leaves invariant an odd dimensional nondegenerate space W .
(3) Some Jordan block of g corresponding to either the polynomial z + 1 or the polynomial z − 1 has odd size. If all Jordan blocks of g corresponding to both the polynomials z ± 1 have even size, then g H is the union of two conjugacy classes of G.
Proof. Since [H : G] = 2, the last statement follows from the equivalence of the first three conditions. Suppose that C = g H . It follows that g centralizes some element x ∈ H \ G. Raising x to an odd power, we may assume that the order of x is a power of 2 and in particular that x is semisimple. Since det x = −1, it follows that the −1 eigenspace of x is nondegenerate and odd dimensional. Thus (2) holds.
Conversely, assume (2). Taking x = −1 and W on 1 on W ⊥ shows that C H (g) is not contained in G, whence (1) holds. Also, the subspace of W corresponding to either the z − 1 or z + 1 space is odd dimensional, whence some Jordan block has odd size. Thus (2) implies (3).
Finally assume (3). By induction, we may assume that g acts indecomposably (i.e. preserves no nontrivial orthogonal decomposition on the natural module). If n is odd, then clearly (2) holds. So we may assume that n is even. Replacing g by −g (if necessary), we may assume that g is unipotent.
By [LSe1,Theorem 2.12], it follows that g either is a single Jordan block of odd size or has two Jordan blocks of even size. Since (3) holds, the latter case cannot hold. Thus, g consists of a single Jordan block of odd size, whence (2) holds.
(4) k(SO ± (2n, q)) ≤ q n + Aq n−1 for a universal constant A; one can take A = 20 for q = 3 and A = 8 for q ≥ 5. Thus and the convergence is uniform in n.
Proof. Clearly k(SO + (2n, q)) + k(SO − (2n, q)) = 2A + B, where A is the sum over O + (2n, q) and O − (2n, q) of the number of classes which have determinant 1 and split into two SO classes, and B is the sum over O + (2n, q) and O − (2n, q) of the number of classes which have determinant 1 and do not split into two SO classes. Applying Lemma 3.15 and arguing as on pages 41-2 of [W] gives that A is the coefficient of .
(The factor of (1 − t 4i ) −2 comes from the fact that the z ± 1 partitions have only even parts which must occur with even multiplicity, and the other factor is precisely Wall's F + 0 (t)). To solve for B, note that A + B is the sum of O + (2n, q) and O − (2n, q) of the number of classes which have determinant 1. Such classes correspond to elements where the z + 1 piece has even size, so arguing as on pages 41-2 of [W] (using his notation) gives that A + B is the coefficient of t 2n in Calculating 2A + B completes the proof of the first part of the theorem. For the second assertion, apply Lemma 3.15 and argue as on pages 41-2 of [W] (using his notation) to conclude that the O + (2n, q) number -the O − (2n, q) number of conjugacy classes which have determinant 1 and split is the coefficient of Again applying Lemma 3.15 and arguing as on pages 41-2 of [W], one sees that the O + (2n, q) number -the O − (2n, q) number of conjugacy classes which have determinant 1 and do not split is 0. The second assertion follows. The lower bound in part 3 is immediate from Theorem 3.1. For the upper bound, it follows from part 1 and elementary manipulations that It is not difficult to see that the expression in square brackets in the previous equation has all coefficients non-negative when expanded as a power series in t (use the fact that the coefficient of t 4i−2 in (1 + t 2i−1 ) 4 is 6). Hence one can argue as in the Theorem 3.14 to conclude that k(SO + (2n, q)) + k(SO − (2n, q)) is at most q n multiplied by    evaluated at t = 3 −.5 . This at most 9.3q n . By part 2 and the fact that 2 i≥1 1 (1+t i )(1−qt 2i ) is analytic for |t| < q −1 +ǫ, it follows from Lemma 3.3 that k(SO + (2n, q)) − k(SO − (2n, q)) is at most This, together with the previous paragraph, completes the proof of the third assertion.
For part 4, the q = 3 case is immediate from part 1. For q ≥ 5, the proof of part 3 showed that k(SO ± (2n,q)) q n is at most Using Lemma 3.4 (as in the unitary case), the result follows from basic calculus.
The proof of part 5 is nearly identical to the proof of part 3 in Theorem 3.14.
Remark: The value of the limit in part 5 of Theorem 3.16 is 4.6... when q = 3. We next consider the odd dimensional case.
Theorem 3.17. Let q be odd.
Proof. Lusztig [Lus2] proves that k(SO(2n + 1, q)) is the coefficient of t n in the generating function  .
By a result of Gauss (page 23 of [A1]), this is equal to Using the same trick as in the unitary and symplectic cases one sees that This is maximized for q = 3 for which Lemma 3.4 yields an upper bound of 7.1q n .
The lower bound follows by Steinberg's result on the number of semisimple classes -see Theorem 3.1.
The second part follows from part 1 (use Lemma 3.4 as in the unitary case and basic calculus), and the third part is proved using the same method used for the symplectic groups.
Remark: The value of the limit in part 3 of Theorem 3.17 is 7.0.. when q = 3.
We now state similar results for the groups Ω ± (n, q). The proofs of these results are somewhat long and are in [FG5].
Theorem 3.18 treats the even dimensional groups, while Theorem 3.19 treats the odd dimensional case.
Note that for the even dimensional special orthogonal groups, one will be a direct product of its center and Ω, and so the answer for Ω is precisely 1/2 the answer for SO (these are precisely the cases not mentioned in the next result).
Theorem 3.18. Let q be odd.
The following result is for odd dimensional groups.
We now turn to orthogonal groups in characteristic 2. Since the odd dimensional orthogonal groups are isomorphic to symplectic groups, we need only consider the even dimensional case.
Theorem 3.21. Let q be even.
(3) k(O ± (2n, q)) ≤ q n 2 + Aq n−1 for a universal constant A; one can take A = 29 for q = 2 and A = 9 for q ≥ 4. Thus and the convergence is uniform in n. (4) For fixed q, lim n→∞ Proof. Combining [W] and [A2] shows that k(O + (2n, q)) + k(O − (2n, q)) is the coefficient of t n in the generating function The first assertion now follows from the following special case of Jacobi's triple product identity (page 21 of [A1]): Note that when the numerator of the generating function of part 1 is expanded as a series in t, all coefficients are positive. Arguing as for the unitary and symplectic groups gives that Since this is analytic for |t| < q −1 + ǫ, Lemma 3.3 gives that k (O + Combining this with the the previous paragraph yields the upper bound in part 2. For the lower bound in part 2, SO ± (2n, q) is simply connected. Thus by Steinberg's theorem, the number of semisimple classes is exactly q n and so there are at least q n /2 in O ± (2n, q) (because the index is 2, at most two classes fuse into one).
The third and fourth parts are proved by the same method as in Theorem 3.14.
Remark: The value of the limit in part 4 of Theorem 3.21 is 12.7.. when q = 2.
Finally, we treat even characteristic special orthogonal groups.
Theorem 3.22. Let q be even.
(3) q n ≤ k(SO ± (2n, q)) ≤ 14q n . (4) k(SO ± (2n, q)) ≤ q n + Aq n−1 for a universal constant A; one can take A = 26 for q = 2 and A = 5 for q ≥ 4. Thus lim q→∞ k(SO ± (2n, q)) q n = 1, and the convergence is uniform in n. (5) For fixed q, lim n→∞ k(SO ± (2n,q)) q n is equal to Proof. For part 1, it follows from [A2] and [Lus2] that if k 1 (SO ± (2n, q)) is the number of unipotent conjugacy classes of SO ± (2n, q), then 1 + n≥1 t n k 1 (SO + (2n, q)) + k 1 (SO − (2n, q)) We claim that a conjugacy class of O ± (2n, q) with empty z − 1 piece splits in SO ± (2n, q), and that a conjugacy class of O ± (2n, q) with nonempty z − 1 piece splits in SO ± (2n, q) if and only if a unipotent element with that z − 1 piece splits in the SO (possibly of lower dimension) which contains it. Let x ∈ O ± (2n, q). Write V = V 1 ⊥ V 2 where V 1 is the kernel of (x − 1) 2n . Let x i denote the element of O(V i ) that is the restriction of x to V i . Thus, the centralizer of x is the direct product of the centralizers of x i in O(V i ). Working over the algebraic closure we see that the centralizer of x 2 in O(V 2 ) is isomorphic to the centralizer of some element of GL(d) where 2d = dim V 2 . In particular, the centralizer of x 2 is connected and so is contained in SO(V 2 ). Thus, if V 1 = 0, the class of x splits. If V 1 = 0, then the class of x splits if and only if the class of x 1 splits in O(V 1 ). This proves the claim.
where the term i≥1 (1−t i ) (1−qt i ) is the even characteristic analog of F + 0 (t) from page 41 of Wall [W] (and is derived the same way). Plugging in the generating function for k ± 1 (SO(2n, q)) and using the an identity of Gauss (page 21 of [A1]) that part 1 follows by elementary simplifications.
For part 2, arguing as in part 1 gives that where the term i≥1 (1−t i ) (1−qt 2i ) is the even characteristic analog of F − 0 (t) from page 42 of Wall [W] (and is derived the same way). Page 153 of [Lus2] gives that 2 + n≥1 t n k 1 (SO + (2n, q)) − k 1 (SO − (2n, q)) = 2 so part 2 follows. The lower bound in part 3 is immediate from Theorem 3.1. For the upper bound, first note from part 1 that k(SO + (2n, q)) + k(SO − (2n, q)) is the Arguing as in Theorem 3.12, shows that k(SO + (2n, q)) + k(SO − (2n, q)) is at most q n multiplied by evaluated at t = 1/q. This is maximized at q = 2, and is at most 15. Using the fact that k(SO ± (2n, q)) ≥ q n , it follows that k(SO ± (2n, q)) ≤ 14q n .
By part 2 and the fact that 2 i≥1 1 (1+t i )(1−qt 2i ) is analytic for |t| < q −1 +ǫ, it follows from Lemma 3.3 that k(SO + (2n, q)) − k(SO − (2n, q)) is at most This, together with the previous paragraph, completes the proof of the third assertion.
For part 4, the proof of part 3 yields that k(SO ± (2n,q)) q n is at most Using Lemma 3.4 (as in the unitary case), this upper bound is at most 2 + A q for a universal constant A. Since k(SO ± (2n, q)) ≥ q n by part 3, the result follows. The proof of part 5 is nearly identical to the proof of part 3 in Theorem 3.14.
Remark: The limit in part 5 of Theorem 3.22 is 7.4.. when q = 2.

Tables of Conjugacy Class Bounds
We tabulate some of the results in the previous section and summarize the corresponding results for exceptional groups. There are exact formulas for these class numbers and we refer the reader to [Lu2]. See also §8.18 of [Hu] and the references therein.
Here are the results for the exceptional groups. We give a polynomial upper bound for each type of exceptional group (this upper bound is valid for both the adjoint and simply connected forms of the group).

Conjugacy Classes in Almost Simple Groups
We use the results of the previous sections to obtains bounds on the class numbers for Chevalley groups. These bounds are close to best possible, but we improve these a bit in [FG5]. These bounds are used in [LOST] to finish the proof of the Ore conjecture and are more than sufficient for our proof of the Boston-Shalev conjecture on derangements.
We first prove Theorem 1.1.
Proof. Since the number of semisimple classes in G F is at least q r , (2) implies (3). Since any F -stable class intersects G F , the number of F -stable classes is at most k(G F ) and so (1) implies (4). Thus, it suffices to prove (1) and (2). First assume that G has type A. Apply Theorems 3.6 and 3.10 and Corollaries 3.7 and 3.11 to conclude that (1) and (2) hold in this case.
If G is exceptional, then the results follow by the Table in the previous section. Now assume that G has type B, C or D. Since the center has order at most 4 in all cases, to prove (1), it suffices to consider any form of the group (this may alter the constant but only by a bounded amount, and in fact it changes only very little). Moreover, in characteristic 2, the adjoint and simply connected groups are the same and so there is nothing to prove. So we may assume that the characteristic of the field is odd.
Suppose that G has type B. The results have been proved for the group of adjoint type with constants 7.1 and 19 respectively. Suppose that G is simply connected. Then k(G) ≤ 2k(Ω), whence the results hold by Theorem 3.19.
Next suppose that G has type C. We have already proved the result for the simply connected group. So assume that G is the adjoint form. Let H = Sp(2r, q). Then k(H/Z(H)) < k(H) ≤ 10.8q r , whence k(G) ≤ 21.6q r , and so (1) holds.
The number of semisimple classes in H/Z(H) is at most (q r + t)/2, where t is the number of H-semisimple classes invariant under multiplication by Z(H). These correspond to monic polynomials of degree r in x 2 with the set of roots invariants under inversion. The number of such is q (r−1)/2 if r is odd and q r/2 if r is even. Thus, k(H/Z(H)) ≤ (1/2)q r + 31q r−1 . Since G/H has order 2, this implies that k(G) ≤ q r + 62q r−1 , and so (2) holds.
Finally, consider the case that G has type D (and we may assume that r ≥ 4). Let H = P Ω ± (2r, q) be the simple group corresponding to G. Using Theorem 3.18, we see that k(P Ω ± (2r, q)) ≤ 6.8q r , whence a straightforward argument shows that k(G) ≤ 4k(H) ≤ 27.2q r .
Note that for simply connected groups or groups of adjoint type, we can do a bit better (and in particular for the simple groups). This follows by the proof above.
Corollary 5.1. Let G be a simply connected simple algebraic group of rank r over a field of positive characteristic. Let F be a Steinberg-Lang endomorphism of G with G F a finite Chevalley group over the field F q . Then (1) k(G F ) ≤ 15.2q r .
Similarly, the proof of Theorem 1.1 also shows: Corollary 5.2. Let G be a simple algebraic group of adjoint type of rank r over a field of positive characteristic. Let F be a Steinberg-Lang endomorphism of G with G F a finite Chevalley group over the field F q . Let S be the socle of G F and assume that S ≤ H ≤ G. Then (1) k(H) ≤ 27.2q r .
(3) The number of non-semisimple classes of H is at most 68q r−1 .
Corollary 5.2 also leads to the following result which was used in [GR]. Proof. For all cases other than the alternating and symmetric groups, this follows from Theorem 5.1 together with bounds on the size of the outer automorphism groups and a computer computation for small cases. For the symmetric groups, the result follows without difficulty from the two bounds k(S n ) ≤ π √ 6(n−1) e π q 2n 3 ( [VW], p. 140) and n! ≥ (2π) 1/2 n n+1/2 e −n+1/(12n+1) ( [Fe], p.52). By Corollary 2.7, k(A n ) ≤ k(S n ), and the two bounds also imply that k(S n ) ≤ n!
Typically, the .41 can be replaced by a much smaller number but the example of G = S 5 shows that one cannot do much better.
We now consider general almost simple Chevalley groups G. Now we have to deal with all types of outer automorphisms. The proof we give below shows that in fact for large q, k(G) is close to q r /e where e = [Inndiag (S) : We first need a lemma. See [GLS] for basic results about automorphisms of Chevalley groups.
Lemma 5.4. Let S be a simple Chevalley group over the field of q elements of rank r ≥ 2. Let x ∈ Aut(S) with x not an inner diagonal automorphism of S. Let S ≤ H ≤ Inndiag (S). Then the number of x stable classes in H is at most Dq r−1 for some universal constant D.
Proof. We may assume that x has prime order p modulo the group of inner diagonal automorphisms.
By Corollary 5.2, it suffices to consider semisimple classes of S (since the number of nonsemisimple classes is at most Aq r−1 for a universal constant A). If x is in the coset of a Lang-Steinberg automorphism, then Shintani descent gives a much better bound. In any case, the stable classes will be in bijection with those in the centralizer and so there are at most Cq r/2 invariant classes, whence the bound holds in this case.
The remaining cases are where x induces a graph automorphism. We lift to the central cover T of S and let H 0 be the lift of H. It suffices to prove the result for H 0 . By considering irreducible representations of T , we see that the number of stable semisimple classes in T is q r ′ where r − 1 ≥ r ′ is the number of orbits of x on the Dynkin diagram of S. Similarly, for each x-invariant coset of H 0 /T , there are are most q r ′ invariant classes in each of those cosets. If S is not of type A, there are at most 4 cosets. If S is of type A, there are at most 2 invariant cosets. Thus, there are most 4q r−1 invariant semisimple classes.
We can now prove: Theorem 5.5. Let G be almost simple with socle S that is a Chevalley group defined over the field of q elements and has rank r. There is an Proof. Let H be the subgroup of G consisting of inner diagonal automorphisms. Let X be the full group of inner diagonal automorphisms of S. Then X = Y F where Y is the corresponding simple algebraic group of adjoint type, and so by Corollary 2.5, the number of semisimple classes in H is precisely [X : H] −1 times the number of semisimple classes in X. The number of non-semisimple classes in H is at most [X : H] times the number of non-semisimple classes of X. If G is not of type A, then [X : H] ≤ 4 and so by Theorem 1.1, k(H) ≤ q r + Eq r−1 for some universal constant E. If G is of type A, we apply Corollaries 3.7 and 3.11 to conclude this as well.
First consider the case that S = P SL(2, q). So H = P SL(2, q) or P GL(2, q). If x ∈ G \ H, then x can be taken to be a field automorphism of order e ≥ 2.
If H = P GL(2, q), the number of stable semisimple classes is q 1/e + 1 and there is one (stable) unipotent class (the stable classes are precisely those of P GL(2, q 1/e )). If H = P SL(2, q), there are even fewer stable classes. Thus, the number of conjugacy classes in the coset xH is at most q 1/e + 2, whence the result holds.
So we may assume that r > 1. Let x ∈ G \ H. By the previous result, the number of x-stable classes in H is at most Eq r−1 for some universal constant E. Thus by Lemma 2.2, the number of conjugacy classes in xH is at most Eq r−1 . Since [G : H] ≤ 6 log q, it follows that k(G) ≤ q r + 6E(log q)q r−1 , and the result follows.
With a bit more effort, one can remove the log q factor in the previous result. Keeping track of the constants in the proof above gives Corollary 1.2.

Minimum Centralizer Sizes for the Finite Classical Groups
This section gives lower bounds on centralizer sizes in finite classical groups, and hence upper bounds on the size of the largest conjugacy class in a finite classical group. Formulas for the conjugacy classes sizes go back to Wall [W], but being quite complicated polynomials in q effort is required to give explicit bounds. The bounds presented here hold for all values of n and q and are also applied in [FG2] and [Sh].
The following standard notation about partitions will be used. Let λ be a partition of some non-negative integer |λ| into parts λ 1 ≥ λ 2 ≥ · · · . Let m i (λ) be the number of parts of λ of size i, and let λ ′ be the partition dual to λ in the sense that λ ′ i = m i (λ) + m i+1 (λ) + · · · . It is also useful to define the diagram associated to λ by placing λ i boxes in the ith row. We use the convention that the row index i increases as one goes downward. So the diagram of the partition (5441) is and λ ′ i can be interpreted as the size of the ith column. The notation (u) m will denote (1 − u)(1 − u/q) · · · (1 − u/q m−1 ). This section freely uses Lemma 3.4 from Section 3.
6.1. The general linear groups. The following result about partitions will be helpful.
Lemma 6.1. Let λ be any partition. Then for q ≥ 2, Proof. Define a function f on partitions by f (λ) = q . Let τ be a partition obtained from λ by moving a box from a row of length i to a row of length j ≥ i+1. The idea is to show that f (λ) ≥ f (τ ). The result then follows because a sequence of such moves transforms any partition into the one-row partition and f evaluated on this partition is q |λ| (1 − 1/q).

One checks that
as desired. For the i = 1 case, the only difference in the above argument is that the term (1 − 1/q m i−1 (λ)+1 ) does not appear.
Lemma 6.2 is well-known and is proved by counting the non-zero elements in a degree r extension of F q by the degrees of their minimal polynomials.
Lemma 6.2. Let N (q; d) be the number of monic degree d irreducible polynomials over the finite field F q , disregarding the polynomial z. Then d|r dN (q; d) = q r − 1. Lemma 6.3 will also be needed.
where the final inequality uses that s ≥ 2.
Proof. It is well known that conjugacy classes of GL(n, q) are parametrized by Jordan canonical form. That is for each monic irreducible polynomial φ = z, one picks a partition λ(φ) subject to the constraint φ deg(φ)|λ(φ)| = n.
The corresponding centralizers sizes are well known ( [M2], page 181) and can be rewritten as To minimize this expression, suppose that for each polynomial φ one knows the size |λ(φ)| of λ(φ). Lemma 6.1 shows that λ(φ) should be taken to be a one-row partition which would contribute q |λ(φ)|·deg(φ) (1−1/q deg(φ) ). Letting r be such that r i=1 iN (q; i) ≥ n, it follows that the minimal centralizer size is at least q n r i=1 (1 − 1/q i ) N (q;i) . Observe that r can be taken to be the smallest integer such that q r −1 ≥ n, because by Lemma 6.2 Since N (q; i) ≤ q i /i, the minimum centralizer size is at least q n r i=1 (1 − 1/q i ) q i /i . Lemma 6.3 gives that the minimum centralizer size is at least q n (1−1/q) e (1+1/2+···+1/r) . To finish the proof use the bounds 1 + 1/2 + · · · + 1/r ≤ 1 + log(r) and take r = 1 + log q (n + 1).
Remark: Since the number of conjugacy classes of GL(n, q) is less than q n , one might hope that the largest conjugacy class size is at most |GL(n,q)| cq n where c is a constant. The proof of Theorem 6.4 shows this to be untrue. Indeed, the infinite product i (1 − 1/q i ) N (q;i) vanishes. This can be seen by setting u = 1/q in the identity (which holds since the coefficient of u n on both sides counts the number of monic degree n polynomials with non-vanishing constant term).
6.2. The unitary groups. The method for the finite unitary groups is similar to that for GL(n, q). As usual, we view U (n, q) as a subgroup of GL(n, q 2 ).
Lemma 6.5. Let λ be any partition. Then for q ≥ 2, Proof. The argument is the same as for Lemma 6.1. Using the same notation as in that proof, except that now f (λ) = q Given a polynomial φ with coefficients in F q 2 and non-vanishing constant term, define a polynomialφ bỹ where φ q raises each coefficient of φ to the qth power. A polynomial φ is called self-conjugate ifφ = φ and an element in an extension field of F q 2 is called self-conjugate if its minimal polynomial over F q 2 is self-conjugate.
Lemma 6.6. Suppose that r is odd. Then the number of nonzero non-selfconjugate elements in F q 2r viewed as an extension of F q 2 is q 2r − q r − 2.
Proof. Theorem 9 of [F2] shows that the number of self-conjugate elements of degree i over F q 2 is 0 if i is even and is d|i µ(d)(q i/d + 1) if i is odd, where µ is the Moebius function. Thus Moebius inversion implies that the total number of self-conjugate elements of F q 2r is i|r d|i µ(d)(q i/d + 1) = q r + 1, which implies the result.
Theorem 6.7. The smallest centralizer size of an element of U (n, q) is at least q n 1−1/q 2 e(2+log q (n+1)) 1/2 . Proof. For n = 1 this is clear so suppose that n > 1. The conjugacy classes of U (n, q) and their sizes were determined in [W]. They are parametrized by the following analog of Jordan canonical form. For each monic irreducible polynomial φ = z, one picks a partition λ(φ) subject to the two constraints that φ deg(φ)|λ(φ)| = n and λ(φ) = λ(φ). The corresponding centralizers sizes are due to Wall and can be usefully rewritten as Here the q → q 2 means (in the second product over polynomials) to replace all occurrences of q by q 2 . Note that the second product is over unordered conjugate pairs of non self-conjugate monic irreducible polynomials. Note that the bound in Lemma 6.5 is greater than q |λ| whereas the bound in Lemma 6.1 is less than q |λ| . Hence the minimum size centralizer will correspond to a conjugacy class whose characteristic polynomial has only non self-conjugate irreducible polynomials as factors. LetM (q; i) denote the number of unordered pairs {φ,φ} where φ is monic non self-conjugate and irreducible of degree i with coefficients in F q 2 . Then a lower bound for the smallest centralizer size is q n r i=1 (1 − 1/q 2i )M (q;i) where r is such that r i=1 2iM (q; i) ≥ n. Take r to be odd, and observe by Lemma 6.6 that and that q 2r − q r − 2 ≥ n if q r ≥ n + 1 (since n > 1). SinceM (q; i) ≤ q 2i 2i , the smallest centralizer size is at least q n r 2i . Arguing as in the general linear case and applying Lemma 6.3 proves the theorem.
6.3. Symplectic and orthogonal groups. To begin the study of minimum centralizer sizes in symplectic and orthogonal groups, we treat the case of elements whose characteristic polynomial is (z ± 1) n . This will be done by two different methods. The first approach uses algebraic group techniques and gives the best bounds. The second approach is combinatorial but of interest as it involves a new enumeration of unipotent elements in orthogonal groups. Note that there is no need to consider odd dimensional orthogonal groups in even characteristic, as these are isomorphic to symplectic groups.
(1) The minimum centralizer size of an element in the group Sp(2n, q) with characteristic polynomial (z±1) 2n is at least q n .
Proof. We first work in the ambient algebraic group G and connected component H. Let g ∈ G(q). Let B be a Borel subgroup of H normalized by g and U its unipotent radical. First suppose that g ∈ H. It suffices to show that C U (g) has dimension at least r, the rank of H. For then the rational points in C U (g) have order a multiple of q r as required (cf [FG1]). The centralizer in B of a regular unipotent element in B has dimension exactly r in B. Since these elements are dense in U , the same is true for any such element. Now suppose that g is not in H. This only occurs in even characteristic with G an orthogonal group. Now G embeds in a symplectic group L of the same dimension. Let A be a maximal unipotent subgroup of L containing U . Note that gU contains regular unipotent elements of L and so the subset of gU consisting of regular unipotent elements is dense in gU .
We claim that C U (g) has dimension at least r − 1. Once we have established that claim, it follows as above that the centralizer in H(q) of g is divisible by q r−1 as required. Since g is not in H, its centralizer in G(q) has order at least twice as large. Since the regular unipotent elements in gU are dense, it suffices to prove the claim for such an element.
Since g is a regular unipotent element of L, it follows that C L (g) = C A (g) has dimension r. On the other hand, we see also that the set of orthogonal groups containing g is a 1-dimensional variety (the orthogonal groups containing g are in bijection with the g-invariant hyperplanes of the orthogonal module for L that do not contain the L fixed space -since any such hyperplane must contain the image of g − 1 which has codimension 2, we see the set of hyperplanes is a 1-dimensional variety). Now C U (g) is precisely the stabilizer of the hyperplane corresponding to H and so has codimension at most 1 in C A (g). Thus, dim C A (g) ≥ r − 1 (in fact equality holds). This proves the claim and completes the proof.
We now give a more combinatorial approach to lower bounding centralizer sizes of elements whose characteristic polynomial is (z ± 1) n ; this is complementary and yields different information. A crucial step in this approach is counting the number of unipotent elements in symplectic and orthogonal groups. Steinberg (see [C] for a proof) showed that if G is a connected reductive group and F : G → G is a Frobenius map, then the number of unipotent elements of G F is the square of the order of a p-Sylow, where p is the characteristic. We remind the reader that (1) |Sp(2n, q)| = q n 2 n j=1 (q 2j − 1). (2) |O(2n + 1, q)| = 2q n 2 n j=1 (q 2j − 1) (in odd characteristic). (3) |O ± (2n, q)| = 2q n 2 −n (q n ∓ 1) n−1 j=1 (q 2j − 1). This implies that the number of unipotent elements in Sp(2n, q) is q 2n 2 . However the orthogonal groups are not connected, so Steinberg's theorem is not directly applicable. Nevertheless, in odd characteristic, unipotent elements always live in Ω, so Steinberg's theorem does imply that the number of unipotent elements in O(2n + 1, q) in odd characteristic is q 2n 2 , and that the number of unipotent elements of O ± (2n, q) in odd characteristic is q 2(n 2 −n) . Proposition 6.11 uses generating functions to treat orthogonal groups in even characteristic; along the way we obtain a formula for the number of unipotent elements (this turns out not to be a power of q and seems challenging from the algebraic approach). Two combinatorial lemmas are needed. .
To state the second lemma, we require some notation (which will be used elsewhere in this subsection as well). Given a a polynomial φ(z) with coefficients in F q and non vanishing constant term, define the "conjugate" polynomial φ * by One calls φ self-conjugate if φ * = φ. Note that the map φ → φ * is an involution. We let N * (q; d) denote the number of monic irreducible selfconjugate polynomials of degree d with coefficients in F q , and let M * (q; d) denote the number of conjugate pairs of monic irreducible non-self conjugate polynomials of degree d with coefficients in F q .
Lemma 6.10. ( [FNP]) Let f = 1 if the characteristic is even and f = 2 if the characteristic is odd. Then Now we can enumerate unipotent elements in even characteristic orthogonal groups.
There is a notion of cycle index for the orthogonal groups (see [F1] or [F2] for background), and the cycle indices for the sum and difference of the orthogonal groups factor. Setting all variables equal to 1 in the cycle index for the sum of O + (n, q) and O − (n, q), it follows that Let us make some comments about this equation. Here the F + (t) corresponds to the part of the cycle index for the polynomial z − 1. The term Q j≥1 (1− t q 2j−1 ) 1−t corresponds to the remaining possible factors of the characteristic polynomial. This follows from the combinatorial identity d≥1 r≥1 which is a consequence of Lemma 6.10 after reversing the order of the products. Solving for F + (t), one finds that .
Taking the coefficient of t n and using Lemma 6.9, it follows that Next we solve for F − (t). Setting all variables equal to 1 in the cycle index for the difference of O + (n, q) and O − (n, q), it follows that Here the F − (t) corresponds to the part of the cycle index for the polynomial z − 1. The other term on the right hand side corresponds to the remaining possible factors of the characteristic polynomial. This follows from the combinatorial identity d≥1 r≥1 which is a consequence of Lemma 6.10 after reversing the order of the products. Thus F − (t) = j even (1 − t q j ) −1 . Taking the coefficient of t n and using Lemma 6.9, it follows that u(O + (2n, q)) − u(O − (2n, q)) = 1 q 2n (1 − 1/q 2 ) · · · (1 − 1/q 2n ) .
Proposition 6.12 gives lower bounds on centralizer sizes for elements in symplectic and orthogonal groups whose characteristic polynomial is (z±1) n . Note that the bound of Proposition 6.8 was only slightly stronger.
Proof. For the first assertion, suppose without loss of generality that the element is unipotent. By Steinberg's theorem the total number of unipotent elements in Sp(2n, q) is q 2n 2 . Hence the sum of the reciprocals of the centralizer sizes of unipotent elements is equal to q 2n 2 |Sp(2n,q)| , from which it follows that the centralizer size of any unipotent element is at least |Sp(2n, q)| q 2n 2 = q n (1 − 1/q 2 ) · · · (1 − 1/q 2n ) ≥ q n (1 − 1/q 2 − 1/q 4 ).
(1) The centralizer size of an element of Sp(2n, q) is at least (2) The centralizer size of an element of O ± (2n, q) is at least (3) The centralizer size of an element of SO ± (2n, q) is at least (4) Suppose that q is odd. The centralizer size of an element of O(2n + 1, q) is at least Proof. First consider the case Sp(2n, q). Wall [W] parametrized the conjugacy classes of Sp(2n, q) and found their centralizer sizes. As in the general linear and unitary cases, the formula is multiplicative with terms coming from self-conjugate irreducible polynomials and also conjugate pairs of nonself conjugate irreducible polynomials. By Lemma 6.8 a size k partition corresponding to a polynomial z − 1 or z + 1 contributes at least a factor of q k/2 . As with the unitary groups, one sees that a partition λ from a selfconjugate irreducible polynomial φ contributes at least q deg(φ)·|λ|/2 and that λ associated with a pair {φ,φ} with φ monic non self-conjugate irreducible contributes at least q deg(φ)·|λ| (1 − 1/q deg(φ) ). Then it follows that a lower bound for the smallest centralizer size is q n 2 Note that if r ≥ 2, then q 2 r − 2q 2 r−1 ≥ q 2 r 2 . It follows that if r ≥ 2 and q 2 r ≥ 4n, then 2 r i=1 2iM * (q; i) ≥ 2n. Thus we need a 2 r which is at least max{4, log q (4n)}, and one can find such a 2 r which is at most 2(log q (4n)+4). Since M * (q; i) ≤ q i /2i, arguing as in the general linear case proves the first assertion of the theorem.
For the remaining assertions the contribution to the centralizer size coming from the part of the characteristic polynomial relatively prime to z 2 − 1 is the same for symplectic and orthogonal groups. Thus it is sufficient to focus on the part of the characteristic polynomial of the form (z − 1) a (z + 1) b where b = 0 if the characteristic is even. For O ± (2n, q), the contribution must be at least q a+b 2 −1 -for either the characteristic is odd and a, b have the same parity and part 2 of Proposition 6.8 applies, or else the characteristic is even part 3 of Proposition 6.8 applies. Note that if the element is not in SO, then the centralizer size is doubled in 0 giving (3). If the element is in SO, then a and b are even. Thus, arguing as above, the minimum centralizer size is q a+b 2 since a and b are both even, and (3) follows. For O ± (2n + 1, q) the contribution must be at least q a+b−1 2 since a, b have unequal parity, and use part 2 of Proposition 6.8.
For exceptional groups (or more generally for groups of bounded rank), we have: Lemma 6.14. Let G be a connected simple exceptional algebraic group with F a Frobenius endomorphism associated to the field of q elements. If g ∈ G F , then |C G F (g)| ≥ q r /26.
Note that Theorem 1.4 is an immediate consequence of the results in this section. In applications, we will have to deal with simple groups as well, so we state this (in all cases except for type A, the index of the simple group in the group of inner diagonal automorphisms is at most 4 -in type A, we use the results for GL(n, q) or U (n, q) and divide by (q ∓ 1) gcd(q ∓ 1, n)one factor to pass to SL or SU and the other factor for the center of these groups). Thus, we have: Theorem 6.15. Let S be a simple Chevalley group defined over the field of q elements with r the rank of the ambient algebraic group. There is a universal constant A such that if x is an inner diagonal automorphism of S, then .
The result also applies to the full orthogonal group as well. The result fails for other graph automorphisms and field automorphisms.

Conjugacy Classes of Maximal Subgroups and Derangements
We want to obtain bounds on the number of conjugacy classes of maximal subgroups of finite simple groups. We will then combine these results with our results on class numbers to obtain very strong results on the proportion of derangements in actions of simple and almost simple groups. We define m(G) to be the number of conjugacy classes of maximal subgroups of G. Aschbacher and the second author [AG] conjectured that m(G) < k(G), and proved this for G solvable. Note that if G is an elementary abelian 2-group, m(G) + 1 = k(G) = |G|.
A related conjecture (of Wall) is that the number of maximal subgroups of a finite group G is less than |G|. Wall proved this for solvable groups. See [LPS] for more recent results.
First we note the following result, which we will not require -see [LS3] and combine this with [LMS].

It follows by [LMS] that:
Theorem 7.2. Let G be an almost simple Chevalley group of rank r defined over the field of q elements. Then m(G) ≤ c(r) + 2r log log q.
The log log q term comes from subfield groups. This immediately gives a generalization of the Boston-Shalev conjecture in the case of bounded rank. Proof. It follows by the basic results about maximal subgroups of G (cf [FG1]) and Corollary 1.2 that any maximal subgroup M of G either contains a maximal torus of S or satisfies k(M ) < Cq r−1 with C a universal constant. Applying Theorem 1.4 (with r fixed) gives that for any maximal subgroup M of G not containing a maximal torus, Thus, by Theorem 7.2 and the fact that r is fixed, whence the result. Indeed, separating out the subfield case shows that O(1/q) is an upper bound in the equation above.
As in [FG1], this gives: Corollary 7.4. Let G be an almost simple group with socle S a Chevalley group of fixed rank r defined over F q . Assume that G is contained in the group of inner-diagonal automorphisms of S. Let M be a maximal subgroup of G not containing S and set Ω = G/M . Then there exists a universal constant δ > 0 such that δ(G, Ω) > δ.
Remark: An inspection of the proof shows that the result holds for the proportion of derangements in a given coset of S. This is no longer true if we allow field automorphisms.
We now want to consider what happens for increasing r. In particular, it suffices to consider r > 8 and so we restrict our attention to classical groups. The maximal subgroups have been classified by Aschbacher [As] and we consider the families individually.
We recall Aschbacher's theorem on maximal subgroups of classical Chevalley groups. We refer the reader to the description of the subgroups in [As]. See also [KL].
So let G be a classical Chevalley group with natural module V of dimension d. Then a subgroup H of G falls into the following nine families. In particular, a maximal subgroup is either in S or is maximal in one of the families C i . We will write C i (G) to denote the maximal subgroups of G that are in the family C i . We let S(G) denote the maximal subgroups of G in S. Lemma 7.5. The number of conjugacy classes of maximal subgroups of G in ∪ H∈C i is at most 8r log r + r log log q.
It is straightforward to see using Corollary 1.2 and the structure of the maximal subgroups in C i that: Lemma 7.6. Let M ∈ C i (G) for i > 3. There is a universal constant C such that k(M ) < Cq (r+1)/2 .
The only subgroups that are close to that bound are those in C 8 . We will now prove: Theorem 7.7. Let G be a finite classical Chevalley group of rank r over the field of q elements. Let X(G) denote the set of maximal subgroups of G contained in S(G) ∪ 8 i=4 C i (G). For r sufficiently large, | ∪ H∈X(G) H| |G| < O(q −r/3 ).
We will prove this for each of the families, and taking unions implies the result. We remark that the theorem applies to any subgroup G between the socle and the full isometry group of V . The result fails if we consider almost simple groups with field automorphisms allowed (see [GMS] for examples of so called exceptional permutation groups and a classification of primitive almost simple exceptional permutation actions).
Of course, a trivial corollary is Theorem 1.3. Note also that our estimate implies that the proportion of derangements in any coset of the simple group tends to 1 as well for the actions considered in Theorem 7.7.
The idea of the proof is quite simple. Let X(G) denote a set of subgroups of G closed under conjugation. We want to show that the number of conjugacy classes of G that intersect some element of X(G) is at most c(X).
Then using our results on a lower bound for centralizers (or equivalently an upper bound for sizes of conjugacy classes), we see that | ∪ H∈X(G) H| ≤ c(X)A|G|(1 + log q r) q r−1 , or | ∪ H∈X(G) H| |G| ≤ c(X)A(1 + log q r) q r−1 , where A is a universal constant. So we only need show that c(X) is at most O(q (2/3)r−2 ) in each case. We note that for a fixed simple group S, the number of embeddings of S into G is certainly bounded from above by 2rk(Ŝ) whereŜ is the universal cover of S (note that k(Ŝ) is an upper bound for the number of representations and the factor 2r comes from the fact that we may have representations which are inequivalent in the simple classical group but become conjugate in the full group of isometries). The arguments vary slightly depending upon the family we are considering but the basic idea is the same in all cases.
Lemma 7.8. Let G be a finite classical Chevalley group of rank r over the field of q elements. Let X(G) denote the set of maximal subgroups of G contained in C i (G) for i > 3. Then for r sufficiently large, | ∪ H∈X(G) H| |G| < O(q −r/3 ).
Proof. By Lemmas 7.5 and 7.6, one knows that ∪ H∈X(G) H is the union of at most Cq (r+1)/2 (8r log r + r log log q) conjugacy classes of G. By Theorem 6.15, each class has size at most A|G|(1 + log q (r))/q r−1 , with A a universal constant, whence the result.
We now consider S(G). It is convenient to split S(G) into 4 subclasses defined as follows (we keep notation as above). First recall that if S is a quasisimple Chevalley group in characteristic p and V is an absolutely irreducible module, then V = V (λ) for some dominant weight λ (in particular, the representation extends to the algebraic group). Write λ = a i λ i with the a i nonnegative integers and the λ i are the fundamental weights. A restricted representation is one with a i < p for all i. By the Steinberg tensor product theorem, every module is a tensor product of Frobenius twists of restricted modules (over the algebraic closure). See [J,St2] for details of this theory.
S 1 S is alternating or sporadic; S 2 S is a Chevalley group in characteristic not dividing q; S 3 S is a Chevalley group in characteristic dividing q and the representation is not restricted; and S 4 S is a Chevalley group in characteristic dividing q, and the representation is restricted.
Lemma 7.9. For r sufficiently large, we have: (1) The number of conjugacy classes of maximal subgroups in S 1 (G) is at most O(r 1/2 e 10r 1/4 ); and (2) Proof. We may assume that r is sufficiently large such that there are no sporadic groups in S 1 (G) nor alternating groups of degree less than 17. Let d be the dimension of the natural module for our classical group. So d ≤ 2r + 1.
It follows by [GT1,Lemma 6] that either the module is the natural permutation module for the symmetric group or the dimension d of the module satisfies d ≥ (m 2 − 5m + 2)/2. Since d ≤ 2r + 1, this implies that, for r sufficiently large, aside from the natural permutation module, m ≤ 3r 1/2 . By the comments preceding Lemma 7.8, the number of embeddings of A m into G is at most 2rk( A m ) ≤ 4rk(A m ). Thus the number of conjugacy classes of elements of S 1 (G) is at most (4r) 3r 1/2 m=5 2k(A m ) ≤ 12r 3/2 k(A ⌊3r 1/2 ⌋ ).
Recalling from Corollary 2.7 that k(A m ) ≤ k(S m ), and using the bound k(S m ) ≤ Cm −1 exp[π(2m/3) 1/2 ] which follows from known asymptotic behavior of the partition function ( [A1], p. 70), one concludes that the number of conjugacy classes of elements of S 1 (G) is at most O(re 5r 1/4 ).
Excluding the natural representations, the number of conjugacy classes of each M ∈ S 1 (G) is at most 2k(A ⌊3r 1/2 ⌋ ) ≤ O(r −1/2 e 5r 1/4 ). Thus, excluding the embedding of A m into G via the natural module, the total number of conjugacy classes of G in the union of maximal subgroups in S 1 (G) is at most O(r 1/2 e 10r 1/4 ).
Finally, consider the natural embedding of A m or S m into G. Then m = d + 1 or d + 2 (depending upon the characteristic). Moreover, since the representation is self dual, G is either symplectic or orthogonal. Thus, there are at most 8 conjugacy classes of such maximal subgroups, each with at most k(S 2r+3 ) classes. Thus, these maximal subgroups contribute at most O(r −1 exp[π(4r/3) 1/2 ]) conjugacy classes of G.
Thus, the total number of conjugacy classes of G represented in ∪ M ∈S 1 (G) is at most O(r −1 exp(4r 1/2 )). Using Theorem 6.15 which gives an upper bound for the size of a conjugacy class gives the result.
Proof. For part 1 we argue very much as in [LPS]. Let S be the socle. By the results of various authors on minimal dimensions of projective representations (see [T, (1) holds.
By (a) and (1), the number of conjugacy classes in the union of all maximal subgroups in S 2 (G) is at most O(r 4 ). Now (2) follows by this and Theorem 6.15.
Proof. Since the representation is not restricted and M is maximal, it follows by Steinberg's tensor product theorem that the representation must be the tensor product of Frobenius twists of some restricted representation. By [GT2,Lemma 26], N G (M ) also preserves this tensor product (over the algebraic closure). Since M is maximal, this implies that M is a classical group over a larger field and that V is the tensor product of Frobenius twists of the natural module for the classical group (and so this module is defined over the smaller field). Thus, there will be at most 2r log r choices for the class of M (essentially depending upon writing the dimension as power of a positive integer).
Indeed, let d be the dimension of the natural module. Write d = m e with e > 1. Then the socle of M is a classical group over the field of size q e and of rank less than m. Thus, by Corollary 1.2, k(M ) ≤ O(q em ) ≤ O(q 3r 1/2 ).
It follows that ∪ M ∈S 3 (G) M contains at most O(r log rq 3r 1/2 ) conjugacy classes of G. Now apply Theorem 6.15 to conclude that (2) holds.
Proof. The restricted representations of the groups of Ree type (i.e. one of 2 B 2 , 2 G 2 , 2 F 4 ) have bounded dimension and so we may ignore these. If S is an untwisted Chevalley group over the field of s elements, then every restricted representation is defined over that field and so s must be the field for the natural module for G. If S is twisted, then any representation is defined over the field of s elements or s d elements where d ≤ 3 (depending upon the twist).
It also follows by [Lu1] that the rank of S is at most 3r 1/2 . Thus there are most Dr 1/2 choices for S for an absolute constant D and the number of possible representations is at most p 3r 1/2 . Thus, the the number of possible conjugacy classes is at most O(r 3/2 p 3r 1/2 ); the extra r comes from the possibility of equivalent representations not conjugate in G. This proves (1).
By Corollary 1.2 and the remarks above, k(M ) ≤ O(q 3r 1/2 ) for each possible M .
Putting the previous results together completes the proof of Theorem 7.7. Theorem 1.5 also follows immediately from the previous results.