On the Numerical Evaluation of Fredholm Determinants

Some significant quantities in mathematics and physics are most naturally expressed as the Fredholm determinant of an integral operator, most notably many of the distribution functions in random matrix theory. Though their numerical values are of interest, there is no systematic numerical treatment of Fredholm determinants to be found in the literature. Instead, the few numerical evaluations that are available rely on eigenfunction expansions of the operator, if expressible in terms of special functions, or on alternative, numerically more straightforwardly accessible analytic expressions, e.g., in terms of Painleve transcendents, that have masterfully been derived in some cases. In this paper we close the gap in the literature by studying projection methods and, above all, a simple, easily implementable, general method for the numerical evaluation of Fredholm determinants that is derived from the classical Nystrom method for the solution of Fredholm equations of the second kind. Using Gauss-Legendre or Clenshaw-Curtis as the underlying quadrature rule, we prove that the approximation error essentially behaves like the quadrature error for the sections of the kernel. In particular, we get exponential convergence for analytic kernels, which are typical in random matrix theory. The application of the method to the distribution functions of the Gaussian unitary ensemble (GUE), in the bulk and the edge scaling limit, is discussed in detail. After extending the method to systems of integral operators, we evaluate the two-point correlation functions of the more recently studied Airy and Airy1 processes for the first time.

1. Introduction.Fredholm's (1903) landmark paper 1 on linear integral equations is generally considered to be the forefather of those mathematical concepts that finally led to modern functional analysis and operator theory-see the historical accounts in Kline (1972Kline ( , pp. 1058Kline ( -1075) ) and Dieudonné (1981, Chap. V) and explicit formulas for the solution thereof, for a right hand side f and a kernel K, both assumed to be continuous functions.He introduced his now famous determinant which is an entire function of z ∈ C, and succeeded in showing that the integral equation is uniquely solvable if and only if d(z) = 0.
Realizing the tremendous potential of Fredholm's theory, Hilbert started working on integral equations in a flurry and, in a series of six papers from 1904 to 1910, 2 2 F. BORNEMANN transformed the determinantal framing to the beginnings of what later, in the hands of Schmidt, Carleman, Riesz, and others, would become the theory of compact operators in Hilbert spaces.Consequently, over the years Fredholm determinants have faded from the core of general accounts on integral equations to the historical remarks section-if they are mentioned at all. 3o, given this state of affairs, why then study the numerical evaluation of the Fredholm determinant d(z)?The reason is, simply enough, that the Fredholm determinant and the more general notions, generalizing (1.1) and (1.2) to for certain classes of compact operators A on Hilbert spaces, have always remained important tools in operator theory and mathematical physics (Gohberg, Goldberg andKrupnik 2000, Simon 2005).In turn, they have found many significant applications: e.g., in atomic collision theory (Jost andPais 1951, Moiseiwitsch 1977), inverse scattering (Dyson 1976), in Floquet theory of periodic differential equations (Eastham 1973), in the infinite-dimensional method of stationary phase and Feynman path integrals (Albeverio andHøegh-Krohn 1977, Rezende 1994), as the twopoint correlation function of the two-dimensional Ising model (Wilkinson 1978), in renormalization in quantum field theory (Simon 2005), as distribution functions in random matrix theory (Mehta 2004, Deift 1999, Katz and Sarnak 1999) and combinatorial growth processes (Johansson 2000, Prähofer and Spohn 2002, Sasamoto 2005, Borodin, Ferrari, Prähofer and Sasamoto 2007).As Lax (2002, p. 260) puts it most aptly upon including Fredholm's theory as a chapter of its own in his recent textbook on functional analysis: "Since this determinant appears in some modern theories, it is time to resurrect it." In view of this renewed interest in operator determinants, what numerical methods are available for their evaluation?Interestingly, this question has apparently never-at least to our knowledge-been systematically addressed in the numerical analysis literature. 4 Even experts in the applications of Fredholm determinants commonly seem to have been thinking (Spohn 2008) that an evaluation is only possible if either the eigenvalues of the integral operator are, more or less, explicitly known or if an alternative analytic expression has been found that is numerically more accessible-in each specific case anew, lacking a general procedure.
The Nyström-type method advocated in this paper.In contrast, we study a simple general numerical method for Fredholm determinants which is exceptionally efficient for smooth kernels, yielding small absolute errors (i.e., errors that are small with respect to the scale det(I) = 1 inherently given by the operator determinant).To this end we follow the line of thought of Nyström's (1930) classical quadrature method for the numerical solution of the Fredholm equation (1.1).Namely, given a quadrature rule Nyström discretized (1.1) as the linear system which has to be solved for u i ≈ u(x i ) (i = 1, . . ., m).Nyström's method is extremely simple and, yet, extremely effective for smooth kernels.So much so that Delves and Mohamed (1985, p. 245), in a chapter comparing different numerical methods for Fredholm equations of the second kind, write: Despite the theoretical and practical care lavished on the more complicated algorithms, the clear winner of this contest has been the Nyström routine with the m-point Gauss-Legendre rule.This routine is extremely simple; it includes a call to a routine which provides the points and weights for the quadrature rule, about twelve lines of code to set up the Nyström equations and a call to the routine which solves these equations.Such results are enough to make a numerical analyst weep.
By keeping this conceptual and algorithmic simplicity, the method studied in this paper approximates the Fredholm determinant d(z) simply by the determinant of the m × m-matrix that is applied to the vector (u i ) in the Nyström equation (1.3): (1.4) If the weights w j of the quadrature rule are positive (which is always the better choice), we will use the equivalent symmetric variant .

0.9000272717982592
Strictly speaking we are not the first to suggest this simple method, though.In fact, it was Hilbert himself (1904, pp. 52-60) in his very first paper on integral equations, who motivated7 the expression (1.2) of the Fredholm determinant by essentially using this method with the rectangular rule for quadrature, proving locally uniform convergence; see also Hilbert (1912, pp. 4-10) and, for the motivational argument given just heuristically, without a proof of convergence, Whittaker andWatson (1927, Sect. 11.2), Tricomi (1957, pp. 66-68) (who speaks of a "poetic license" to be applied "without too many scruples"), Smithies (1958, pp. 65-68), Hochstadt (1973, pp. 243-239), and Fenyő and Stolle (1982-1984, Vol. II, pp. 82-84)-to name just a few but influential cases.Quite astonishingly, despite of all its presence as a motivational tool in the expositions of the classical theory, we have found just one example of the use of this method (with Gauss-Legendre quadrature) in an actual numerical calculation: a paper by the physicists Reinhardt and Szabo (1970) on low-energy elastic scattering of electrons from hydrogen atoms.However, the error estimates (Theorem 6.2) that we will give in this paper seem to be new at least; we will prove that the approximation error essentially behaves like the quadrature error for the sections x → K(x, y) and y → K(x, y) of the kernel.In particular, we will get exponential convergence rates for analytic kernels.
Examples.Perhaps the generality and efficiency offered by our direct numerical approach to Fredholm determinants, as compared to analytic methods if they are available at all, is best appreciated by an example.The probability E 2 (0; s) that an interval of length s does not contain, in the bulk scaling limit of level spacing 1, an eigenvalue of the Gaussian unitary ensemble (GUE) is given by the Fredholm determinant of the sine kernel, see Gaudin (1961) and Mehta (2004, Sect. 6.3).Gaudin has further shown that the eigenfunctions of this selfadjoint integral operator are exactly given by a particular family of special functions, namely the radial prolate spheroidal wave functions with certain parameters.Using tables (Stratton, Morse, Chu, Little and Corbató 1956) of these functions he was finally able to evaluate E 2 (0; s) numerically.On the other hand, in an admirably intricate analytic tour de force Jimbo, Miwa, Môri and Sato (1980) expressed the Fredholm determinant of the sine kernel as in terms of the sigma, or Hirota, representation of the fifth Painlevé equation, namely see also Mehta (2004, Chap. 21) and Tracy and Widom (2000, Sect. 4.1).With respect to the numerical evaluation, the latter two authors conclude in a footnote, most probably by comparing to Gaudin's method: "Without the Painlevé representations, the numerical evaluation of the Fredholm determinants is quite involved."However, one does not need to know more than the smooth kernel sin(π(x − y))/(π(x − y)) ifself to approximate E 2 (0; s) with the method of this paper.For instance, the Gauss-Legendre rule with just m = 5 quadrature points already gives, in 0.2 ms computing time, 15 accurate digits of the value that is, by calculating the determinant of a 5 × 5-matrix easily built from the kernel.Even though it is satisfying to have an alternative and simpler way of calculating already known quantities, it is far more exciting to be able to calculate quantities that otherwise have defeated numerical evaluations so far.For instance, the joint distribution functions of the Airy and the Airy 1 processes are given as determinants of systems of integral operators, see Prähofer and Spohn (2002), Johansson (2003), Sasamoto (2005) and Borodin et al. (2007).Even though a nonlinear partial differential equation of third order in three variables has been found by Adler and van Moerbeke (2005, Eq. (4.12)) for the logarithm of the joint distribution function of the Airy process at two different times, this masterful analytic result is probably of next to no numerical use.And in any case, no such analytic results are yet known for the Airy 1 process.However, the Nyström-type method studied in this paper can easily be extended to systems of integral operators.In this way, we have succeeded in evaluating the two-point correlation functions of both stochastic processes, see Section 8.
Outline of this paper.For the proper functional analytic setting, in Section 2 we review some basic facts about trace class and Hilbert-Schmidt operators.In Section 3 we review the concept of the determinant det(I + A) for trace class operators A and its relation to the Fredholm determinant.In Section 4 we study perturbation bounds implying that numerical calculations of determinants can only be expected to be accurate with respect to absolute errors in general.In Section 5 we use the functional analytic formulation of the problem to obtain convergence estimates for projection methods of Galerkin and Ritz-Galerkin type.The convergence rate is shown to depend on a proper balance between the decay of the singular values of the operator and the growth of bounds on the derivatives of the corresponding singular functions.This is in sharp contrast with Section 6, where we study the convergence of the Nyström-type method (1.5) by directly addressing the original definition of the Fredholm determinant.Here, only the smoothness properties of the kernel enter the convergence estimates.It turns out that, for kernels of low regularity, the order of convergence of the Nyström-type method can be even higher than that of a Ritz-Galerkin method.In Section 7 we give examples for the exponential convergence rates enjoyed by analytic kernels.To this end we discuss the details of the numerical evaluation of the determinants of the sine and Airy kernels, which express the probability distributions E 2 (0; s) and F 2 (s) (the Tracy-Widom distribution) of random matrix theory.Finally, after extending the Nyström-type method to systems of integral operators we report in Section 8 on the numerical evaluation of the two-point correlation functions of the Airy and Airy 1 processes.

2.
Trace Class and Hilbert-Schmidt Operators.We begin by recalling some basic material about the spectral theory of nonselfadjoint compact operators, which can be found, e.g., in Gohberg, Goldberg and Kaashoek (1990), Lax (2002) and Simon (2005).We consider a complex, separable Hilbert space H with an inner product •, • that is linear in the second factor and conjugate linear in the first.The set of bounded linear operators will be denoted by B(H), the compact operators by J ∞ (H).The spectrum of a compact operator A ∈ J ∞ (H) has no non-zero limit point; its nonzero points are eigenvalues of finite algebraic multiplicity.We list these eigenvalues as (λ n (A)) n=1 , counting multiplicity, where N(A) is either a finite non-negative integer or infinity, and order them by are called the singular values of A. Correspondingly, there is the Schmidt or singularvalue representation of A, that is, the norm convergent expansion where the u n and v n are certain (not necessarily complete) orthonormal sets in H.

Note that s n
(2.2) The Schatten-von Neumann classes of compact operators are defined as with the corresponding norm8 The operator norm on J ∞ (H) perfectly fits into this setting if we realize that There are the continuous embeddings J p (H) ⊂ J q (H) for 1 p q ∞ with A Jq A Jp .
The classes J p (H) are two-sided operator ideals in B(H), that is, for Of special interest to us are the trace class operators J 1 (H) and the Hilbert-Schmidt operators J 2 (H).The product of two Hilbert-Schmidt operators is of trace class: The trace of a trace class operator A is defined by Likewise, for a Hilbert-Schmidt operator A ∈ J 2 (H) we have (2.6) Integral operators with L 2 -kernel.In the Hilbert space H = L 2 (a, b) of squareintegrable functions on a finite interval (a, b) the Hilbert-Schmidt operators are exactly given by the integral operators with L 2 -kernel.That is, there is a one-to-one correspondence (Simon 2005, Thm. 2.11) Since the product is a natural candidate for the definition of det(I + A) it makes sense requiring A to be of trace class; for then, by (2.5), the absolute convergence of the sum can be guaranteed.
Integral operators with a continuous kernel.
) is certainly square-integrable.Therefore, the induced integral operator (2.7) defines a Hilbert-Schmidt operator A on the Hilbert space H = L 2 (a, b).Moreover, other than for L 2 kernels in general, the integral b a K(x, x) dx over the diagonal of (a, b) 2 is now well defined and constitutes, in analogy to the matrix case, a "natural" candidate for the trace of the integral operator.Indeed, if an integral operator A with continuous kernel is of trace class, one can prove (Gohberg et al. 2000, Thm. 8.1) (2.8) Unfortunately, however, just the continuity of the kernel K does not guarantee the induced integral operator A to be of trace class.9Yet, there is some encouraging positive experience stated by Simon (2005, p. 25): However, the counter-examples which prevent nice theorems from holding are generally rather contrived so that I have found the following to be true: If an integral operator with kernel K occurs in some 'natural' way and |K(x, x)| dx < ∞, then the operator can (almost always) be proven to be trace class (although sometimes only after some considerable effort).
Nevertheless, we will state some simple criteria that often work well: 1.If the continuous kernel K can be represented in the form , then the induced integral operator A is trace class on L 2 (a, b).This is, because A can then be written as the product of two Hilbert-Schmidt operators.2. If K(x, y) and ∂ y K(x, y) are continuous on [a, b] 2 , then the induced integral operator A is trace class on L 2 (a, b).This is, because we can write A by partial integration in the form as a sum of a rank one operator and an integral operator that is trace class by the first criterion.In particular, integral operators with smooth kernels are trace class (Lax 2002, p. 345).3. A continuous Hermitian10 kernel K(x, y) on [a, b] that satisfies a Hölder condition in the second argument with exponent α > 1/2, namely induces an integral operator A that is trace class on L 2 (a, b); see Gohberg et al. (2000, Thm.IV.8.2). 4. If the continuous kernel K induces a selfadjoint, positive-semidefinite integral operator A, then A is trace class (Gohberg et al. 2000, Thm. IV.8.3).The hypothesis on A is fulfilled for positive-semidefinite kernels K, that is, if z j z k K(x j , x k ) 0 (2.9) for any x 1 , . . ., x n ∈ (a, b), z ∈ C n and any n ∈ N (Simon 2005, p. 24).

Definition and Properties of Fredholm and Operator Determinants.
In this section we give a general operator theoretical definition of infinite dimensional determinants and study their relation to the Fredholm determinant.For a trace class operator A ∈ J 1 (H) there are several equivalent constructions that all define one and the same entire function in fact, each construction has been chosen at least once, in different places of the literature, as the basic definition of the operator determinant: 1. Gohberg and Kreȋn (1969, p. 157) define the determinant by the locally uniformly convergent (infinite) product which possesses zeros exactly at z n = −1/λ n (A), counting multiplicity.2. Gohberg et al. (1990, p. 115) which converges for |z| < 1/|λ 1 (A)| and can analytically be continued as an entire function to all z ∈ C. 4. Grothendieck (1956, p. 347) and Simon (1977, p. 254) define the determinant most elegantly with a little exterior algebra (Greub 1967).With n (A) ∈ J 1 ( n (H)) being the n th exterior product of A, the power series is just the n th symmetric function of the eigenvalues of A. Proofs of the equivalence can be found in (Gohberg et al. 2000, Chap.2) and (Simon 2005, Chap.3).We will make use of all of them in the course of this paper.We state two important properties (Simon 2005, Thm. 3.5) of the operator determinant: First its multiplication formula, then the characterization of invertibility: det(I + A) = 0 if and only if the inverse operator (I + A) −1 exists.
11 Gohberg et al. (2000) have later extended this idea to generally define traces and determinants on embedded algebras of compact operators by a continuous extension from the finite dimensional case.Even within this general theory the trace class operators enjoy a most unique position: it is only for them that the values of trace and determinant are independent of the algebra chosen for their definition.On the contrary, if A is Hilbert-Schmidt but not trace class, by varying the embedded algebra, the values of the trace tr(A) can be given any complex number and the values of the determinant det(I + A) are either always zero or can be made to take any value in the set C \ {0} (Gohberg et al. 2000, Chap. VII).
The matrix case.In Section 6 we will study the convergence of finite dimensional determinants to operator determinants in terms of the power series (3.4).Therefore, we give this series a more common look and feel for the case of a matrix A ∈ C m×m .By evaluating the traces with respect to a Schur basis of A one gets tr that is, the sum of all n × n principal minors of A. The yields the von Koch (1892) form of the matrix determinant (A ∈ C m×m ).
(3.6) (In fact, the series must terminate at n = m since det(I + zA) is a polynomial of degree m in z.)A more elementary proof of this classical formula, by a Taylor expansion of the polynomial det(I + zA), can be found, e.g., in Meyer (2000, p. 495).
The Fredholm determinant for integral operators with continuous kernel.Suppose that the continuous kernel K ∈ C([a, b] 2 ) induces an integral operator A that is trace class on the Hilbert space H = L 2 (a, b).Then, the traces of n (A) evaluate to (Simon 2005, Thm.3.10) The power series representation (3.4) of the operator determinant is therefore exactly Fredholm's expression (1.2), that is, The similarity with von Koch's formula (3.6) is striking and, in fact, it was just an analogy in form that had led Fredholm to conceive his expression for the determinant.It is important to note, however, that the right hand side of (3.7) perfectly makes sense for any continuous kernel, independently of whether the corresponding integral operator is trace class or not.
The regularized determinant for Hilbert-Schmidt operators.For a general Hilbert-Schmidt operator we only know the convergence of ∑ n λ(A) 2 but not of ∑ n λ n (A).Therefore, the product (3.1), which is meant to define det(I + zA), is not known to converge in general.Instead, Hilbert (1904) and Carleman (1921) introduced the entire function13 which corresponds to the eigenvalues λ n (A) = 1/n of a Hilbert-Schmidt operator that is not trace class.
which also has the property to possess zeros exactly at z n = −1/λ n (A), counting multiplicity.Plemelj's formula (3.3) becomes (Gohberg et al. 2000, p. 167) for |z| < 1/|λ 1 (A)|, which perfectly makes sense since A 2 , A 3 , . . .are trace class for A being Hilbert-Schmidt. 14Note that for trace class operators we have For integral operators A of the form (2.7) with a continuous kernel 2) is related to the Hilbert-Carleman determinant by the equation (Hochstadt 1973, p. 250) with such a kernel, we have b a K(x, x) dx = tr(A) simply because tr(A) is not well defined by (2.4) anymore then.

Perturbation Bounds.
In studying the conditioning of operator and matrix determinants we rely on the fundamental perturbation estimate for trace class operators, which can beautifully be proven by means of complex analysis (Simon 2005, p. 45).This estimate can be put to the form showing that the condition number κ abs of the operator determinant det(I + A), with respect to absolute errors measured in trace class norm, is bounded by This bound can considerably be improved for certain selfadjoint operators that will play an important role in Section 7.
14 Equivalently one can define (Simon 2005, p. 50) the regularized determinant by This is because one can then show (I + zA)e −zA − I ∈ J 1 (H).For integral operators A on L 2 (a, b) with an L 2 kernel K, Hilbert (1904, p. 82) had found the equivalent expression simply replacing the problematic "diagonal" entries K(t j , t j ) in Fredholm's determinant (3.7) by zero.Simon (2005, Thm. 9.4) gives an elegant proof of Hilbert's formula.
That is, the condition number κ abs of the determinant det(I − A), with respect to absolute errors measured in trace class norm, is bounded by κ abs 1.
Thus, for the operators that satisfy the assumptions of this lemma the determinant is a really well conditioned quantity-with respect to absolute errors, like the eigenvalues of a Hermitian matrix (Golub and Van Loan 1996, p. 396).
Implications on the accuracy of numerical methods.The Nyström-type method of Section 6 requires the calculation of the determinant det(I + A) of some matrix A ∈ C m×m .In the presence of roundoff errors, a backward stable method like Gaussian elimination with partial pivoting (Higham 2002, Sect. 14.6) gives a result that is exact for some matrix Ã = A + E with where is a small multiple of the unit roundoff error of the floating-point arithmetic used.We now use the perturbation bounds of this section to estimate the resulting error in the value of determinant.Since the trace class norm is not a monotone matrix norm (Higham 2002, Def. 6.1), we cannot make direct use of the componentwise estimate (4.4).Instead, we majorize the trace class norm of m × m matrices A by the Hilbert-Schmidt (Frobenius) norm, which is monotone, using Thus, the general perturbation bound (4.2) yields the following a priori estimate of the roundoff error affecting the value of the determinant: If the matrix A satisfies the assumptions of Lemma 4.1, the perturbation bound (4.3) gives the even sharper estimate Therefore, if det(I − A) A J 2 , we have to be prepared that we probably cannot compute the determinant to the full precision given by the computer arithmetic used.Some digits will be lost.A conservative estimate stemming from (4.5) predicts the loss of at most log 10 decimal places.For instance, this will affect the tails of the probability distributions to be calculated in Section 7.
5. Projection Methods.The general idea (3.2) of defining the infinite dimensional determinant det(I + A) for a trace class operator A by a continuous extension from the finite dimensional case immediately leads to the concept of a projection method of Galerkin type.We consider a sequence of m-dimensional subspaces V m ⊂ H together with their corresponding orthonormal projections The Galerkin projection P m AP m of the trace class operator A is of finite rank.Given an orthonormal basis φ 1 , . . ., φ m of V m , its determinant can be effectively calculated as the finite dimensional expression (5.1) if the matrix elements φ i , Aφ j are numerically accessible.
Because of P m 1, and thus P m AP m J 1 A J 1 1, the perturbation bound (4.1) gives the simple error estimate For the method to be convergent we therefore have to show that P m AP m → A in trace class norm.By a general result about the approximation of trace class operators (Gohberg et al. 1990, Thm. 4.3) all we need to know is that P m converges pointwise 15 to the identity operator I.This pointwise convergence is obviously equivalent to the consistency of the family of subspaces V m , that is, V m is a dense subspace of H. (5.3) In summary, we have proven the following theorem.
Theorem 5.1.Let A be a trace class operator.If the sequence of subspaces satisfies the consistency condition (5.3), the corresponding Galerkin approximation (5.1) of the operator determinant converges, uniformly for bounded z.
A quantitative estimate of the error, that is, in view of (5.2), of the projection error P m AP m − A J 1 in trace class norm, can be based on the singular value representation (2.1) of A and its finite-rank truncation A N : (We assume that A is nondegenerate, that is, N(|A|) = ∞; since otherwise we could simplify the following by putting We obtain, by using P m 1 once more, (5.4) There are two competing effects that contribute to making this error bound small: First, there is the convergence of the truncated series of singular values, which is independent of m.Second, there is, for fixed N, the collective approximation of the first N singular functions u n , v n (n = 1, . . ., N).For instance, given > 0, we can first choose N large enough to push the first error term in (5.4) below /2.After fixing such an N, the second error term can be pushed below /2 for m large enough.This way we have proven Theorem 5.1 once more.However, in general the convergence of the second term might considerably slow down for growing N. Therefore, a good quantitative bound requires a carefully balanced choice of N (depending on m), which in turn requires some detailed knowledge about the decay of the singular values on the one hand and of the growth of the derivatives of the singular functions on the other hand (see the example at the end of this section).While some general results are available in the literature for the singular values-e.g., for integral operators obtained by Smithies (1937, Thm.12)-the results are sparse for the singular functions (Fenyő andStolle 1982-1984, §8.10).Since the quadrature method in Section 6 does not require any such knowledge, we refrain from stating a general result and content ourselves with the case that there is no projection error in the singular functions; that is, we consider projection methods of Ritz-Galerkin type for selfadjoint operators.
Theorem 5.2.Let A be a selfadjoint integral operator that is induced by a continuous Hermitian kernel K and that is trace class on the Hilbert space H = L 2 (a, b).Assume that A is not of finite rank and let (u n ) be an orthonormal basis of eigenfunctions of A. We consider the associated Ritz projection P m , that is, the orthonormal projection Note that in this case 2 ), then there holds the error estimate (5.2) with If K is bounded analytic on E ρ × E ρ (with the ellipse E ρ defined in Theorem A.2), then the error estimate improves to for any fixed choice of > 0.
Proof.With the spectral decompositions at hand the bound (5.4) simplifies, for N = m, to Now, by some results of Hille and Tamarkin (1931, Thm.7.2 and 10.1), we have, for (which is just slightly stronger than Smithies' singular value bound (5.5)) and, for K bounded analytic on E ρ × E ρ , which proves both assertions.However, for kernels of low regularity, by taking into account the specifics of a particular example one often gets better results than stated in this theorem.(An example with an analytic kernel, enjoying the excellent convergence rates of the second part this theorem, can be found in Section 7.) An example: Poisson's equation.For a given f ∈ L 2 (0, 1) the Poisson equation with Dirichlet boundary conditions is solved (Hochstadt 1973, p. 5) by the application of the integral operator A, which is induced by the Green's kernel (5.8) Since K is Lipschitz continuous, Hermitian, and positive definite, we know from the results of Section 2 that A is a selfadjoint trace class operator on H = L 2 (0, 1).The eigenvalues and normalized eigenfunctions of A are those of the Poisson equation which are known to be Note that the actual decay of the eigenvalues is better than the general Hille-Tamarkin bound (5.6) which would, because of ).The trace formulas (2.4) and (2.8) reproduce a classical formula of Euler's, namely whereas, by (3.1) and the product representation of the sine function, the Fredholm determinant explicitly evaluates to the entire function (5.9) The sharper perturbation bound of Lemma 4.1 applies and we get, for each finite dimensional subspace V m ⊂ H and the corresponding orthonormal projection P m : H → V m , the error estimate (5.10) Now, we study two particular families of subspaces.Trigonometric polynomials.Here, we consider the subspaces which are exactly those spanned by the eigenfunctions of A. In this case, the projection method is of Ritz-Galerkin type; the estimates become particularly simple since we have the explicit spectral decomposition of the error operator.Hence, the error bound (5.10) directly evaluates to Figure 1 shows that this upper bound overestimates the error in the Fredholm determinant by just about 20%.
Algebraic polynomials.Here, we consider the subspaces of algebraic polynomials of order m, that is, We apply the bound given in (5.4) and obtain (keeping in mind that A is selfadjoint) with a truncation index N yet to be skilfully chosen.As before in (5.11), the first term of this error bound can be estimated by 2/π 2 N.For the second term we recall that the projection error u n − P m u n is, in fact, just the error of polynomial best approximation of the eigenfunction u n with respect to the L 2 -norm.A standard Jackson-type inequality (DeVore and Lorentz 1993, p. 219) from approximation theory teaches us that where c k denotes a constant that depends on the smoothness level k.A fixed eigenfunction u n (being an entire function in fact) is therefore approximated beyond every algebraic order in the dimension m; but with increasingly larger constants for higher "wave numbers" n.We thus get, with some further constant ck depending on k 2, We now balance the two error terms by minimizing this bound: the optimal truncation index N turns out to be exactly N = m, which finally yields the estimate Thus, in contrast to the approximation of a single eigenfunction, for the Fredholm determinant the order of the convergence estimate does finally not depend on the choice of k anymore; we obtain the same O(m −1 ) behavior as for the Ritz-Galerkin method.In fact, a concrete numerical calculation 16 shows that this error estimate really reflects the actual order of the error decay of the Galerkin method, see Figure 1.
Remark.The analysis of this section has shown that the error decay of the projection methods is essentially determined by the decay ∞ ∑ k=m+1 s k (A) → 0 of the singular values of A, which in turn is related to the smoothness of the kernel K of the integral operator A. In the next section, the error analysis of Nyström-type quadrature methods will relate in a much more direct fashion to the smoothness of 16 By (5.1) all we need to know for implementing the Galerkin method are the matrix elements φ i , Aφ j for the normalized orthogonal polynomials φ n (that is, properly rescaled Legendre polynomials) on the interval [0, 1].A somewhat lengthy but straightforward calculation shows that these elements are given by with the coefficients the kernel K, giving even much better error bounds, a priori and in actual numerical computations.For instance, the Green's kernel (5.7) of low regularity can be treated by an m-dimensional approximation of the determinant with an actual convergence rate of O(m −2 ) instead of O(m −1 ) as for the projection methods.Moreover, these methods are much simpler and straightforwardly implemented.
6. Quadrature Methods.In this section we directly approximate the Fredholm determinant (1.2) using the Nyström-type method (1.4) that we have motivated at length in Section 1.We assume throughout that the kernel K is a continuous function on [a, b] 2 .The notation simplifies considerably by introducing the n-dimensional functions K n defined by Their properties are given in Lemma A.4 of the appendix.We then write the Fredholm determinant shortly as For a given quadrature formula we define the associated Nyström-type approximation of d(z) by the expression The key to error estimates and a convergence proof is the observation that we can rewrite d Q (z) in a form that closely resembles the Fredholm determinant.Namely, by using the von Koch form (3.6) of matrix determinants, the multi-linearity of minors, and by introducing the n-dimensional product rule (A.3) associated with Q (see the appendix), we get Thus, alternatively to the motivation given in the introductory Section 1, we could have introduced the Nyström-type method by approximating each of the multidimensional integrals in the power series defining the Fredholm determinant with a product quadrature rule.Using this form, we observe that the error is given by that is, by the exponentially generating function of the quadrature errors for the functions K n .The following theorem generalizes a result that Hilbert (1904, p. 59) had proven for a specific class of quadrature formulae, namely, the rectangular rule.
Theorem 6.1.If a family Q m of quadrature rules converges for continuous functions, then the corresponding Nyström-type approximation of the Fredholm determinant converges, uniformly for bounded z.
Proof.Let z be bounded by M and choose any > 0. We split the series (6.2) at an index N yet to be chosen, getting Let Λ be the stability bound of the convergent family Q m of quadrature rules (see Theorem A.1 of the appendix) and put Λ 1 = max(Λ, b − a).Then, by Lemma A.4, the second part of the splitting can be bounded by The last series converges by Lemma A.5 and the bound can, therefore, be pushed below /2 by choosing N large enough.After fixing such an N, we can certainly also push the first part of the splitting, that is, below /2, now for m large enough, say m m 0 , using the convergence of the product rules Q n m induced by Q m (see Theorem A.3).In summary we get for all |z| M and m m 0 , which proves the asserted convergence of the Nyströmtype method.
F. BORNEMANN If the kernel K enjoys, additionally, some smoothness, we can prove error estimates that exhibit, essentially, the same rates of convergence as for the quadrature of the sections x → K(x, y) and y → K(x, y). Theorem 2 ), then for each quadrature rule Q of order ν k with positive weights there holds the error estimate where c k is the constant (depending only on k) from Theorem A.2, and K k and Φ are the norm and function defined in (A.6) and (A.10), respectively.
If K is bounded analytic on E ρ × E ρ (with the ellipse E ρ defined in Theorem A.2), then for each quadrature rule Q of order ν with positive weights there holds the error estimate Proof.By Theorem A.3 and Lemma A.4 we can estimate the error (6.2) in both cases in the form in the second case.This proves both assertions.
An example with an analytic kernel, enjoying the excellent convergence rates of the second part this theorem, can be found in Section 7.
Note that Theorem 6.2 is based on a general result (Theorem A.2) about quadrature errors that stems from the convergence rates of polynomial best approximation.There are cases (typically of low regularity), however, for which certain quadrature formulae enjoy convergence rates that are actually better than best approximation.The Nyström-type method inherits this behavior; one would just have to repeat the proof of Theorem 6.2 then.We refrain from stating a general theorem, since this would involve bounds on the highest derivatives involving weights 17 that take into account the boundary of the interval [a, b].Instead, we content ourselves with the detailed discussion of a particular example.
An example: Poisson's equation.We revisit the example of Section 5, that is the integral operator (5.7) belonging to the Green's kernel K (defined in (5.8)) of Poisson's equation.Recall from (5.9) that ).If we apply the Nyström-type method with the m-point Gauss-Legendre (order ν = 2m) or the Curtis-Clenshaw (order ν = m) formulae as the underlying quadrature rule Q m , Theorem 6.2 proves an error bound of the form which superficially indicates the same convergence rate as for the m-dimensional Galerkin methods of Section 5.However, the actual numerical computation shown in Figure 2 exhibits the far better convergence rate of O(m −2 ).This deviation can be understood in detail as follows: On the one hand, by inverse theorems of approximation theory (DeVore and Lorentz 1993, p. 220), valid for proper subintervals of [a, b], the polynomial best approximation (of degree m) of sections of the Green's kernel K cannot give a better rate than O(m −1 ); since otherwise those sections could not show jumps in the first derivative.Given the line of arguments leading from polynomial best approximation to Theorem 6.2, the error estimate of O(m −1 ) was therefore the best that could be established this way.
On the other hand, the sections of the Green's kernel look like piecewise linear hat functions.Therefore, the coefficients a m of their Chebyshev expansions decay as O(m −2 ) (Davis and Rabinowitz 1984, Eq. (4.8.1.3)).Given this decay rate, one can then prove-see, for Gauss-Legendre, Davis and Rabinowitz (1984, Eq. (4.8.1.7))and, for Clenshaw-Curtis, Riess and Johnson (1972, Eq. ( 9))-that the quadrature error is of rate O(m −2 ), too.Now, one can lift this estimate to the Nyström-like method essentially as in Theorem 6.2; thus proving in fact that as numerically observed.
Remark.This "superconvergence" property of certain quadrature rules, as opposed to best approximation, for kernels with jumps in a higher derivative is therefore also the deeper reason that the Nyström-type method then outperforms the projection methods of Section 5 (see Figure 2): Best approximation, by direct (Jackson) and inverse (Bernstein) theorems of approximation theory, is strongly tied with the regularity of K.And this, in turn, is tied to the decay of the singular values of the induced integral operator A, which governs the convergence rates of projection methods.Fig. 3.The probability E 2 (0; s) that an interval of length s does not contain, in the bulk scaling limit of level spacing 1, an eigenvalue of the Gaussian unitary ensemble (GUE).The result shown was calculated with the Nyström-like method based on Gauss-Legendre with m = 30; and cross-checked against the asymptotic expansion log E 2 (0; s) = −π 2 s 2 /8 − log(s)/4 + log(2)/3 − log(π)/4 + 3ζ (−1) + O(s −1 ) for s → ∞ (Deift et al. 1997).
A note on implementation.If the quadrature weights are positive (which in view of Theorem A.1 is anyway the better choice), as is the case for Gauss-Legendre and Clenshaw-Curtis, we recommend to implement the Nyström-type method (6.1) in the equivalent, symmetric form . (6.3) (Accordingly short Matlab and Mathematica code is given in the introductory Section 1.) The reason is that the m × m-matrix A Q inherits some important structural properties from the integral operator A: • If A is positive semidefinite, then, by (2.9),A Q is positive semidefinite, too.
This way, for instance, the computational cost for calculating the finite-dimensional determinant is cut to half, if by structural inheritance I + zA Q is Hermitian positive definite; the Cholesky decomposition can then be employed instead of Gaussian elimination with partial pivoting.

7.
Application to Some Entire Kernels of Random Matrix Theory.In this section we study two important examples, stemming from random matrix theory, with entire kernels.By Theorem 6.2, the Nyström-type method based on Gauss-Legendre or Curtis-Clenshaw quadrature has to exhibit exponential convergence.

The sine kernel.
The probability E 2 (0; s) (shown in Figure 3) that an interval of length s does not contain, in the bulk scaling limit of level spacing 1, an eigenvalue of the Gaussian unitary ensemble (GUE) is given (Mehta 2004, Sect. 6.3) by the Fredholm determinant of the integral operator A s on L 2 (0, s) that is induced by the sine kernel K: Note that K(x, y) is Hermitian and entire on C × C; thus A s is a selfadjoint operator of trace class on L 2 (0, s).(This is already much more than we would need to know for successfully applying and understanding the Nyström-type method.However, to facilitate a comparison with the Ritz-Galerkin method, we analyze the operator A s in some more detail.)The factorization e ixξ e −iyξ dξ (7.1) of the kernel implies that A s is positive definite with maximal eigenvalue λ 1 (A s ) < 1; since, for 0 = u ∈ L 2 (0, s), we obtain Here, in stating that the inequalities are strict, we have used the fact that the Fourier transform û of the function u ∈ L 2 (0, s), which has compact support, is an entire function.Therefore, the perturbation bound of Lemma 4.1 applies and we obtain, for Ritz-Galerkin as for any Galerkin method, like in the example of Section 5, the basic error estimate Now, Theorems 5.2 and 6.2 predict a rapid, exponentially fast convergence of the Ritz-Galerkin and the Nyström-type methods: In fact, an m-dimensional approximation will give an error that decays like O(e −cm ), for any fixed c > 0 since, for entire kernels, the parameter ρ > 1 can be chosen arbitrarily in these theorems.Details of the implementation of the Ritz-Galerkin method.There is certainly no general recipe on how to actually construct the Ritz-Galerkin method for a specific example, since one would have to know, more or less exactly, the eigenvalues of A. In the case of the sine kernel, however, Gaudin (1961) had succeeded in doing so.(See also Katz and Sarnak (1999, p. 411) and Mehta (2004, pp. 124-126).)He had observed that the integral operator Ãt on L 2 (−1, 1), defined by Ãt u(x) = 1 −1 e iπtxy u(y) dy (which is, by (7.1), basically a rescaled "square-root" of A 2t ), is commuting with the selfadjoint, second-order differential operator with boundary conditions Thus, both operators share the same set of eigenfunctions u n , namely the radial prolate spheroidal wave functions (using the notation of Mathematica 6.0) n,0 (πt, x) (n = 0, 1, 2 . ..).These special functions are even for n even, and odd for n odd.By plugging them into the integral operator Ãt Gaudin had obtained, after evaluating at x = 0, the eigenvalues Finally, we have (starting with the index n = 0 here) Hence, the m-dimensional Ritz-Galerkin approximation of det(I − A s ) is given by While Gaudin himself had to rely on tables of the spheroidal wave functions (Stratton et al. 1956), we can use the fairly recent implementation of these special functions by Falloon, Abbott and Wang (2003), which now comes with Mathematica 6.0.This way, we get the following implementation: u@t_, n_, x_D := SpheroidalS1@n, 0, π t, xD μ@t_, n_ ?EvenQD := NIntegrate@u@t, n, yD, 8y, −1, 1<D u@t, n, 0D μ@t_, n_ ?OddQD := π t NIntegrate@y u@t, n, yD, 8y, −1, 1<D u H0,0,1L @t, n, 0D λ@s_, n_D := λ@s, nD = s Given all this, one can understand that Jimbo et al.'s (1980) beautiful discovery of expressing E 2 (0; s) by formula (1.6) in terms of the fifth Painlevé transcendent was generally considered to be a major break-through even for its numerical evaluation Fig. 5.The Tracy-Widom distribution F 2 (s); that is, in the edge scaling limit, the probability of the maximal eigenvalue of the Gaussian unitary ensemble (GUE) being not larger than s.The result shown was calculated with the Nyström-like method based on Gauss-Legendre with m = 50; and cross-checked against the asymptotic expansion log F 2 (−s) = −s 3 /12 − log(s)/8 + log(2)/24 + ζ (−1) + O(s −3/2 ) for s → ∞ (Deift et al. 2008).(Tracy and Widom 2000, Footnote 10).However, note how much less knowledge suffices for the application of the far more general Nyström-type method: continuity of K makes it applicable, and K being entire guarantees rapid, exponentially fast convergence.That is all.
An actual numerical experiment.Figure 4 shows the convergence (in IEEE machine arithmetic) of an actual calculation of the numerical values E 2 (0; 1) and E 2 (0; 2).We observe that the Nyström-type method based on Gauss-Legendre has an exponentially fast convergence rate comparable to the Ritz-Galerkin method.Clenshaw-Curtis needs a dimension m that is about twice as large as for Gauss-Legendre to achieve the same accuracy.This matches the fact that Clenshaw-Curtis has the order ν = m, which is half the order ν = 2m of Gauss-Legendre, and shows that the bounds of Theorem 6.2 are rather sharp with respect to ν (there is no "kink" phenomenon here, cf.Trefethen (2008, p. 84)).The dashed line shows the amount, as estimated in (4.5), of roundoff error that stems from the numerical evaluation of the finite dimensional m × m-determinant itself.Note that this bound is essentially the same for all the three methods and can easily be calculated in course of the numerical evaluation.We observe that this bound is explaining the actual onset of numerical "noise" in all the three methods reasonably well.
Remark.Note that the Nyström-type method outperforms the Ritz-Galerkin method by far.First, the Nyström-type method is general, simple, and straightforwardly implemented (see the code given in Section 1); in contrast, the Ritz-Galerkin depends on some detailed knowledge about the eigenvalues and requires numerical access to the spheroidal wave functions.Second, there is no substantial gain, as compared to the Gauss-Legendre based method, in the convergence rate from knowing the eigenvalues exactly.Third, and most important, the computing time for the Ritz-Galerkin runs well into several minutes, whereas both versions of the Nyström-type method require just a few milliseconds.

The Airy kernel.
The Tracy-Widom distribution F 2 (s) (shown in Figure 5), that is, in the edge scaling limit, the probability of the maximal eigenvalue of the Gaussian unitary ensemble (GUE) being not larger than s, is given (Mehta 2004, F. BORNEMANN §24.2) by the Fredholm determinant of the integral operator A s on L 2 (s, ∞) that is induced by the Airy kernel K: Note that K is Hermitian and entire on C × C; thus A s is selfadjoint.(Again, this is already already about all we would need to know for successfully applying and understanding the Nyström-type method.However, we would like to show that, as for the sine kernel, the strong perturbation bound of Lemma 4.1 applies to the Airy kernel, too.)There is the factorization (Tracy and Widom 1994, Eq. (4.5)) which relates the Airy kernel with the Airy transform (Vallée and Soares 2004, §4.2) in a similar way as the sine kernel is related by (7.1) with the Fourier transform.This proves, because of the super-exponential decay of Ai(x) → 0 as x → 0, that A s is the product of two Hilbert-Schmidt operators on L 2 (s, ∞) and therefore of trace class.Moreover, A s is positive semi-definite with maximal eigenvalue λ 1 (A) 1; since by the Parseval-Plancherel equality (Vallée and Soares 2004, Eq. (4.27)) of the Airy transform we obtain, for u ∈ L 2 (s, ∞), More refined analytic arguments, or a carefully controlled numerical approximation, show the strict inequality λ 1 (A) < 1; the perturbation bound of Lemma 4.1 applies.Modification of the Nyström-type method for infinite intervals.The quadrature methods of Section 6 are not directly applicable here, since the integral operator A s is defined by an integral over the infinite interval (s, ∞).We have the following three options: 1. Using a Gauss-type quadrature formula on (s, ∞) that is tailor-made for the super-exponential decay of the Airy function.Such formulae have recently been constructed by Gautschi (2002).
2. Truncating the integral in ( 7.3) at some point T > s.That is, before using the Nyström-type method with a quadrature formula on the finite interval [s, T] (for which the second part of Theorem 6.2 is then applicable, showing exponential convergence), we approximate the Fredholm determinant (7.This way we commit an additional truncation error, which has, by passing through the perturbation bound of Lemma 4.1, the computable bound . (7.4) Figure 6 shows this bound as a function of the truncation point T. We observe that, for the purpose of calculating (within IEEE machine arithmetic) F 2 (s) for s ∈ [−8, 2]-as shown in Figure 5-, a truncation point at T = 16 would be more than sufficiently safe.

3.
Transforming the infinite intervals to finite ones.By using a monotone and smooth transformation φ s : (0, 1) → (s, ∞), defining the transformed integral operator Ãs on L 2 (0, 1) by Ãs u(ξ) = 1 0 Ks (ξ, η)u(η) dη, Ks (ξ, η) = φ s (ξ)φ s (η) K(φ s (ξ), φ s (η)), gives the identity For the super-exponentially decaying Airy kernel K we suggest the transformation φ s (ξ) = s + 10 tan(πξ/2) (ξ ∈ (0, 1)).(7.5) Note that though K(ξ, η) is a smooth function on [0, 1] it possesses, as a function on C × C, essential singularities on the lines ξ = 1 or η = 1.Hence, we can only apply the first part of Theorem 6.2 here, which then shows, for Gauss-Legendre and Clenshaw-Curtis, a super-algebraic convergence rate, that is, O(m −k ) for arbitrarily high algebraic order k.The actual numerical experiments reported in Figure 7 show, in fact, even exponential convergence.From the general-purpose point of view, we recommend the third option.It is straightforward and does not require any specific knowledge, or construction, as would the first and second option.Remarks on other numerical methods to evaluate F 2 (s).As for the sine kernel, there is a selfadjoint second-order ordinary differential operator commuting with A s (Tracy and Widom 1994, p. 166).Though this has been used to derive some asymptotic formulas, nothing is known in terms of special functions that would enable us to base a Ritz-Galerkin method on it.As Mehta (2004, p. 453) puts it: "In the case of the Airy kernel . . . the differential equation did not receive much attention and its solutions are not known." Prior to our work of calculating F 2 (s) directly from its determinantal expression, all the published numerical calculations started with Tracy and Widom's (1994) remarkable discovery of expressing F 2 (s) in terms of the second Painlevé transcendent; namely with q(z) being the Hastings-McLeod (1980) solution of Painlevé II, q (z) = 2q(z) 3 + z q(z), q(z) ∼ Ai(z) as z → ∞.
(7.6) Initial value methods for the numerical integration of (7.6) suffer from severe stability problems (Prähofer and Spohn 2004).Instead, the numerically stable way of solving (7.6) goes by considering q(z) as a connecting orbit, the other asymptotic state being q(z) ∼ −z 2 as z → −∞, and using numerical two-point boundary value solvers (Dieng 2005).

Extension to Systems of Integral Operators.
We now consider an N × N system of integrals operators that is induced by continuous kernels K ij ∈ C(I i × I j ) (i, j = 1, . . ., N), where the I i ⊂ R denote some finite intervals.The corresponding system of integral equations defines, with u = (u 1 , . . ., u N ) and f = ( f 1 , . . ., f N ), an operator equation 8.1.The Fredholm determinant for systems.Assuming A to be trace class, let us express det(I + zA) in terms of the system (K ij ) of kernels.To this end we show that the system (8.1) is equivalent to a single integral equation; an idea that, essentially, can already be found in the early work of Fredholm (1903, p. 388).To simplify notation, we assume that the I k are disjoint (a simple transformation of the system of integral equations by a set of translations will arrange for this).We then have 18 by means of the natural isometric isomorphism where χ k denotes the characteristic function of the interval I k .Given this picture, the operator A can be viewed being the integral operator on L 2 (I) that is induced by the kernel By (3.7) we finally get (cf.Gohberg et al. (2000, Thm 6.1)) 18 The general case could be dealt with by the topological sum, or coproduct, of the intervals I k , One would then use (Johansson 2003) the natural isometric isomorphism By eventually transforming back to the originally given, non-disjoint intervals I k , the last expression is the general formula that we have sought for: det(I + zA) = d(z) with This is a perfectly well defined entire function for any system K ij of continuous kernels, independently of whether A is a trace class operator or not.We call it the Fredholm determinant of the system.
The determinant of block matrices.In preparation of our discussion of Nyströmtype methods for approximating (8.2) we shortly discuss the determinant of N × Nblock matrices Starting with von Koch's formula (3.6), an argument 19 that is similar to the one that has led us to (8.2) yields (8.3) 8.2.Quadrature methods for systems.Given a quadrature formula for each of the intervals I i , namely (8.4) we aim at generalizing the Nyström-type method of Section 6.We restrict ourselves to the case of positive weights, w ij > 0, and generalize the method from the single operator case as given in (6.3) to the system case in the following form: with the sub-matrices A ij defined by the entries 19 Alternatively, we can use (3.4) and, recursively, the "binomial" formula (Greub 1967, p. 121) of exterior algebra, which is valid for general vector spaces V 0 and V 1 .
This can be as straightforwardly implemented as in the case of a single operator.Now, a convergence theory can be built on a representation of the error d Q (z) − d(z) that is analogous to (6.2).To this end we simplify the notation by introducing the following functions on , and by defining, for functions f on Thus, we can rewrite the Fredholm determinant (8.2) in the form Likewise, by observing the generalized von Koch formula (8.3), we put the definition (8.5) of d Q (z) to the form Thus, once again, the Nyström-type method amounts for approximating each multidimensional integral of the power series of the Fredholm determinant by using a product quadrature rule.Given this representation, Theorem 6.2 can straightforwardly be generalized to the system case: Theorem 8.1.If K ij ∈ C k−1,1 (I i × I j ), then for each set (8.4) of quadrature formulae of a common order ν k with positive weights there holds the error estimate uniformly for bounded z.
If the K ij are bounded analytic on E ρ (I i ) × E ρ (I j ) (with the ellipse E ρ (I i ) defined, with respect to I i , as in Theorem A.2), then for each set (8.4) of quadrature formulae of a common order ν with positive weights there holds the error estimate d uniformly for bounded z.

8.3.
Examples from random matrix theory.Here, we apply the Nyström-type method (8.5) to two 2 × 2-systems of integral operators that have recently been studied in random matrix theory.
Two-point correlation of the Airy process.The Airy process A 2 (t) describes, in a properly rescaled limit of infinite dimension, the maximum eigenvalue of Hermitian matrix ensemble whose entries develop according to the Ornstein-Uhlenbeck process.This stationary stochastic process was introduced by Prähofer and Spohn (2002) and further studied by Johansson (2003).These authors have shown that the joint probability function is given by a Fredholm determinant; namely (8.7) Of particular interest is the two-point correlation function where c 1 denotes the expectation value of the Tracy-Widom distribution (7.2).We have calculated this correlation function for 0 t 100 in steps of 0.1 to an absolute error of ±10 −10 , see Figure 8. 20 Here are some details about the numerical procedure: 20 A table can be obtained from the author upon request.Sasamoto (2005, Fig. 2) shows a plot (which differs by a scaling factor of two in both the function value and the time t) of the closely related function g 2 (t) = var(A 2 (t) − A 2 (0))/2 = var(A 2 (0)) − cov(A 2 (t), A 2 (0)) -without, however, commenting on either the numerical procedure used or on the accuracy obtained.
• Infinite intervals of integration, such as in the definition (8.7) of the kernels or for the domain of the integral operators (8.6) themselves, are handled by a transformation to the finite interval [0, 1] as in Section 7.2.
• The joint probability distribution (8.6) is then evaluated, after transformation, by the Nyström-type method of this section, based on Gauss-Legendre quadrature.
• To avoid numerical differentiation, the expectation values defining the twopoint correlation (8.8) are evaluated by truncation of the integrals, partial integration, and using a Gauss-Legendre quadrature once more.Because of analyticity, the convergence is always exponential.With parameters carefully (i.e., adaptively) adjusted to deliver an absolute error of ±10 −10 , the evaluation of the two-point correlation takes, for a single time t and using a 2 GHz PC, about 20 minutes on average.The results were cross-checked, for small t, with the asymptotic expansion (Prähofer andSpohn 2002, Hägg 2007)  2007) have introduced the Airy 1 process A 1 (t) for which, once again, the joint probability distribution can be given in terms of a Fredholm determinant; namely with integral operators A t that are now induced by the kernel functions Ai(x + y + t 2 )e t(x+y)+2t 3 /3 − exp(−(x − y) 2 /(4t)) √ 4πt , t > 0, Ai(x + y + t 2 )e t(x+y)+2t 3 /3 , otherwise.
21 Adler and van Moerbeke (2005) have derived this asymptotic expansion from the masterfully obtained result that G(t, x, y) = log P(A 2 (t) x, A 2 (0) y) satisfies the following nonlinear 3rd order PDE with certain (asymptotic) boundary conditions: The reader should contemplate a numerical calculation of the two-point correlation based on this PDE, rather than directly treating the Fredholm determinant as suggested by us.By basically employing the same numerical procedure as for the Airy process, we have succeeded in calculating the two-point correlation function cov(A 1 (t), A 1 (0)) for 0 t 2.5 in steps of 0.025 to an absolute error of ±10 −10 , see Figure 9. 22 For a single time t the evaluation takes about 5 minutes on average (using a 2 GHz PC).This numerical result has been used by Bornemann, Ferrari and Prähofer (2008) as a strong evidence that the Airy 1 process is, unlike previously conjectured, not the limit of the largest eigenvalue in GOE matrix diffusion.

A. Appendices.
A.1.Quadrature Rules.For the ease of reference, we collect in this appendix some classical facts about quadrature rules in one and more dimensions.
Quadrature rules in one dimension.We consider quadrature rules of the form which are meant to approximate b a f (x) dx for continuous functions f on some finite interval [a, b] ⊂ R. We define the norm of a quadrature rule by Convergence of a sequence of quadrature rules is characterized by the following theorem of Pólya (Davis and Rabinowitz 1984, p. 130).
Theorem A.1.A sequence Q n of quadrature rules converges for continuous functions, which is, by Hadamard's inequality (Meyer 2000, p. 469), 24 bounded by the expression (see also Lax (2002, p. 262)) Now, with This proves the asserted bounds (A.7) and (A.8) with k = 0 and k 1, respectively.The class C ρ bound (A.9) follows analogously to the case k = 0. we get the asserted enclosure.

Fig. 1 .
Fig. 1.Convergence of Ritz-Galerkin (circles) and Galerkin (dots) for approximating the Fredholm determinant of the integral operator induced by Green's kernel of Poisson's equation.The solid line shows the upper bound 1/π 2 m of Ritz-Galerkin as given in (5.11).

Fig. 2 .
Fig. 2. Convergence of the Nyström-type method for approximating the Fredholm determinant of the integral operator induced by Green's kernel (5.8) of Poisson's equation; the underlying quadrature rules Q m are the m-point Gauss-Legendre (dots) and Clenshaw-Curtis (circles) rules.Note that, in accordance with Trefethen (2008), both behave essentially the same.The solid line shows the function 1/25m 2 , just to indicate the rate of convergence.For comparison we have included the results of the Ritz-Galerkin method (stars) from Figure 1.

Fig. 6 .
Fig.6.Values of the expression P T A s P T − A s J 2 , which bounds, by (7.4), the error in F 2 (s) committed by truncating the integral in (7.3) at a point T > s.

A. 3 .
Properties of a certain function used in Theorem 6.2.The power series function on C (as the following lemma readily implies).Lemma A.5.Let Ψ be the entire function given by the expression If x > 0, then the series Φ(x) is enclosed by: the upper bound is obtained for n = 1 and the lower bound for n → ∞.
define the determinant as follows.Given any sequence of finite rank operators A n with A n → A converging in trace class norm, the sequence of finite dimensional determinants 11 det I + zA n ran(A n ) (3.2) (which are polynomials in z) converges locally uniform to det(I + zA), independently of the choice of the sequence A n .The existence of at least one such sequence follows from the singular value representation (2.1). 3. Dunford and Schwartz (1963, p. 1029) define the determinant by what is often called Plemelj's formula 12 1 , s n ) . . . . . .K(t n , s 1 ) • • • K(t n , s n ) s 1 =t 1 ,...,s n =t n