The Connes embedding problem: A guided tour

By Isaac Goldbring

Abstract

The Connes embedding problem (CEP) is a problem in the theory of tracial von Neumann algebras and asks whether or not every tracial von Neumann algebra embeds into an ultrapower of the hyperfinite II factor. The CEP has had interactions with a wide variety of areas of mathematics, including -algebra theory, geometric group theory, free probability, and noncommutative real algebraic geometry, to name a few. After remaining open for over 40 years, a negative solution was recently obtained as a corollary of a landmark result in quantum complexity theory known as . In these notes, we introduce all of the background material necessary to understand the proof of the negative solution of the CEP from . In fact, we outline two such proofs, one following the “traditional” route that goes via Kirchberg’s QWEP problem in -algebra theory and Tsirelson’s problem in quantum information theory and a second that uses basic ideas from logic.

1. Introduction

1.1. What is this all about?

The story told in this tour is (in this author’s humble opinion) absolutely fascinating! It can also be completely confusing and terrifying to an outsider. It contains a seemingly infinite number of acronyms (CEP, WEP, QWEP, LLP, MIP*, RE, …), all sorts of tensor products , entangled particles, and even good friends Einstein and Gödel both make an appearance (the latter twice).

At one end of the story is the Connes embedding problem (CEP), a problem in the field of von Neumann algebras first posed by Alain Connes in his famous 1976 paper “Classification of Injective Factors” Reference 15 (the paper mainly responsible for his being awarded the Fields Medal in 1982). Roughly speaking, a von Neumann algebra is a collection of bounded operators on a Hilbert space containing the identity operator, closed under addition, composition, scalar multiplication, and adjoint, and which is closed in a certain topology known as the weak operator topology. The von Neumann algebras Connes was considering came equipped with a trace functional that shares many of the nice properties enjoyed by the (normalized) trace functional on matrices.

Here is the passage from Reference 15 which led to the establishment of the CEP:

We now construct an approximate imbedding of in . Apparently such an imbedding ought to exist for all II factors because it does for the regular representation of free groups. However, the construction below relies on condition 6.

What is this quote trying to convey? is the hyperfinite II factor, arguably the most important tracial von Neumann algebra. For now, one should just think of as an appropriate limit of matrix algebras of increasing sizes. We will have much to say about this algebra throughout this paper. A II factor is just a particular kind of tracial von Neumann algebra, and the appearing in the passage is a particular II factor satisfying a certain list of properties. By an approximate imbedding of in , Connes means that any finite amount of information about elements of (that is, the trace of finitely many -polynomials with elements from plugged in) can be simulated by appropriate elements of . Connes later shows that such approximate imbeddings correspond to actual embeddings of into a so-called ultrapower of , denoted . He comments that such an embedding “ought” to always exist since it does for a particular von Neumann algebra, namely the group von Neumann algebra associated to the free group, denoted (see Subsection 3.7). Why that embedding “ought to exist” is not quite clear. Nevertheless, Connes is only able to show that the under consideration can be embedded in using one of the conditions (namely the sixth one) he has assumed about this particular algebra.

Thus, the Connes embedding problem (CEP) states: every tracial von Neumann algebra embeds (in a trace-preserving way) in an ultrapower of . We will say this slightly more precisely in Subsection 3.6. Many prefer to call this a “problem” rather than a “conjecture” since “ought to” is not a very strong sentiment.

The robustness of the CEP lies in its many reformulations and from the many areas of mathematics it has touched upon; see Section 2 for some examples.

At the other end of this story (and seemingly a world far, far away) is a landmark theorem in quantum complexity theory known as Reference 39. Like most theorems in complexity theory, it compares two complexity classes. Roughly speaking, a complexity class consists of a collection of problems that all share some common level of difficulty with which one can solve or verify these problems. The class denotes those problems for which there is a computer program so that, if you left the program running long enough, would list all instances for which the problem has a positive answer (but you would never know about instances with a negative answer). Usually, complexity theorists are more interested in levels of efficiency, and the class is hardly ever discussed. The other complexity class in the above equation is , which denotes those problems for which a verifier interacting with multiple cooperating (but noncommunicating) provers who share a source of quantum entanglement can reliably verify a positive instance of a problem. The result states that these two classes coincide! This is a monumental result, for it shows the power of quantum ideas in computational complexity. One particular instance of this result is that the (in)famous halting problem, which asks if a particular computer program will halt on, say, the empty input (which is known to be an undecidable problem), can actually be efficiently and reliably verified by two provers sharing some quantum entanglement. Here, “efficiently” means in polynomial time and “reliably” means that if the machine halts, then the verifier will accept the provers’ proof of that fact with probability , while if it does not halt, then only half the time will they accept a (necessarily incorrect) proof purporting to show that the machine halts. (An execution of the protocol has a probabilistic outcome, whence here the condition is that there is acceptance with probability at most over the verifiers’ and provers’ random choices in the case of a Turing machine that does not halt. If making a mistake half of the time seems unacceptable, the reader will find comfort in knowing that the chance of mistake can be made as close to as one wishes upon repetition of the protocol.) This is an astounding result!

Even more amazing than the sheer statement of the result is that the equality actually yields a negative solution to CEP!

1.2. Connecting the dots

But how could these seemingly unrelated topics be so tightly connected? The answer lies within a series of previously known connections. First, in a fundamental paper of Kirchberg Reference 44, it was shown that CEP is equivalent to an important problem (Kirchberg even used the word conjecture) in the theory of -algebras stemming from the complexity of -tensor products known now as Kirchberg’s QWEP problem (see Subsection 3.8). Later, Fritz Reference 24 and, independently, Junge et al. Reference 41 demonstrated that a positive answer to the QWEP problem would yield a positive answer to a problem in quantum information theory known as Tsirelson’s problem which, roughly speaking, asks whether the usual quantum mechanical framework and that coming from quantum field theory yield the same set of quantum correlations corresponding to Bell experiments. While the jump from the QWEP problem to Tsirelson’s problem might seem like quite a leap, once one unravels the definitions, this is actually a fairly straightforward argument, which will be given in Subsection 6.1. Both sets of authors almost proved that the Kirchberg and Tsirelson problems were actually equivalent; Ozawa succeded in connecting the last dots in Reference 47.

Now we are at least in the same arena: quantum information theory and quantum complexity theory (both areas at least have “quantum” in their names). The last step in the puzzle is to use a result of Fritz, Netzer, and Thom Reference 25 about the computability of the operator norm for universal group -algebras and the analysis leading to the equivalence of QWEP and Tsirelson to show that if Tsirelson’s problem has a positive answer, then every language in would actually be decidable, contradicting . (See Subsection 6.1 for the complete argument.)

Okay, so that was a mouthful!

1.3. Why another treatment of CEP?

Numerous accounts of the CEP and its many equivalents can be found in the literature. In fact, Pisier Reference 52 recently wrote a fascinating account (coming in just shy of 500 pages) on the CEP and its equivalences with QWEP and Tsirelson (and so, so much more). Much trimmer accounts were given by Ozawa Reference 47Reference 48 and Capraro and Lupini Reference 13.

So if there are so many accounts of the CEP, why write another? We have several good reasons:

First, all of the above accounts were written pre-, so none of them actually explain how the story resolves itself.

Second, all of the above accounts go into an extreme amount of detail and assume a fair amount of background knowledge in operator algebras. We envision the reader in, say, quantum physics or complexity theory wanting to understand the main thread of the story and being overburdened by the overhead needed to enter the fray. In this survey, we try very hard to at least state all of the necessary definitions. On the other hand, we offer very few details or proofs in the interest of space and refer the reader to the above references if they are interested in the gritty details. Also, since we are focusing on the one-way implications (as opposed to the equivalences the other accounts present), we save ourselves some complications.

Third, the operator algebra community may know very little quantum theory or complexity theory, so we offer brief introductions to these areas to at least paint the picture for them.

Finally, and most certainly gratuitously, we offer an alternative and, in this author’s biased opinion, simpler path from to the failure of CEP than that outlined above using basic methods from mathematical logic. This path also offers some extra bells and whistles to the failure of CEP, including a Gödelian refutation of the CEP and a proof of the existence of many counterexamples to the CEP. While we have our logician hats on, we take advantage of the fact that we have the readers’ attention to describe a model-theoretic weakening of the CEP that is still open and quite fascinating (at least to us!).

1.4. A quick guide to this guide

In Section 2, we briefly describe some of the known equivalents of the CEP. The reader may benefit from coming back to this section after having read some of the definitions, but this is supposed to whet the reader’s appetite and convince them that the rest of the paper is worth reading.

Section 3 is a crash course in operator algebras, assuming some basic functional analysis that someone in quantum physics should probably be familiar with. We cover both the and von Neumann algebra background needed as well as topics such as states and traces, the ultrapower construction, operator algebras arising from groups, and finally, what is so darn complicated about -algebra tensor products, culiminating in a discussion of why a positive solution to the CEP implies a positive solution to the QWEP problem.

Section 4 is a similar crash course but this time in complexity theory. We start from the definition of Turing machines, defining some of the basic complexity classes, and then work our way up to the class of languages verifiable by a verifier interacting with multiple cooperating provers.

In Section 5, we make a quantum detour for those unfamiliar with the basic tenets of quantum mechanics and even take a digression on superdense coding just for fun (and to indicate the power of entanglement). This section culminates with the definition of the complexity class , the analogue of where the provers are allowed to share quantum entanglement as a resource, and the precise statement of the result .

Section 6 contains the details of the proof of the failure of the QWEP problem from by first showing how the latter yields a negative solution to Tsirelson’s problem and then by establishing how a negative solution to Tsirelson’s problem yields a negative solution to the QWEP problem. Combined with our derivation of a positive solution of the QWEP problem from a positive solution to the CEP, this completes the proof of the negative solution to the CEP from .

Section 7 offers the alternative proof alluded to above using basic ideas from logic. We present the appropriate logic for studying tracial von Neumann algebras and discuss the main contribution from logic, namely Gödel’s completeness theorem. We also describe the extra information about the CEP gleamed from the logical perspective mentioned above, including a completely operator-algebraic reformulation of our main model-theoretic contribution in terms of the undecidability of a certain moment approximation problem. We also offer an alternative proof of the failure of Tsirelson from using the completeness theorem. Most of the material in this section represents joint work with Bradd Hart Reference 31Reference 32.

Finally, in Section 8, we discuss the open problem around the existence of the so-called enforceable factor, which is the model-theoretic weakening of the CEP referred to above.

2. Equivalent reformulations of CEP

One of the aspects of the CEP that makes it such an interesting problem is its numerous equivalences spanning many seemingly different areas of mathematics. In the main text, the equivalences with Kirchberg’s QWEP conjecture in -algebra theory and Tsirelson’s problem in quantum information theory will be expounded on in more detail due to their relevance to the current story. In this section, we briefly mention some of the other well-known equivalences.

2.1. Free probability theory

In free probability theory, one considers noncommutative probability spaces, such as tracial von Neumann algebras , where the elements of act as noncommutative random variables and the trace is the analogue of the integral. Voiculescu demonstrated the robustness of this theory, establishing free analogues of many familiar facts from ordinary probability theory and giving applications to operator algebras and random matrices (to name a few). A nice introduction to free probability is Speicher’s lecture notes Reference 57.

In classical probability theory, the entropy of a random variable is an important numerical value measuring the amount of information obtained when measuring the random variable. One method of calculating the entropy of a discrete random variable with probability distribution is to approximate the distribution using microstates, which are functions for which the fraction of , where is within of for all , …, . By taking the logarithm of the number of such functions divided by for a given pair of parameters and then letting and , we obtain the entropy of the distribution. A more general version of this works for a wider class of random variables.

When faced with the task of defining the free entropy of a tuple of self-adjoint elements in a tracial von Neumann algebra , Voiculescu proceeds analogously by considering those tuples of self-adjoint matrices in some matrix algebra for which a certain finite number of moments approximate the corresponding moments in the tracial von Neumann algebra; that is, and differ by at most for finitely many noncommutative -polynomials in -variables. Now one has to calculate the volume of the set of those matrices and let the various parameters involved tend to infinity or . With this definition of free entropy, one can prove a number of results which are the free analogues of the corresponding result in the classical theory. For example, it is known that a tuple of classical random variables has maximal entropy if and only if they are independent and have Gaussian distribution. In the free theory, the free entropy of a tuple is maximal if and only if the elements of the tuple are freely independent and have semicircular distributions (which are known to be the free analogue of the Gaussian distribution). The paper Reference 63 is a survey of free entropy by Voiculescu himself.

This definition of free entropy leads to an interesting feature: if there are no such tuples that simulate , then the free entropy of equals . It is well-known (see Subsection 3.6) that, for a given tracial von Neumann algebra , the set of such moments is nonempty for all such tuples from if and only if embeds into (in a trace-preserving way). Thus, CEP is equivalent to all tuples of self-adjoint elements in tracial von Neumann algebras having nonnegative free entropy.

2.2. Hyperlinear groups

Okay, so this one really is not an equivalence but rather an equivalence with a special case of the CEP. An important notion in group theory is that of a sofic group. Roughly speaking, a countable discrete group is sofic if, for every finite subset of , there is a symmetric group and a function that is an approximately injective approximate homomorphism. For example, if , then one would like to say that is close to , where closeness is measured with respect to the normalized Hamming distance between permutations (which calculates what fraction of elements the permutations disagree on). The importance of this class of groups is that many important conjectures in group theory are known to hold when restricted to the class of sofic groups. Surprisingly, there is no known example of a nonsofic group! One can make a similar definition, replacing symmetric groups with unitary groups , equipped with their normalized Hilbert–Schmidt metric; the resulting class of groups is called the class of hyperlinear groups. Every sofic group is hyperlinear, and since we do not know if every group is sofic, we do not know if this inclusion is proper. Moreover, there is no known example of a nonhyperlinear group. We refer the reader to Reference 13 for more information on sofic and hyperlinear groups.

The connection with CEP comes via an observation of Radulescu Reference 53, who showed that is hyperlinear if and only if the group von Neumann algebra of (see Subsection 3.7) embeds into . In other words, if CEP is true just for group von Neumann algebras, then every group is hyperlinear!

Interestingly enough, even though we now know that CEP is false, we still do not know if its special case for group von Neumann algebras holds; that is, we still do not know if every group is hyperlinear.

2.3. Embeddability of general von Neumann algebras

The CEP is about tracial von Neumann algebras. But there is a much wider class of von Neumann algebras out there. Is there a reformulation of the CEP that addresses them? The answer is yes, and it was established by Ando, Haagerup, and Winslow in Reference 1. There is a so-called type III (in the sense of Subsection 3.5) version of , called the Araki–Woods factor , which is the unique hyperfinite type III factor. Moreover, there is a generalization of the tracial ultraproduct construction, known as the Ocneanu ultraproduct, that covers the much larger class of -finite von Neumann algebras, of which is one of them. The main result of Reference 1 states that CEP is equivalent to the assertion that every separably acting von Neumann algebra embeds with expectation into the Ocneanu ultrapower . The notion of an embedding with expectation is defined in Subsection 3.9. In the case of tracial von Neumann algebras, the embedding is automatically with expectation, but in the general case, it is a necessary nontriviality condition.

2.4. Existentially closed factors

The model-theoretic notion of an existentially closed (e.c.) structure is the generalization of the notion of algebraically closed field to an arbitrary structure (see Subsection 8.3 for a precise definition). In particular, it makes sense to study e.c. groups, e.c. graphs, and, yes, even e.c. tracial von Neumann algebras. One can prove many general facts about the class of e.c. tracial von Neumann algebras, such as they must be II factors with McDuff’s property and with only approximate inner automorphisms. There are a plethora of e.c. tracial von Neumann algebras; in particular, every tracial von Neumann algebra embeds in an e.c. one. However, can one actually name a concrete e.c. tracial von Neumann algebra? It turns out that a positive solution to CEP is equivalent to the statement that is an e.c. tracial von Neumann algebra. A proof of this fact will be given in Subsection 8.3.

2.5. Noncommutative real algebraic geometry

A Positivstellensatz is a theorem that declares that certain elements that are positive in some way are so for some good reason. Perhaps the best-known such result is the positive solution to Hilbert’s 17th Problem, due to Artin Reference 2 (although this author is unabashedly fond of Abraham Robinson’s model-theoretic solution Reference 54): if is a positive semidefinite rational function (that is, a rational function such that for all , …, ), then is a sum of squares of rational functions, providing a good reason that is positive semidefinite.

One can ponder noncommutative versions of Artin’s theorem. First, we set to be the set of polynomials in noncommuting variables. Consider the positivity statement that for all self-adjoint matrices , …, of operator norm at most , for all . Then a theorem of Helton and McCullough Reference 36 tells us that there is a good reason for this kind of positivity, namely that, for all , belongs to the quadratic module generated by , , …, . Here, a quadratic module is a subset of containing , closed under addition and closed under the function , where and (and where is the result of reversing the orders of the variables in each monomial of ). Note indeed that all functions in the quadratic module generated by the ’s must be positive in the above sense, and the Helton–McCullough result says that this is (approximately) the good reason that any such noncommutative polynomial might be positive.

Now suppose instead that we assume that is merely trace positive, that is, for all such , …, as in the previous paragraph. Clearly, the operators in the Helton and McCullough result are trace positive. But now you can also add finite sums of commutators since the trace of a commutator vanishes. One can ask if this new class of noncommutative polynomials gives a necessary and sufficient condition to be trace positive; that is, if is tracially positive, must it be the case that, for every , differs from an element of the quadratic module generated by the ’s by a sum of commutators? It turns out that this tracial version of the Positivstellensatz from the previous paragraph is actually equivalent to the CEP, a result proven by Klep and Schweighofer in Reference 45.

3. A crash course in operator algebras

In this long section, we explain all of the background material in operator algebras one needs to know to understand the statements of both the CEP and the QWEP problem as well as to understand how a positive solution to the former implies a positive solution to the latter. Nearly everything discussed here can be found in Pisier’s book Reference 52. Brown and Ozawa’s book Reference 12 is another nice reference.

3.1. Introducing -algebras

A -algebra is an algebra over satisfying, for all and ,

,

,

,

.

If is actually a unital algebra over with unit for which , we say that is a unital -algebra. There are obvious notions of -subalgebra of a -algebra and unital -subalgebra of a unital -algebra.

A -homomorphism between -algebras is an algebra homomorphism that also preserves the -operation. If is a -homomorphism between unital -algebras, then we implicitly assume that maps the unit of to the unit of .

In this paper, the most relevant (unital) -algebras are , the -algebra of bounded operators on the Hilbert space , and its (unital) -subalgebras. Recall that for a Hilbert space , a linear operator is bounded if its operator norm is finite. is a -algebra with the algebra operations being addition, composition, and scalar multiplication and with the -operation being given by the adjoint, where, for , we have that is the unique operator for which for all . (In connection with this formula, we follow the convention that inner products are linear in the first argument and conjugate-linear in the second argument; this is the opposite of the convention used in the physics literature.) is a unital -algebra with identity operator acting as the unit.

We now define our first class of operator algebras, namely the class of -algebras. For both classes of operator algebras, there are two approaches to their definition, namely the concrete and the abstract. A concrete -algebra is a -subalgebra of that is closed in the operator norm topology. If, moreover, contains the identity , then we say that is a unital concrete -algebra.

We now present the abstract approach to -algebras. Suppose that is a -algebra. A -norm on is a norm on satisfying the following identities for all :

,

,

.

The first two identities are the usual axioms for defining a normed -algebra; the last axiom, called the -identity, is what makes a -norm a -norm. An abstract -algebra is a -algebra equipped with a complete -norm. If is a -algebra equipped with a -norm, then the -algebra operations extend naturally to the completion of the -algebra, which is then an abstract -algebra. An abstract unital -algebra is an abstract -algebra that is a unital *-algebra; in this case, we have .

It is an important fact that a -homomorphism between abstract -algebras is necessarily contractive; it is an isometric embedding if and only if it is injective. In particular, given any -algebra , there is at most one norm on which makes into a -algebra.

It is an easy exercise to see that the operator norm on is a -norm, whence every concrete -algebra is an abstract -algebra. On the other hand, the Gelfand–Naimark theorem states that every abstract -algebra is isomorphic (as abstract -algebras) to a concrete -algebra. This result can be reformulated in terms of representations of -algebras. Given a -algebra , a representation of is a -homomorphism for some Hilbert space . Usually a nondegeneracy condition is assumed on a representation, namely that is dense in ; if is unital, this is equivalent to assuming that . The representation is faithful if it is moreover injective (equivalently isometric). Thus, the Gelfand–Naimark theorem states that every abstract -algebra admits a faithful representation.

From here on out, we no longer make a distinction between concrete and abstract -algebras and simply take either perspective whenever it is convenient.

Unless stated otherwise, in the rest of this paper, we restrict our attention to unital -algebras; we might often repeat this convention for emphasis.

A -algebra is commutative (or abelian) if its multiplication is commutative. Given a compact Hausdorff space , the set of continuous, complex-valued functions is a unital commutative -algebra under the pointwise operations of addition, multiplication, and scalar multiplication, with the -operation being given by (complex conjugate), and with norm given by . In fact, all unital commutative -algebras are of this form, and there is a dual equivalence of categories (known as Gelfand duality) between compact Hausdorff spaces with continuous maps and unital commutative -algebras with -homomorphisms. For this reason, -algebra theory is often dubbed “noncommutative topology.”

There are special kinds of elements in -algebras that will be important throughout this paper. If is a -algebra, is called

self-adjoint if ,

positive if for some ,

a projection if is self-adjoint and ,

unitary if .

In the case that , the self-adjoint (resp., positive) elements are those with spectrum contained in the reals (resp., the positive reals) while the projections correspond to orthogonal projections onto closed subspaces. In the case that for a compact space, the self-adjoint (resp., positive) elements correspond to the real-valued (resp., positive real-valued) functions while the projections correspond to those functions which take only the values or .

3.2. Let’s be positive

An important role in this story is played by maps between -algebras which are not necessarily -homomorphisms but which still preserve some remnants of the -algebra structure. It is hard to truly appreciate the importance of these maps without getting into the details of the results to follow, but we introduce the terminology in order to be able to follow the definitions and theorems.

First, we say that a linear map between -algebras is positive if it maps positive elements to positive elements. Note that a -homomorphism is positive: .

For many purposes, a stronger version of positivity is needed. First, for any -algebra and any , we let denote the set of matrices with entries from . We can view as a -subalgebra of and it is readily verified that, under this identification, is closed in the operator norm topology, that is, is a -algebra once again. Note also that a linear map induces a linear map given by . We say that is completely positive (cp) if each is a positive map. If, in addition, , we say that is unital, completely positive (ucp). A -homomorphism is easily seen to induce -homomorphisms , whence -homomorphisms are ucp. It can be shown that if is commutative, then any positive map is automatically completely positive.

In a similar vein, one says that as above is completely bounded (resp., completely contractive) if each is bounded (resp., contractive).

A fundamental theorem of Stinespring says that ucp maps are not too far away from being -homomorphisms. More precisely, consider the following situation: suppose that and are Hilbert spaces and is an isometry, that is, a linear map for which (or, in other words, for all ). Then for any representation of , we have a map given by which is readily verified to be ucp. The Stinespring dilation theorem says that all ucp maps arise in this way. A particular consequence of this theorem is that ucp maps are completely contractive. The relevance of Stinespring’s theorem to our story is that certain results that hold somewhat immediately for -homomorphisms will also hold for ucp maps (see Subsection 3.8).

We mention in passing that cp maps play an important role in quantum information theory. Indeed, one perspective on a quantum state (say on a finite-dimensional state space) is that of a positive matrix of trace , corresponding to the density matrix of some mixed state. A quantum channel is a linear map that is to represent some allowable physical transformation on quantum states. In particular, if it is to map density matrices to density matrices, then it should be trace-preserving and positive. However, often one needs to add ancilla bits to a given state and then apply the correspondiing quantum channel. The desire to have the resulting matrix be a density matrix again is equivalent to the requirement that the quantum channel be completely positive instead of merely positive. The reader can find more details in, for example, Vern Paulsen’s lecture notes Reference 50.

3.3. Introducing von Neumann algebras

We now turn our attention to the other kind of operator algebra, the von Neumann algebra. Once again, we have a choice between a concrete definition and an abstract definition. A concrete von Neumann algebra is a unital -subalgebra of closed in the weak operator topology (WOT), where a subbasic open neighborhood of in the WOT has the form for some and some . Said differently, the WOT is the smallest topology on making the maps continuous for all . It is easy to check that the WOT is a finer topology on than the operator norm topology, whence every concrete von Neumann algebra is a (unital) concrete -algebra. In general, von Neumann algebras are much bigger than general -algebras due to the, well, weakness of the WOT.

A theorem of Sakai allows for an abstract reformulation: a (unital) abstract -algebra is isomorphic (as an abstract -algebra) to a concrete von Neumann algebra if and only if is isometrically isomorphic to a dual Banach space, that is, if and only if there is a closed subspace such that isometrically. In this case, is unique and is called the predual of , denoted .

Von Neumann’s bicommutant theorem allows for a purely algebraic reformulation of being a von Neumann algebra that is incredibly important to the theory. First, given a subset , set , the so-called commutant of . Note that for any set , we have that is a von Neumann subalgebra of and that . Von Neumann’s bicommutant theorem states that for any unital -subalgebra of , coincides with the WOT-closure of in ; this common algebra is the von Neumann algebra generated by . In fact, the bicommutant theorem shows that both of these coincide with the closure of in the strong operator topology (SOT), where now a subbasic open neighborhood of in the SOT has the form for some and some . A consequence of the bicommutant theorem is that a unital -subalgebra of is a von Neumann algebra if and only if .

A von Neumann algebra is called separable if its bidual is a separable Banach space. This is equivalent to having a concrete representation of on a separable Hilbert space, whence one sometimes calls such a von Neumann algebra separably acting.

Just as in the case of -algebras, we can completely characterize the commutative von Neumann algebras. Given a -finite measure space , we can view by identifying with given by . It is an exercise to check that , whence is a commutative von Neumann algebra. Moreover, all commutative von Neumann algebras have this form, whence von Neumann algebra theory is often dubbed “noncommutative measure theory.”

While -homomorphisms between von Neumann algebras are automatically contractive with respect to the operator norm (as they are -algebras), since the relevant topology for defining von Neumann algebras is the WOT, the appropriate continuity condition relates to this latter topology. More precisely, a positive linear map between von Neumann algebras is normal if the restriction of to the operator norm unit ball of is continuous when and are equipped with their WOT topologies. (The restriction to the operator norm unit ball may seem slightly unsightly; this is equivalent to saying that is continuous when both and are equipped with their weak*-topologies when viewed as dual Banach spaces.) One can reformulate normality in an intrinsic way that does not refer to the particular realization of and : is normal if and only if it is positive, linear, and for every bounded increasing net of positive elements in .

3.4. States and traces

Fix a compact Hausdorff space . Given a complex Borel measure on , we can consider the associated integration functional given by which satisfies , where denotes the total variation norm of . Setting to be the Banach space of complex Borel measures on , the Riesz representation theorem implies that this association yields an isomorphism . Moreover, the probability measures on correspond to those for which is a positive map satisfying .

More generally, given a unital -algebra , a state on is a positive linear functional on with . Thus, the states on a unital abelian -algebra correspond to the integration functionals associated to probability measures on , and one thinks of states on arbitrary -algebras as the abstract analogue of such an integral.

The states on form a convex, closed subset of . The extreme points of are referred to as pure states. By the Krein–Milman theorem, finite convex combinations of pure states are dense in the space of all states. The pure states on are the vector states, that is, the states of the form for some .

A consequence of the Hahn–Banach theorem is the fact that, for any -algebra and any self-adjoint element , we have . Another consequence of the Hahn–Banach theorem is that whenever is a subalgebra of , any state on can be extended to a state on .

When is finite dimensional, every state on is of the form for a unique positive operator of trace , where denotes the trace of an operator. The operator is often called the density matrix for the state. When is not necessarily finite dimensional, the same result holds true for states on that are continuous with respect to the weak*-topology on , except that is now stipulated to be a trace-class operator (see, for example, Reference 35, Theorem 19.9).

The term “state” comes from quantum mechanics. A first introduction to quantum mechanics will introduce a state of a physical system as simply a unit vector in the Hilbert space associated to the physical system. This usage of the word state corresponds to the vector states described above. Later, one encounters the notion of mixed state (to accomodate the fact that results of quantum measurements are merely probabilistic ensembles of pure states), which is often defined in terms of the density matrix as defined above. The role of a state in quantum mechanics is simply to assign expected values of observables (see Reference 35, Section 19).

An important construction associated to a state is the Gelfand–Naimark–Segal (GNS) construction, which associates a representation of the -algebra to the state. Suppose that is a state on . We define a sesquilinear form on by . It is straightforward to check that this is a so-called pre-inner product on in that it satisfies all of the properties of being an inner product except that need not necessarily imply that ; when is actually an inner product, we say that is faithful. Associated to is the seminorm on given by . We obtain a Hilbert space, denoted , by quotienting out by the closed subspace of vectors with and then taking the completion. Given , we let denote its equivalence class in . It follows that there is a representation uniquely determined by the condition for all . The representation is cyclic, meaning that there is a vector for which , namely . Moreover, the vector state on restricts to on the image of . Note that is a faithful representation precisely when is faithful.

There is a converse to the above construction: if is a cyclic representation of with cyclic vector , then one obtains a state on by and the GNS representation associated to is unitarily equivalent to .

Define and set , which we call the universal representation of . Since for any self-adjoint , it follows that is a faithful representation of . Since any representation of is a direct sum of cyclic representations and since every cyclic representation of is, up to unitary equivalence, of the form for some , we see that every representation of is unitarily equivalent to a subrepresentation of , whence the name!

An important ingredient in this story is the von Neumann algebra generated by the image of in . It can be shown that this von Neumann algebra is isometrically isomorphic to the Banach space , whence it is this notation that is usually used.

A state on is called a tracial state if for all . For example, the normalized trace on given by is a tracial state on .

Since is a von Neumann algebra, it makes sense to speak of normal states on von Neumann algebras. A faithful normal tracial state on a von Neumann algebra is simply referred to as a trace. A von Neumann algebra is called finite if it admits a trace. (This terminology makes much more sense if you introduce Murray–von Neumann equivalence of projections.) A tracial von Neumann algebra is a pair , where is a von Neumann algebra and is a trace on . An embedding of tracial von Neumann algebras is a normal, injective -homomorphism that preserves the trace. The normalized trace on is a trace (in the von Neumann algebra sense) on . However, when is infinite dimensional, there is no trace on .

Suppose that is a trace on . The corresponding representation is normal and the image of is WOT-closed in . Also, is separable if and only if it is separable with respect to the metric stemming from the norm . When restricted to the unit ball of , the topology induced by coincides with the SOT on it inherits from .

We end this section by showing how traces can be used to define the hyperfinite II factor, the star of this paper!

Given , there is a natural embedding of tracial von Neumann algebras given by . In this way, we obtain a directed system of tracial von Neumann algebras whose union is a -algebra we denote by . The fact that the embeddings preserve the normalized traces on the individual matrix algebras implies that has a tracial state on it. We apply the GNS construction to (which still works even though the original algebra is not necessarily complete) and take the von Neumann algebra generated by inside of . This von Neumann algebra is called the hyperfinite II factor . We will see the reason for the “II factor” in the name in the next section but the terminology “hyperfinite” can be explained now. A separable von Neumann algebra is called hyperfinite if it contains an increasing union of finite-dimensional subalgebras whose union is WOT-dense. Murray and von Neumann showed that there is a unique separable hyperfinite II factor. Consequently, if we started the above construction with any instead of , we would have arrived at the same II factor, namely .

3.5. More on von Neumann algebras

The center of a von Neumann algebra is . A von Neumann algebra is called a factor when its center is trivial, that is when . It is quite easy to see that is a factor; in particular, each is a factor. This makes it plausible that is also a factor, given that it is the completion of an increasing sequence of factors—one just needs to check that no elements snuck into the center at the completion stage.

The interest in factors comes from the fact that they are the building blocks of all von Neumann algebras in the sense that every von Neumann algebra can be written as a direct integral (a generalization of direct sum) of factors, and thus the study of arbitrary von Neumann algebras can usually be reduced to studying factors.

Murry and von Neumann divided the collection of factors into three types, (creatively) called types I, II, and III. They further split the first two types into subtypes as follows. First, for each , there is a unique factor of type I, namely . The unique factor of type I is for infinite dimensional. Next, a II factor is an infinite-dimensional finite factor, that is, an infinite-dimensional factor that admits a trace. Thus, the hyperfinite II factor is indeed a II factor. A II factor is one that can be written as a proper increasing union of type II factors. Equivalently, a II factor can be written in the form for some II factor (see Subsection 3.8 for the definition of tensor products of von Neumann algebras). There is also division of type III factors into subtypes III for , but we will not need to get into that here.

It is important to note that, while an arbitrary finite von Neumann algebra may have many traces, the trace on a finite factor is unique. We also note the crucial fact that embeds into any II factor.

As alluded to above when defining finite von Neumann algebras, the above type classification makes more sense in the context of Murray–von Neumann equivalence. However, we can still see this division using traces. Given a von Neumann algebra , let denote the set of projections in . If is a trace on , set . Note that, for the unique type I factor , we have , one value for every dimension being projected onto. On the other hand, for a II factor , one can show that . Thus, we think of II factors being like matrix factors in that they admit traces, but now we have a continuous dimension for projections.

One can explain the cases I and II using traces if one considers the unnormalized trace on given by , where is any orthonormal basis for . Note that can take the value . We then have that the possible traces of projections for the I factor are while in the type II case they are .

3.6. The tracial ultrapower construction and the official statement of the CEP

In general, an ultraproduct of a collection of similar structures is a structure of the same type that represents some sort of limit of these structures. There are ways of making this precise using model theory, and we will discuss this in Subsection 7.2. In this section, we show how to carry this construction out in the case of tracial von Neumann algebras and see how this allows us to precisely state the CEP.

First, one needs to introduce the notion of an ultrafilter. (For an entire book devoted to ultrafilters and their applications, see the author’s Reference 27). Given an index set , an ultrafilter on is simply a -valued finitely additive probability measure on . One often identifies with its set of measure sets and writes instead of . From this perspective, one can alternatively view an ultrafilter on as a division of the subsets of into two categories—-large and -small—satisfying the following requirements: is -large while is -small; the intersection of two -large sets is -large; a superset of a -large set is once again -large; for any subset of , exactly one of or is -large.

Following typical measure-theoretic terminology, given a property that may or may not hold of elements of , we may write “for -almost all , holds” when belongs to .

Given a bounded sequence of complex numbers, it is straightforward to show that there is a unique complex number such that, for every , we have for -almost all . This unique complex number is called the -ultralimit of the sequence , denoted or simply . The fact that every bounded sequence has a limit (in this generalized sense) is but one indication of the utility of ultrafilters.

Given , the unique ultrafilter on for which is called the principal ultrafilter generated by , and is denoted . An ultrafilter on is called nonprincipal if it is not principal. Equivalently, is nonprincipal if for all finite . It is easy to check that , whence ultralimits along principal ultrafilters do not really capture a genuine notion of limit. It is a basic fact that, for any infinite set , there is a nonprincipal ultrafilter on (in fact there are many!). However, any proof of the existence of such ultrafilters is necessarily nonconstructive, and thus it is impossible to explicitly describe a nonprincipal ultrafilter (see Reference 27, Chapter 5 for more on these foundational matters concerning ultrafilters). In what follows, the specific choice of nonprincipal ultrafilter will be irrelevant for our purposes.

We now come to the tracial ultraproduct construction. Fix a family of tracial von Neumann algebras and an ultrafilter on . We first set

that is, collects all those sequences from the Cartesian product for which the operator norms of the coordinates are uniformly bounded. It is readily verified that is a -algebra under the supremum norm. It is tempting to try to define a tracial state on by declaring , which makes sense given that the sequence is a uniformly bounded sequence of complex numbers (a consequence of the uniform bound on the operator norms of the coordinates of ). Unfortunately, if each is a positive element of , whence is a positive element of , with the property that , then we have that even though may not be zero. In other words, this definition leads to a tracial state on that is not faithful.

We fix the above problem by defining . While there are a lot of things to check, we have that:

is a two-sided ideal in ,

is a von Neumann algebra,

the induced tracial state on given by is a trace (that is, a faithful, normal tracial state) on , where denotes the coset of modulo .

The resulting tracial von Neumann algebra is denoted and is called the tracial ultraproduct of the family with respect to . In a certain sense, one should think of this ultraproduct as some sort of limit of the constitutent tracial von Neumann algebras, an analogy that will become clearer as we proceed. When each is a finite factor, then the trace on each factor is unique and we simplify the notation to . When each , we simply write and speak of the ultrapower of with respect to . Similarly, the ultrapower of a finite factor is denoted . We view any tracial von Neumann algebra as a subalgebra of via the diagonal embedding which maps an element to the coset of the diagonal sequence modulo .

When is principal, one can verify that , whence this is not a terribly interesting procedure. The true power of the ultraproduct construction comes when one uses a nonprincipal ultrafilter, for then the ultraproduct is sort of an average or limit of the constitutent tracial von Neumann algebras.

If , then is also finite dimensional. Otherwise, is quite large, in fact, nonseparable, even if each is separable. It is quite common to hear expressions such as “every II factor in this paper (or talk) is separable unless it isn’t.” This tautology refers to the fact that often researchers are only interested in separable tracial von Neumann algebras, and the only nonseparable II factors that one might encounter are those obtained from a nonprincipal ultraproduct of a family of separable II factors. (We are being a bit sloppy—nonprincipality only guarantees nonseparability when the index set is countable; otherwise, one needs the mild assumption of countable incompleteness.)

We can now officially state the CEP.

Connes embedding problem.

Given any nonprincipal ultrafilter on , every separable tracial von Neumann algebra embeds into ; that is, it admits a trace-preserving injective -homomorphism into .

Let us make several remarks on variations of the statement of the CEP:

It can be shown using some basic model theory that the CEP is equivalent to the statement that every separable tracial von Neumann algebra embeds into for some nonprincipal ultrafilter on . This fact can also be witnessed by using a simple ultrafilter-free equivalent reformulation of the CEP known as the microstate conjecture, discussed below.

The restriction to separable tracial von Neumann algebras is not necessary. It can be shown that ultrapowers of with respect to certain kinds of ultrafilters on larger index sets, known as good ultrafilters, lead to larger ultrapowers that can embed tracial von Neumann algebras of larger density character. In other words, we can reformulate CEP by saying every tracial von Neumann algebra embeds into some ultrapower of .

The validity of the CEP does not change if we restrict ourselves to embedding II factors into an ultrapower of . The reason for this is due to the fact that every tracial von Neumann algebra embeds into a II factor, say . Here is the group von Neumann algebra of the group of integers (see Subsection 3.7) and denotes the free product of tracial von Neumann algebras.

One can replace with a nonprincipal ultraproduct of matrix algebras without changing the validity of the CEP. More precisely, CEP is equivalent to the statement that, for any nonprincipal ultrafilter on , every separable tracial von Neumann algebra embeds into . This follows from the fact that each embeds in , whence embeds in , while there are conditional expectations , and the ultralimit of these expectations yields an embedding . (See Subsection 3.9 for the definition of conditional expectation.)

The last alternate reformulation makes the equivalence with the so-called microstate conjecture more apparent. The microstate conjecture states that for any tracial von Neumann algebra , any finite collection , …, of -polynomials in the noncommuting variables , any , …, in the operator norm unit ball of , and any , there is and , …, in the operator norm unit ball such that

In other words, any finite configuration that can be obtained in some tracial von Neumann algebra can be approximately obtained in some matrix algebra. It is this formulation of CEP that appeared in connection with free entropy as discussed in Subsection 2.1.

3.7. Operator algebras coming from groups

A large source of operator algebras arise from groups and these algebras play an important role in our story.

First, a unitary representation of a (discrete) group is a group homomorphism , where is a -algebra and denotes the group of unitary elements of .

Suppose that is a group. Let be the Hilbert space formally generated by an orthonormal basis for all . For any , define to be the linear operator on determined by for all . Notice that is unitary for all (since and so given by is a unitary representation of , called the left regular representation of .

Recall that group algebra consists of formal linear combinations with only finitely many nonzero coefficients. There is a natural -algebra structure on , the addition and multiplication being the obvious ones and the -operation being given by . is in fact a unital -algebra with unit , where denotes the identity of the group.

The left regular representation of extends by linearity to a unital -algebra homomorphism .

The reduced group -algebra of , denoted , is the closure of in the operator norm topology on . The group von Neumann algebra of , denoted , is the closure of in the WOT on . Moreover, the vector state on corresponding to yields a tracial state on and a trace on .

When is finite, and is generally considered uninteresting (to operator algebraists). When is infinite, is a II factor precisely when is an infinite conjugacy class (ICC) group, that is, when all nontrivial conjugacy classes of are infinite.

The procedure of taking the group von Neumann algebra of a group can “forget” a lot of the algebraic structure of the group. For example, it follows from Connes’ fundamental work Reference 15 that all ICC amenable groups have group von Neumann algebra isomorphic to .

In what follows, the reduced group -algebra of a group is not quite as important as a second -algebra associated to a group, the so-called universal (or maximal) group -algebra. To define this, we define a norm on by defining

It is readily verified that this is a well-defined (that is, finite) -norm on . The completion of with respect to this norm is thus a -algebra, called the universal -algebra associated to , denoted . Since the above norm is easily seen to be the maximal -norm on , it is sometimes called the maximal norm on and the completion the maximal group -algebra of . It follows immediately from the definition that any unitary representation of extends uniquely to a -homomorphism .

A particular corollary of this universal property of the universal group -algebra is that is surjectively universal, where is the free group on a countably infinite set of generators. More precisely, given any separable -algebra , there is a surjective -homomorphism . To see that this is the case, just note that there is a countable set of unitaries that generates (as a -algebra); now apply the universal property to the surjective unitary representation obtained by mapping the th basis element of onto .

Another consequence of the universal property is that if is a group homomorphism, then we get an induced -algebra homomorphism . A less obvious fact is that if is a subgroup of , then is naturally a -subalgebra of . This follows immediately from the definitions once one knows that any unitary representation of can be extended to a unitary representation of (via a technique known as induction; see Reference 22, Chapter 6).

By considering the left-regular representation of , we immediately see that there is a canonical surjective -homomorphism . In general, this map is not an isomorphism; that is, it often has nontrivial kernel. In fact, the canonical map is an isomorphism precisely when is amenable.

One final fact will prove useful later: for any two groups and , we have , where denotes the free product of groups and denotes the unital free product of -algebras, which is slightly annoying to define but whose properties can be guessed from the terminology.

3.8. The problem with -algebra tensor products

Before discussing the issues associated with defining tensor products of -algebras, we first recall the tensor product construction for vector spaces. Let and be vector spaces over the same field . We let be the free -vector space on the set , that is, all formal linear combinations with only finitely many nonzero coefficients. carries an obvious -vector space structure. The tensor product of and , denoted , is the quotient of by the subspace generated by elements of the following form, for , , and :

,

,

,

.

While it is more common to write instead of , we will reserve for analytic tensor products (to be defined shortly) and will use for the above algebraic tensor product.

The equivalence class of in is denoted . Thus, an arbitrary element of may be written as a formal linear combination , but not necessarily uniquely.

If and are both finite dimensional, then so is with ; if is a basis for and is a basis for , then is a basis for .

It is clear from the construction that if and are -linear maps, then there is a -linear map uniquely determined by .

If and are Hilbert spaces, then the algebraic tensor product comes naturally equipped with an inner product uniquely determined by

The completion of with respect to this inner product is then a Hilbert space, denoted and called the Hilbert space tensor product of and . If and are orthonormal bases for and , respectively, then is an orthonormal basis for . Moreover, if and are bounded linear maps, then the algebraic tensor product map extends uniquely to a bounded linear map .

We now come to the task of defining tensor products of operator algebras. We first note that if and are two -algebras, there is a natural -algebra operation on their algebraic tensor product determined by

,

.

If and are both unital, then so is with unit .

The tensor product of von Neumann algebras is fairly uncontroversial. Consider concretely represented von Neumann algebras and . It is straightforward to check that the algebraic tensor product is naturally a subset of (using the tensor product of linear transformation construction above) and that the -algebra structure induced by this identification agrees with the one placed on it in the previous paragraph. The von Neumann algebra tensor product of and is then the WOT closure of in . One can verify that this construction is indeed independent of the choice of representations of and .

The story for -algebras, on the other hand, is far more complicated in general. Fix -algebras and . We seek -norms on , for then the completion of with respect to such a -norm will be a -algebra tensor product of and .

One natural choice is to proceed as in the case of von Neumann algebras, that is, fix concrete representations and , and to consider the operator norm on . One can verify that this norm on is a -norm and is independent of the choice of representations. This norm is called the minimal tensor norm, denoted . The justification for the name comes from a theorem of Takesaki showing that is indeed the minimal -norm on . The completion of with respect to is denoted and is called the minimal tensor product of and . One should be aware of the fact that some authors simply write instead of .

A useful property of the minimal tensor product is the following result, which follows from the independence of the choice of representation. If and are -homomorphisms, then the linear map extends uniquely to a -homomorphism . Using the Stinespring dilation theorem, one can generalize the conclusion of the previous sentence to ucp maps as follows. If and are ucp maps, then there is a unique ucp map determined by .

Another natural -norm to consider on is the so-called maximal norm defined by

In connection with this formula, it is useful to observe that a -homomorphism restricts to -homomorphisms and with commuting ranges and, conversely, any two -homomorphisms and with commuting ranges yield a -homomorphism uniquely determined by . (The commutativity of the ranges of and ensure that this map is in fact a -homomorphism.) It is clear that is a -norm on ; the completion of with respect to is called the maximal tensor product of and , denoted . Any pair of -homomorphisms and with commuting ranges yields a -homomorphism that extends . Consequently, really is the largest -norm on . Using a more complicated Stinespring argument than the one mentioned above, one can show that any pair of ucp maps and with commuting ranges yields a ucp map uniquely determined by .

Before moving forward, we notice the following two facts, which are readily verified from the definitions. For any pair of groups and , we have

,

.

Returning to the general discussion, we have defined two extreme -norms on . In general, they can be different. For example, it can be shown that the maximal and minimal norms on are distinct. The corresponding question for turns out to be equivalent to CEP, as we will soon see! Another somewhat surprising result is that the maximal and minimal norms on (for infinite dimensional) are also distinct, a result due to Junge and Pisier Reference 42. In fact, Ozawa and Pisier Reference 49 showed that there exist at least continuum many different -norms on when is infinite dimensional.

We say that form a nuclear pair if there is a unique -norm on , that is, if the minimal and maximal norms on coincide. We also say that is nuclear if is a nuclear pair for every -algebra . There are many interesting examples of nuclear -algebras. For example, is nuclear for all , the reason being that , which is already a -algebra with a unique -norm. A more interesting example coming from groups is that is nuclear if and only if is nuclear if and only if is amenable (in which case ).

The following theorem of Kirchberg Reference 44 will be central moving forward.

Theorem 3.1.

is a nuclear pair.

Note that neither of these algebras are nuclear. The importance of Kirchberg’s theorem stems from the fact that is surjectively universal while is injectively universal.

We also note that if is a subgroup of and is any -algebra for which is a nuclear pair, then so is . In particular, whether or not is a nuclear pair is independent of the choice of , a fact that will come up in our discussion of Kirchberg’s QWEP problem. We will also need the fact that if is a surjective group morphism for which the canonical surjection has a ucp lift (meaning a ucp map for which is the identity on ), then being a nuclear pair implies is a nuclear pair.

3.9. Kirchberg’s QWEP problem

It follows from the definition of the minimal tensor product that for any inclusion of -algebras, one has that (isometrically) for any other -algebra . On the other hand, with the same setup, while there will always be a -homorphism , this homomorphism need not be injective, that is, isometric. One case, however, where this does hold is the inclusion . That is, it follows from the definitions that isometrically for all -algebras and .

Suppose again that and further suppose, for the sake of argument, that there is a ucp map with for all . By a fact pointed out in the the previous subsection, we obtain a ucp (and thus contractive) map . By the observation made in the previous paragraph, it follows that the canonical map is an isometric inclusion.

If is a -subalgebra of and is a linear map for which for all , then a theorem of Tomiyama says that the following are equivalent:

is cp,

is contractive,

is a conditional expectation, that is, for all and .

When such a map exists, we say that is cp-complemented in . The nomenclature comes from Banach space theory, for a Banach space is complemented in a superspace if and only if there is a contractive linear map that is the identity on . In the previous paragraph, we merely had to posit the existence of a ucp map , whence we call a weak conditional expecation and say that is weakly cp-complemented in . Consequently, we proved that if is weakly complemented in , then isometrically for any other -algebra . With more work, one can actually show that the converse of this observation holds as well. In fact, by the surjective universality of , we see that is weakly cp-complemented in if and only if isometrically.

There are two notable examples of cp-complemented inclusions worth pointing out now.

If is a finite von Neumann algebra, then any von Neumann subalgebra of is cp-complemented. To see this, fix a trace on and note that is a closed subspace of . One shows that the orthogonal projection actually restricts to a conditional expectation .

If is a subgroup of , then is cp-complemented in . To see this, one shows that the map , defined by setting for all while for all , extends to a conditional expectation .

Returning to the general situation, if is weakly cp-complemented in every superalgebra (or equivalently in ), then we say that has the weak expectation property (WEP). An insight of Kirchberg Reference 44 was to use his theorem proving that is a nuclear pair to provide an alternate test for having WEP:

Theorem 3.2.

For a -algebra , the following are equivalent:

(1)

has the WEP.

(2)

For every -algebra , isometrically.

(3)

is a nuclear pair.

Proof.

We already observed the equivalence of (1) and (2). Now suppose that has WEP. We then have

that is, is a nuclear pair. Conversely, suppose that is a nuclear pair. It suffices to show that . However, this follows immediately from the assumption, Kirchberg’s theorem, and the fact that preserves inclusions.

What are the ramifications of assuming that itself has the WEP, that is, is a nuclear pair? First, as observed above, this is equivalent to being a nuclear pair for any fixed . Next, if we define the QWEP to be the property that a -algebra is a quotient of a -algebra with WEP, then having the WEP implies that all separable -algebras have the QWEP. We note that it is common to see both phrases, has the QWEP” and is QWEP” (although the latter is of course grammatically incorrect).

Conversely, suppose that all separable -algebras have the QWEP. Then certainly has the QWEP. However, has another property, the so-called lifting property (LP) which, when combined with QWEP, actually implies the WEP. A -algebra has the lifting property if, for any ucp map , where is a -algebra and is a closed, two-sided ideal in , there is a ucp map for which , where is the canonical quotient map. Said more casually, has the LP if every ucp map into a quotient -algebra has a ucp lift. Now suppose that has the LP and the QWEP as witnessed by the quotient map with having the WEP. Let be a ucp lift of the identity map , that is, . Then by applying the ucp lift of , we see that also has the WEP.

We have thus arrived at the following.

Theorem 3.3.

The following assertions are equivalent:

(1)

has the WEP.

(2)

For some (equivalently any) , is a nuclear pair.

(3)

Every separable -algebra has the QWEP.

Any of the above equivalent statements are known as Kirchberg’s QWEP problem. (We might be tempted to follow the CEP’s lead and call this the QWEPP, but that looks a bit silly.) As mentioned above, QWEP combined with LP implies WEP. It turns out that it suffices to consider a local version of LP, aptly called the local lifting property (LLP) and the same argument works; that is, QWEP together with LLP implies WEP. Thus, another equivalent formulation of the QWEP problem is the statement that LLP implies WEP.

We mention one other equivalent formulation of the QWEP problem that is not relevant for our particular story but is fascinating nonetheless: the QWEP problem is equivalent to the statement that has a faithful tracial state. What makes this interesting is that this is true for the reduced group -algebra (simply because it is true for any reduced group -algebra) and it is true for (a result due to Choi).

The classes of WEP and QWEP algebras enjoy a number of closure properties relevant to the proofs that follow. Rather than enumerate them all now, we will simply quote them when we need them later in the paper.

3.10. From CEP to QWEP

In this section, we show how a positive solution to the CEP implies a positive solution to the QWEP problem. While these statements are indeed equivalent, we focus on the direction that we need in order to give a negative solution to CEP.

So how does CEP get involved in a story about -algebras? The first clue is that, for von Neumann algebras, the WEP had already been well studied and is referred to as injectivity. A not so trivial result is that any hyperfinite von Neumann algebra is injective, whence is injective. The extremely deep work of Connes in Reference 15, where the CEP originally comes from, proved the converse, namely any separable injective II factor must be hyperfinite, and thus isomorphic to . Thankfully we do not need this result in our story, although the proof that is injective (and thus has WEP) is difficult enough.

Now that we know that is injective, so is as WEP is closed under the formation of direct sums. Since is a -algebra quotient of , we see that is QWEP! Okay, it smells like we are getting closer.

Now suppose that is a tracial von Neumann algebra that embeds in in a trace-preserving manner. Without loss of generality, let us assume that is simply a subalgebra of . By a fact pointed out above, this means that is cp-complemented in . Since QWEP is preserved by (weakly) cp-complemented inclusions, we conclude that is also QWEP.

We have thus arrived at the statement: a positive solution to CEP implies all finite von Neumann algebras are QWEP!

But we are still talking about von Neumann algebras. How do we bridge the gap into talking about -algebras? Well, recall that every -algebra has a canonically associated von Neumann algebra . Since is tautologically weakly cp-complemented in , in order to show that has QWEP, it suffices to show that has QWEP (again using the closure of QWEP under weakly cp-complemented subalgebras).

While is a separable von Neumann algebra, it may not be finite. How do we get CEP to help us with nonfinite von Neumann algebras?

Given any von Neumann algebra , there is an important one-parameter group of automorphisms of , known as the modular group. When is finite, the modular automorphism group is trivial and thus plays no role. But in the general theory, it is an indispensible tool. (For all of the fancy type III material discussed in this paragraph, Takesaki’s book Reference 58 is the canonical reference.) Akin to the semidirect product construction in group theory, there is a crossed product construction that associates to any group acting on a von Neumann algebra a larger von Neumann algebra where this action is implemented by unitaries. Thus, we are entitled to consider the crossed product algebra corresponding to the action of on via the modular automorphism group. A serious theorem of Takesaki states that is semifinite. We came across semifinite factors above. For a general von Neumann algebra, we can take semifinite to mean that the algebra contains an increasing union of finite subalgebras whose union generates the von Neumann algebra. Since QWEP is preserved under unions and a von Neumann algebra is QWEP if it contains a WOT-dense -subalgebra with QWEP, we see that has QWEP. An alternative approach is to use the fact that a von Neumann algebra is QWEP if and only if all of the factors involved in its direct integral decomposition are QWEP. Thus, to show that a semifinite von Neumann algebra is QWEP, it suffices to consider the case of factors. But then a semifinite factor is of the form for a finite factor , and one can use the fact that the von Neumann algebra tensor product of QWEP von Neumann algebras is again QWEP. Either way, we now know that is QWEP.

Finally, it is a general fact that any von Neumanna algebra is always cp-complemented in any crossed product ; since is QWEP for any von Neumann algebra , it follows that itself is also QWEP. Applying this fact to , we see that , and thus , are also QWEP for any -algebra . This finishes the proof that a positive solution to CEP implies a positive solution to the QWEP problem.

As mentioned before, a positive solution to the QWEP problem implies a positive solution to the CEP. The proof involves the theory of amenable traces, which we will not go into now, but which will be important in our alternate derivation of the failure of CEP from given in Subsection 7.5.

4. A crash course in complexity theory

In this section, we introduce the basic notions from (classical) complexity theory needed to understand the statement of the result . Essentially all of this material (apart from the business about nonlocal games) was taken from the book Reference 3.

4.1. Turing machines

A Turing machine is one of the more popular mathematical formulations of an idealized computing device. Formally, a Turing machine is a pair , where is a finite set of states of the machine and is the transition function; here and are two special symbols whose significance will be seen shortly. We always assume that contains two special states, namely the start state and the halting state .

Throughout, for any , denotes the set of binary strings of length while denotes the set of all finite binary strings. Given , denotes the length of the string .

Here is how one should envision the computation performed by the Turing machine upon some input . The machine contains three tapes, which are one-way infinite strips containing boxes which, at any given moment in the computation, contain exactly one symbol from . The first tape is the input tape, the second tape is the work tape, and the last tape is the output tape. At the beginning of the computation, the input tape has the start symbol in the first box, then the input string in the next boxes, and then the remainder of the boxes contain the blank symbol . Both the work tape and the output tape contain the start symbol in the first box and then blank symbols in the remaining boxes. One envisions each tape having a “tape head” which is placed over exactly one box in the tape at any given moment during the computation; the tape head for the input tape can read the symbol in that box while the tape head for the other two tapes can both read the symbol in that box and potentially change it to a new symbol.

The Turing machine begins the computation in the start state with the tape head above the leftmost box (which contains the start symbol ) for each tape. In general, at any given moment during the computation, the Turing machine is in some state with tape heads reading boxes (representing how far they are from the beginning of their respective tape) and with symbols inside of each of the boxes being read. The Turing machine then computes , obtaining the tuple , which should be interpreted as follows:

The box in the work tape (resp., output tape) that the tape head is reading should have its contents replaced by (resp., ).

The tape head for the input tape should move to the left if , to the right if , and should stay in the same place if . Similar actions should be taken corresponding to for the work tape and for the output tape. If any tape head is at the leftmost box and the instruction is , then the tape head should also stay in the same place.

After executing these acts, the Turing machine should now enter state .

The machine continues running in this fashion. If the machine ever enters the state , then the machine stops running, that is, no further modification of the three tapes will take place. In this case, the output of the computation upon input is the longest initial string on the output tape not containing any blank symbols. (If all the symbols are blank, then the output is considered the empty string).

Every Turing machine computes a partial function whose domain consists of those strings for which halts upon input ; in this case, we define to be the corresponding output. We sometimes abuse notation and identify with itself, that is, we may write instead of . We say a partial function is computable if for some Turing machine .

Given a function , we say that the Turing machine runs in -time if, for any input , upon input , halts in at most steps. Note that if runs in -time for some function , then is a total function. We say that is a polynomial time (resp., exponential time) Turing machine if runs in -time (resp., -time) for some constants and .

A language is simply a subset . We identify a language with its characteristic function . Consequently, it makes sense to speak of being computable by a Turing machine. Usually, a language is described in terms of some mathematical problem under consideration, e.g., the set of finite graphs that can be -colored. The implict assumption is that there is some natural (and effective) way of coding the set of such graphs as a set of finite binary strings. In the following, for all languages introduced in this manner, we assume that the reader can figure out how such a coding might be performed.

Turing machines are one of several mathematically precise models for computation; other alternatives include register machines and the class of recursive functions. However, all known reasonable models of computation lead to precisely the same class of computable functions. This is evidence for the Church–Turing thesis, which states that this common class of functions coincides with our heuristic notion of what a computable function should be. (See Reference 18, Chapter 3 for more on this.) One can even formulate a stronger version of the thesis, which states that even when taking into account the efficiency of computations, that is, how fast one can compute a function, the choice of model is still irrelevant. (It is plausible that quantum computers could pose a serious threat to the strong Church–Turing thesis.) The import of the strong Church–Turing thesis for us in these notes is that, in what follows, when claiming that a certain problem can be solved in a certain efficient manner, we never need to actually write down the Turing machine that witnesses this fact. Instead, one can write down an argument using pseudo-code and the reader can (if they choose to) convert the pseudo-code into an actual Turing machine program.

4.2. Some basic complexity classes

A complexity class is simply a collection of languages. The most interesting complexity classes are those defined by some sort of condition saying that the languages in the class represent efficiently computable (or verifiable, as we shall shortly see) problems.

The complexity class is defined to be the class of languages such that membership in can be decided by a Turing machine in polynomial time, that is, can be computed by a polynomial time Turing machine. For example, the set of connected graphs is a language that belongs to (as witnessed by, say, the breadth first search algorithm).

The complexity class is defined in the same manner as , replacing polynomial time by exponential time. The time hierarchy theorem implies that .

Sometimes it is too difficult to come up with an algorithm that efficiently decides membership in a particular language while it is the case that if someone were to hand you a proof that a certain string belonged to the language, then you could efficiently verify that the proof was indeed correct. The complexity class captures this idea. More precisely, the complexity class consists of those languages for which there is a polynomial time Turing machine and a polynomial such that:

for all , there is for which .

for all and for all , .

In the above definition, one thinks of as the proof that ; other commonly used terms for are “witness” and “certificate.” One often envisions this situation using two fictious players, a verifier and a prover. If , the prover hands the verifier a proof that indeed belongs to ; the prover has unlimited computation power in this regard. In order for the verifier to be able to efficiently check that the proof indeed works, the proof cannot be too long (or else the verifier will not even be able to read the entire proof), hence the polynomial length requirement. Moreover, if , then there should be no proof that belongs to , whence the second condition.

It is clear that . While intuitively it seems clear that this inclusion should be proper (there ought to be problems that are impossible to efficiently decide, yet there are always proofs that are efficiently verifiable), this fact has yet to be established and remains one of the more famous open problems in mathematics.

We also note that , as one can check all of the exponentially many possible certificates for a given string in exponential time.

An example of a language in is the set of codes for pairs , where is a finite graph that contains an independent set of size ; a certificate for a given pair is simply an independent set of size . This language is unlikely to be in . Indeed, this is an example of a so-called -complete problem, meaning that it is as difficult as any other problem in (in a precise sense), whence if it belongs to , then so do all languages in and . Another example of a language in is the set of codes for pairs of finite graphs that are isomorphic; the certificate here is the isomorphism between the graphs. This problem, however, is unlikely to be -complete (see Reference 3, Section 8.4).

An alternative way of defining the class is to use nondeterministic Turing machines, which is actually the original definition and explains the terminology ( stands for nondeterministic polynomial time). A nondeterministic Turing machine is defined exactly like a deterministic one except that it has two transition functions rather than one. Consequently, rather than there being a single (determinstic) sequence of steps during a computation upon a given input, there is an entire binary tree of such computations, for at every step during a computation, one can apply either of the two transition functions. One additional difference is that instead of a single halting state , we now have two halting states called and . We say that the nondeterministic Turing machine outputs on input if there is some sequence of steps which causes the machine to reach . If every sequence of steps causes the machine to reach , then we say that outputs . If every sequence of computations results in either or in time , then we say that runs in time . It thus makes sense to speak of polynomial time (resp., exponential time) nondeterminstic Turing machines.

It can easily be verified that a language belongs to if and only if there is a polynomial time nondeterministic Turing machine such that . Moreover, using exponential time nondeterministic Turing machines, we can also define the complexity class . Of course, using nondeterministic Turing machines that run in doubly exponential time, one can also define (this will come up later). A nondeterministic version of the time hierarchy theorem guarantees .

At this point, we have with and . One can also use a padding argument to show that if , then .

So far we have only been concerned with time efficiency. One can instead consider space efficiency. We will only consider the class , which consists of all languages for which there is a Turing machine such that, upon input , it decides whether or not using only a polynomial amount of work space. It is clear that . It is also fairly easy to see that , for one can simply check all possible certificates, erasing one’s work after each individual check, thus using only a polynomial amount of space. A slightly less obvious inclusion is ; the proof uses the notion of a configuration graph for a computation. So, to update our state of knowledge, we have . It is not known if either inclusion or is proper (although one of them must be as ); as with , it is believed that both of these inclusions are proper.

We end this section with the definition of the class . Although it will not play a direct role in the story to follow, it will make a later pill easier to swallow. We return to the setting of nondeterministic Turing machines, but this time we count the proportion of computations that output . We say that the language belongs to the class if there is a nondeterministic Turing machine such that, upon , the probability that a random nondeterministic computation agrees with is at least . Just as in the case of , there is a formulation using deterministic Turing machines: belongs to if and only if there is a Turing machine and a polynomial such that, for every string , the probability that a random is such that is at least . We remark that the choice of is fairly arbitrary; by repeating the computation several (but still a reasonable number of) times and taking the majority result of the computations, we can replace with a probability as close to as one desires. The class is contained in as one can check all random bits and compute the probability that a random choice yields or .

If is a language in and one repeats the computation a sufficient number of times to achieve a probability of, say, , then one can be fairly certain that the result of the probabilistic computation is the truth, and thus seems like a fairly good substitute for . In fact, there are complexity-theoretic reasons for believing that might coincide with (see Reference 3, Chapter 16).

4.3. Interactive proofs

We now imagine the situation where rather than the prover just handing the verifier a proof, the prover and the verifier are allowed to interact. Given an input, the verifier can ask the prover a question, the prover can answer that question, then (based on that answer) the verifier can ask the prover another question to which the prover can reply, and so on, for a certain number of rounds. Each time, the verifier’s question and the prover’s answer depend on the entire sequence of questions and answers obtained up to that point (as well as the input). After this discussion, the verifier can decide whether or not to accept. Once again, the verifier uses a polynomial-time Turing machine to choose which questions to ask and whether or not to accept at the end of the conversation while the prover has no computational limitations.

It is not too difficult to verify that, with this description of interactive proof, the corresponding complexity class would simply be in disguise. Indeed, one can just use the conversation, or transcript as it is usually called, as the certificate. However, combining this idea with a randomized process as discussed at the end of the last subsection does lead to a class with more computational power (although, interestingly enough, it is a class we have already seen before).

In order to define this class, we fix (although one could actually work with a polynomial-time computable instead) and a polynomial . Assume that we also have a Turing machine (now that we are really viewing the machine as a verifier, we have replaced with ) such that, for all , all , and all strings , …, , halts upon input in time polynomial in for all , …, . We then imagine a prover interacting with as follows. First, the verifier randomly selects and computes ; this is ’s first question to . then responds with the answer . (Note that does not have access to the random string ; one says that is using private coins. It turns out that for what we are going to define below, one can also use public coins that the prover is aware of.) This constitutes the first round of their interaction. The verifier then asks their second question and responds with , completing the second round of interaction. This is repeated for a total of rounds. then returns their decision , indicating whether or not they accept the prover’s answers as constituting evidence that indeed belongs to .

The complexity class is defined to be the collection of those languages for which there is a Turing machine as above such that:

If , then there is a prover such that the probability that a random string causes to accept is at least .

If , then no prover can cause to accept more than of the time.

The probabilities and above are called the completeness and soundness parameters, respectively. As in the case of , they are somewhat arbitrary; any choice of completeness parameter strictly larger than one’s choice of soundness parameter will define the same class. It turns out that one can even replace the completeness parameter by without changing the class; this property of is called perfect completeness.

A nice example of a language in is the collection of pairs of finite graphs that are not isomorphic. Note that this class is not obviously in for there are too many possible functions that could serve as an isomorphism. There is however a simple interactive proof for this class. Indeed, the verifier randomly selects and then randomly selects a permutation on the number of vertices of , obtaining a graph isomorphic to . The verifier then sends the graph to the prover as its question. The prover then responds with a bit , which represents its guess as to which of the two graphs or the verifier selected randomly. The verifier accepts if and only if the prover guessed the chosen graph correctly. Note that if (that is, if the pair belongs to the class), then the prover can always respond correctly, for the prover can just figure out whether or not is isomorphic to or to . (Do not forget that the prover is all-powerful!) However, if (that is, does not belong to the class), then is isomorphic to both and , and thus the prover (regardless of its unlimited power) can do no better than simply guessing which graph was chosen by the verifier, thus only convincing the verifier at most half of the time

Given a verifier as in the definition of , one can compute in -space the optimal prover strategy. This shows that . A landmark theorem in the subject shows that in fact we have equality:

Theorem 4.1 (Lund, Fortnow, Karloff, Nisan Reference 23; Shamir Reference 55).

.

Recall that randomization alone likely does not achieve anything new (earlier we remarked that is likely), and, similarly, interaction alone does not achieve anything too new (as we just recover ). However, combining randomization with interaction bumps us up to (which is likely bigger than ).

But why stop at one prover? One can consider interactions as above but allowing for multiple provers to interact with the verifier. It should be emphasized that the provers are not allowed to interact with each other during the interaction, but only with the verifier. They can, however, have a meeting before the interaction starts and decide upon a strategy that they will use while interacting with the verifier. In other words, the provers are cooperating but noncommunicating. If the provers use deterministic strategies as above, we arrive at the complexity class . By allowing different kinds of strategies (in particular, those that employ quantum methods), we arrive at variations of , such as the famous appearing in the equation .

It turns out that the complexity class is unchanged if one restricts to just two provers and one round of interaction. We thus make that default assumption from now on. By ignoring one of the provers, we clearly have that . As with , one can also achieve perfect completeness.

With two provers, one can now utilize “police-style” interrogation tactics. This makes it possible for the verifier to read polynomially many random portions of an exponentially long proof and come to a conclusion that with high probability agrees with the truth. A formalization of this idea yields another major theorem in the subject:

Theorem 4.2 (Babai, Fortnow, Lund Reference 5).

.

As mentioned before, it is believed that . If this inclusion is indeed proper, then the jump from one to more than one prover does in fact lead to a larger collection of languages.

4.4. Nonlocal games

It will be useful to recast our description of the class in terms of so-called nonlocal games, a certain collection of two-person games. (The terminology “nonlocal” comes from the connection with Bell’s theorem on quantum nonlocality, as we discuss later.)

Consider a language in as witnessed by the polynomial-time verifier . Given input and a sequence of random bits , by computing , we are really computing the two questions and (sequences of bits of length polynomial in ) that are being sent to the two provers, whom we will call Alice and Bob, following typical quantum information nomenclature. Alice and Bob, employing their deterministic strategies and , then respond with their answers, say and , and then the prover calculates to decide whether or not to accept their answers. Whether or not belongs to then corresponds to the expected value over a randomly chosen that the verifier returns . Note that the polynomial time requirement on allows us to assume that the set of possible answers only contains bits that are of size at most some fixed polynomial in .

This reformulation leads us to the following notion: A nonlocal game with questions and answers is a pair , where is a probability distribution on and is the decision predicate for the game. Here, and similarly for . A strategy for the players consists of a conditional probability expressing the probability that Alice and Bob respond with answers and if they are asked questions and , respectively. We view such a strategy as an element of . Above, we only considered deterministic strategies, namely those for which there are functions such that for all . We let denote the set of such deterministic strategies. Later, we will consider several other sets of strategies.

Given a strategy , the value of the game with respect to the strategy is the expected value the players will win if they play according to , that is,

We set and refer to this as the classical value of the game .

We can now rephrase the definition of in terms of nonlocal games: a language belongs to if and only if there is an efficient mapping (in the precise sense described earlier in this subsection) so that:

If , then .

If , then .

The class appearing in the result is defined in the analogous way except that we replace classical strategies by quantum strategies. But first, an interlude to explain all things quantum.

5. A quantum detour

In this section, we introduce the quantum prerequisites necessary to understand the definition of the complexity class . Our presentation of quantum mechanics is fairly standard and can be found in any good textbook on quantum mechanics. As mentioned above, we also found Paulsen’s lecture notes Reference 50 very helpful as well.

5.1. Quantum measurements

In quantum mechanics, one associates to each physical system a corresponding Hilbert space . The state of the system at any given time is given by a unit vector . The state of the system evolves linearly according to a certain partial differential equation (the Schrödinger equation) until it is measured. A measurement should be thought of as an experiment on the system which has a finite number, say , possible outcomes. (There are also experiments that can have a countably infinite set of outcomes, say the infinite discrete set of energies of some particle, or even a continuum of outcomes, say when measuring the position or momentum of a particle. For the purposes of this paper, it suffices to focus on the case of finitely many outcomes.) Formally, a measurement with outcomes consists of bounded operators , …, . The Born rule states that, if the state of the system is upon measurement, then the probability that the th outcome happens is given by . Furthermore, in case the th outcome is measured, the collapse dynamics tells us that the state of the system instantaneously (and discontinuously) changes to . Since the sum of the outcome probabilities must be , we see that

Since this equality must hold true for all unit vectors , it follows that . Consequently, any sequence , …, satisfying this latter property constitutes a measurement of the system.

If one is only interested in the probabilities of the outcomes rather than the outcomes themselves (as we will be when we return to our discussion of nonlocal games), then it simplifies matters by replacing a measurement as above by a sequence , …, consisting of positive operators which sum up to and interpret the probability that the th outcome is obtained when the system is in state to be given by . Such a collection of positive operators is called a positive operator-valued measure (POVM) on (this terminology comes from spectral theory). If one specializes even further to the case that each is not only a positive operator but in fact a projection, then one speaks of projection-valued measures (PVMs) on . Note then that the projections are automatically pairwise orthogonal, so a PVM on with outcomes corresponds to a decomposition of into orthogonal subspaces.

Many introductions to quantum mechanics discuss the measurements of observables. An observable for the physical system is a self-adjoint operator on . Supposing for simplicity that is finite dimensional, the spectral theorem implies that we can find a PVM , …, on such that the ’s correspond to the projections onto the various eigenspaces of corresponding to . The self-adjointness assumption on further implies that the corresponding eigenvalues are real numbers, whence we can interpret them as corresponding to actual possible physical measurements. Conversely, given any PVM , …, on and real numbers , …, , one has an observable .

A simple example of the content of the previous paragraph is given by the spin of an electron. The spin of an electron along any choice of axis comes in one of two flavors: up or down. (By the way, this is what is “quantum” about quantum mechanics: many attributes of a physical system come in a discrete set of possibilities.) For the sake of completeness, let us say that we are measuring spin along the vertical axis. The state of the electron is given by a unit vector in the Hilbert space . We view the usual orthonormal basis for as representing the two possible spin values: so corresponds to up while corresponds to down. Now a general unit vector in can be written in the form for unique complex numbers for which . What is strange and new about quantum mechanics is that a given electron can be in a state that is neither up nor down. More specifically, when neither nor are , the electron is considered to be in a superposition of the two states and will only reveal one of these two states upon a measurement of the spin, that is, using the PVM on consisting of the projections onto the coordinate axes. The state of the electron merely gives us probabilistic information as to which of the two outcomes will happen upon such a measurement. Moreover, once the measurement has been made, the new state of the electron instantaneously and discontinuously jumps to the unit vector or corresponding to the outcome of the measurement just made. This reflects the fact that if another measurement is made directly following the first measurement, the same outcome will occur. It is important to make the distinction between superpositions and definite measurement outcome with probabilities measuring ignorance of the actual value.

The above description of quantum mechanics is the standard or Copenhagen interpretation, and it is a mighty big pill to swallow upon a first reading. (Technically speaking, this is really the von Neumann–Dirac formulation of the theory; however, it has become common parlance to refer to this interpretation as the Copenhagen interpretation, even though Niels Bohr himself explicitly disagreed with this formulation.) Perhaps the biggest point of contention are the questions, “What constitutes a measurement?” together with the follow-up question “Why did the state of the electron collapse to one of the two basis states?” This is the so-called measurement problem and is a very popular topic of debate amongst philosophers and theoretical physicists. It has lead to a plethora of alternate interpretations of quantum mechanics (often yielding mathematically equivalent predictions); a good introduction to these foundational issues is Barrett’s recent book Reference 6.

To keep the strangeness coming, suppose that we want to measure spin in the horizontal direction instead of the vertical direction. It turns out that the appropriate basis to consider is now , where and . In other words, the PVM consisting of the orthogonal projections onto the lines spanned by and , respectively, constitutes a measurement of the spin of the electron in the horizontal direction. Suppose that an electron has a definite spin, say up, in the vertical direction, whence its state is . In the eigenbasis for the observable of spin in the horizontal direction, the state becomes . Consequently, a measurement of an electron with an up spin in the vertical direction will yield a spin of either left or right in the horizontal direction with equal probability. Even more strangely, suppose that the electron that had a definite vertical spin that was up was then measured in the horizontal direction and the outcome was spin left; that is, the measurement led to an outcome state of . Suppose further that a subsequent measurement of the electron in the vertical direction was performed. Since , we see that the outcome of the measurement now yields up or down with equal probability. Thus, the measurement in the horizontal direction destroyed the definite spin the electron had in the vertical direction!

5.2. The spookiness of entanglement

The postulates of quantum mechanics tell us that if and are the Hilbert spaces representing two physical systems, then the appropriate Hilbert space for studying the composite system is the tensor product space . The fact that elements of the tensor product need not be merely simple tensors leads to the fascinating concept of entanglement, which, in some sense, is the essence of this entire story!

In order to get an idea of the utility of entanglement as a resource in, say, quantum information theory, we present the example of superdense coding. We set . This quantum state is known as the EPR state, named after Einstein, Podolsky, and Rosen. Shortly, we will have more to say about this state and why Einstein, Podolsky, and Rosen were considering it. Let us imagine that Alice and Bob each possess an electron and the joint state of the vertical spins of the two electrons is , that is, the electrons are in an equal superposition of both spins being up or both spins being down. Furthermore, imagine that Alice and Bob are really (really) far away from each other. We show how Alice and Bob can utilize the fact that their electrons are in this entangled state in order for Alice to send two classical bits of information to Bob by just sending one qubit of information, that is, by Alice sending Bob her electron (after she has first done some work on it).

Depending on what two bits of information Alice wishes to send to Bob, she performs one of the following actions to her electron:

,

,

,

.

Here, , the so-called bit-flip operator, and , the so-called phase-flip operator.

One can check that the four vectors form an orthonormal basis for known as the Bell basis. Consequently, any observable on with distinct eigenvalues and with the Bell basis vectors as eigenvectors can be used to distinguish these vectors, that is, when the state of the system is , a measurement of will yield with probability , whence Bob knows which of the four actions above Alice took and thus knows which pair of bits she wished to send to Bob. (One can be explicit about the observable , namely , where is the so-called Hadamard operator and is the so-called controlled not operator.)

Notice something peculiar about the EPR state? If the state of two electrons is given by , then they are in a superposition of either both electrons having spin up or both electrons having spin down (with equal probability). However, if Alice performs a measurement of the spin of her electron and sees a result of spin up, she knows, with absolute certainty, that a subsequent measurement of the spin of Bob’s electron will also be spin up. Thus, while Bob’s electron did not have a determinate spin before Alice’s measurement, the result of Alice’s measurement instantaneously gave a determinate value to the spin of Bob’s electron.

Einstein was worried by this phenomenon, which he called “spooky action at a distance.” Together with Podolsky and Rosen Reference 17, they used the EPR state to present an argument for the incompleteness of quantum mechanics. The gist of the argument is as follows: Suppose that Alice and Bob share a pair of electrons in the EPR state , and that Alice and Bob are again really (really) far apart. Suppose that Alice measures her electron and sees the result spin up. Then Alice knows with 100% certainty that if Bob were to measure his electron, then it must also have a determinately up spin. The same holds for a measurement result of spin down. Since Alice can predict with certainty the outcome of Bob’s measurement, and since her measurement could not possibly have altered the spin of Bob’s electron, Bob’s spin must have a definite value, independent of whether or not Alice were to measure it. This definite value must represent some element of physical reality and if quantum mechanics were to be complete, there must be some counterpart of this physical reality in the theory. Since there is nothing in the description of the EPR state which specifies a determinate value for Bob’s spin, quantum mechanics must be incomplete.

It gets even worse, for if Alice were to decide to measure her spin along a different axis, say the horizontal axis, then once again the result of her measurement would allow her to definitively conclude the value of Bob’s electron’s spin in the horizontal axis. In this case, both the vertical and horizontal spins would have definite, predetermined values, which is a contradiction of the fact that knowing, say, the vertical spin of an electron forces us to be maximally uncertain about the horizontal spin of the electron. So in some sense, the EPR argument even posits that quantum mechanics is inconsistent!

The underlying philosophy that Einstein, Podolsky, and Rosen have in their argument is usually dubbed local realism: the term “local” refers to the assumption that Alice’s measurement could not have affected Bob’s electron since they are so far away and communication cannot travel faster than the speed of light, while the term “realism” refers to the statement that the fact that one can determine the spin of Bob’s electron with certainty implies that there must be some real, predetermined value to the spin. Einstein, Podolsky, and Rosen believed that there should be some hidden variable explaining this predetermined spin, allowing them to preserve their classical, locally real intuitions.

John Bell Reference 7 set up a thought experiment to determine whether there could indeed be a formulation of quantum mechanics that was complete and adhered to the local realist philosophy. He showed that this is in fact impossible by showing that a small set of local realist assumptions leads to an inequality on the expected outcome of a certain experiment and that a particular quantum measurement could violate that inequality. Moreover, it is actually experimentally testable whether or not this inequality holds in nature. Spoiler alert: the inequality is violated by nature, whence quantum mechanics comes out victorious! Thus, while seemingly strange, quantum mechanics lies in contradistinction to the local realist assumptions.

Besides being an intellectually fascinating story, there turns out to be a direct link between these Bell inequalities and the phenomena of having quantum strategies for nonlocal games that exceed all possible classical values, which we now explain. (The idea of treating the violation of Bell-type inequalities as quantum strategies for nonlocal games that exceed the classical value of the game seems to have first been seriously studied by Cleve, Hoyer, Toner, and Watrous Reference 14).

We have already discussed deterministic strategies for nonlocal games. One may imagine incorporating a probabilistic component to these strategies by considering a probability space and determinstic strategies and , one for each . Consequently, the players can randomly (according to ) select an and then play deterministically according to and . In terms of the EPR experiment, one may think of as the hidden variable for which we do not have perfect knowledge but that if we were to know it, then things would behave deterministically. The probability space represents our epistemic (lack of) knowledge of the hidden variable. Consequently, we now have probabilistic strategies

which are called local strategies, the term “local” referring to the fact that each player’s output still only depends on their local environment. The set of such local strategies is denoted . It is straightforward to see that is a compact, convex subset of whose extreme points are the elements in . Moreover, it is clear that every element of is a convex combination of elements of , whence for any nonlocal game with questions and answers.

On the other hand, we can consider quantum strategies for nonlocal games as follows. We let consist of those strategies for which

where, for each , and are POVMs with outcomes on finite-dimensional Hilbert spaces and , respectively. We call such a strategy a quantum strategy. These stratgies correspond to Alice and Bob sharing a (possibly entangled) state of their composite system and performing measurements and on their portion of the state upon receiving questions and , respectively. Using a technique known as Naimark dilation (a special case of the Stinespring dilation theorem from above), one can replace POVMs with the more convenient to use PVMs without altering the definition of . It is a straightforward argument to show that is a convex subset of .

We have that . Indeed, since every element of is a convex combination of deterministic strategies and is convex, it suffices to show that every determinstic strategy is contained in . However, this is quite easy: if is the function determining Alice’s strategy, let be the POVM on for which and for all . Bob’s POVM is defined in the analogous way. It follows that, for any state , we have that .

Given a nonlocal game , we define its entangled value to be

By the previous paragraph, we have that for any nonlocal game . The idea behind Bell’s theorem, recast in the setting of nonlocal games, is that there are nonlocal games for which .

For example, we consider the following game, known as the CHSH game. (The acronym CHSH stands for Clauser, Horne, Shimony, and Holt, the researchers responsible for the CHSH inequality, a Bell-type inequality that was one of the first to be experimentally testable.) The CHSH game is a game with . The question distribution is the uniform distribution on and with decision predicate given by the following conditions.

If or , then Alice and Bob win if and only if their answers agree.

If , then Alice and Bob win if and only if their answers disagree.

By inspecting all determinstic strategies, one finds that . However, the entangled value of the game satisfies . The interested reader can find the details for this calculation in Reference 14, Section 3.1. We merely point out that a quantum strategy for achieving uses the EPR state .

5.3. MIP*

Based on the nonlocal game definition of the complexity class and our recent discussion of quantum strategies for nonlocal games, it should be clear how to define the complexity class : the language belongs to if there is an efficient mapping (in the precise sense from Subsection 4.4) from strings to nonlocal games such that:

If , then .

If , then .

We remark that the definition of first appeared in the aforementioned paper Reference 14.

To be fair, we are really defining the complexity class , which only has two provers and one round of interactions. There are ways to define similar classes that allow more verifiers and rounds, but the eventual result will show they yield the same class anyways, so we will not bother.

So how do the classes and relate? The lesson from the previous section was that provers that share entanglement can win some nonlocal games more often than they rightfully should. In other words, it seems that it might be the case that for a language that belongs to and for a string that does not belong to , the provers might have a strategy for the corresponding game whose value exceeds .

Nevertheless (and perhaps somewhat surprisingly), Ito and Vidick Reference 38 showed that . The rough idea behind this inclusion is that it suffices to show that , and the games involved in the proof that are such that their classical and quantum values are approximately the same.

Later, Natarajan and Wright Reference 46 showed that . Recalling that , this shows that , whence adding entanglement does indeed strictly increase the computational power of the verifier.

So exactly how much extra power does entanglement give us? Besides the result mentioned in the last paragraph, there was only an a priori seemingly silly upper bound on , namely , where (which is short for recursively enumerable) is the complexity class which consists of those languages for which there is a Turing machine (with absolutely no efficiency requirements whatsoever) whose domain is ; that is, consists of the set of inputs for which halts. An alternative formulation of is helpful: belongs to if there is a total computable function whose range is (this is why modern computability theorists refer to this as being computably enumerable or CE). To see the inclusion , note first that, given any dimension , one can effectively enumerate a countable set of quantum strategies of dimension that is dense in the set of such strategies and for which one can effectively compute for any such quantum strategy . By letting tend to , if one ever finds such a strategy for which , one knows that (and one is guaranteed that this will happen for some such if ). Note that if , this procedure will never convince us that because maybe we did not wait long enough, and a higher-dimensional strategy would indeed have convinced us if we were just a bit more patient.

The amazing fact proven in Reference 39 is that this upper bound is actually tight! That is, holds! More specifically, the authors prove that there is an effective mapping from (codes for) Turing machines to nonlocal games such that:

If halts on the empty tape, then .

If does not halt on the empty tape, then .

The language consisting of codes for Turing machines that halt on the empty tape is known as the halting problem . Since the halting problem is complete for the class , the inclusion holds.

Regardless of your interest in CEP, the equality is an amazing fact. The halting problem is an undecidable problem (this follows from a simple diagonalization argument together with the fact that there is a so-called universal Turing machine). Nevertheless, if two cooperating but noncommunicating provers share some quantum entanglement, they can reliably convince a verifier whether or not a given Turing machine halts! This is a landmark intellectual achievement.

The proof of is very complicated, and we will not discuss it here. The introduction to Reference 39 does a great job outlining the essence of the proof.

The story of is about allowing quantum resources but keeping the computational model classical. Further, it is interesting to ask what happens if we also replace the computational model we are using (i.e., the Turing machine) with a quantum computational model (e.g., quantum circuits). It turns out that there is nothing to be gained here: by prefixing the corresponding classical complexity class with a “Q” to denote its counterpart defined using a quantum computational model, we have , , and ; see Reference 62 for the details.

6. From to the failure of CEP: the traditional route

The derivation of the negative solution to the CEP from now proceeds in two steps: We first show how leads to a negative solution to Tsirelson’s problem in quantum information theory; we show this in the first subsection. In the second subsection, we then show how a negative solution to Tsirelson’s problem naturally leads to a negative solution to Kirchberg’s QWEP problem. As we already observed in Subsection 3.10, this leads to a negative solution to the CEP.

6.1. A negative solution to Tsirelson’s problem

In order to explain Tsirelson’s problem, we need to introduce some more collections of strategies. First, we define exactly as in the definition of , except that we remove the finite-dimensionality assumptions on Alice’s and Bob’s state spaces and ; a strategy in this larger class is called a quantum spatial strategy. Quantum spatial strategies still correspond to the idea that Alice and Bob each have their own physical system and the state of their composite system is given by the tensor product. It can be checked that there is no loss of generality in restricting our attention to separable Hilbert spaces in the definition of . Moreover, by considering projections onto larger and larger finite-dimensional subspaces, we see that , the closure of in the usual topology it inherits from being a subset of . Like , one can check that is convex.

There is another model that is natural to consider which arises in quantum field theory. In quantum field theory, one usually considers a large quantum system (maybe the system describing the whole universe!) and then it may be difficult to separate Alice and Bob’s systems as isolated subsystems of the larger system. The state of the large system is now given by some unit vector in a single Hilbert space , and Alice’s and Bob’s measurements are now given by families of POVMs and acting on this single Hilbert space . Since we are still assuming that Alice and Bob are far away and so they cannot interact with each other, it is natural to assume that either of them can measure first without affecting the value of the other’s measurements (or even that they can perform their measurements simultaneously). According to von Neumann, the mathematical way of modeling this situation is to assume that Alice’s and Bob’s measurements commute with one another, that is, for all and all . The corresponding strategy is given by and is called a quantum commuting strategy. (Commutativity ensures that this a priori complex value lies in .) The set of quantum commuting strategies is denoted . Note that there is no requirement that be finite dimensional (although one can take it to be separable) and, in fact, requiring to be finite dimensional yields another description of the set (see Reference 16). Later, we will see that is a closed convex subset of and that, like , one can use PVMs instead of POVMs without changing the definition.

It is clear that . In Reference 59, Boris Tsirelson claimed (without proof) that equality holds for all . After he was questioned about this, he realized that he could not prove this claim. In fact, upon further reflection, he could not even establish whether or not was closed nor whether or not coincided with (see his note Reference 60). The question of whether or not for all is known as Tsirelson’s problem. (Incidentally, in Reference 56 Slofstra showed that Tsirelson’s original claim was false, that is, there is a pair such that , and he even strengthened this result to show that need not be closed, that is, there is for which .)

Fix a nonlocal game with questions and answers. It is clear that

However, we can also use elements of to define values of games; namely, we define the commuting value of to be . It is clear that and that equality holds for all nonlocal games if Tsirelson’s problem has an affirmative answer. In fact, it can be shown that an affirmative answer to Tsirelson’s problem is equivalent to the statement for all nonlocal games .

Recall that in our discussion of the inclusion , we discussed how can be effectively approximated from below. On the other hand, it turns out that can be effectively approximated from above. This result follows from two facts:

There is a finitely presented group (which in fact only depends on the number of questions and answers in ) and an element such that (see Corollary 6.3).

For any finitely presented group , one can always find effective upper bounds on the operator norm of (a result due to Fritz, Netzer, and Thom Reference 25, Corollary 2.2).

In Subsection 7.8, we offer a simple model-theoretic proof of the fact that can be approximated from above, although, to be fair, we really establish a slightly different version of this fact sufficient to derive the failure of Tsirelson’s problem from . In any event, if , that is, if Tsirelson’s problem has an affirmative answer, then we can effectively approximate both from below and above, which would then imply that all languages in are decidable, contradicting !

By the way, the argument in the preceding paragraph shows that , where is defined exactly like but using the commuting value of games instead of the entangled value , and denotes the class of languages whose complement lies in . It is currently unknown if this upper bound is sharp.

6.2. A negative solution to Kirchberg’s QWEP problem

In this subsection, we show how a negative solution to Tsirelson’s problem yields a negative solution to Kirchberg’s QWEP problem. We follow Fritz’s presentation Reference 24 closely.

We begin by considering the abelian -algebra . For each , …, , we let denote the th standard basis element of . (We are using as the index since we are using the notation from nonlocal games.) For any , we also consider the -fold free product and denote by the version of in the th copy of .

Proposition 6.1.
(1)

There is a - correspondence between -outcome POVMs in and ucp maps given by .

(2)

There is a - correspondence between -tuples of -outcome POVMs in and ucp maps given by .

Proof.

The proof of (1) is easy to check, using that a positive map with commutative domain is automatically completely positive. Part (2) follows from (1) and a theorem of Florin Boca Reference 11, which implies that the individual ucp maps given by can be jointly extended to a single ucp map .

We now bring group -algebras into the picture, getting us closer to the QWEP problem. We first note that , where denotes the additive group of integers modulo . Indeed, let be a generator of and consider the element . It is readily verified that is an element of of order , whence the assignment yields a unitary representation , extending to a -homomorphism that can be checked to be an isomorphism. (This identification usually goes under the name discrete Fourier transform.)

Let denote the group freely generated by elements of order . We then have

We abuse notation slightly and let denote the element of corresponding to . (Another viewpoint is that denote the spectral projections corresponding to the th unitary element of .)

Here is the main result connecting the QWEP problem and Tsirelson’s problem:

Theorem 6.2.

Fix and a strategy . We then have:

(1)

if and only if there is a state on for which .

(2)

if and only if there is a state on for which .

Proof.

For the forward direction of (1), we may assume, without loss of generality, that , say , where the POVMs and act on the Hilbert spaces and , respectively. By Proposition 6.1 and the above identification , we have ucp maps and corresponding to these POVMs. These two ucp maps combine to yield a ucp map . Consequently, we can define a state on by setting . It is clear that this state implements as in the statement of (1).

Conversely, suppose that is a state on for which . Concretely represent so that . Extend to a state on , which we will continue to denote . Since convex combination of vector states are dense in the state space of , given there are vectors , …, for which for all and . This shows that can be approximated by convex combinations of elements of . Since is closed and convex, we have that .

The proof of the forward direction of (2) is identical to the proof of the forward direction of (1), using the fact that one can combine ucp maps with commuting ranges into a ucp map on the maximal tensor product. For the converse, suppose that is a state on for which . Let be the GNS representation corresponding to the state with cyclic vector . Set and . It is clear that and commute for all and that , whence .

We note that the proof above fulfills a few promises made earlier, namely that elements of can always be taken to arise from PVMs (instead of just POVMs) and that is closed and convex (being the continuous image of the compact convex set of states on .

Given a nonlocal game with questions and answers, set

Corollary 6.3.

For any nonlocal game , we have and .

We remind the reader that Corollary 6.3 is responsible for the negative solution to Tsirelson’s problem from . Indeed, corresponds to the operator norm of