# Algebraic Theory of Phase Retrieval

Communicated by *Notices* Associate Editor Steven Sam

The phase retrieval problem first arose in the early 20th century from work on X-ray crystallography. In the last century, X-ray crystallography has developed into the leading method for elucidating the atomic structure of molecules, leading to enormous scientific breakthroughs: at least 25 Nobel Prizes have been awarded for work directly or indirectly involving crystallography. The phase retrieval problem also occurs in numerous other scientific and engineering applications, such as diffraction imaging, ptychography, ultra-short pulse characterization, speech processing, radar, and astronomy.

In its most general form, the phase retrieval problem can be written as:

where

Without model error or noise, namely, when the measurement model 1 is accurate, phase retrieval is a problem of solving real quadratic equations. As such, it is naturally amenable to algebraic techniques, including commutative algebra, algebraic geometry, and invariant theory. In particular, algebraic methods are powerful techniques for proving that phase retrieval is theoretically possible; i.e., that a particular phase retrieval problem has a unique solution up to the action of its intrinsic symmetry group. Although these methods are typically not algorithmic, they can be used to provide theoretical validations of existing algorithms, and to unveil the fundamental limitations of different methods. The purpose of this article is to discuss recent advances in this growing field of research, and to publicize open problems that we believe will be of interest to mathematicians in general, and algebraists in particular.

In the rest of the article, we succinctly introduce five specific phase retrieval setups that we find important and intriguing, present known results, and delineate open mathematical questions. We conclude by discussing the limitations of the algebraic point of view, and how other mathematical fields, such as information theory, statistics, combinatorics, and optimization, can provide indispensable insights into the phase retrieval problem.

### Phase retrieval with general linear measurements

The revival of interest in the mathematics of the phase retrieval problem in the last 15 years emerged from the study of phase retrieval models with a general sensing matrix. Specifically, in this setup ^{1} with

For a general real matrix

For real matrices, there is a remarkably elegant characterization of when every vector

Note that a necessary condition for an

For complex matrices, the situation is more nuanced.

Here, we view the space of

Due to the lack of characterization, there are relatively few explicit examples of complex

One step towards solving this problem was made by Heinosaari, Mazarella and Wolf HMW13, who used results from topology on minimal embeddings of projective spaces to prove that

### X-ray crystallography

X-ray crystallography—a prevalent technology for determining the three-dimensional atomic structure of molecules—is by far the largest phase retrieval application. In X-ray crystallography, the signal is the electron density function of the crystal — a periodic arrangement of a repeating, compactly supported unit

where

where ^{2}

As the size of the set

This signal is supported only at the sparsely spread positions of atoms. Thus, the crystallographic phase retrieval entails finding a

While this article does not focus on algorithmic questions, we mention that a set of benchmark problems for evaluating crystallographic phase retrieval algorithms was designed in ELB18.

In most models, the signal is assumed to be real, so the auto-correlation function is a real quadratic function. In this case,

We say that two subsets

where each difference is counted with multiplicity. For example, if

The phase retrieval problem for binary sets is equivalent to the combinatorial question of whether two subsets with the same cyclic difference multi-sets are dihedrally equivalent. This question does not have an affirmative answer. For example, the subsets of

At the other extreme, we can consider a model where the non-zero entries of the sparse vector are assumed to be arbitrary. In this case, the goal is to prove that a suitably *generic* sparse vector is determined up to the action of

In BE20, we conjectured that if

The first step is support recovery; that is, verifying that the auto-correlation determines the support of a generic signal up to dihedral equivalence. Note that for a generic signal

The reason that Conjecture 1 implies support recovery follows from the fact that the image of

The second step is signal recovery for a signal with known support. Let

To see that Conjecture 2 implies signal recovery, note that

Conjectures 1 and 2 are similar to questions about the dimensions and degrees of secant varieties that arise in the literature on tensor decompositions. Indeed, if

Although we do not yet know how to prove either Conjecture 1 or Conjecture 2, they can be easily verified for small values of

Interestingly, using tools from harmonic analysis and information theory, it was recently proven that

**Future directions.** Thus far we have discussed binary and generic models for X-ray crystallography. In practice, however, the model should account for sparse signals whose non-zero entries are taken from a finite (small) alphabet; this alphabet models the relevant type of atoms, such as hydrogen, oxygen, carbon, nitrogen, and so on. Any analysis of this model would be probabilistic, but we expect that in the case of fixed finite alphabet, the probability that a sparse signal can be recovered from its auto-correlation is asymptotic to one as the signal length

### Fourier phase retrieval

In this section, we consider the problem of recovering a signal from its aperiodic auto-correlation:

where