Notices of the American Mathematical Society
Welcome to the current issue of the Notices of the American Mathematical Society.
With support from AMS membership, we are pleased to share the journal with the global mathematical community.
- Previous Issue
- Volume 72 | Number 5 | May 2025
- No newer issues
PDFLINK |
Machine Learning and Invariant Theory

1. Introduction
Modern machine learning has not only surpassed the state of the art in many engineering and scientific problems, but it also has had an impact on society at large, and will likely continue to do so. This includes deep learning, large language models, diffusion models, etc. In this article, we give an account of certain mathematical principles that are used in the definition of some of these machine learning models, and we explain how classical invariant theory plays a role in them. Due to space constraints we leave out many relevant references. A version of this manuscript with a longer set of references is available on arXiv 1.
In supervised machine learning, we typically have a training set where , are the data points and are the labels. A typical example is image recognition, where the are images and the are image labels (say, “cat” or “dog”), encoded as vectors. The goal is to find a function in a hypothesis space that not only approximately interpolates the training data ,( but also performs well on unseen (or held-out) data. The function ), is called the trained model, predictor, or estimator. In practice, one parametrizes the class of functions with some parameters varying over a space of parameter values sitting inside some in other words, ; Then one uses local optimization (in . to find a function in ) that locally and approximately minimizes a prespecified empirical loss function which compares a candidate function values on the ’s with the “true” target values In other words, one approximately solves . and then takes , .
Modern machine learning performs regressions on classes of functions that are typically overparameterized (the dimension of the space of parameters is much larger than the number of training samples), and in many cases, several functions in the hypothesis class can interpolate the data perfectly. Deep learning models can even interpolate plain noise, or fit images to random labels. Moreover, the optimization problem is typically nonconvex. Therefore the model performance is highly dependent on how the class of functions is parameterized and the optimization algorithms employed.
The parameterization of the hypothesis class of functions is what in deep learning is typically referred to as the architecture. In recent years, the most successful architectures have been ones that use properties or heuristics regarding the structure of the data (and the problem) to design the class of functions: convolutional neural networks for images, recurrent neural networks for time series, graph neural networks for graph-structured data, transformers, etc. Many of these design choices are related to the symmetries of the problem: for instance, convolutional neural networks can be translation equivariant, and transformers can be permutation invariant.
When the learning problem comes from the physical sciences, there are concrete sets of rules that the function being modeled must obey, and these rules often entail symmetries. The rules (and symmetries) typically come from coordinate freedoms and conservation laws 19. One classical example of these coordinate freedoms is the scaling symmetry that comes from dimensional analysis (for instance, if the input data to the model is rescaled to change everything that has units of kilograms to pounds, the predictions should scale accordingly). In order to do machine learning on physical systems, researchers have designed models that are consistent with physical law; this is the case for physics-informed machine learning, neural ODEs and PDEs, and equivariant machine learning.
Given data spaces and a group acting on both of them, a function is equivariant if for all and all Many physical problems are equivariant with respect to rotations, permutations, or scalings. For instance, consider a problem where one uses data to predict the dynamics of a folding protein or uses simulated data to emulate the dynamics of a turbulent fluid. Equivariant machine learning restricts the hypothesis space to a class of equivariant functions. The philosophy is that every function that the machine learning model can express is equivariant, and therefore consistent with physical law. .
Symmetries were used for machine learning (and in particular neural networks) in early works 16, and more recently they have been revisited in the context of deep learning. There are three main ways to implement symmetries. The simplest one parameterizes the invariant and equivariant functions with respect to discrete groups by averaging arbitrary functions over the group orbit 3. The second approach, explained in the next section, uses classical representation theory to parameterize the space of equivariant functions (see for instance 8). The third approach, the main point of this article, uses invariant theory.
As an example, we briefly discuss graph neural networks (GNNs), which have been a very popular area of research in the past couple of years. GNNs can be seen as equivariant functions that take a graph represented by its adjacency matrix and possible node features and output an embedding , so that for all permutation matrices. Graph neural networks are typically implemented as variants of graph convolutions or message passing, which are equivariant by definition. However, many equivariant functions cannot be expressed with these architectures. Several recent works analyze the expressive power of different GNN architectures in connection to the graph isomorphism problem.
Beyond graphs, equivariant machine learning models have been extremely successful at predicting molecular structures and dynamics, protein folding, protein binding, and simulating turbulence and climate effects, to name a few applications. Theoretical developments have shown the universality of certain equivariant models, as well as generalization improvements of equivariant machine learning models over nonequivariant baselines. There has been some recent work studying the inductive bias of equivariant machine learning, and its relationship with data augmentation. See 1 for a list of references on these topics.
2. Equivariant Convolutions and Multilayer Perceptrons
Modern deep learning models have evolved from the classical artificial neural network known as the perceptron. The multilayer perceptron model takes an input and outputs defined to be the composition of affine linear maps and nonlinear entry-wise functions. Namely,
where is the (fixed) entrywise nonlinear function and are affine linear maps to be learned from the data. The linear maps can be expressed as where and In this example each function . is defined by the parameters .
The first neural network that was explicitly equivariant with respect to a group action is the convolutional neural network. The observation is that if the input images are seen in the torus the linear equivariant maps are cross-correlations (which in ML are referred to as convolutions) with fixed filters. The idea of restricting the linear maps to satisfy symmetry constraints was generalized to equivariance with respect to discrete rotations and translations, and to general homogenous spaces. Note that when working with arbitrary groups there are restrictions on the functions , for the model to be equivariant.
Classical results in neural networks show that certain multilayer perceptrons can universally approximate any continuous function in the limit where the number of neurons goes to infinity. However, that is not true in general in the equivariant case. Namely, functions expressed as 1 where the are linear and equivariant may not universally approximate all continuous equivariant functions. In some cases, there may not even exist nontrivial linear equivariant maps.
One popular idea to address this issue is to extend this model to use equivariant linear maps on tensors. Now are linear equivariant maps (where the action in the tensor product is defined as the tensor product of the action in each component and extended linearly). Now the question is how can we parameterize the space of such functions to do machine learning? The answer is via Schur’s lemma.
A representation of a group is a map that satisfies (where is a vector space and as usual, denotes the automorphisms of , that is, invertible linear maps , A group action of ). on (written as is equivalent to the group representation ) such that We extend the action . to the tensor product so that the group acts independently in every tensor factor (i.e., in every dimension or mode), namely .
The first step is to note that a linear equivariant map corresponds to a map between group representations such that for all Homomorphisms between group representations are easily parametrizable if we decompose the representations in terms of irreducible representations (aka irreps): .
In particular, Schur’s Lemma says that a map between two irreps over is zero (if they are not isomorphic) or a multiple of the identity (if they are).
The equivariant neural-network approach consists in decomposing the group representations in terms of irreps and explicitly parameterizing the maps 9. In general, it is not obvious how to decompose an arbitrary group representation into irreps. However in the case where the decomposition of a tensor representation as a sum of irreps is given by the Clebsch–Gordan decomposition: ,
The Clebsch–Gordan decomposition not only gives the decomposition of the right side of 3 but also it gives the explicit change of coordinates. This decomposition is fundamental for implementing the equivariant 3D point-cloud methods defined in 8 and other works (see references in 1). Moreover, recent work 6 shows that the classes of functions used in practice are universal, meaning that every continuous SO(3)-equivariant function can be approximated uniformly in compact sets by those neural networks. However, there exists a clear limitation to this approach: Even though decompositions into irreps are broadly studied in mathematics (plethysm), the explicit transformation that allows us to write the decomposition of tensor representations into irreps is a hard problem in general. It is called the Clebsch–Gordan problem.
3. Invariant Theory for Machine Learning
An alternative but related approach to the linear equivariant layers described above is the approach based on invariant theory, the focus of this article. In particular, the authors of this note and collaborators 18 explain that for some physically relevant groups—the orthogonal group, the special orthogonal group, and the Lorentz group—one can use classical invariant theory to design universally expressive equivariant machine learning models that are expressed in terms of the generators of the algebra of invariant polynomials. Following an idea attributed to B. Malgrange (that we learned from G. Schwarz), it is shown how to use the generators of the algebra of invariant polynomials to produce a parameterization of equivariant functions for a specific set of groups and actions.
To illustrate, let us focus on functions, namely functions -equivariant such that for all and all (for instance, the prediction of the position and velocity of the center of mass of a particle system). The method of B. Malgrange (explicated below) leads to the conclusion that all such functions can be expressed as
where are functions. Classical invariant theory shows that -invariant is if and only if it is a function of the pairwise inner products -invariant So, in actuality, .4 can be rewritten
where are arbitrary functions. In other words, the pairwise inner products generate the algebra of invariant polynomials for this action, and every equivariant map is a linear combination of the vectors themselves with coefficients in this algebra.
In this article, we explicate the method of B. Malgrange in full generality, showing how to convert knowledge of the algebra of invariant polynomials into a characterization of the equivariant polynomial (or smooth) maps. In Section 4 we explain the general philosophy of the method, and in Section 5 we give the precise algebraic development, formulated as an algorithm to produce parametrizations of equivariant maps given adequate knowledge of the underlying invariant theory. In Section 6, we work through several examples.
We note that, for machine learning purposes, it is not critical that the functions are defined on invariant polynomials, nor that they themselves are polynomials. In the following, we focus on polynomials because the ideas were developed in the context of invariant theory; the arguments explicated below are set in this classical context. However, in 18, the idea is to preprocess the data by converting the tuple to the tuple of dot products and then treat the latter as input to the , which are then learned using a machine learning architecture of one’s choice. Therefore, the ’s, are not polynomials but belong to whatever function class is output by the chosen architecture. Meanwhile, some recent works ’s2511 have proposed alternative classes of separating invariants that can be used in place of the classical algebra generators as input to the and may have better numerical stability properties. This is a promising research direction. ’s,
4. Big Picture
We are given a group and finite-dimensional linear , -representations and over a field (We can take . or We want to understand the equivariant polynomial maps .) We assume we have a way to understand .invariant polynomials on spaces related to - and and the goal is to leverage that knowledge to understand the equivariant maps. ,
The following is a philosophical discussion, essentially to answer the question: why should it be possible to do this? It is not precise; its purpose is just to guide thinking. Below in Section 5 we show how to actually compute the equivariant polynomials given adequate knowledge of the invariants. That section is rigorous.
The first observation is that any reasonable family of maps (for example linear, polynomial, smooth, continuous, etc.) has a natural induced from the actions on -action and and that the , maps in such a family are precisely the fixed points of this action, as we now explain. This observation is a standard insight in representation and invariant theory. -equivariant
Let be the set of maps of whatever kind, and let (respectively be the group of linear invertible maps from ) (respectively to itself. Given ) and and we define the map , by
where and are the group homomorphisms defining the representations and The algebraic manipulation to verify that this is really a group action is routine and not that illuminating. A perhaps more transparent way to understand this definition of the action as “the right one” is that it is precisely the formula needed to make this square commute: .
It follows from the definition of this action that the condition is equivalent to the statement that is The square above automatically commutes, so -equivariant. is the same as saying that the below square commutes—
—and this is what it means to be equivariant.
An important special case of 6 is the action of on the linear dual of , This is the case . with trivial action, and for 6 reduces to This is known as the contragredient action. We will utilize it momentarily with . in the place of .
The second observation is that can be identified with functions from a bigger space to the underlying field by “currying,” and this change in point of view preserves the group action. Again, this is a standard maneuver in algebra. Specifically, given any map we obtain a function , defined by the formula ,
Note that the function is linear homogeneous in Conversely, given any function . that is linear homogeneous in the second coordinate, we can recover a map such that by taking , to be the element of identified along the canonical isomorphism with the functional on that sends to functional is guaranteed to exist by the fact that —this is linear homogeneous in the second coordinate. An observation we will exploit in the next section is that the desired functional is actually the gradient of with respect to .
This construction gives an identification of with a subset of Furthermore, there is a natural action of . on defined precisely by the above formula ,6 with in place of , in place of and trivial action on , ;Footnote1 and the identification described here preserves this action. Therefore, the fixed points for the on -action correspond with fixed points for the on -action which are invariant functions (since the action of , on is trivial).
The action of on is defined by acting separately on each factor; the action on is the contragredient representation defined above.
What has been achieved is the reinterpretation of equivariant maps first as fixed points of a and then as invariant functions -action, Thus, knowledge of invariant functions can be parlayed into knowledge of equivariant maps. .
5. Equivariance from Invariants
With the above imprecise philosophical discussion as a guide, Algorithm 1 shows how in practice to get from a description of invariant polynomials on to equivariant polynomial (or smooth) maps , The technique given here is attributed to B. Malgrange; see .13, Proposition 2.2 where it is used to obtain the smooth equivariant maps, and 15, Proposition 6.8 where it is used to obtain holomorphic equivariant maps. Variants on this method are used to compute equivariant maps in 7, Sections 2.1–2.2, 12, Section 3.12, 4, Section 4.2.3, and 20, Section 4.
The goal of the algorithm is to provide a parametrization of equivariant maps. That said, the proof of correctness is constructive: as an ancillary benefit, it furnishes a method for taking an arbitrary equivariant map given by explicit polynomial expressions for the coordinates and expressing it in terms of this parametrization.
Malgrange’s method for getting equivariant functions
- (1)
Order the generators so that are of degree and are of degree in Discard . (of higher degree in ).
- (2)
Choose a basis for and let , be the dual basis, so an arbitrary element can be written
and .
- (3)
For and for , let , be the gradient of with respect to identified with an element of , along the canonical isomorphism explicitly, ;
Then each is a function .
We now exposit in detail Algorithm 1 and its proof of correctness, in the case where is a polynomial map; for simplicity we take The argument is similar for smooth or holomorphic maps, except that one needs an additional theorem to arrive at the expression .7 below. If is a compact Lie group, the needed theorem is proven in 14 for smooth maps, and in 10 for holomorphic maps over .
We begin with linear representations and of a group over We take . to be the contragredient representation to defined above. (If , is compact, we can work in a coordinate system in which the action of on is orthogonal, and then we may ignore the distinction between and as discussed above in the case of , We suppose we have an explicit set .) of polynomials that generate the algebra of invariant polynomials on the vector space (denoted as other words, they have the property that any invariant polynomial can be written as a polynomial in these. We also assume they are bihomogeneous, i.e., independently homogeneous in )—in and in To reduce notational clutter we suppress the maps specifying the actions of . on and (which were called and in the previous section), writing the image of (respectively , under the action of an element ) as (respectively , We suppose ). are degree in (so they are functions of alone), are degree in and , are degree in .
Now we consider an arbitrary polynomial function -equivariant
We let be arbitrary, and, as in the previous section, we construct the function
Equivariance of implies that is invariant:
From the invariance of and the fact that , generate the algebra of invariant polynomials on we have an equality of the form ,
where is a polynomial. Note that do not depend on while , do.
We now fix and take the gradient of both sides of 7 with respect to viewed as an element of , .Footnote2 Choosing dual bases for and for and writing , we can express the operator , acting on a smooth function explicitly by the formula
In the background, we are using canonical isomorphisms to identify with all its tangent spaces, and with .
Applying to the left side of 7, we get
so recovers from (Indeed, this was the point.) Meanwhile, applying . to the right side of 7, writing for the partial derivative of with respect to its argument, and using the chain rule, we get th
Combining these, we conclude
Now we observe that if because in those cases , is constant with respect to But meanwhile, the left side of .8 does not depend on and it follows the right side does not either; thus we can evaluate it at our favorite choice of , we take ; Upon doing this, . also becomes for because in these cases , is homogeneous of degree at least in so its partial derivatives with respect to the , remain homogeneous degree at least in thus they vanish at , Meanwhile, . itself vanishes for so that the , to st arguments of each th vanish. Abbreviating
as we may thus rewrite ,8 as
Finally, we observe that, being linear homogeneous in for , is degree in i.e., it does not depend on , So we may call it . as in the algorithm, and we have finally expressed as the sum as promised. ,
6. Examples
In this section we apply Malgrange’s method to parametrize equivariant functions in various examples. In all cases, for positive integers we take a group , of matrices, equipped with its canonical action on and we are looking for equivariant maps ,
from an of vectors to a single vector. The underlying invariant theory is provided by Weyl’s The Classical Groups in each case. -tuple
The orthogonal group. We parametrize maps that are equivariant for By the Riesz representation theorem, we can identify . with along the map where , is the standard dot product Since . preserves this product (by definition), this identification is equivariant with respect to the thus -action, is isomorphic with as a representation of We may therefore ignore the difference between . and in applying the algorithm.
Thus, we consider the ring of polynomials on tuples -invariant
We begin with bihomogeneous generators for this ring. By a classical theorem of Weyl known as the first fundamental theorem for , they are the dot products for ; for and ; These are ordered by their degree in . as in Step 1 of the algorithm; we discard as it is degree in .
We can work in the standard basis for and we have identified it with its dual, so Step 2 is done as well. ,
Applying Step 3, we take the generators of degree in which are ,
Taking the gradients, we get
where denotes the coordinate of th Thus the . yielded by the algorithm is nothing but projection to the input vector. Meanwhile, the th of the algorithm are the algebra generators of degree zero in thus the output of the algorithm is precisely the representation described in ;5 and the paragraph following.
The Lorentz and symplectic groups. If we replace with the Lorentz group or (in case , is even) the symplectic group the entire discussion above can be copied verbatim, except with the standard dot product being replaced everywhere by the Minkowski product , in the former case, or the standard skew-symmetric bilinear form (where is block diagonal with matrices as blocks) in the latter. We also need to use these respective products in place of the standard dot product to identify -rotation equivariantly with its dual representation. The key point is that the invariant theory works the same way (see 12, Sec. 9.3 for a concise modern treatment, noting that and have the same complexification).
The special orthogonal group. Now we consider We can once again identify . with its dual. However, this time, in Step 1, the list of bihomogeneous generators is longer: in addition to the dot products and (and which will be discarded), we have , determinants for and , for The former are of degree . in while the latter are of degree Thus, the latter contribute to our list of . in Step 3, while the former figure in the arguments of the ’s Carrying out Step 3 in this case, we find that ’s.
is exactly the generalized cross product of the vectors Thus we must add to the . a generalized cross-product for each ’s of our input vectors; in the end the parametrization of equivariant maps looks like -subset
where , represents the set of of -subsets , is shorthand for where and , is shorthand for the generalized cross product of the vectors where .
The special linear group. We include an example where we cannot identify with its dual representation. Namely, we take As . does not preserve any bilinear form on we must regard , as a distinct representation. Thus in Step 1 we must consider the polynomial invariants on
A generating set of homogeneous invariants is given by the canonical pairings , and the , determinants , where we have used the same shorthand as above for a , -subset of and the determinant whose columns are the indexed by ’s The former are degree . in while the latter are degree So, writing . as in Step 2, we have
and again in Step 3 we get
with the computation identical to the one above for the orthogonal group. Thus the algorithm outputs that an arbitrary polynomial map has the form -equivariant
The symmetric group. We give an example where the algebra of invariants is generated by something other than determinants and bilinear forms, so the output by the algorithm are not just the ’s themselves and generalized cross products. Take ’s the symmetric group on , letters, acting on by permutations of the coordinates. As realized in this way is a subgroup of we can once again identify , with its dual.
By the fundamental theorem on symmetric polynomials, the algebra of polynomials on a single vector -invariant is given by the elementary symmetric polynomials in the coordinates , etc., where , index the coordinates of the vector Weyl showed that for any . the algebra of invariant polynomials on an , of vectors -tuple is generated by the polarized elementary symmetric polynomials ,
etc., where can be any of the vectors distinct or not. (Up to a scalar multiple, one recovers the original, unpolarized elementary symmetric polynomials of a single vector by setting , Running the algorithm to parametrize equivariant functions .) we write down the algebra of invariants on , and see that the algebra generators of degree 1 in , have the form where, again, , can be distinct or not. The gradients of these become the of Step 3. For the sake of explicitness, we fix ’s , and write out the results as column vectors. We get ,
Thus any polynomial map -equivariant is a linear combination of these six maps, with coefficients that are polynomials in , and , with .
7. Discussion
This article gives a gentle introduction to equivariant machine learning, and it explains how to parameterize equivariant maps in two different ways. One requires knowledge of the irreducible representations and Clebsch-Gordan coefficients, and the other requires the knowledge of the generators of the algebra of invariant polynomials on The main focus is on the latter, which is useful to design equivariant machine learning with respect to groups and actions where the invariants are known and are computationally tractable. This is not a panacea: sometimes the algebra of invariant polynomials is too large, or outright not known. Both these issues come up, for example, in the action of permutations on . symmetric matrices by conjugation, the relevant one for graph neural networks. The invariant ring has not been fully described as of this writing, except for small where the number of generators increases very rapidly with , see ;17.
From a practitioner’s point of view, it is not yet clear which of these approaches will behave better in practice. We conjecture that Malgrange’s construction applied to nonpolynomial invariants such as 2511, as described at the end of Section 3, is a promising direction for some applications, especially because some of them exhibit desirable stability properties.
Acknowledgments
The authors thank Gerald Schwarz (Brandeis University) for introducing us to Malgrange’s method. The authors also thank Jaume de Dios Pont (UCLA), David W. Hogg (NYU), Teresa Huang (JHU), and Peter Olver (UMN) for useful discussions. SV and BBS are partially supported by ONR N00014-22-1-2126. SV was also partially supported by the NSF-Simons Research Collaboration on the Mathematical and Scientific Foundations of Deep Learning (MoDL) (NSF DMS 2031985), NSF CISE 2212457, and an AI2AI Amazon research award.
References
- [1]
- B. Blum-Smith and S. Villar, Machine learning and invariant theory, arXiv preprint arXiv:2209.14991, 2023.
- [2]
- J. Cahill, J. W. Iverson, D. G. Mixon, and D. Packer, Group-invariant max filtering, arXiv preprint arXiv:2205.14039, 2022.
- [3]
- T. Cohen and M. Welling, Group equivariant convolutional networks, Proceedings of the 33rd International Conference on Machine Learning, PMLR, 2990–2999, 2016.
- [4]
- Harm Derksen and Gregor Kemper, Computational invariant theory, Second enlarged edition, Encyclopaedia of Mathematical Sciences, vol. 130, Springer, Heidelberg, 2015. With two appendices by Vladimir L. Popov, and an addendum by Norbert A’Campo and Popov; Invariant Theory and Algebraic Transformation Groups, VIII, DOI 10.1007/978-3-662-48422-7. MR3445218,
Show rawAMSref
\bib{derksen2015computational}{book}{ author={Derksen, Harm}, author={Kemper, Gregor}, title={Computational invariant theory}, series={Encyclopaedia of Mathematical Sciences}, volume={130}, edition={Second enlarged edition}, note={With two appendices by Vladimir L. Popov, and an addendum by Norbert A'Campo and Popov; Invariant Theory and Algebraic Transformation Groups, VIII}, publisher={Springer, Heidelberg}, date={2015}, pages={xxii+366}, isbn={978-3-662-48420-3}, isbn={978-3-662-48422-7}, review={\MR {3445218}}, doi={10.1007/978-3-662-48422-7}, }
- [5]
- N. Dym and S. J. Gortler, Low dimensional invariant embeddings for universal geometric learning, arXiv preprint arXiv:2205.02956, 2022.
- [6]
- N. Dym and H. Maron, On the universality of rotation equivariant point cloud networks, International Conference on Learning Representations, 2021.
- [7]
- Karin Gatermann, Computer algebra methods for equivariant dynamical systems, Lecture Notes in Mathematics, vol. 1728, Springer-Verlag, Berlin, 2000, DOI 10.1007/BFb0104059. MR1755001,
Show rawAMSref
\bib{gatermann2007computer}{book}{ author={Gatermann, Karin}, title={Computer algebra methods for equivariant dynamical systems}, series={Lecture Notes in Mathematics}, volume={1728}, publisher={Springer-Verlag, Berlin}, date={2000}, pages={xvi+153}, isbn={3-540-67161-7}, review={\MR {1755001}}, doi={10.1007/BFb0104059}, }
- [8]
- M. Geiger and T. Smidt, e3nn: Euclidean neural networks, arXiv preprint arXiv:2207.09453, 2022.
- [9]
- R. Kondor, Z. Lin, and S. Trivedi, Clebsch–Gordan nets: a fully fourier space spherical convolutional neural network, Advances in Neural Information Processing Systems 31 (2018), 10117–10126.
- [10]
- Domingo Luna, Fonctions différentiables invariantes sous l’opération d’un groupe réductif (French, with English summary), Ann. Inst. Fourier (Grenoble) 26 (1976), no. 1, ix, 33–49. MR423398,
Show rawAMSref
\bib{luna1976fonctions}{article}{ author={Luna, Domingo}, title={Fonctions diff\'{e}rentiables invariantes sous l'op\'{e}ration d'un groupe r\'{e}ductif}, language={French, with English summary}, journal={Ann. Inst. Fourier (Grenoble)}, volume={26}, date={1976}, number={1}, pages={ix, 33--49}, issn={0373-0956}, review={\MR {423398}}, }
- [11]
- Peter J. Olver, Invariants of finite and discrete group actions via moving frames, Bull. Iranian Math. Soc. 49 (2023), no. 2, Paper No. 11, 12, DOI 10.1007/s41980-023-00744-0. MR4549776,
Show rawAMSref
\bib{olver2023invariants}{article}{ author={Olver, Peter J.}, title={Invariants of finite and discrete group actions via moving frames}, journal={Bull. Iranian Math. Soc.}, volume={49}, date={2023}, number={2}, pages={Paper No. 11, 12}, issn={1017-060X}, review={\MR {4549776}}, doi={10.1007/s41980-023-00744-0}, }
- [12]
- V. L. Popov and E. B. Vinberg. Invariant theory. In Algebraic geometry IV, pages 123–278. Springer, 1994.
- [13]
- F. Ronga, Stabilité locale des applications équivariantes, Differential topology and geometry (Proc. Colloq., Dijon, 1974), Lecture Notes in Math., Vol. 484, Springer, Berlin, 1975, pp. 23–35. MR0445526,
Show rawAMSref
\bib{ronga1975stabilite}{article}{ author={Ronga, F.}, title={Stabilit\'{e} locale des applications \'{e}quivariantes}, conference={ title={Differential topology and geometry}, address={Proc. Colloq., Dijon}, date={1974}, }, book={ series={Lecture Notes in Math., Vol. 484}, publisher={Springer, Berlin}, }, date={1975}, pages={23--35}, review={\MR {0445526}}, }
- [14]
- Gerald W. Schwarz, Smooth functions invariant under the action of a compact Lie group, Topology 14 (1975), 63–68, DOI 10.1016/0040-9383(75)90036-1. MR370643,
Show rawAMSref
\bib{schwarz1975smooth}{article}{ author={Schwarz, Gerald W.}, title={Smooth functions invariant under the action of a compact Lie group}, journal={Topology}, volume={14}, date={1975}, pages={63--68}, issn={0040-9383}, review={\MR {370643}}, doi={10.1016/0040-9383(75)90036-1}, }
- [15]
- Gerald W. Schwarz, Lifting smooth homotopies of orbit spaces, Inst. Hautes Études Sci. Publ. Math. 51 (1980), 37–135. MR573821,
Show rawAMSref
\bib{schwarz1980lifting}{article}{ author={Schwarz, Gerald W.}, title={Lifting smooth homotopies of orbit spaces}, journal={Inst. Hautes \'{E}tudes Sci. Publ. Math.}, number={51}, date={1980}, pages={37--135}, issn={0073-8301}, review={\MR {573821}}, }
- [16]
- J. Shawe-Taylor, Building symmetries into feedforward networks, 1989 First IEE International Conference on Artificial Neural Networks (Conf. Publ. No. 313), 158–162, IET, 1989.
- [17]
- N. M. Thiéry, Algebraic invariants of graphs; a study based on computer exploration, ACM SIGSAM Bulletin 34 (2000), no. 3, 9–20.
- [18]
- S. Villar, D. W. Hogg, K. Storey-Fisher, W. Yao, and B. Blum-Smith, Scalars are universal: Equivariant machine learning, structured like classical physics, Advances in Neural Information Processing Systems 34 (2021), 28848–28863.
- [19]
- S. Villar, D. W. Hogg, W. Yao, G. A. Kevrekidis, and B. Schölkopf, The passive symmetries of machine learning, arXiv preprint arXiv:2301.13724, 2023.
- [20]
- Patrick A. Worfolk, Zeros of equivariant vector fields: algorithms for an invariant approach, J. Symbolic Comput. 17 (1994), no. 6, 487–511, DOI 10.1006/jsco.1994.1031. MR1300350,
Show rawAMSref
\bib{worfolk1994zeros}{article}{ author={Worfolk, Patrick A.}, title={Zeros of equivariant vector fields: algorithms for an invariant approach}, journal={J. Symbolic Comput.}, volume={17}, date={1994}, number={6}, pages={487--511}, issn={0747-7171}, review={\MR {1300350}}, doi={10.1006/jsco.1994.1031}, }
Credits
Opening image is courtesy of filo via Getty.
Photo of Ben Blum-Smith is courtesy of Ryan Lash/TED.
Photo of Soledad Villar is courtesy of Erwin List/Soledad Villar.