PDFLINK |

# Uncovering Data Across Continua: An Introduction to Functional Data Analysis

Communicated by *Notices* Associate Editor Richard Levine

## 1. Introduction

Nowadays, advancements in data collection technologies like sensors, computer vision, medical imaging, IoT, and wearables have generated vast volumes of high-frequency data across various fields. These data are not just a collection of numbers and tables but a rich, dynamic tapestry of information that captures the essence of change over a continuum. Functional Data Analysis (FDA)

Unlike traditional statistics dealing with discrete data points, FDA focuses on entire functions, curves, or shapes, providing insights into continuous changes. Whether analyzing time series, spatial data, growth curves, or any structured dataset, FDA excels at capturing ongoing change. FDA’s applications span various fields like medicine, biology, chemistry, economics, and environmental science, offering insights beyond isolated measurements. It aids in patient health tracking, economic trend analysis, and chemical or environmental management by modeling and understanding complex systems. In manufacturing, FDA can be applied to monitor continuous processes, such as chemical reactions, quality control measurements, and equipment performance. It helps detect deviations from the desired process behavior

In economics, FDA is employed to analyze longitudinal data, such as stock prices, gross domestic product trends, and inflation rates. It helps identify long-term patterns, cyclic behavior, and structural changes

In essence, FDA transcends traditional data analysis limitations by leveraging data with functionality, providing valuable statistical tools for researchers and professionals seeking deeper insights and solutions to complex problems. We explore FDA’s significance, mathematical foundations, practical applications, and future prospects to unveil its transformative potential.

## 2. The Significance of Functional Data Analysis

In various fields, FDA provides a powerful set of methods to model, analyze, and interpret data that exhibit continuous variation, allowing researchers and professionals to gain deeper insights, and make more accurate predictions and informed decisions based on the inherent functional nature of the data. This versatility makes FDA a valuable approach in a wide range of scientific and practical applications

Employing mathematical domains like linear algebra, functional analysis, probability and statistics, FDA manipulates and analyzes functions by representing data as observations of random variables in a function space. This allows operations like differentiation, integration, and smoothing, facilitating exploration of data structure and variations.

By treating data as functions, FDA helps uncover hidden patterns, relationships, and trends that would be challenging to discern using traditional statistical methods, leading to more informed decision-making and a deeper understanding of complex phenomena.

### FDA versus multivariate statistics

Is it worthwhile to employ continuous representations, or are we unnecessarily adding complexity to our tasks? Given that discrete data is often needed for computational purposes, what are the benefits of utilizing continuous representations in our analyses?

While discrete data may offer computational convenience, the advantages of working with continuous representations are numerous. By viewing objects as functions, curves, or surfaces, scientists can unlock more powerful analysis techniques, yielding better practical results and more natural solutions. Grenander’s principle of discretizing as late as possible underscores the importance of retaining continuous representations for as long as feasible, highlighting their inherent value in data analysis workflows.

With this in mind, let us consider how continuous representations enhance our understanding and analysis of data.

If data are sampled from an underlying function (e.g., Figure 1 (a)), and time points are synchronized across observations, focusing solely on heights, then analysis can be conducted using the vector If the time points hold significance as well, then it is necessary to retain them alongside the height data: . With continuous functions one can interpolate and resample at arbitrary points (e.g., Figure .1 (b)) and easily compare observations with different time points, as elements of a function space.

In traditional data analysis, one might work with data points in a table where each row represents an observation (e.g., and each column represents a variable. In FDA, the data are treated as functions, where each observation is considered as a function (e.g., in Figure )2 (a)) that maps a continuous variable (often time, frequency, wavelength or a spatial dimension) to a measured value. These functions represent how the data change over the continuum.

Understanding functions necessitates a profound grasp of the structures lying beneath them. Analyzing these structures requires a solid foundation of mathematical representations. The FDA approach empowers researchers to investigate various models extensively, thus expanding the comprehension of data characterized by functional structures across diverse fields of science and engineering. Examples of functional data are illustrated in Figure 2.

## 3. Mathematical Foundations of FDA

FDA involves a variety of specialized statistical techniques for handling functional data

Let be a probability space, a function space (e.g., a separable Banach space or a Hilbert space). A functional random variable is a variable

taking values in (of eventually infinite dimension). A functional data is then an observation of the functional random variable .

If then is a curve while an image may be considered as a functional data in the case where If . ( ), has a more complex structure.

Let us consider in the following the commonly used functional Hilbert space the space of , vector-valued square-integrable functions on -dimensional and give the main background to analyse functional data. First, consider the inner product on that Hilbert space: for , , The mean and covariance functions of the random variable . assumed as smooth functions, are respectively ,

and

The latter is viewed as the kernel of the linear Hilbert-Schmidt operator on : , Note that . admits the spectral decomposition where , , , is a complete orthonormal system in and is a decreasing sequence of positive real numbers such that .

Let be an independent and identically distributed (i.i.d.) sample of .

The usual estimator of is the method of moments estimator given by .

In this i.i.d. framework, there are several theoretical guarantees regarding the convergence of to (such as the law of large numbers, the central limit theorem, and concentration inequalities of the Bernstein type; see Chapter 2 of

Classic empirical estimators of the covariance operator and covariance function are and Several asymptotic results on . are given in Chapter 4 of

**From raw data to functional data:** Note that, in practice, we observe raw data (e.g., the average daily temperature in spatial locations described by the first panel of Figure 4: the temperature in each location is measured every day from 1960 to 1994) of the form

It should be noted that the observation times can vary in number and value depending on the individual .

Following Zhang and Wang

It is common in FDA to assume that the observations are noisy observations of the smooth latent curve Namely, we have . where the error terms , are zero mean and i.i.d. In the early stages of FDA, this smoothing is typically conducted as an initial step by kernel smoothing, local polynomial smoothing, Fourier, spline, or penalized spline approaches.

The classic smoothing approach is basis expansion by assuming that can be expressed as a finite combination of the first functions of a basis function of :

This is equivalent to writing where , and Then the estimators of the mean and covariance functions of . can be defined respectively on by

where is the matrix whose row is equal to Depending on the nature of the data, various choices for the . are possible. In the case of periodic data, a Fourier basis is appropriate, whereas for nonperiodic data, possible choices are polynomial basis or splines basis. Figure 3 shows a transition from raw data to functional data using a cubic basis. This process is not applicable for sparse functional data due to the very limited quantity of information available for each curve. Sparse data require more sophisticated approaches not discuss here. For more details concerning basis options, please see -splines

With the growing popularity of functional data analysis, numerous statistical methods have been extended and adapted to this context. In the following discussion, we explore some of the most valuable and widely used ones (principal component analysis, clustering, linear and non-linear regressions). It should be noted that we only consider the case of univariate functional data, i.e., for all The structure of multivariate functional data .( is more complex, please refer to )

**Functional principal component analysis:** Kleffe

The above spectral decomposition of is linked to the PCA on the In fact, functional PCA aims to represent the i.i.d. curves . using a few ( principal orthogonal eigenfunctions ) so that

This is an approximation of the Karhunen-Loève (KL) expansion that states that

where and are the eigenfunctions and eigenvalues of The . are called the scores, they extract the main features of and are centered pairwise uncorrelated random variables.

Note that the KL expansion is related to Mercer’s theorem, that states

Since the mean and covariance functions are unknown, in the early stage of FDA, applying a functional PCA is in practice equivalent to find estimated orthogonal eigenfunctions so that

Hence, assuming that can be expressed as the task is to find , and so that

By defining and and then multiplying the previous equation on the left by , we arrive at the classic PCA formula (where , ):

Subsequently, it becomes straightforward to ascertain the values of and Additionally, the determination of . (and consequently can be inferred through the following relationships: )

Ultimately, the estimated scores are given by It should be noted that, although not discussed here, other approaches for functional PCA have been proposed in the literature. A more complete review can be found in .

**Functional linear regression:** Numerous studies have explored regression modeling within the context of functional data

In this framework, we consider a real-valued response variable and a functional covariate Throughout the section, we assume that . has been centered and that we have a sample of i.i.d. replications of .

Generalized functional linear regression posits that the relationship between the response variable and the functional covariate is defined as follows:

where is a monotonic “link function”, is a positive “variance function” and , is a linear predictor defined by .

The model finds practical applications in various scenarios, such as establishing associations between the incidence of respiratory diseases (e.g., asthma, lung cancer) and air pollution levels in the months or years leading up to the study. In such cases, a generalized functional linear Poisson regression model is often employed, where the link function is the logarithm, and is the identity function.

The simplest and most popular model is the so-called functional linear model, where is the identity function, and is a constant function:

The random variables are assumed i.i.d., scalar variables with a mean of zero and a constant variance. Sometimes, an additional assumption of normality is made.

More generally, this linear model may encompass both functional ( and nonfunctional )( , covariates, so that: )

with .

The primary challenge in FDA lies in dealing with the infinite dimension of the functional variable. A frequently adopted solution is to approximate as a finite combination of orthogonal basis functions (as mentioned earlier), as well as However, in practice, finding such a basis is not always straightforward. An orthonormal basis can be derived through functional PCA: .

where the are the orthonormal eigenfunctions. By assuming that can also be written as we then obtain the following truncated linear regression model: , Beyond the linear model, this truncation procedure leads to a classic generalized linear model with covariates . and .

**Functional data clustering:** Clustering is the process of organizing observations into clusters, where observations within each cluster share similar characteristics, while the characteristics of each cluster are distinct from those of others. Clustering methods can be broadly categorized into hierarchical, partitional, and model-based approaches. Researchers have explored adaptations of these categories to the functional data framework. In the case of hierarchical methods, a significant challenge arises in devising an appropriate similarity measure for functional observations. One approach to addressing this challenge was presented by Hitchcock et al.

Then, the algorithm relies on measuring the distance between observations, typically using the Euclidean distance for nonfunctional data. However, when dealing with functional data, this distance metric needs to be adapted. García et al. -means

Model-based clustering based on mixture of distributions have also been proposed. The interested reader may refer to

For more comprehensive details, methodologies, and applications, please refer to the reviews provided by Horváth and Kokoszka

## 4. Applications and Future Directions

A multitude of methods for handling functional data have been introduced, with many others yet to be discovered. In this section, we highlight the potential of functional data through an illustration of clustering using the well-known Canadian Weather dataset from the R package *fda*. We focus on the average daily temperature recorded every day from 1960 to 1994 in 35 spatial locations in Canada.

As said above, in practical applications, functional data are typically observed at discrete points, such as the 365 days of the year. However, it is possible to reconstruct the underlying functions by representing them in a basis of functions. The initial step of this process is depicted in Figure 4, where a basis has been employed. -splines

To distinguish groups of Canadian cities based on their temperature patterns, we applied the algorithm proposed by Sangalli et al. -means*fdacluster*. The optimal number of groups was determined using the optimal average silhouette index and the resulting clustering results are depicted in Figure 5. We can observe that two distinct groups of Canadian cities emerge from the analysis: the first group (in green) corresponds to cities with consistently lower temperatures throughout the year, while the second group (in red) represents cities with consistently higher temperatures.

In addition to the functional aspect of the data, the spatial dimension is becoming increasingly relevant, particularly in the context of environmental data. Thus, the literature has seen the emergence of numerous methods specifically tailored to the analysis of spatial functional data. Recently, several spatial cluster detection methods have been introduced in this context. These methods can be used, for instance, to identify environmental hotspots characterized by elevated levels of certain pollutants.

In the following example, we will demonstrate a cluster detection approach that incorporates both the spatial and the functional nature of the data. This approach is the distribution-free functional spatial scan statistic (DFFSS) proposed by Frévent et al. *HDSpatialScan*. The data used in this example are sourced from the National Air Quality Forecasting Platform (www.prevair.org) and are available within the package. They comprise the daily average concentration of the pollutant recorded from May 1 to June 25, 2020, in northern France. Figures 6 and 7 present the raw data, their functional reconstruction using a basis, and their spatial distribution, respectively. -splines

Figure 8 (left panel) displays the statistically significant cluster detected by the DFFSS (highlighted in red) in this air pollution dataset. The right panel compares the concentration over the time within this cluster (in red) with that outside the cluster (in gray), revealing higher concentrations of within the identified cluster. This information can be valuable for authorities in conducting local investigations and implementing policies to mitigate pollution.

FDA’s significance has grown significantly owing to its relevance across diverse domains and the advancements in data collection technology (please refer to

- •
*Healthcare and Personalized Medicine*: FDA can analyze patient data as continuous functions, allowing for personalized treatment plans based on individual health profiles. Real-time monitoring through wearables and FDA can aid in disease prediction and the optimization of treatment strategies.- •
*Artificial Intelligence (AI) and Machine Learning (ML)*: FDA provides a nuanced representation of data, improving AI and ML models’ performance in various applications. In speech recognition, it enhances accuracy by capturing the continuous nature of speech signals.- •
*Climate Science*: In high-resolution climate data analysis, FDA identifies subtle patterns and trends, aiding in modeling, prediction, and mitigation strategies. Continuous data analysis contributes to precise climate projections and monitoring environmental changes.- •
*Digital Marketing and User Behavior*: In the digital realm, FDA uncovers intricate user behavior patterns, optimizing marketing, user experience, and product recommendations. It analyzes continuous data streams from digital platforms for deeper insights.- •
*Brain-Computer Interfaces (BCIs)*: FDA enhances BCIs by interpreting continuous brain activity data for prosthetics, neurorehabilitation, and cognitive augmentation. It enables precise control of assistive devices and cognitive enhancements.- •
*Smart Cities*: In smart cities, FDA optimizes urban planning, transportation systems, and energy consumption by analyzing continuous IoT and sensor data. It helps design sustainable and efficient cities through traffic analysis and energy usage trends.- •
*Biotechnology and Synthetic Biology*: In biotechnology, FDA models complex biological systems, facilitating the design of custom organisms and pharmaceuticals. It analyzes longitudinal data to engineer organisms for specific tasks.

**Softwares:** The Task View Functional Data Analysis on CRAN (https://cran.r-project.org/web/views/FunctionalData.html) lists the available R packages in the field of FDA, covering general functional data analysis, unsupervised learning (PCA, clustering, …), supervised learning (regression, classification), visualization and exploratory data analysis, registration, and alignment. Python and MATLAB also offer a few alternatives such as *fdasrsf* and *scikit-fda* (for Python), and *fda* or *fdasrvf* (for MATLAB).

## 5. Conclusion

Functional Data Analysis (FDA) understands and extracts meaningful insights from data that continuously vary over a continuum. While FDA may be particularly intriguing for those with a mathematical inclination, it invites everyone to explore the process of transforming numbers into valuable insights and offers a statistical approach that allows us to gain a deeper understanding of the world and actively contribute to shaping the future.

Indeed, over time, the methods and frequency of data collection will evolve, and computing and storage capacities will increase. The development of functional analysis methods is therefore essential, and their applications will improve decision-making in a variety of fields, providing biologists, economists, and policymakers with accurate information to make informed choices.

FDA is therefore a powerful field for understanding, analyzing, and using complex datasets. Thus, next time you see a graph, don’t just see points and lines, but look for the continuous story it tells, the hidden patterns it holds, and the insights it offers. Remember what you just read: this is the realm of functional data analysis, where numbers transform into narratives waiting to be discovered.

*A longer version of this paper is available on arXiv (https://arxiv.org/abs/2404.16598) to provide additional references and details.*

## References

- [1]
- D. Bosq,
*Linear processes in function spaces*:*Theory and applications*, Lecture Notes in Statistics, vol. 149, Springer-Verlag, New York, 2000, DOI 10.1007/978-1-4612-1154-9. MR1783138,## Show rawAMSref

`\bib{bosq2000linear}{book}{ author={Bosq, D.}, title={Linear processes in function spaces}, series={Lecture Notes in Statistics}, volume={149}, subtitle={Theory and applications}, publisher={Springer-Verlag, New York}, date={2000}, pages={xiv+283}, isbn={0-387-95052-4}, review={\MR {1783138}}, doi={10.1007/978-1-4612-1154-9}, }`

- [2]
- Frédéric Ferraty and Philippe Vieu,
*Nonparametric functional data analysis*:*Theory and practice*, Springer Series in Statistics, Springer, New York, 2006. MR2229687,## Show rawAMSref

`\bib{ferraty2006nonparametric}{book}{ author={Ferraty, Fr\'{e}d\'{e}ric}, author={Vieu, Philippe}, title={Nonparametric functional data analysis}, series={Springer Series in Statistics}, subtitle={Theory and practice}, publisher={Springer, New York}, date={2006}, pages={xx+258}, isbn={0-387-30369-3}, isbn={978-0387-30369-7}, review={\MR {2229687}}, }`

- [3]
- Camille Frévent, Mohamed-Salem Ahmed, Matthieu Marbac, and Michaël Genin,
*Detecting spatial clusters in functional data: new scan statistic approaches*, Spat. Stat.**46**(2021), Paper No. 100550, 21, DOI 10.1016/j.spasta.2021.100550. MR4336436,## Show rawAMSref

`\bib{frevent2021detecting}{article}{ author={Fr\'{e}vent, Camille}, author={Ahmed, Mohamed-Salem}, author={Marbac, Matthieu}, author={Genin, Micha\"{e}l}, title={Detecting spatial clusters in functional data: new scan statistic approaches}, journal={Spat. Stat.}, volume={46}, date={2021}, pages={Paper No. 100550, 21}, review={\MR {4336436}}, doi={10.1016/j.spasta.2021.100550}, }`

- [4]
- María Luz López García, Ricardo García-Ródenas, and Antonia González Gómez,
*K-means algorithms for functional data*, Neurocomputing**151**(2015), 231–245.,## Show rawAMSref

`\bib{garcia2015k}{article}{ author={Garc{\'\i }a, Mar{\'\i }a Luz~L{\'o}pez}, author={Garc{\'\i }a-R{\'o}denas, Ricardo}, author={G{\'o}mez, Antonia~Gonz{\'a}lez}, title={K-means algorithms for functional data}, date={2015}, journal={Neurocomputing}, volume={151}, pages={231\ndash 245}, }`

- [5]
- David B. Hitchcock, James G. Booth, and George Casella,
*The effect of pre-smoothing functional data on cluster analysis*, J. Stat. Comput. Simul.**77**(2007), no. 11-12, 1089–1101, DOI 10.1080/10629360600880684. MR2416483,## Show rawAMSref

`\bib{hitchcock2007effect}{article}{ author={Hitchcock, David B.}, author={Booth, James G.}, author={Casella, George}, title={The effect of pre-smoothing functional data on cluster analysis}, journal={J. Stat. Comput. Simul.}, volume={77}, date={2007}, number={11-12}, pages={1089--1101}, issn={0094-9655}, review={\MR {2416483}}, doi={10.1080/10629360600880684}, }`

- [6]
- Lajos Horváth and Piotr Kokoszka,
*Inference for functional data with applications*, Vol. 200, Springer Science & Business Media, 2012.,## Show rawAMSref

`\bib{horvath2012inference}{book}{ author={Horv{\'a}th, Lajos}, author={Kokoszka, Piotr}, title={Inference for functional data with applications}, publisher={Springer Science \& Business Media}, date={2012}, volume={200}, }`

- [7]
- Gareth M. James,
*Generalized linear models with functional predictors*, J. R. Stat. Soc. Ser. B Stat. Methodol.**64**(2002), no. 3, 411–432, DOI 10.1111/1467-9868.00342. MR1924298,## Show rawAMSref

`\bib{james2002generalized}{article}{ author={James, Gareth M.}, title={Generalized linear models with functional predictors}, journal={J. R. Stat. Soc. Ser. B Stat. Methodol.}, volume={64}, date={2002}, number={3}, pages={411--432}, issn={1369-7412}, review={\MR {1924298}}, doi={10.1111/1467-9868.00342}, }`

- [8]
- Jürgen Kleffe,
*Principal components of random variables with values in a separable Hilbert space*(English, with French and German summaries), Math. Operationsforsch. Statist.**4**(1973), no. 5, 391–406, DOI 10.1080/02331937308842161. MR391402,## Show rawAMSref

`\bib{kleffe1973principal}{article}{ author={Kleffe, J\"{u}rgen}, title={Principal components of random variables with values in a separable Hilbert space}, language={English, with French and German summaries}, journal={Math. Operationsforsch. Statist.}, volume={4}, date={1973}, number={5}, pages={391--406}, issn={0047-6277}, review={\MR {391402}}, doi={10.1080/02331937308842161}, }`

- [9]
- Salil Koner and Ana-Maria Staicu,
*Second-generation functional data*, Annu. Rev. Stat. Appl.**10**(2023), 547–572, DOI 10.1146/annurev-statistics-032921-033726. MR4567805,## Show rawAMSref

`\bib{koner2023second}{article}{ author={Koner, Salil}, author={Staicu, Ana-Maria}, title={Second-generation functional data}, journal={Annu. Rev. Stat. Appl.}, volume={10}, date={2023}, pages={547--572}, issn={2326-8298}, review={\MR {4567805}}, doi={10.1146/annurev-statistics-032921-033726}, }`

- [10]
- Xiaoyan Leng and Hans-Georg Müller,
*Classification using functional data analysis for temporal gene expression data*, Bioinformatics**22**(2006), no. 1, 68–76.,## Show rawAMSref

`\bib{leng2006classification}{article}{ author={Leng, Xiaoyan}, author={M{\"u}ller, Hans-Georg}, title={Classification using functional data analysis for temporal gene expression data}, date={2006}, journal={Bioinformatics}, volume={22}, number={1}, pages={68\ndash 76}, }`

- [11]
- Jeffrey S. Morris,
*Functional regression*, Annual Review of Statistics and Its Application**2**(2015), 321–359.,## Show rawAMSref

`\bib{morris2015functional}{article}{ author={Morris, Jeffrey~S.}, title={Functional regression}, date={2015}, journal={Annual Review of Statistics and Its Application}, volume={2}, pages={321\ndash 359}, }`

- [12]
- Biagio Palumbo, Fabio Centofanti, and Francesco Del Re,
*Function-on-function regression for assessing production quality in industrial manufacturing*, Quality and Reliability Engineering International**36**(2020), no. 8, 2738–2753, DOI 10.1002/qre.2786.,## Show rawAMSref

`\bib{fabioetal20}{article}{ author={Palumbo, Biagio}, author={Centofanti, Fabio}, author={Del~Re, Francesco}, title={Function-on-function regression for assessing production quality in industrial manufacturing}, date={2020}, journal={Quality and Reliability Engineering International}, volume={36}, number={8}, pages={2738\ndash 2753}, doi={10.1002/qre.2786}, }`

- [13]
- J. O. Ramsay and B. W. Silverman,
*Functional data analysis*, 2nd ed., Springer Series in Statistics, Springer, New York, 2005. MR2168993,## Show rawAMSref

`\bib{ramsay2005}{book}{ author={Ramsay, J. O.}, author={Silverman, B. W.}, title={Functional data analysis}, series={Springer Series in Statistics}, edition={2}, publisher={Springer, New York}, date={2005}, pages={xx+426}, isbn={978-0387-40080-8}, isbn={0-387-40080-X}, review={\MR {2168993}}, }`

- [14]
- Laura M. Sangalli, Piercesare Secchi, Simone Vantini, and Valeria Vitelli,
*alignment for curve clustering -mean*, Comput. Statist. Data Anal.**54**(2010), no. 5, 1219–1233, DOI 10.1016/j.csda.2009.12.008. MR2600827,## Show rawAMSref

`\bib{sangalli2010k}{article}{ author={Sangalli, Laura M.}, author={Secchi, Piercesare}, author={Vantini, Simone}, author={Vitelli, Valeria}, title={$k$-mean alignment for curve clustering}, journal={Comput. Statist. Data Anal.}, volume={54}, date={2010}, number={5}, pages={1219--1233}, issn={0167-9473}, review={\MR {2600827}}, doi={10.1016/j.csda.2009.12.008}, }`

- [15]
- Han Lin Shang,
*A survey of functional principal component analysis*, AStA Adv. Stat. Anal.**98**(2014), no. 2, 121–142, DOI 10.1007/s10182-013-0213-1. MR3254025,## Show rawAMSref

`\bib{shang2014survey}{article}{ author={Shang, Han Lin}, title={A survey of functional principal component analysis}, journal={AStA Adv. Stat. Anal.}, volume={98}, date={2014}, number={2}, pages={121--142}, issn={1863-8171}, review={\MR {3254025}}, doi={10.1007/s10182-013-0213-1}, }`

- [16]
- J. O. Ramsay and B. W. Silverman,
*Applied functional data analysis*:*Methods and case studies*, Springer Series in Statistics, Springer-Verlag, New York, 2002, DOI 10.1007/b98886. MR1910407,## Show rawAMSref

`\bib{silverman2002applied}{book}{ author={Ramsay, J. O.}, author={Silverman, B. W.}, title={Applied functional data analysis}, series={Springer Series in Statistics}, subtitle={Methods and case studies}, publisher={Springer-Verlag, New York}, date={2002}, pages={x+190}, isbn={0-387-95414-7}, review={\MR {1910407}}, doi={10.1007/b98886}, }`

- [17]
- Jane-Ling Wang, Jeng-Min Chiou, and Hans-Georg Müller,
*Functional data analysis*, Annual Review of Statistics and its application**3**(2016), 257–295.,## Show rawAMSref

`\bib{wang2016functional}{article}{ author={Wang, Jane-Ling}, author={Chiou, Jeng-Min}, author={M{\"u}ller, Hans-Georg}, title={Functional data analysis}, date={2016}, journal={Annual Review of Statistics and its application}, volume={3}, pages={257\ndash 295}, }`

- [18]
- Fang Yao, Hans-Georg Müller, and Jane-Ling Wang,
*Functional data analysis for sparse longitudinal data*, J. Amer. Statist. Assoc.**100**(2005), no. 470, 577–590, DOI 10.1198/016214504000001745. MR2160561,## Show rawAMSref

`\bib{yao2005functional}{article}{ author={Yao, Fang}, author={M\"{u}ller, Hans-Georg}, author={Wang, Jane-Ling}, title={Functional data analysis for sparse longitudinal data}, journal={J. Amer. Statist. Assoc.}, volume={100}, date={2005}, number={470}, pages={577--590}, issn={0162-1459}, review={\MR {2160561}}, doi={10.1198/016214504000001745}, }`

- [19]
- Mimi Zhang and Andrew Parnell,
*Review of clustering methods for functional data*, ACM Transactions on Knowledge Discovery from Data**17**(2023), no. 7, 1–34.,## Show rawAMSref

`\bib{zhang2023review}{article}{ author={Zhang, Mimi}, author={Parnell, Andrew}, title={Review of clustering methods for functional data}, date={2023}, journal={ACM Transactions on Knowledge Discovery from Data}, volume={17}, number={7}, pages={1\ndash 34}, }`

- [20]
- Xiaoke Zhang and Jane-Ling Wang,
*From sparse to dense functional data and beyond*, Ann. Statist.**44**(2016), no. 5, 2281–2321, DOI 10.1214/16-AOS1446. MR3546451,## Show rawAMSref

`\bib{zhangwang}{article}{ author={Zhang, Xiaoke}, author={Wang, Jane-Ling}, title={From sparse to dense functional data and beyond}, journal={Ann. Statist.}, volume={44}, date={2016}, number={5}, pages={2281--2321}, issn={0090-5364}, review={\MR {3546451}}, doi={10.1214/16-AOS1446}, }`

## Credits

Figures 1–8 are courtesy of Sophie Dabo-Niang and Camille Frévent.

Photo of Sophie Dabo-Niang is courtesy of Sophie Dabo-Niang.

Photo of Camille Frévent is courtesy of Camille Frévent.