Modified FFTs for fused multiply-add architectures

Linzer, Elliot; Feig, Ephraim

doi:10.1090/S0025-5718-1993-1159169-0

Modified FFTs for fused multiply-add architectures
HTML articles powered by AMS MathViewer

by Elliot Linzer and Ephraim Feig PDF

Math. Comp. 60 (1993), 347-361 Request permission

Abstract:

We introduce fast Fourier transform algorithms (FFTs) designed for fused multiply-add architectures. We show how to compute a complex discrete Fourier transform (DFT) of length $n = {2^m}$ with $\frac {8}{3}nm - \frac {{16}}{9}n + 2 - \frac {2}{9}{( - 1)^m}$ real multiply-adds. For real input, this algorithm uses $\frac {4}{3}nm - \frac {{17}}{9}n + 3 - \frac {1}{9}{( - 1)^m}$ real multiply-adds. We also describe efficient multidimensional FFTs. These algorithms can be used to compute the DFT of an $n \times n$ array of complex data using $\frac {{14}}{3}{n^2}m - \frac {4}{3}{n^2} - \frac {4}{9}n{( - 1)^m} + \frac {{16}}{9}$ real multiply-adds. For each problem studied, the number of multiply-adds that our algorithms use is a record upper bound for the number required.

References

L. Auslander, E. Feig, and S. Winograd, Abelian semisimple algebras and algorithms for the discrete Fourier transform, Adv. in Appl. Math. 5 (1984), no. 1, 31–55. MR 736549, DOI 10.1016/0196-8858(84)90003-4
Richard E. Blahut, Fast algorithms for digital signal processing, Addison-Wesley Publishing Company, Advanced Book Program, Reading, MA, 1985. MR 777867
James W. Cooley and John W. Tukey, An algorithm for the machine calculation of complex Fourier series, Math. Comp. 19 (1965), 297–301. MR 178586, DOI 10.1090/S0025-5718-1965-0178586-1
Pierre Duhamel, Implementation of “split-radix” FFT algorithms for complex, real, and real-symmetric data, IEEE Trans. Acoust. Speech Signal Process. 34 (1986), no. 2, 285–295. MR 835552, DOI 10.1109/TASSP.1986.1164811

New algorithms for the

dimensional discrete Fourier transform

A bibliography of fast transform and convolution algorithms

Implementation of efficient FFT algorithms on fused multiply/add architectures

New scaled DCT algorithms for fused multiply/add architectures

Design of the IBM RISC System/6000 floating point execution unit

34

Henri J. Nussbaumer and Philippe Quandalle, Fast computation of discrete Fourier transforms using polynomial transforms, IEEE Trans. Acoust. Speech Signal Process. 27 (1979), no. 2, 169–181. MR 523618, DOI 10.1109/TASSP.1979.1163216

A new principle for fast Fourier transformation

24

Direct fast Fourier transform of bivariate functions

25

On computing the split-radix FFT

34

Richard Tolimieri, Myoung An, and Chao Lu, Algorithms for discrete Fourier transform and convolution, Springer-Verlag, New York, 1989. MR 1201161, DOI 10.1007/978-1-4757-3854-4
Martin Vetterli and Henri J. Nussbaumer, Simple FFT and DCT algorithms with reduced number of operations, Signal Process. 6 (1984), no. 4, 267–278 (English, with French and German summaries). MR 760430, DOI 10.1016/0165-1684(84)90059-8
Shmuel Winograd, On computing the discrete Fourier transform, Proc. Nat. Acad. Sci. U.S.A. 73 (1976), no. 4, 1005–1006. MR 415993, DOI 10.1073/pnas.73.4.1005

On computing the DFT

32

S. Winograd, On the multiplicative complexity of the discrete Fourier transform, Adv. in Math. 32 (1979), no. 2, 83–117. MR 535617, DOI 10.1016/0001-8708(79)90037-9