Remote Access Mathematics of Computation
Green Open Access

Mathematics of Computation

ISSN 1088-6842(online) ISSN 0025-5718(print)

Request Permissions   Purchase Content 
 
 

 

Error bounds on complex floating-point multiplication with an FMA


Authors: Claude-Pierre Jeannerod, Peter Kornerup, Nicolas Louvet and Jean-Michel Muller
Journal: Math. Comp. 86 (2017), 881-898
MSC (2010): Primary 65G50
DOI: https://doi.org/10.1090/mcom/3123
Published electronically: July 15, 2016
MathSciNet review: 3584553
Full-text PDF

Abstract | References | Similar Articles | Additional Information

Abstract: The accuracy analysis of complex floating-point multiplication done by Brent, Percival, and Zimmermann [Math.  Comp., 76:1469-1481, 2007] is extended to the case where a fused multiply-add (FMA) operation is available. Considering floating-point arithmetic with rounding to nearest and unit roundoff $ u$, we show that their bound $ \sqrt 5 \, u$ on the normwise relative error $ \vert\hat z/z-1\vert$ of a complex product $ z$ can be decreased further to $ 2u$ when using the FMA in the most naive way. Furthermore, we prove that the term $ 2u$ is asymptotically optimal not only for this naive FMA-based algorithm but also for two other algorithms, which use the FMA operation as an efficient way of implementing rounding error compensation. Thus, although highly accurate in the componentwise sense, these two compensated algorithms bring no improvement to the normwise accuracy $ 2u$ already achieved using the FMA naively. Asymptotic optimality is established for each algorithm thanks to the explicit construction of floating-point inputs for which we prove that the normwise relative error then generated satisfies $ \vert\hat z/z-1\vert \to 2u$ as $ u\to 0$. All our results hold for IEEE floating-point arithmetic, with radix $ \beta $, precision $ p$, and rounding to nearest; it is only assumed that underflows and overflows do not occur and that $ \beta ^{p-1} \ge 24$.


References [Enhancements On Off] (What's this?)

  • [1] M. Baudin, Error bounds of complex arithmetic, June 2011, available at http://forge.scilab.org/upload/compdiv/files/complexerrorbounds_v0.2.pdf.
  • [2] Sylvie Boldo, Pitfalls of a full floating-point proof: example on the formal proof of the Veltkamp/Dekker algorithms, Automated Reasoning, Lecture Notes in Comput. Sci., vol. 4130, Springer, Berlin, 2006, pp. 52-66. MR 2354672, https://doi.org/10.1007/11814771_6
  • [3] Richard Brent, Colin Percival, and Paul Zimmermann, Error bounds on complex floating-point multiplication, Math. Comp. 76 (2007), no. 259, 1469-1481 (electronic). MR 2299783 (2008b:65062), https://doi.org/10.1090/S0025-5718-07-01931-X
  • [4] M. Cornea, J. Harrison, and P. T. P. Tang, Scientific Computing on Itanium\textregistered-based Systems, Intel Press, Hillsboro, OR, USA, 2002.
  • [5] T. J. Dekker, A floating-point technique for extending the available precision, Numer. Math. 18 (1971/72), 224-242. MR 0299007 (45 #8056)
  • [6] Nicholas J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd ed., Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2002. MR 1927606 (2003g:65064)
  • [7] IEEE Computer Society, IEEE Standard for Floating-Point Arithmetic, IEEE Standard 754-2008, August 2008, available at http://ieeexplore.ieee.org/servlet/opac?punumber=4610933.
  • [8] C.-P. Jeannerod, A radix-independent error analysis of the Cornea-Harrison-Tang method, ACM Trans. Math. Software 42 (2016), no. 3, Art. 19, 20 pp.
  • [9] Claude-Pierre Jeannerod, Nicolas Louvet, and Jean-Michel Muller, Further analysis of Kahan's algorithm for the accurate computation of $ 2\times 2$ determinants, Math. Comp. 82 (2013), no. 284, 2245-2264. MR 3073198, https://doi.org/10.1090/S0025-5718-2013-02679-8
  • [10] W. Kahan, Further remarks on reducing truncation errors, Communications of the ACM 8 (1965), no. 1, 40.
  • [11] Seppo Linnainmaa, Analysis of some known methods of improving the accuracy of floating-point sums, Nordisk Tidskr. Informationsbehandling (BIT) 14 (1974), 167-202. MR 0483373 (58 #3381)
  • [12] Seppo Linnainmaa, Software for doubled-precision floating-point computations, ACM Trans. Math. Software 7 (1981), no. 3, 272-283. MR 630437 (82h:68041), https://doi.org/10.1145/355958.355960
  • [13] Ole Møller, Quasi double-precision in floating point addition, Nordisk Tidskr. Informationsbehandling (BIT) 5 (1965), 37-50. MR 0181130 (31 #5359)
  • [14] O. Møller, Note on quasi double-precision, Nordisk Tidskr. Informationsbehandling (BIT) 5 (1965), 251-255.
  • [15] Jean-Michel Muller, On the error of computing $ ab+cd$ using Cornea, Harrison and Tang's method, ACM Trans. Math. Software 41 (2015), no. 2, Art. 7, 8. MR 3318079, https://doi.org/10.1145/2629615
  • [16] Jean-Michel Muller, Nicolas Brisebarre, Florent de Dinechin, Claude-Pierre Jeannerod, Vincent Lefèvre, Guillaume Melquiond, Nathalie Revol, Damien Stehlé, and Serge Torres, Handbook of Floating-Point Arithmetic, Birkhäuser Boston, Inc., Boston, MA, 2010. MR 2568265
  • [17] M. Pichat, Correction d'une somme en arithmétique à virgule flottante, Numer. Math. 19 (1972), 400-406 (French, with English summary). MR 0324892 (48 #3241)
  • [18] M. Pichat, Contributions à l'étude des erreurs d'arrondi en arithmétique à virgule flottante, Ph.D. thesis, Université Scientifique et Médicale de Grenoble, Grenoble, France, 1976.

Similar Articles

Retrieve articles in Mathematics of Computation with MSC (2010): 65G50

Retrieve articles in all journals with MSC (2010): 65G50


Additional Information

Claude-Pierre Jeannerod
Affiliation: Inria, Laboratoire LIP (CNRS, ENS de Lyon, Inria, UCBL), Université de Lyon, 46, allée d’Italie, 69364 Lyon cedex 07, France
Email: claude-pierre.jeannerod@inria.fr

Peter Kornerup
Affiliation: Department of Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, DK-5230 Odense M, Denmark
Email: kornerup@imada.sdu.dk

Nicolas Louvet
Affiliation: UCBL, Laboratoire LIP (CNRS, ENS de Lyon, Inria, UCBL), Université de Lyon, 46, allée d’Italie, 69364 Lyon cedex 07, France
Email: nicolas.louvet@ens-lyon.fr

Jean-Michel Muller
Affiliation: CNRS, Laboratoire LIP (CNRS, ENS de Lyon, Inria, UCBL), Université de Lyon, 46, allée d’Italie, 69364 Lyon cedex 07, France
Email: jean-michel.muller@ens-lyon.fr

DOI: https://doi.org/10.1090/mcom/3123
Received by editor(s): September 26, 2013
Received by editor(s) in revised form: July 25, 2014, May 15, 2015, and September 28, 2015
Published electronically: July 15, 2016
Article copyright: © Copyright 2016 American Mathematical Society

American Mathematical Society