Uniquely Identifying Mathematical Authors in the Mathematical Reviews® Database
Since 1940 Mathematical Reviews® has made an attempt to identify authors of papers listed in its publications. This was done in the beginning entirely by hand, with data for each individual (published name variants, MR numbers, etc.) kept on 3x5 cards and filed alphabetically. Beginning in 1985, with the advent of the electronic author database, the process of author identification has been automated to an astonishing degree. The results of this effort, performed by the staff of MR, are visible in the bracketed full names published in printed Mathematical Reviews® headings, in the printed author indexes, and in the Search the Author Database application of MathSciNet.
For each author a separate "author-individual" record is maintained in the MR Database, containing each published name variant associated with an author, institutional affiliations, mathematical subject classifications assigned to the author-individual's papers, coauthors, and references to papers indexed in the MR Database. Each record is headed by a "preferred name", which usually represents the fullest published form of an author's name that will distinguish the author-individual from others who publish using similar names; on occasion an unpublished full name is used as a preferred name if necessary to identify the author-individual uniquely.
MR's author identification employs a number of machine algorithms to compare a name string that appears on a paper, the institutional affiliation listed for the author, and the classification for the paper assigned by the MR editors against author-individuals already in the MR Database and find the best possible (ideally exact) match on all three elements. The programs are successful roughly eighty percent of the time. For the remaining twenty percent the program makes its "best guess" on a potential match to an author-individual in the database, using preset algorithms to rank possible matches. It is on this remaining twenty percent that the MR staff spends most of its time: the keyboarded name string is checked for typos or a mistaken name break; the intent of the journal in name presentation is carefully checked (journals do make errors in the presentation of first name/family name); alternate spellings are examined; bibliographies are checked for self-citations; coauthors are checked for a possible match. When all possibilities available via the paper at hand are exhausted, staff use internet and web-based tools to search for authors, e.g. searching for full names at university/department websites or for CV's with lists of publications. Authors are contacted by e-mail or paper mail if necessary.
The introduction of the MathSciNet Web interface to the Mathematical Reviews® Database provided an impetus to create additional author-individual database records from papers indexed or reviewed in MR prior to 1985. A matching algorithm was employed to assign attribution of older papers to author-individuals already in the MR Database, based on exact string matches with known variations of an author's name. In cases where an author used one form of her or his name on papers written prior to 1985 (say "Smith, J. M."), but a different form of the name on later papers ("Smith, John M."), a new author-individual record for "Smith, J. M." was created in the MR Database. Tools are available to examine and combine author-individual records and to make these changes available on MathSciNet virtually overnight. MathSciNet users have also proved to be of invaluable assistance in this endeavor, and users are encouraged to notify Mathematical Reviews® of any anomalies they notice in author information presented as a result of searches using Search the Author Database or clicking on an author name in a headline.
A helpful discussion about author identification, as it is presented in the MathSciNet interface to the Mathematical Reviews® Database, can be found in the booklet MathSciNet--Mathematical Reviews® on the Web: Guiding you through the literature of mathematics. Download a copy in PDF.