PageRank for networks, graphs, and Markov chains

Authors:
C. Engström and S. Silvestrov

Original publication:
Teoriya Imovirnostei ta Matematichna Statistika, tom **96** (2017).

Journal:
Theor. Probability and Math. Statist. **96** (2018), 59-82

MSC (2010):
Primary 60K20; Secondary 05C82, 90B15, 90B18

DOI:
https://doi.org/10.1090/tpms/1034

Published electronically:
October 5, 2018

Full-text PDF

Abstract | References | Similar Articles | Additional Information

In this work it is described how a partitioning of a graph into components can be used to calculate PageRank in a large network and how such a partitioning can be used to re-calculate PageRank as the network changes. Although calculating PageRank is considered a problem, it is worth noing that the same partitioning method could be used when working with Markov chains in general or solving linear systems as long as the method used for solving a single component is chosen appropriately. An algorithm for calculating PageRank using a modified partitioning of the graph into strongly connected components is described.

Moreover, the paper also focuses on the calculation of PageRank in a changing graph from two different perspectives, by considering specific types of changes in the graph and calculating the difference in rank before and after certain types of edge additions or removals between components. Moreover, some common specific types of graphs for which it is possible to find analytic expressions for PageRank are considered, and in particular the complete bipartite graph and how PageRank can be calculated for such a graph. Finally, several open directions and problems are described.

- [1]
J. Leskovec,
*Google web graph. Google programming contest*,`http://snap.stanford.edu/data/web-Google.html`. - [2]
Fredrik
K. Andersson and Sergei
D. Silvestrov,
*The mathematics of internet search engines*, Acta Appl. Math.**104**(2008), no. 2, 211–242. MR**2443279**, https://doi.org/10.1007/s10440-008-9254-y - [3]
Konstantin
E. Avrachenkov, Jerzy
A. Filar, and Phil
G. Howlett,
*Analytic perturbation theory and its applications*, Society for Industrial and Applied Mathematics, Philadelphia, PA, 2013. MR**3137641** - [4]
A. Arasu, J. Novak, A. Tomkins, and J. Tomlin,
*Pagerank computation and the structure of the web: Experiments and algorithms*, Proceedings of the Eleventh International Conference on World Wide Web, Alternate Poster Tracks, 2002. - [5]
L. Becchetti and C. Castillo,
*The distribution of pagerank follows a power-law only for particular values of the damping factor*, Proceedings of the 15th international conference on World Wide Web, ACM Press, 2006. - [6]
Richard
Bellman,
*Introduction to matrix analysis*, Classics in Applied Mathematics, vol. 12, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1995. Reprint of the 1960 original. MR**1312360** - [7]
Abraham
Berman and Robert
J. Plemmons,
*Nonnegative matrices in the mathematical sciences*, Classics in Applied Mathematics, vol. 9, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1994. Revised reprint of the 1979 original. MR**1298430** - [8]
S. Brin and L. Page,
*The anatomy of a large-scale hypertextual web search engine*, Proceedings of the Seventh International Conference on World Wide Web, Amsterdam, The Netherlands, 1998, Elsevier Science Publishers B.V., pp. 107-117. - [9]
D. Dhyani, S. S. Bhowmick, and W.-K. Ng,
*Deriving and verifying statistical distribution of a hyperlink-based web page quality metric*, Data Knowl. Eng.**46**(2003), no. 3, 291-315. - [10]
C. Engström and S. Silvestrov,
*A componentwise pagerank algorithm*, Applied Stochastic Models and Data Analysis (ASMDA 2015), The 16th Conference of the ASMDA International Society, 2015, pp. 186-199. - [11]
C. Engström and S. Silvestrov,
*Non-normalized pagerank and random walks on n-partite graphs*, 3rd Stochastic Modeling Techniques and Data Analysis International Conference (SMTDA2014), 2015, pp. 193-202. - [12]
C. Engström and S. Silvestrov,
*Using graph partitioning to calculate pagerank in a changing network*, 4th Stochastic Modeling Techniques and Data Analysis International Conference (SMTDA2016), 2016, pp. 155-164. - [13]
C. Engström and S. Silvestrov,
*Calculating pagerank in a changing network with added or removed edges*, International Conference in Nonlinear Problems in Aviation and Aerospace (ICNPAA), AIP Conference Proceedings, vol. 1798, 2017. - [14]
Christopher
Engström and Sergei
Silvestrov,
*Generalisation of the damping factor in PageRank for weighted networks*, Modern problems in insurance mathematics, EAA Ser., Springer, Cham, 2014, pp. 313–333. MR**3330692** - [15]
Christopher
Engström and Sergei
Silvestrov,
*PageRank, connecting a line of nodes with a complete graph*, Engineering mathematics. II, Springer Proc. Math. Stat., vol. 179, Springer, Cham, 2016, pp. 249–274. MR**3630582** - [16]
Christopher
Engström and Sergei
Silvestrov,
*PageRank, connecting a line of nodes with a complete graph*, Engineering mathematics. II, Springer Proc. Math. Stat., vol. 179, Springer, Cham, 2016, pp. 249–274. MR**3630582** - [17]
F.
R. Gantmacher,
*The theory of matrices. Vols. 1, 2*, Translated by K. A. Hirsch, Chelsea Publishing Co., New York, 1959. MR**0107649** - [18]
Mats
Gyllenberg and Dmitrii
S. Silvestrov,
*Quasi-stationary phenomena in nonlinearly perturbed stochastic systems*, De Gruyter Expositions in Mathematics, vol. 44, Walter de Gruyter GmbH & Co. KG, Berlin, 2008. MR**2456816** - [19]
T. Haveliwala and S. Kamvar,
*The second eigenvalue of the google matrix*, Technical Report 2003-20, Stanford InfoLab, 2003. - [20]
H. Ishii, R. Tempo, E.-W. Bai, and F. Dabbene,
*Distributed randomized pagerank computation based on web aggregation*, Decision and Control, 2009, Held jointly with the 2009 28th Chinese Control Conference, CDC/CCC 2009, Proceedings of the 48th IEEE Conference, 3026-3031. - [21]
S. Kamvar and T. Haveliwala,
*The condition number of the pagerank problem*, Technical Report 2003-36, Stanford InfoLab, June 2003. - [22]
Sepandar
Kamvar, Taher
Haveliwala, and Gene
Golub,
*Adaptive methods for the computation of PageRank*, Linear Algebra Appl.**386**(2004), 51–65. MR**2066607**, https://doi.org/10.1016/j.laa.2003.12.008 - [23]
Vladimir
S. Koroliuk and Nikolaos
Limnios,
*Stochastic systems in merging phase space*, World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2005. MR**2205562** - [24]
Peter
Lancaster,
*Theory of matrices*, Academic Press, New York-London, 1969. MR**0245579** - [25]
Amy
N. Langville and Carl
D. Meyer,
*A reordering for the PageRank problem*, SIAM J. Sci. Comput.**27**(2006), no. 6, 2112–2120. MR**2211442**, https://doi.org/10.1137/040607551 - [26]
Chris
P. Lee, Gene
H. Golub, and Stefanos
A. Zenios,
*A two-stage algorithm for computing PageRank and multistage generalizations*, Internet Math.**4**(2007), no. 4, 299–327. MR**2522947** - [27]
J. Leskovec and A. Krevl,
*SNAP Datasets: Stanford large network dataset collection*,`http://snap.stanford.edu/data`, June 2014. - [28]
Dmitrii
Silvestrov and Sergei
Silvestrov,
*Asymptotic expansions for stationary distributions of perturbed semi-Markov processes*, Engineering mathematics. II, Springer Proc. Math. Stat., vol. 179, Springer, Cham, 2016, pp. 151–222. MR**3630580** - [29]
D. Silvestrov and S. Silvestrov,
*Asymptotic expansions for stationary distributions of nonlinearly perturbed semi-Markov processes. I, II*(2016). Part I: arXiv:1603.04734, Part II: arXiv:1603.04743. - [30]
Dmitrii
Silvestrov and Sergei
Silvestrov,
*Nonlinearly perturbed semi-Markov processes*, SpringerBriefs in Probability and Mathematical Statistics, Springer, Cham, 2017. MR**3700359** - [31]
Robert
Tarjan,
*Depth-first search and linear graph algorithms*, SIAM J. Comput.**1**(1972), no. 2, 146–160. MR**0304178**, https://doi.org/10.1137/0201010 - [32]
Q. Yu, Z. Miao, G. Wu, and Y. Wei,
*Lumping algorithms for computing google's pagerank and its derivative, with attention to unreferenced nodes*, Information Retrieval**15**(2012), no. 6, 503-526.

Retrieve articles in *Theory of Probability and Mathematical Statistics*
with MSC (2010):
60K20,
05C82,
90B15,
90B18

Retrieve articles in all journals with MSC (2010): 60K20, 05C82, 90B15, 90B18

Additional Information

**C. Engström**

Affiliation:
Division of Applied Mathematics, The School of Education, Culture and Communication, Mälardalen University, Box 883, 721 23 Västerås, Sweden

Email:
christopher.engstrom@mdh.se

**S. Silvestrov**

Affiliation:
Division of Applied Mathematics, The School of Education, Culture and Communication, Mälardalen University, Box 883, 721 23 Västerås, Sweden

Email:
sergei.silvestrov@mdh.se

DOI:
https://doi.org/10.1090/tpms/1034

Keywords:
PageRank,
random walk,
Markov chain,
graph,
strongly connected component

Received by editor(s):
March 13, 2017

Published electronically:
October 5, 2018

Dedicated:
Dedicated to Professor Dmitrii Silvestrov on his 70th birthday

Article copyright:
© Copyright 2018
American Mathematical Society