A comparison of a Lazy PageRank and variants for common graph structures

(1)

MASTER (1 YEAR) THESIS IN MATHEMATICS / APPLIED MATHEMATICS

by

Barkat Aziz Ali

Magisterarbete i matematik / tillämpad matematik

DIVISION OF APPLIED MATHEMATICS MÄLARDALEN UNIVERSITY

(2)

(3)

Date: 2018-01-25 Project name:

A comparison of a Lazy PageRank and variants for common graph structures Author(s):

Barkat Aziz Ali Supervisor(s): Christopher Engström Reviewer: Milica Ranˇci´c Examiner: Sergei Silvestrov Comprising: 15 ECTS credits

(4)

(5)

Abstract

Barkat Aziz Ali

A comparison of a Lazy PageRank and variants for common graph structures

The thesis first reviews the mathematics behind the Google’s PageRank, which is the state-of-the-art webpage ranking algorithm. The main focus of the thesis is on exploring a lazy PageRank and variants, related to a random walk, and by realizing that, they can be computed using the very same algorithm, find lazy PageRank and variants’ expressions for some common graph structures, for example, a line-graph, a complete-graph, a complete-bipartite graph including a star graph, and try to get some understanding of the behavior of the PageRank, when a network evolves, for example either by a contraction or an expansion of graphs’ nodes or links.

(6)

(7)

Acknowledgements

First of all, I thank Almighty God, the Most Gracious and the Most Merciful.

I wish to thank my thesis supervisor, Christopher Engström, for his supervision as well as patience towards the progress and completion of the thesis work.

I also wish to thank the program’s study co-coordinator, Karl Lundengård, and the study advisor, Malin Lundin, for their academic and administrative support.

I wish to thank Milica Ranˇci´c, the programme coordinator-Master’s programme in Financial Engineering, for her valuable time to review the thesis.

I also wish to thank the thesis examiner, Professor Sergei Silvestrov, the Research Sci-entific Leader for Mathematics and Applied Mathematics at Mälardalen University. My special thanks to my wife for her unconditional love and support.

My sincere thanks to my parents-in-law and my younger sister for their financial assistance.

I also wish to thank all my brothers and sisters, relatives and friends for their well-wishes.

(8)

(9)

2 PageRank variants 15 2.1 Variants . . . 15 2.1.1 Traditional PR . . . 15 2.1.2 Lazy PR . . . 15 2.1.3 Generalized-Lazy PR . . . 16 2.1.4 Backstep PR . . . 16 2.2 Theoretical results . . . 16 2.2.1 Lazy vs Traditional PR . . . 16 2.2.2 Generalized-lazy vs Traditional PR . . . 17 2.2.3 Backstep vs Traditional PR . . . 17 BPR-single backstep . . . 17 BPR-multiple backsteps . . . 19 2.2.4 Conclusion . . . 20 3 PageRank expressions 21 3.1 Line-graph . . . 21 3.2 Modified-line-graph . . . 21 3.2.1 Adding a link . . . 22 3.2.2 Removing a link . . . 22

3.2.3 Linking an external node . . . 22

3.3 Complete-graph . . . 23

3.3.1 Modifications . . . 24

3.4 Bipartite graphs . . . 24

3.4.1 Bipartite graph (Bi-graph) . . . 24

3.4.2 Complete-bipartite graph (Bi-clique) . . . 25

3.4.3 Star graph . . . 25

3.5 Expressions of the PR variants . . . 26

3.6 LPR expressions . . . 26

3.6.1 LPR of a line-graph linking an external node . . . 26

3.6.2 LPR of a complete-graph . . . 27

3.6.3 LPR of a complete-bipartite graph . . . 27

3.6.4 LPR of a star graph . . . 27

3.7 GLPR expressions . . . 27

3.7.1 GLPR of a line-graph linking an external node . . . 27

3.7.2 GLPR of a complete-graph . . . 28

3.7.3 GLPR of a complete-bipartite graph . . . 28

3.7.4 GLPR of a star graph . . . 28

3.8 BPR-s expressions . . . 28

3.8.1 BPR-s of a line-graph linking an external node . . . 29

3.8.2 BPR-s of a complete-graph . . . 29

3.8.3 BPR-s of a complete-bipartite graph . . . 29

3.8.4 BPR-s of a star graph . . . 29

3.9 BPR-m expressions . . . 30

3.9.1 BPR-m of a line-graph linking an external node . . . 30

3.9.2 BPR-m of a complete-graph . . . 30

3.9.3 BPR-m of a complete-bipartite graph . . . 30

3.9.4 BPR-m of a star graph . . . 30

(11)

(12)

(13)

List of Figures

1.1 A graph with a rank sink . . . 4

1.2 A graph with a cycle . . . 4

3.1 A simple-line graph with 5 nodes . . . 21

3.2 A line-graph with 5 nodes, where node 1 links to node 2. . . 22

3.3 A line-graph where a link between node 3 and node 2 is removed. . . . 22

3.4 A line-graph where external node 6 is linked to internal node 3. . . 22

3.5 A complete-graph with 5 nodes. . . 23

3.6 A bipartite graph. . . 24

3.7 A complete-bipartite graph K5,3. . . 25

(14)

(15)

List of Abbreviations

PR PageRank

BP Sergey Brin and Larry Page, Google’s inventors

SIAM Society for Industrial and Applied Mathematics

RW Random Walker

NN-PR Non Normalized PageRank

TPR Traditional PageRank

LPR Lazy PageRank

GLPR Generalized Lazy PageRank

BPR Backstep PageRank

BPR-s Backstep PageRank-single backstep

(16)

(17)

List of Symbols

rk(Pi) PageRank score of page Pi at kth iteration

BPi Set of pages pointing into Pi |Pi| Number of out-links from Pi

H Hyperlink matrix

π(k)T PR vector at kthiteration e A vector of all 1s

P Markov matrix

S Stochastic matrix

a Binary dangling node vector

E Teleportation matrix

G Google matrix

α Scaling parameter/ boring factor λ1 Dominant eigenvalue V Personalization/Teleporation vector O(n) Computational complexity γ Laziness degree π 0 Non-Normalized PR vector ||π0|| Norm of π0 b Some fraction of α I Identity matrix

Kn Complete-graph with n nodes

Km,n Complete-bipartite-graph with subsets m and n

Sn Star-graph with n leaves π(t) Traditional PR vector π(l) Lazy PR vector

π(g) Generalized-lazy PR vector

π(b1) Back step PR-single backstep vector π(b) Back step PR-multiple backsteps vector

(18)

(19)

I dedicate this thesis to my beloved late mother, late father and late brother.

(20)

(21)

Introduction

PageRank is a mathematical link analysis algorithm and a state-of-the-art web rank-ing algorithm, which provides search engines, Google for example, with impres-sively accurate results for search queries.

In Chapter 1, we will review the mathematics behind the Google’s PageRank. The review includes the basic PageRank model and its iterative matrix equation, Markov chain and its matrix and properties, initial adjustments and their implications to the model, computation of the PageRank vector with special reference to the power method, the ways of modifying the model and speeding up the computation of the PageRank.

In Chapter 2, we will explore the lazy PageRank and variants, related to diverse methods of a random walk that have distinct semantic connotations, and later de-duce that, though all of them differ semantically, they can still be computed using the very same algorithm.

In Chapter 3, we will find and compare expressions of a Lazy PageRank & variants for some common graph structures, for example, a line-graph, a complete-graph, a complete-bipartite graph including a star graph.

(22)

(23)

Chapter 1

Mathematical anatomy of the

Google’s PageRank

In this chapter we will present the mathematical anatomy of the Google’s PageR-ank. One of the most exciting applications of the linear algebra today, is perhaps, the use of link analysis by web search engines for the webpage ranking. PageRank (PR) is a mathematical link analysis algorithm and a state-of-the-art web ranking al-gorithm. PR may be regarded as the grand application of Markov chains, actually the stationary values of an enormous Markov chain, and the world’s largest ma-trix computation of the order of billions. PR was patented as a US Patent number 6285999, filed in 1998, by one of its founders, Larry Page, and granted in 2001, and so the name PageRank has a double reference to both Larry Page, and a web Page [6].

1.1 PageRank proposition

Stanford University USA, is the place of origin and the mother of the PR algorithm. The PR algorithm was a thesis, written by two of its computer science doctoral stu-dents, Sergey Brin and Larry Page who later left their PhD studies to develop the Google search engine. We refer to them as BP on wards.

PageRank’s thesis is that a webpage is important if it is pointed to by other important pages. One can notice its circular definition, but it can be formalized in a simple mathemat-ical formula.

Webpage "importance" refers to the webpage popularity score which [6] claims to be a crucial complement to the content score which provides search engines, Google for example, with impressively accurate results for search queries.

Mathematical formulation of the statement of their thesis will reveal that the PR importance scores are actually the stationary values of an enormous Markov chain.

1.2 Primitive PageRank Model

The author in [6] provides an in-depth analysis of the primitive PR model which we review:

(24)

2 Chapter 1. Mathematical anatomy of the Google’s PageRank

1.2.1 Definition

Recall that BP’s starting proposition about the PR is that a webpage is important if it is pointed to by other important pages. So the starting definition of the PR is as follows:

Definition 1.2.1. The PR of page Piis the sum of the PRs of all pages pointing into Pi.

Mathematically: r(Pi) =

∑

Pj∈BPi r(Pj) |Pj| (1.1) where:

r(Pi)denotes the PR of page Pi.

|P_j|denotes the number of out-links from page Pj.

BPidenotes the set of pages pointing into Pi(Back-linking to Pi).

Due to the cyclic definition of eq. 1.1, there is a problem that the values of r(P_j), the PRs of the pages in-linking to the page Pi, are unknown. To overcome this problem

BP used an iterative procedure, by assuming that in the beginning all the pages have equal PRs,1_n; where n is the number of webpages in the Google’s index of webpages. To proceed further, we need to introduce the iterative procedure for eq. 1.1.

1.2.2 Iterative procedure

We define the iterative procedure for eq. 1.1 by letting rk+1 be the PR of page Pi at

iteration k+1. That is:

rk+1(Pi) =

∑

Pj∈BPi

rk(Pj) |Pj|

(1.2)

The process is initiated with r0(Pi) = n1 for all pages, as stated earlier, and then

repeated for a number of iterations until the PR scores eventually converge to some final stable values.1_{Note that eq. 1.1 and eq. 1.2 compute the PR, one page at a time.}

We can, however, represent these summation equations by matrix equations with added benefits.

1.2.3 Matrix representation

By using matrices we can replace the tedious summation symbol, and at each itera-tion, can compute a PR vector, which uses a 1×n vector to hold the PR values for all the pages in the index.

We need to introduce here the hyperlink matrix and the PR vector.

(25)

Hij = |P1i| if there is a link from node i to node j, and 0 otherwise.

Although H has the same nonzero structure as the binary adjacency matrix for any graph, its nonzero elements actually represent the probabilities. The nonzero ele-ments of row i correspond to the out-linking pages of page i, whereas the nonzero elements of column i correspond to the in-linking pages of page i.

One can observe that H is a very sparse matrix because a large proportion of its ele-ments are zero, as most webpages link to only a handful of other pages.

One can also observe that H appears to be a stochastic transition probability matrix for a Markov chain. Nodes with no out-links (dangling nodes) create zero rows but all other rows corresponding to the non-dangling nodes create stochastic rows (a row sum equals to 1). Thus H is a sub-stochastic.

PageRank vector

PR vector is a single 1×n vector that holds the PR values for all webpages in the index. A row vector π(k)Tdenotes the PR vector at the kth iteration.

1.3 Iterative matrix equation

The above discussion leads us to the matrix representation of the summation equa-tions of the basic PR model as2:

π(k+1)T = π(k)TH. (1.3)

We observe that the iterative process of eq. 1.3 is a simple linear stationary process. One can observe that each iteration of eq. 1.3 involves one vector-matrix multipli-cation, which generally requires O(n2) computations, where n is the size of the square matrix H. Since H is a sparse matrix, this vector-sparse matrix multiplica-tion now requires O(nnz(H)), much less effort than the O(n2₎_{dense computation,}

where nnz(H)is the number of non-zeros in H. The author in [6] estimates that, on average a webpage has about 10 outlinks, which means that H has about 10n non-zeros and hence the vector-sparse matrix multiplication of eq. 1.3 reduces to O(n) computations!

1.3.1 Convergence problems

BP initially encountered the problems of rank sink, zero PR and cycles, when using eq. 1.3, which we discuss here:

(26)

Rank sink

A rank sink is a webpage or set of webpages that continue to suck in PR during the iterative PR computation. [6]

BP encountered a problem of a rank sink when they originally started the iterative process of eq. 1.3 with initial vector π(0)T = _n1eT_{where e}T _{is the row vector of all 1s.}

At each iteration, these pages accumulate more and more PR and thus refuse to share score with other pages. A simple example of a rank sink is shown in Fig.1.1 where the dangling node 3 is a rank sink.

FIGURE1.1: A graph with a rank sink.

Another problem is that once a RW enters this set of pages, there is no escape route.

Zero PR

As rank sink nodes hoard scores, they may cause a problem of zero PR to some other nodes. Thus ranking nodes by their PR values is tough when most of the nodes have zero PR.

Cycle

A cycle is a path in the web-graph that always returns back to its origin. [6]

For example, in Fig. 1.2, a trivial cycle occurs when page 1 points only to page 2 and page 2 points only back to page 1. The problem is that a RW of the PR model gets stuck in a cycle and circles indefinitely in the pages on the path, which causes convergence problems for a PR.

FIGURE1.2: A graph with a cycle.

When the iterative process of eq. 1.3 is run with the initial vector π(0)T ₌ _{1 0} _,

the iterates π(k)Tflip-flop indefinitely between 1 0 when k is even and 0 1 when k is odd, and hence never converge.

Now the question is to how to overcome these PR convergence problems caused by sinks and cycles? The answer perhaps lies in the Markov matrix.

(27)

chain with the transition probability matrix H. So the theory of a Markov chain is applicable to a PR problem and with the help of this theory, adjustments to this iterative matrix equation can be made to overcome the PR convergence problems. The result of the Markov properties on the PR problem will be that a unique positive vector exists when the Google matrix is stochastic and irreducible. With added prop-erty of aperiodicity, the power method will converge to this PR vector, regardless of the starting vector for the iterative process. Note that aperiodicity plus

irreducibil-ity imply the primitivirreducibil-ity.

This means that, for any starting vector, the power method applied to a Markov matrix P converges to a unique positive vector called the stationary vector; provided

P is stochastic, irreducible and aperiodic.

Hence if H is modified to P with these Markov properties, the PR convergence prob-lems of sinks and cycles in the basic PR model can be overcome. This leads us to the needed initial adjustments to the model.

1.5 Initial adjustments to the model

Recall that the adjustments are required in the basic PR model to achieve the Markov properties. With the notion of a RW model, BP made two adjustments: the stochas-ticity adjustment and the primitivity adjustment.

1.5.1 Stochasticity adjustment

It is the adjustment to the BP’s original basic PR model that artificially forces the PR matrix to be stochastic, and allows a RW to teleport to a new page immediately after entering a dangling node. [6]

We know that, on the Web there are a plenty of dangling nodes and whenever a RW enters a dangling node, it gets caught as there is no out-link from it, which corresponds to 0T row of H, and this makes the matrix a sub-stochastic. In order to make it a stochastic matrix, 0T _{rows of H are replaced with} 1

neT, thereby making it

a stochastic matrix S, and in this way a stochasticity adjustment is achieved and as a result, the RW, after entering a dangling node, can now hyperlink to any page at random.

Mathematically, stochastic fix reveals that S is created from a rank-one update to H. That is, S is a combination of the original raw H and a rank-one matrix, _n1aeT.That is: S= H+a 1 ne T , (1.4)

where a is a binary vector, called the dangling node vector and that ai =1 if page i

(28)

6 Chapter 1. Mathematical anatomy of the Google’s PageRank The consequence of the stochasticity adjustment is that it guarantees S to be stochas-tic, and hence, is the transition probability matrix of a Markov chain.

1.5.2 Primitivity adjustment

The stochastic adjustment cannot alone guarantee desired convergence results. For a unique positive PR vector πT to exist, and that eq. 1.3 of the model converges to this πT _{quickly, a primitive adjustment is needed.}

A primitivity adjustment is the adjustment to the BP’s original basic PR model that artifi-cially adds direct, small weight connections between every page on the Web; it guarantees the existence and the uniqueness of the PR vector and the convergence of the power method to that vector. [6]

The stochastic adjustment makes H to be stochastic and now a primitive fix makes the matrix to be both stochastic and primitive. We know that a primitive matrix is both irreducible and aperiodic, which are the desired Markov properties for the basic PR model. This implies that the PR vector, which is the stationary vector of a Markov chain, exists, is unique, and can be found by simple power iteration.

We here need to mention the notion of a RW for the primitivity fix.

A RW following the hyperlink structure of the web, sometimes, may get bored and abandon the hyperlink method of surfing, and instead teleport to the new page by entering a new web address in the browser’s URL line, for example.

After teleporting, the RW begins hyperlink surfing again, until the next teleporta-tion, and so on.

This activity can be modeled mathematically by introducing the teleportation matrix

E, the result of which is the Google matrix G:

G=αS+ (1−α)E, (1.5)

where:

E is called a teleportation matrix, which is a completely dense, rank-one matrix. S is a stochastic matrix, which is a sparse, stochastic, but most likely reducible

ma-trix.

G is called a Google matrix, which is a completely dense, stochastic, and primitive

matrix.

αis a scaling parameter between the value of 0 and 1, 0 < α < 1 .This parameter

controls the proportion of the time a RW follows the hyperlinks as opposed to tele-porting. If we let α=0.8 then 80% of the time a RW follows the hyperlink structure of the web while remaining 20% the RW teleports to a random new page.

Note that the teleportation matrix E = _n1eeT _{is uniform, which means that, on}

tele-porting, a RW is equally likely to jump to any page. Eq. 1.5 now becomes:

G=αS+ (1−α)1

nee

(29)

some sense artificial, which we briefly summarize: • G is stochastic

It is the convex combination of the two stochastic matrices S and E. • G is irreducible

Irreducibility is trivially enforced by connecting every page directly to every other page.

• G is aperiodic

Gii >0 for all i implies self-loops and creates aperiodicity.

• G is primitive3

Gk > 0 for some k implies that a unique positive vector πT exists, and the power method applied to G is guaranteed to converge to πT.

• G is completely dense

As E is completely dense, so is G. But G can be written as a rank-one update to the very sparse matrix H. Eq. 1.6 can be written as:

G=αH+ (αa+ (1−α)e)1

ne

T_. _(1.7)

• G is artificial

Recall that the raw H was twice modified to get required convergence prop-erties because a stationary vector does not exist for H, and in this sense G is artificial, and with these two modifications, a unique πT exists for G.

In summary, when the power method is applied to G, Google’s adjusted PR method becomes:

π(k+1)T =π(k)TG

1.6 Computation of PR vector

In this section, we will analyze the computation of the PR vector.

The PR problem can be stated either as an eigenvector problem or a linear system problem.

1.6.1 Eigenvector system

The author in [6] argues that much attention has been given to an eigenvector prob-lem, because BP originally conceived the PR problem as an eigenvector problem.

(30)

8 Chapter 1. Mathematical anatomy of the Google’s PageRank The problem statement is to solve the eigenvector problem for the PR vector π=πT_G,

subject to the normalization eq. πTe=1.

Recall that πTis a stationary vector of a Markov chain with a transition matrix G. Since G is a stochastic matrix, its dominant eigenvalue is λ1. So the goal in this

eigenvector system is to find the normalized left-hand eigenvector of the Google matrix corresponding to the dominant eigenvalue, λ1. The normalization eq. πTe =1 ensures

that πTis a probability vector.

1.6.2 Computation

Recall that PR problem is the world’s largest matrix computation of the order of bil-lion, and therefore advanced and computationally efficient methods must be used. Some of the available numerical iterative methods are the power method, Jacobi’s method, Gauss-Seidel method, etc. Specific features of the Google PR matrix make the power method one of the best choices.4

1.6.3 Power method

The power method is one of the oldest and simplest iterative methods. The author in [6] states that it goes back at least to 1913, and in the 1960s it became the stan-dard method for finding the eigenvalues and eigenvectors of a matrix with a digital computer. A required stationary vector of a Markov chain, which is the dominant left-hand eigenvector of the Markov chain, can therefore be found by the power method.

Reasons using the method

There are several good reasons to use the power method. It is one of the simplests to implement, program computationally and a storage-friendly as well. Like many other iterative methods, it is a matrix-free method, and the power applied to a com-pletely dense matrix G can be expressed in terms of a very sparse matrix H, as shown below, which is computationally a good thing.

Putting the value of S from eq. 1.4 into eq. 1.6, we get:

G=αH+ (αa+ (1−α)e)1

ne

T_.

Another reason is the number of iteration it requires. We further examine the reasons here:

Method simplicity

As stated earlier, implementation and computer programming of the power method are elementary, and the power applied to a completely dense matrix G can be ex-pressed in terms of a very sparse matrix H.

4_{It is equally valid for other iterative methods, the Jacobi method in particular, and other methods}

(31)

sparse matrix multiplication reduces to O n computations as opposed to O n if it were vector-dense matrix multiplication.

Matrix-free method

Direct methods manipulate elements of the matrix during each step of vector-matrix multiplication. Even though H is very sparse, its enormous size (year 2006), is of the order of 8.1 billion [6], and the lack of structure prohibits the use of direct methods. But like many other iterative methods, power method is a matrix-free method. A matrix-free method for solving a linear system of equations or an eigenvalue problem does not store the coefficient matrix explicitly, but accesses the matrix by evaluating matrix-vector products.

Storage-friendly method

Recall that H is very sparse but due to its enormous size the storage memory is a serious concern. In power-method, the sparse matrix H, dangling node vector a, and the current iterate π(k)T _{(which is completely dense) need to be stored but this}

storage-requirement is still much lower than other methods’.

Number of iterations

The author in [6] states that, BP confirmed in their 1998 research paper that about 50 iterations are needed for convergence to a satisfactory approximation to the exact PR vector.

Recall that H is very sparse and it requires O(n)effort for each iteration of the power method, and for 50 iterations it will require 50O(n)power iterations, which is a lin-ear computational effort, and it is perhaps rare to find other algorithms of such a computational effort.

The author in [6] realizes that 50 iterations are needed for the convergence due to the asymptotic rate of convergence.

Theorem 1.6.1(Asymptotic rate of convergence, [6]). The asymptotic rate of conver-gence is the rate at which αk_→_0.

For the proof, we refer the readers to [6]. The conclusion is that α controls the asymp-totic rate of convergence of the PR power method.

BP used the value of α=0.85 and it is the value still used (year 2006) by Google, and that α50=0.8550≈0.000296 implies that at the 50th iteration one can expect roughly 2-3 places of accuracy in the approximate PR vector. [6]

1.7 Ways of modifying the model

In this section, we will review various ways of modifying a basic PR model, analyzed in [6], and consider the implications of changes to the parameters: α, H and E.

(32)

10 Chapter 1. Mathematical anatomy of the Google’s PageRank TABLE1.1: Values of α vs the number of iterations required

α Number of iterations 0.5 34 0.75 81 0.8 104 0.85 142 0.9 219 0.95 449 0.99 2292 0.999 23015 1.7.1 Scaling parameter

Recall that 0<α<1. α=1 means a RW follows a completely hyperlink structure of

the web, while α=0 means a RW abandons hyperlink structure of the web and only teleports. α=0.85 means that 85% of the time a RW follows the hyperlink structure of the web and remaining 15% the RW teleports to a random new page, which means that α clearly controls the priority given to the Web’s natural hyperlink structure as opposed to the artificial teleportation. The author in [6] reports that, for the selective dataset, for the power method to converge to a tolerance of 10−10, which may be needed to distinguish between elements of the PR vector, only about 34 iterations are expected when α = 0.5. As α→1, the expected number of iterations required increases dramatically, as shown in Table 1.1 for the selective dataset [6].

When α=0.85 which Google uses, about 142 iterations are expected, but it requires several days of computation, due to the scale of matrices and vectors involved. This means that a delicate balancing act is needed. As α→1 the artificiality of telepor-tation reduces, yet the computelepor-tation increases dramatically. So setting α = 0.85 is a

workable compromise between efficiency and effectiveness.

The dynamic nature of the Web makes sensitivity an important issue. The value of α also affects the sensitivity of the resulting PR vector. As α→1 the PageRank-ings become much more volatile, and fluctuate noticeably for even small changes in structure of the Web. But we need to produce PRs that is stable despite such small changes.

1.7.2 Hyperlink matrix

By changing H, we can also modify the basic PR model. Recall that in the basic PR model, a uniform weighing scheme for filling in elements in H is used, which means that all out-links from a page are given equal weights in terms of a RW’s hyperlink-ing probabilities. The author in [6] argues that while fair, democratic, and easy to implement; equality may not be best for a webpage ranking. For example, a RW would like to select a new page by choosing out-linking pages with a lot of valuable content. In this case, an intelligent walker description would be more suitable than a RW description.

(33)

preferred pages, and out-linking probabilities in the rows of H can be adjusted ac-cordingly.

1.7.3 Teleportation matrix

We can also modify the basic PR model by changing E. Recall that E= _n1eeT which means that it is a uniform matrix. When we let VT = 1_neT then E becomes E=eVT. Since VT > 0, every node is still directly connected to every other node, and thus is primitive, which means that a unique stationary vector for the Markov chain exists, which is the PR vector. Using VT in place of 1_neT means that the teleportation prob-abilities are no longer uniformly distributed, and now on teleporting, a RW follows the non-uniform probability distribution, given in VT, to jump to the next page. This small modification retains all the advantageous properties of the power method, except that the PR vector itself is changed, because different personalization vectors produce different PageRankings, which means that PR vector πT(VT)is a function of VT.

The benefit of this slight modification is to introduce personalized PR vector for personal preferences regarding pages and topics on the web, and [6] speculates that Google can use this personalization vector to control spamming done by link farms. The conclusion is that G depends on three specific parameters: α, H and VT, and that these parameters can modify the PR model.

1.8 Accelerating the computation

In this section, we will review the analysis in [6] about speeding up the computation of a PR.

Recall that the standard power method takes days to converge, so it is essential to find ways to speed up the computation of PR, otherwise it will take weeks because the Web is growing rapidly.

The classical power method is known for its slow convergence. As each iteration of the power method on a Web-sized matrix is so expensive, reducing the number of iterations by a handful can save hours of computation. The author in [6] argues that there are just two ways to reduce the work involved in any iterative method. One way is to reduce the work per iteration, while the other is to reduce the total number of iterations. But these two methods are often not in agreement with one another. What it means is that reducing the number of iterations usually slightly increases the work per iteration, and vice versa. However, [6] also suggests that as long as this overhead is minimal, the proposed acceleration is considered beneficial.

Some of the successful methods for reducing the work associated are an adaptive power method, extrapolation, and aggregation.

(34)

1.8.1 Adaptive power method

The author in [6] maintains that the adaptive algorithm of a power method makes a practical contribution to the PR acceleration by attempting to reduce the work per iteration required by the power method. The adaptive power method provides a modest speed up in the computation of the PR vector. The author in [6], however argues that, there are some open theoretical issues with the algorithm. For instance, there is no proof regarding convergence of the algorithm; the algorithm may or may not converge. And even if it does converge, it is not clear whether the algorithm converge to the true PR values or some gross approximation to them.

1.8.2 Extrapolation method

The author in [6] maintains that the extrapolation method aims to reduce the number of iterations. Since extrapolation in this method requires additional computation, it should only be used periodically. This method referred to as Aitken extrapolation, gives only modest speedups. So it is suggested in [6] that improved extrapolation is required. One improved extrapolation is the Quadratic extrapolation, which while more complicated, is based on the same idea as Aitken extrapolation. The author in [6] claims that, on the (selective) tested data-sets, quadratic extrapolation reduces PR computation time quite considerably, with minimal overhead. Unfortunately, a quadratic extrapolation is expensive and can be done only periodically, suggests [6].

1.8.3 Aggregation method

The author in [6] maintains that the aggregation method often reduces both the num-ber of iterations required and the work per iteration, thereby accelerating the com-putation of the PR. This very promising method, called BlockRank, is an aggregation method that lumps sections of the Web by hosts. BlockRank is actually just a classic aggregation applied to a PR problem. This method gives an approximation to the true PR vector that the power method computes.

1.9 Updating a PR vector

In this section, we review the analysis in [6] regarding the issues of updating a PR vector.

“Updating” a PR refers to the process of computing a new PR vector after monthly changes have been made to the Web’s graph structure. The author in [6] reports that the Google updates their PR vector on a monthly basis, and that this process is called the Google Dance because webpages dance up and down the rankings during the three days of updating computations.

PR Updating: an open challenge

The author in [6] argues that updating PR is a very challenging task both mathemat-ically and computationally.

(35)

size of the matrix does not. In a page-updating problem, webpages themselves may be added to or removed from the Web, which means that the states are added to or removed from the Google Markov chain and the size of the Google matrix changes. Hence a page-updating problem is more difficult, deduces [6].

The theoretical answers to updating PR vectors are both exact and approximate. While the exact link-updating formulas are useful only when a row or two is changed and no pages are added or deleted, they are not computationally practical for mak-ing more general updates, and thus, because of the dynamics of the Web, they are virtually useless for updating PR. The author in [6] claims that no theoretical or prac-tical solutions for the page-updating problem for a Markov chain exists. In light of the dynamics of the Web, updating PR is an open challenge!

So it appears that starting (the power method) from scratch is perhaps the only alter-native for the PR updating problem, and [6] confirms that, a Google spokesperson at the annual SIAM meeting in 2002, reported that restarting this month’s power method with last month’s vector seemed to provide no improvement.

However, if instead of aiming for the exact value of the updated stationary distri-bution, we settle for an approximation, then the door opens wider. With an approx-imation approach that is based on state aggregation, the Google’s PR can be esti-mated. The author in [6] states that the Stage aggregation is a class of “approximate aggregation techniques” that is used to estimate stationary distributions of nearly-uncoupled chains.

(36)

(37)

Chapter 2

PageRank variants

In this chapter, we will comprehend the analysis in [4] about the PR variants. We will explore some versions of PR, related to diverse methods of a random walk that have distinct semantic connotations. We will later deduce that, though all of them differ semantically, they can still be computed using the very same algorithm, which shows the equivalence of these algorithms.

2.1 Variants

2.1.1 Traditional PR

Recall that one of the many interpretations of a random walk concept of a traditional PR (TPR) is the probability that a knowledgeable1 _{but mindless}2 _{random walker}

(RW) will enter a given webpage. Upon entering that page, if the RW has no outgo-ing links, the RW jumps to any page with uniform probability. If there are outgooutgo-ing links, the RW selects with uniform probability one of the outgoing links and reads the selected webpage. The author in [4] argues that there is a fixed boring probabil-ity α on any page that the RW gets bored reading that page. If the RW gets bored on the given page, RW jumps to any other webpage with a uniform probability.

In other words, a TPR represents a RW travelling at a uniform pace through the network.

2.1.2 Lazy PR

The author in [4] argues that a Lazy-random-walk PR, or simply stated, a Lazy PR (LPR), is one of the behaviors of a RW in which a RW, before choosing the next page to visit, first tosses a coin and upon head the RW visits the next page and upon tail the RW stays on the current webpage. So there is a 50% chance of visiting the next webpage while remaining 50% chance of staying on the current webpage.

In other words, a LPR represents the distribution of RWs that may stay longer at a given page than just a single unit of time.

1_{Knowledgeable means that a RW knows the addresses of all the webpages.}

(38)

16 Chapter 2. PageRank variants

2.1.3 Generalized-Lazy PR

The author in [4] argues that, with a Generalized-Lazy-random-walk PR, or simply stated, a Generalized-lazy PR (GLPR), we can generalize the behavior of a LPR by introducing a laziness degree γ, which means that a RW, before choosing the next page to visit, first tosses an unfair coin and with the tail probability of γ the RW will stay on the current webpage, while with the head probability of 1−γ, the RW will

visit the next page.

In other words, a GLPR allows for the simulation of leaning towards either jumping or reading of a webpage.

2.1.4 Backstep PR

The author in [4] argues that, a Random-walk-with-Backstep PR, or simply stated, a Backstep PR (BPR), is one of the behaviors of a RW in which the RW, instead of visiting the next page, with some probability β, chooses to go back to the previous webpage.

In other words, a BPR refers to a RW that may withdraw from a step forward if the RW finds the page boring.

2.2 Theoretical results

In this section we will deduce the theoretical results of the variants LPR, GLPR and BPR and acknowledge their equivalence.

The behavior of a TPR, π(t)_{can be expressed by generalized eigen equation as:} π(t) = (1−α)Sπ(t)+αV. (2.1)

The solution to eq. 2.1 is its principal eigenvector denoted as π(t)(S, V, α). Let us now make a comparison between the behaviors of a LPR and a TPR.

2.2.1 Lazy vs Traditional PR

The behavior of a LPR, π(l)_{can be expressed as:}

π(l) = (1−α)(0.5I+0.5S)π(l)+αV. (2.2)

where I denotes the identity matrix. The solution to eq. 2.2 is its principal eigenvec-tor denoted as π(l)(S, V, α). Eq. 2.2 can be transformed into:

π(l) = 1−α

1+αSπ

(l)₊ 2α

1+αV . (2.3)

By comparing the transformed eq. 2.3 with the TPR eq. 2.1, we see that π(l) for α is the same as π(t) _for 2α

1+α, where 0 < 12α+α < 1, which leads us to the following

(39)

1+α

where 0< ₁2α_+α <1.

Let us now make a comparison between the behaviors of a GLPR and a TPR.

2.2.2 Generalized-lazy vs Traditional PR

The behavior of a GLPR, π(g)can be expressed as:

π(g) = (1−α)(γI+ (1−γ)S)π(g)+αV, (2.4)

where γ denotes a laziness degree. The solution to eq. 2.4 for a GLPR is its principal eigenvector denoted as π(g)(S, V, α, γ).

The GLPR eq. 2.4 can be transformed into:

π(g)= (1−α)(1−γ)

1−γ+γα Sπ

(g)₊ α

1−γ+γαV . (2.5)

By comparing eq. 2.5 with eq. 2.1, we observe that π(g)for α is the same as π(t)for

α

1−γ+γα, where 0< 1−γ+γαα <1, which leads us to the following theorem:

Theorem 2.2.2(Generalized-lazy versus Traditional PR, [4]). The relationship between a TPR and a GLPR is described by:

π(g)(S, V, α, γ) =π(t) S, V, α 1−γ+γα , where 0< α 1−γ+γα <1. 2.2.3 Backstep vs Traditional PR

This section is based on [5].

Recall that a BPR refers to a RW that may withdraw from a forward step if the RW finds the page boring. In BPR, a RW proceeds as follows: A RW either chooses to go backwards with some probability β, or jumps to any page with probability α, or visits one of the child pages with the remaining probability (1−β−α). We will

consider here two cases of BPR. The one which deals with going backwards only one step before going forward (BPR-single backstep). While the other deals with multiple backsteps (BPR-multiple backsteps).

BPR-single backstep

Let us first denote the stationary distribution under this behavior as π(b1)₍_{S, V, α, β}₎_. If we assume that after going back one step the RW does not step back further then momentarily β becomes zero. In BPR-single backstep (BPR-s) the PR, π(b1)

(40)

18 Chapter 2. PageRank variants j consists of the authority pjfrom its parents, and the authority cjfrom its children.

That is:

π(_jb1) = pj+cj. (2.6)

In the next backstep, if page j gives away βpjto its parents, then the children get(1−

β)pj+cj. The children then give back again β((1−β)pj+cj), and upon stationary

distribution, one must have: cj = β((1−β)pj+cj)or

cj =β pj. (2.7)

So that eq. 2.6 becomes: π(b1)

j = pj+β pj. or

pj =

π(_jb1)

1+β. (2.8)

Putting the value of pjfrom eq. 2.8 into eq. 2.7, we get:

cj =

β

1+βπ

(b1)

j . (2.9)

In other words, the children get: (1−β)p_j +c_j = (1−β)p_j +β p_j = p_j = π

(_b1)

j 1+β.

Out of the amount of π

(_b1)

j 1+β, απ

(b1)

j is distributed over the network by the boring

jump, while the remaining authority assigned to the real children in a walk is then:

π(_jb1) 1+β −απ (b1) j = 1 1+β −α π(_jb1). Concluding: p = S 1 1+β−α π(b1)+αV and c= β 1+βπ(b1). Then: π(b1)= p+c= 1 1+β−α Sπ(b1)₊_αV₊ β 1+βπ (b1)_. _(2.10)

Eq. 2.10 can be transformed into:

π(b1)= (1−α(1+β))Sπ(b1)+α(1+β)V . (2.11)

Comparing eq. 2.11 and eq. 2.1, we observe that a BPR-s with a boring factor α cor-responds to a TPR with a boring factor α(1+β), where 0<α(1+β) <1.

Theorem 2.2.3(BPR-single backstep versus Traditional PR, [5]). The relationship be-tween a BPR-s and a TPR is described by:

π(b1)(S, V, α, β) =π(t)(S, V, α(1+β)),

where 0< α(1+β) <1.

Eq. 2.10 can also be transformed into:

π(b1) = (1−α) β (1+β)(1−α)I + 1− (1+β)α (1+β)(1−α)S π(b1)+αV. (2.12)

Comparing the coefficients of eq. 2.12 and eq. 2.4, we get: γ= β

(1+β)(1−α). This means

(41)

π 1 (S, V, α, β) =π S, V, α,

(1+β)(1−α) ,

where 0< β

(1+β)(1−α) <1.

BPR-multiple backsteps

Let us denote the stationary distribution under a BPR-multiple backsteps (BPR-m) as π(b)(S, V, α, β). Given that β < 1−β or β < 0.5 means that a RW cannot go

backwards more often than the RW go forward. In case of multiple backsteps in a row, both pj and cj are subject to backstep. In the next step of backsteps, if page

j gives away the authority of βπ(_jb) to its parents, then its children get (1−β)π(_jb)

where π(_jb)= pj+cjor

π(b)=c+p. (2.13)

The children then give back again β(1−β)π(_jb). The author in [5] argues that while

the authority passes down the children, eventual backstep may occur many steps away, providing probability masses of β2(1−β)2π(_jb), β3(1−β)3π(_jb), . . . For a stable

state the probability mass lost for parents has to be provided by the children. So that upon stationary distribution one must have then: cj = βπj. Hence concluding:

p = (1−α−β)Sπ(b)+αV, (2.14)

and

c =βπ(b). (2.15)

Putting the values of p and c from eq. 2.14 and eq. 2.15 into eq 2.13, we get:

π(b) = (1−α−β)Sπ(b)+αV+βπ(b). (2.16)

Eq. 2.16 can be transformed into:

π(b)= 1− α 1−β Sπ(b)+ α 1−β V . (2.17)

Comparing eq. 2.17 and eq. 2.1, we observe that a BPR-m with a boring factor α corresponds to a TPR with a boring factor α

1−β, where 0< 1−βα <1.

Theorem 2.2.5(BPR-multiple backsteps versus Traditional PR, [5]). The relationship between a BPR-m and a TPR is described by:

π(b)(S, V, α, β) =π(t) S, V, α (1−β) , where 0< α 1−β <1.

(42)

20 Chapter 2. PageRank variants

2.2.4 Conclusion

Based on the above theoretical results, we can now conclude that the behaviors of PR variants are semantically quite distinct but mathematically they can be reduced to a single form by distinguishing the boring factor. This means that we need only one version of the algorithm for the computation in each case.

(43)

Chapter 3

PageRank expressions

In this chapter, we will find PR expressions for some common graph structures, with the help of the analysis in [1] and [2]. Note that these PR expressions represent a non-normalized (NN) PR, π0, but to get a normalized PR, π, we need to normalize the result to sum to one. That is:

π = π

0

||π0||

3.1 Line-graph

A simple-line-graph-link-structure can be simply stated as a line-graph.

Definition 3.1.1. A line-graph is a graph with n nodes, where node n links to node n−1, which in turn links to node n−2, all the way until node 2 links to node 1. [1]

As an example, a simple-line graph with 5 nodes is shown in Fig. 3.1.

FIGURE3.1: A simple-line graph with 5 nodes.

Refer to Fig. 3.1; for node 5, the probability of getting to node 4 it links to is α and then α2for node 3, and so on. A PR (NN), for a simple line-graph with 5 nodes, is found to be:

π 0

= 1+α+α2+α3+α4 1+α+α2+α3 1+α+α2 1+α 1 T.

Let us now modify a line-graph to see the changes in its PR.

3.2 Modified-line-graph

We can modify a simple line-graph by various ways. Some of the ways are as fol-lows:

(44)

22 Chapter 3. PageRank expressions

3.2.1 Adding a link

We can, for example, add a link in the line-graph with 5 nodes, in such a way that node 1 links to node 2, as shown in Fig. 3.2. A PR (NN) of this modified line graph

FIGURE3.2: A line-graph with 5 nodes, where node 1 links to node 2.

is found to be:

π 0

= c+cα+cα2+cα3+cα4 c+2cα+cα2+cα3 1+α+α2 1+α 1 T.

The author in [1] argues that here c = ∞

∑

k=0

α2k = ₁_−α1 2 represents the sum of ALL the probabilities of getting from node 1 or node 2 back to itself.

3.2.2 Removing a link

We can, for example, remove a link between node 3 and node 2 in the line-graph with 5 nodes, ending up with two disjoint line-graphs, as shown in Fig. 3.3.

FIGURE3.3: A line-graph where a link between node 3 and node 2 is removed.

A PR (NN) for this modified line-graph is found to be:

π 0

= 1+α 1 1+α+α2 1+α 1 T.

3.2.3 Linking an external node

We can, for example, link an external node in the line-graph with 5 nodes, in such a way that external node 6 from the outside is linked to internal node 3, as shown in Figure 3.4.

FIGURE3.4: A line-graph where external node 6 is linked to internal node 3.

(45)

Note that the value of the PR (NN) of the external node 6 equals 1, for the obvious reason that it is alone outside. The author in [1] generalizes the expression for the PR of such a modified line-graph by formulating the following theorem:

Theorem 3.2.1(PR of a line-graph linking an external node, [1]). Assuming a uniform weight vector, in a line-graph, where an external node links any of its internal nodes, the PR of an internal node i, is given by:

πi 0 = n−i

∑

k=0 αk ! +bij = 1−αn−i+1 1−α +bij, where: bij = αj−i+1 j≥i 0 j<i The value of the PR of the external node is 1.

For the proof we refer the readers to [1].

3.3 Complete-graph

By a complete-graph-link-structure, we mean a complete-digraph.

A complete-digraph is a directed graph in which every pair of distinct vertices is connected by a pair of unique edges, one in each direction.

For simplicity, we will refer a complete-digraph, just as a complete-graph.

Definition 3.3.1. A graph in which every pair of vertices is an edge, is called complete,

denoted by Knwhere n is the number of vertices. [8]

FIGURE3.5: A complete-graph with 5 nodes.

We observe that there is no dangling node in a graph. Hence for a complete-graph with 5 nodes in Fig. 3.5, the PR of each node will obviously be π = 1₅. That is, every node influences the PR of a complete-graph. But when a complete-graph is linked to an outside, these nodes will not share their PRs to the outside because they only point to each other, and in this way, according to [1], a complete-graph is similar to a dangling node, when linked to/from outside.

A PR (NN), for a complete-graph with 5 nodes, is found to be:

π 0

= 1

(46)

24 Chapter 3. PageRank expressions The author in [1] generalizes the expression for the PR of a complete-graph, by for-mulating the following theorem:

Theorem 3.3.1(PR for a complete graph, [1]). Assuming a uniform weight vector, the PR of a node of a complete-graph is given by:

π 0

= 1

1−α.

For the proof, we refer the readers to [1]. We note in the theorem that all the proba-bility from a node in a complete-graph is distributed within the graph, and that the size of the graph, that is n, is irrelevant for a PR (NN) of each node, as the formula is independent of n.

3.3.1 Modifications

The author in [1] and [2] extensively analyze and formulate PRs for the modifications to a complete graph, for example, by adding a link from an external node to one of the nodes in a complete graph, by adding a link to an external node from one of the nodes in a complete graph, by adding a link from one of the nodes of a line graph to one of the nodes in a complete graph, and by merging a line graph and a complete graph through their common node. We refer the readers to these references for further reading.

3.4 Bipartite graphs

We first describe a bipartite graph and a complete-bipartite graph and then find their PRs.

3.4.1 Bipartite graph (Bi-graph)

Definition 3.4.1. A graph G = (X, E)is called bipartite if its vertex set X can be parti-tioned into two disjoint sets X1and X2, called parts, in such a way that every edge connects

vertices from different sets. [8]

If G= (X, E)is a bipartite graph, it is convenient to write it as G= (X1, X2; E)where

X1is its left and X2is its right part, as shown in fig. 3.6 . When modelling relations

FIGURE3.6: A bipartite graph.

(47)

analysis.

3.4.2 Complete-bipartite graph (Bi-clique)

A complete-bipartite graph is a special kind of a bipartite-graph where every vertex of the first set is connected to every vertex of the second set.

Definition 3.4.2. A complete-bipartite graph is a bipartite graph in which every vertex from

part X1is adjacent to every vertex from part X2. [8]

In other words, it is a graph whose vertices can be partitioned into two subsets X1

and X2such that no edge has both endpoints in the same subset, and every possible

edge that could connect vertices in different subsets is a part of the graph.

That is, it is a bipartite graph(X1, X2; E)such that for every two vertices x1 ∈X1and

x2 ∈X2, x1x2is an edge in E.

A complete-bipartite graph with partitions of size|X1| =m and|X2| =n, is denoted

Km,n. Fig. 3.7 shows a complete-bipartite graph K5,3.

FIGURE3.7: A complete-bipartite graph K5,3.

We deduce the PR of a complete-bipartite graph formulated in [2]:

Theorem 3.4.1 (PR of a complete-bipartite graph, [2]). Assuming a uniform weight vector, for a complete-bipartite graph(n, m; E), the PR of a vertex in n is given by:

π 0

= n+mα

n(1−α2). (3.1)

Similarly, the PR of a vertex in m is given by:

π 0

= m+nα

m(1−α2). (3.2)

For the proof, we refer the readers to [2], which also formulates PRs of a complete-bipartite graph with non-uniform weight vector, and N-partite graph with both uni-form and non-uniuni-form weight vectors.

3.4.3 Star graph

A star Snis a complete-bipartite graph K1,n, i.e. a tree with one internal node and n

leaves.

(48)

26 Chapter 3. PageRank expressions The star S7 is shown in Fig. 3.8 . For example, the star topology of a computer

FIGURE3.8: A star graph S7.

network modeled after a star graph, is important in the distributed computing. Since a star graph is a complete-bipartite graph and we already know the PR of a complete-bipartite graph, we can deduce the PR of a star graph by putting m=1 in eq. 3.1 and eq. 3.2.

Theorem 3.4.2(PR of a star graph). Assuming a uniform weight vector, for a star graph Sn, the PR of the internal node, is given by:

π 0

= 1+nα 1−α2,

And, the PR of its leaves is given by:

π 0

= n+α n(1−α2).

3.5 Expressions of the PR variants

In the previous sections (3.1-3.4), we found the standard or traditional PR expres-sions for some common graph structures. With the help of these TPR expresexpres-sions and the theoretical results obtained in section 2.2, we can find the PR expressions of the PR variants: LPR, GLPR and BPR.

3.6 LPR expressions

To get the LPR expressions, replace α for TPR by₁2α_+α, where 0< ₁2α_+α <1, in the TPR expressions found earlier to get the following results:

3.6.1 LPR of a line-graph linking an external node

Theorem 3.6.1(LPR of a line-graph linking an external node). Assuming a uniform weight vector, in a line-graph, where an external node links any of its internal nodes, the LPR of an internal node i, is given by:

πi 0 = n−i

∑

k=0 2α 1+α k! +bij = (1+α)n−i+1− (2α)n−i+1 (1+α)n−i(1−α) +bij,

(49)

3.6.2 LPR of a complete-graph

Theorem 3.6.2(LPR of a complete-graph). Assuming a uniform weight vector, the LPR of a node of a complete-graph is given by:

π 0

= 1+α 1−α.

3.6.3 LPR of a complete-bipartite graph

Theorem 3.6.3(LPR of a complete-bipartite graph). Assuming a uniform weight vector, for a complete-bipartite graph(n, m; E), the LPR of a vertex in n is given by:

π 0

= (1+α)((1+α)n+2αm) n(1−α)(1+3α) .

Similarly, the LPR of a vertex in m is given by:

π 0

= (1+α)((1+α)m+2αn) m(1−α)(1+3α) .

3.6.4 LPR of a star graph

Theorem 3.6.4(LPR of a star graph). Assuming a uniform weight vector, for a star graph Sn, the LPR of the internal node, is given by:

π 0

= (1+α)((1+α) +2αn) (1−α)(1+3α) ,

and the LPR of its leaves is given by:

π 0

= (1+α)((1+α)n+2α) n(1−α)(1+3α) .

3.7 GLPR expressions

To get GLPR expressions, replace α for TPR by α

1−γ+γα, where 0 < 1−γ+γαα < 1, in

the TPR expressions found earlier to get the following results:

3.7.1 GLPR of a line-graph linking an external node

Theorem 3.7.1(GLPR of a line-graph linking an external node). Assuming a uniform weight vector, in a line-graph, where an external node links any of its internal nodes, the

(50)

28 Chapter 3. PageRank expressions GLPR of an internal node i, is given by:

πi 0 = n−i

∑

k=0 α 1−γ+γα k! +bij = (1−γ+γα) −αn−i+1(1−γ+γα)i−n (1−γ)(1−α) +bij, where: bij = ( α 1−γ+γα j−i+1 j≥i 0 j<i 3.7.2 GLPR of a complete-graph

Theorem 3.7.2 (GLPR of a complete-graph). Assuming a uniform weight vector, the GLPR of a node of a complete-graph is given by:

π 0

= 1−γ+γα (1−γ)(1−α).

3.7.3 GLPR of a complete-bipartite graph

Theorem 3.7.3(GLPR of a complete-bipartite graph). Assuming a uniform weight vec-tor, for a complete-bipartite graph(n, m; E), the GLPR of a vertex in n is given by:

π 0

= (1−γ+γα)((1−γ+γα)n+αm) n((1−γ+γα)2−α2) .

Similarly, the GLPR of a vertex in m is given by:

π 0

= (1−γ+γα)((1−γ+γα)m+αn) m((1−γ+γα)2−α2) .

3.7.4 GLPR of a star graph

Theorem 3.7.4(GLPR of a star graph). Assuming a uniform weight vector, for a star graph Sn, the GLPR of the internal node, is given by:

π 0

= (1−γ+γα)((1−γ+γα) +αn) ((1−γ+γα)2−α2) ,

and the GLPR of its leaves is given by:

π 0

= (1−γ+γα)((1−γ+γα)n+α) n((1−γ+γα)2−α2) .

3.8 BPR-s expressions

To get the BPR-s expressions, replace α for TPR by α(1+β), where 0<α(1+β) <1,

(51)

weight vector, in a line-graph, where an external node links any of its internal nodes, the BPR-s of an internal node i, is given by:

πi 0 = n−i

∑

k=0 (α(1+β))k ! +bij = 1− (α(1+β))n−i+1 1−α(1+β) +bij, where: bij = (α(1+β))j−i+1 j≥i 0 j<i 3.8.2 BPR-s of a complete-graph

Theorem 3.8.2 (BPR-s of a complete-graph). Assuming a uniform weight vector, the BPR-s of a node of a complete-graph is given by:

π 0

= 1

1−α(1+β).

3.8.3 BPR-s of a complete-bipartite graph

Theorem 3.8.3(BPR-s of a complete-bipartite graph). Assuming a uniform weight vec-tor, for a complete-bipartite graph(n, m; E), the BPR-s of a vertex in n is given by:

π 0

= n+αm(1+β) n(1−α2(1+β)2).

Similarly, the BPR-s of a vertex in m is given by:

π 0

= m+αn(1+β) m(1−α2(1+β)2).

3.8.4 BPR-s of a star graph

Theorem 3.8.4(BPR-s of a star graph). Assuming a uniform weight vector, for a star graph Sn, the BPR-s of the internal node, is given by:

π 0

= 1+αn(1+β) 1−α2(1+β)2,

and the BPR-s of its leaves is given by:

π 0

= n+α(1+β) n(1−α2(1+β)2).

(52)

30 Chapter 3. PageRank expressions

3.9 BPR-m expressions

To get the BPR-m expressions, replace α for TPR by α

1−β, where 0 < 1−βα < 1, in the

TPR expressions found earlier to get the following results:

3.9.1 BPR-m of a line-graph linking an external node

Theorem 3.9.1(BPR-m of a line-graph linking an external node). Assuming a uniform weight vector, in a line-graph, where an external node links any of its internal nodes, the BPR-m of an internal node i, is given by:

πi 0 = n−i

∑

k=0 α 1−β k! +bij = (1−β) −αn−i+1(1−β)i−n 1− (α+β) +bij, where: bij = ( α 1−β j−i+1 j≥i 0 j<i 3.9.2 BPR-m of a complete-graph

Theorem 3.9.2(BPR-m of a complete-graph). Assuming a uniform weight vector, the BPR-m of a node of a complete-graph is given by:

πi

0

= 1−β 1− (α+β).

3.9.3 BPR-m of a complete-bipartite graph

Theorem 3.9.3 (BPR-m of a complete-bipartite graph). Assuming a uniform weight vector, for a complete-bipartite graph(n, m; E), the BPR-m of a vertex in n is given by:

π 0

= (1−β)((1−β)n+αm) n((1−β)2−α2) .

Similarly, the BPR-m of a vertex in m is given by:

πi

0

= (1−β)((1−β)m+αn) m((1−β)2−α2) .

3.9.4 BPR-m of a star graph

Theorem 3.9.4(BPR-m of a star graph). Assuming a uniform weight vector, for a star graph Sn, the BPR-m of the internal node, is given by:

π 0

= (1−β)((1−β) +αn) (1−β)2−α2 ,

(53)

(54)

(55)

Chapter 4

Summary

4.1 Chapters’ summary

In Chapter 1, we explored the linear algebra behind the Google’s PageRank. The PR scores are the stationary values of an enormous Markov chain. The iterative proce-dure overcomes the problem with the the primitive PR model’s circular definition. The matrix equations represent the summation equations of the iterative procedure.

H appears to be a stochastic transition probability matrix for a Markov chain, but it is

a sub-stochastic. The iterative matrix equation is a simple linear stationary process. There are, however, convergence problems of rank sinks and cycles in using these equations, which are overcome by modifying H to a Markov matrix P which then converges to a stationary vector π(T)provided P is stochastic, irreducible and aperi-odic. Modifying H to P requires stochasticity and primitivity adjustments. Stochas-tic fix converts H to S which is the stochasStochas-tic matrix. Primitive fix makes the resul-tant matrix to be stochastic and primitive, by introducing E, the result of which is the Google matrix G which guarantees a unique positive π(T)to exist and converge quickly. The consequences of the adjustments are that G is stochastic, irreducible, aperiodic, primitive, dense and in some sense artificial. The PR problem stated in terms of eigenvector system is to solve the PR vector: π=πTG, subject to πTe = 1. So, the goal is to find the normalized left-hand eigenvector of G corresponding to λ1.

We find the required eigenvector with the power method due to its simplicity, being matrix-free and storage-friendly, and its reasonable number of iterations needed for convergence to a satisfactory approximation to the exact PR vector.

We discussed the ways of modifying the model by considering the implications of changes to the parameters α, H and E . α controls the asymptotic rate of convergence of the PR power method. As α→ 1 the artificiality of teleportation reduces, yet the computation increases dramatically. So setting α = 0.85 is a workable compromise between efficiency and effectiveness. Different personalization vectors produce dif-ferent PRs.

It is essential to find the ways to speed up the computation of PR, otherwise it will take weeks to compute, because the Web is growing rapidly. There are just two ways to reduce the work involved in any iterative method. One is to reduce the work per iteration, while the other is to reduce the total number of iterations. But reducing the one increases the other and vice versa. However, as long as this overhead is minimal, the proposed acceleration is considered beneficial. Some of the successful methods for reducing the work associated are adaptive-power method, extrapola-tion and aggregaextrapola-tion.

A comparison of a Lazy PageRank and variants for common graph structures

Abstract

Acknowledgements

Contents

List of Figures

List of Abbreviations

List of Symbols

Introduction

Chapter 1

Mathematical anatomy of the

Google’s PageRank

1.1

PageRank proposition

1.2

Primitive PageRank Model

∑

∑

1.3

Iterative matrix equation

1.5

Initial adjustments to the model

1.6

Computation of PR vector

1.7

Ways of modifying the model

1.8

Accelerating the computation

1.9

Updating a PR vector

Chapter 2

PageRank variants

2.1

Variants

2.2

Theoretical results

Chapter 3

PageRank expressions

3.1

Line-graph

3.2

Modified-line-graph

∑

∑

3.3

Complete-graph

3.4

Bipartite graphs

3.5

Expressions of the PR variants

3.6

LPR expressions

∑

3.7

GLPR expressions

∑

3.8

BPR-s expressions

∑

3.9

BPR-m expressions

∑

Chapter 4

Summary

4.1

Chapters’ summary