Transcript
Page 1: Investigating Google’s PageRank A lgorithm

Researchers:Erik [email protected]

Per-Anders Ekströ[email protected]

Advisors: Lars Eldén

[email protected]

Maya G. [email protected]

Investigating Google’s PageRank AlgorithmIterations for ConvergenceThe Power method requires far more iterations to converge than the Arnoldi method.

Execution Time The time for convergence is better for the restarted Arnoldi method than other tested methods.

As alpha increases this difference becomes more evident.

PageRank ExplainedA page is important if other important pages link to it.

This is an eigenvector problem:

The matrix Q describes the link structure. To assure a reasonable answer Q must be modified.

Here d shows which pages lack outlinks and alpha determines the general probability of “teleporting” to a random Web page.

Example The following small 6-page link structure wouldgive us the following Q.

Eigenvector methods usedPower method + low memory demands – slow for large alpha-values Arnoldi method + few iterations for convergence – high memory demands – increasing work for each iterationRestarted Arnoldi + fast for all alpha-values ± much less memory needed than for normal Arnold, but higher than the Power method

Web-Crawler Written in Perl and used to retrieve the link structure of a specified domain.

This is the link structure of it.uu.se.

Project in course ”Scientific Computing 10p.” at the Division of Scientific Computing, Department of Information Technology, Uppsala University

Contact:Lina von [email protected]

Top Related