Page Rank

Static Badge Static Badge

In the vast world of the Internet, how can we determine a webpage's importance based on its links? The PageRank algorithm, developed by Google's co-founders, offers a solution. It views a hyperlink from page A to page B as a vote of confidence by A for B. But, not all votes are equal: a vote from a highly-ranked page means more than one from a lesser-ranked page. Additionally, PageRank uses a damping factor to simulate a user's likelihood to continue clicking links versus jumping to a random page. A good resource for this assignment is the wiki https://en.wikipedia.org/wiki/PageRank.

Implementation guidelines

  1. Running the program:

    • The program can be ran in different modes (sequential, parallel and distributed) by specifying a parameter.

    • User can specify the parameter damping factor, maximum number of iterations, \(|V|\) and \(|E|\) (graph size).

    • The program measures run-time needed to compute the whole PageRank as well as a single iteration.

  2. Implementation details

    • Graph libraries are off-limits

    • Use an adjacency list to represent the graph.

    • Given program parameters \(|V|\) and \(|E|\) generate a random graph.

    • The code should implement functions for storing the graph in the edge list format as a csv file, and for storing the page rank result as a csv file with two columns (vertex,score), results must be ordered by the vertex column in ascending order.

    • The implementation must adapt automatically to the hardware it is being ran on (Physical CPU's, Cores, Memory, etc..);

    • Wherever randomness is used a seed must be provided in order to compare solutions.