Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 756 Bytes

README.md

File metadata and controls

5 lines (3 loc) · 756 Bytes

PageRank-Hadoop

Implementation of Improved PageRank Algorithm on Hadoop

The PageRank algorithm is one of the most discussed topics for processing large volume internet data. The primary purpose is to rank the Web pages through allocating weightage based on the links pointing towards the Web page to measure the importance of the same. To overcome the computational difficulty in processing the algorithm the paper proposes an improved PageRank algorithm to be implemented over a distributed environment using the Hadoop MapReduce architecture. The improved algorithm is sub divided into six process, most of which is implemented in Map and Reduce task. The final PageRank is computed based on the convergence property of Power Iteration algorithm.