The PageRank algorithm is one of the most discussed topics for processing large volume internet data. The primary purpose is to rank the Web pages through allocating weightage based on the links pointing towards the Web page to measure the importance of the same. To overcome the computational difficulty in processing the algorithm the paper proposes an improved PageRank algorithm to be implemented over a distributed environment using the Hadoop MapReduce architecture. The improved algorithm is sub divided into six process, most of which is implemented in Map and Reduce task. The final PageRank is computed based on the convergence property of Power Iteration algorithm.
-
Notifications
You must be signed in to change notification settings - Fork 0
simonsimanta/PageRank-Hadoop
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Implementation of Improved PageRank Algorithm on Hadoop
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published