This is the project 1.2 for CMU 15-619 Cloud Computing.
Wikipedia mining using EMR on AWS based on Hadoop Streaming.
Create a mapper and reducer to analyze the whole wikipedia data on the July, 2014. Then extract the most popular article (highest page views).