As more data is generated online, efficient data scraping, data warehousing and data aggregation is becoming increasingly necessary for supplying their ever-increasing demands.
Open datasets (not limited to) are on http://www.vgchartz.com/gamedb/. Scraping of data sets include (but are not limited to): 1) Ranking of overall video/e-game sales; 2) The game type platforms of the game release (i.e. PC, PS4, phone/app, Nintendo etc.); 3) Year of the games’ release date (ie, 2010) and contextual time period (ie, during The Recession); 4) Genre of the game (first person shooter FPS, Strategy (League of Legends, StarCraft etc), role playing games RPG, fictional life simulator (Sims, Pokemon Go etc)); 5) Publisher of the game and Sales in different regional demographics (i.e. NA, EU, Japan) etc. 6) User characteristics/demographics (age, ethnicity)
The goal is to extract useful analytics from various data sets, inferring user play time, interactions between players/CPU, quitting point (when players exit), peak server times/lag/ping rates, profile associated with player accounts—this, in turn, is aimed to improve gameplay experience insights for both company revenue and user interfacing. Our analysis will be targeted at intelligent (and automated) data cleansing, implementing large libraries of different python modules, eventually outlining predictive models for gaming companies’ sales rank. https://mef-bda503.github.io/pj-AhmetTuncel/files/Assignment_2.html
Step Estimated completion time Person(s) in charge (among the group of 3)
- Extracting and cleaning up data One week (Binglin Zhao Jiawei Song Yichen Zhang Ester Park)
- Analysis by features Two weeks (Binglin Zhao Jiawei Song Yichen Zhang Ester Park)
- Data visualization One week (Binglin Zhao Jiawei Song Yichen Zhang Ester Park)
- Presentation slides One week (Binglin Zhao Jiawei Song Yichen Zhang Ester Park)
Data visulization are on
EsterPark.ipynb
BinglinZhao.ipynb
Yichen.ipynb
jiawei.ipynb
All functions are in .py files.