Skip to content

curiosity-data-analysis/distro_watch_api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DistroWatch API

What is DistroWatch? DistroWatch is a website dedicated to talking about, reviewing and keeping up to date with open source operating systems. This site particularly focuses on Linux distributions and flavours of BSD, though other open source operating systems are sometimes discussed. There is a lot of information out there on Linux distributions and this site tries to collect and present that information in a consistent manner to make it easier to locate.

The Project

The project conciste of creating a tool that realizes the collection of the distributions that exist in the site and also the ranking of distributions that are more accessed in a time of 12, 6, 3 and 1 month.

After this collection will be performed a data analysis to collect possible results of the ranking of distributions, and also in the future the creation of a mobile application that presents this data.

Tools Used

Installation

BeautifulSoup

$ pip install beautifulsoup4

Requests Making a request with Requests is very simple.

Begin by importing the Requests module:

>>> import requests

MySQLdb

pip install mysqlclient

In python version 3:

$ sudo apt-get install python-mysqldb andpython3-mysqldb

CSV csv is part of python's standard library so you don't need to install it with pip. Just use it with:

import csv

Flask

$ pip install -U Flask

Directories

The crawler_db directory contains the codes related to the collection of data from the DistroWatch site and also the insertion into the database for the mobile application to use later. Also in this folder contains .csv files for data analysis.

The api directory contains the code related to creating a webservices to feed the mobile application.

To Contribute

To contribute with some project just take the Fork and perform a PR with the modifications made.

For more information, access the Gitter.

About

Project for collecting and creating api for the DistroWatch site

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published