Skip to content

Using (mostly) Rvest library for scraping content from various online news sources; Creating and comparing bimodal networks of posts and tags; calculating centrality measures.

Notifications You must be signed in to change notification settings

shootinputin007/Online-News-Scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

News web scraping

Developing code to scrape and visualize data from multiple online news sources, mostly trough R's Rvest package. I am especially interested in scraping and comparing different news sources in Serbia, which are known for being extremely polarized and often instruments of government misuse.

For these purposes, aside from scraping from RTV Slovenia (this was done just to 'get the ropes'), I have scraped Serbian N1 tv (a generally pro-western, corporate-owned media house) and Kurir (pro-russian, government controlled 'yellow press' tabloid that is extremely popular nevertheless). To this date, I have been doing comparison of news topics by scraping posts and their tags and plotting networks to find tags with greatest centrality. In my experience, the comparisons of networks showed that N1 post-and-tags bimodal networks have more equally sized components and are much less centralized compared to Kurir, indicating more diverse media content. When I collect a substantial number of these networks, I will upload them here for comparison.

The next step is sentiment analysis, as soon as I find a good Serbian language sentiment lexicon.

About

Using (mostly) Rvest library for scraping content from various online news sources; Creating and comparing bimodal networks of posts and tags; calculating centrality measures.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages