News web scraping

Developing code to scrape and visualize data from multiple online news sources, mostly trough R's Rvest package. I am especially interested in scraping and comparing different news sources in Serbia, which are known for being extremely polarized and often instruments of government misuse.

For these purposes, aside from scraping from RTV Slovenia (this was done just to 'get the ropes'), I have scraped Serbian N1 tv (a generally pro-western, corporate-owned media house) and Kurir (pro-russian, government controlled 'yellow press' tabloid that is extremely popular nevertheless). To this date, I have been doing comparison of news topics by scraping posts and their tags and plotting networks to find tags with greatest centrality. In my experience, the comparisons of networks showed that N1 post-and-tags bimodal networks have more equally sized components and are much less centralized compared to Kurir, indicating more diverse media content. When I collect a substantial number of these networks, I will upload them here for comparison.

The next step is sentiment analysis, as soon as I find a good Serbian language sentiment lexicon.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
N1_Kurir_networks_1.R		N1_Kurir_networks_1.R
README.md		README.md
code.md		code.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

News web scraping

About

Releases

Packages

Languages

shootinputin007/Online-News-Scraping

Folders and files

Latest commit

History

Repository files navigation

News web scraping

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages