Developing code to scrape and visualize data from multiple online news sources, mostly trough R's Rvest package. I am especially interested in scraping and comparing different news sources in Serbia, which are known for being extremely polarized and often instruments of government misuse.
For these purposes, aside from scraping from RTV Slovenia (this was done just to 'get the ropes'), I have scraped Serbian N1 tv (a generally pro-western, corporate-owned media house) and Kurir (pro-russian, government controlled 'yellow press' tabloid that is extremely popular nevertheless). To this date, I have been doing comparison of news topics by scraping posts and their tags and plotting networks to find tags with greatest centrality. In my experience, the comparisons of networks showed that N1 post-and-tags bimodal networks have more equally sized components and are much less centralized compared to Kurir, indicating more diverse media content. When I collect a substantial number of these networks, I will upload them here for comparison.
The next step is sentiment analysis, as soon as I find a good Serbian language sentiment lexicon.