IDSTA22-Team-Heigit

Proposal links

link to Tex https://de.overleaf.com/7975813297sqrrgdrscysd

Introduction

"[E]verything is related to everything else, but near things are more related than distant things". Tobler’s first rule was abruptly brought to the at- tention of the people of Europe, and Germany in particular, as a result of Russia’s attack on Ukraine in february 2022. In addition to the mental factor of having a war not far from the own borders, the population is particularly affected by the strong dependence on Russian resources due to the geograph- ical proximity.

In order to pre-empt an escalation of political tensions within the population, it is therefore useful to create an overview of the spatial distribution of residents who support or do not support the government’s current direction and which events have had an impact on public perception. As one of the most well-known social media platforms that offers the pos- sibility to export content and geolocation of a message, Twitter represents the best opportunity to analyze the influence of socio-economic factors and geopolitical events on social attitudes towards the war in Ukraine.

Tweets

In order to retrieve tweets sent from Germany that address Russia’s attack on Ukraine, we relied on the R package academictwitteR. We were able to retrieve 106.000 tweets.

Most of the tweets were written in german. Nevertheless, we also identified tweets in russian, polish, turkish and ukrainian.

The sentiment score of the tweets is depicted below. We calculated the daily mean sentiment score.

We carried out our analysis with this dataset.

Workflow

Documentation

Requirements

In order to run this project the following dependencies must be met.

Conda

Conda needs to be installed on the machine as well as an environment that is based on the provided ENV.yml. The environment needs to be imported. To do so one can use the following command.

conda env create -n ENVNAME --file code/ENV.yml

Npm

to run the front end one does require npm and nodejs

to run the front end run the following commands

cd frontend/webapp
npm install
npm update
npm run dev

Elastic search

An elastic search connection must be provided. Please note that currently the security setting (xpack.security.enabled) must be set false in your elasticsearch.yml The Authors used the elastic search version 8.6.2.

Usage

with the following command, different functionalities can be triggered.

python ./code/Python/main.py ["preprocess","process","api","bulk"full","full+bulk"] -c ./code/default.config

Note that a valid config file needs to be provided. For this project the code/default.config is used.

Preprocess

Start the preprocessing, which cleans up the individual tweets and translates them into english.

Process

Applies the models for NER and sentiment classification and writes the results directly to the elastic search index. Note that incase the index is missing, it will be created.

Api

Launches the api using uvicorn. alternative way to run the api:

cd code/Python
uvicorn api:app --reload

Full

Executes the three steps above

Bulk

As a shortcut, the results from the steps preprocessing and processing, have been stored as a json. This command allows to load these directly into the defined Index, without the need of processing.

Full+Bulk

Trigers the bulk command and launches the api.

API Endpoint

The api offers one POST endpoint called /plot. The request requieres a body in the following schema

{
    "name":[
       "name"
    ],
    "layer":[
        "layer"
    ],
    "interval": "monthly" or "weekly" or "daily"
}

the response constists outofthefollowing elements

{
    "timestamps": [str] (alist of iso formated date strings YYYY-MM-DD)
    "sentiment": [[float,float]] (a list of lists with two values, the first represents the share of positive labeled tweets, the second represents the share of negative labeled tweets)
    "enteties": [{str:{"n":int,"sentiment":[float]}] (a list of dictionaries with the used enteties and some meta data)
    "plot":"str" (json object used to plot the date in the frontend)
}

Models used in this Project

for the sentiment analysis the cardiffnlp/twitter-roberta-base-sentiment-latest with the huggingface sentiment pipeline was used. for the NER analysis the dslim/bert-base-NER has been implemented

we originaly started building our front end provided in the lecture.

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
code		code
files		files
frontend		frontend
graphs		graphs
proposal		proposal
.DS_Store		.DS_Store
README.md		README.md
all_tweets_daily.png		all_tweets_daily.png
drw.png		drw.png
langs.png		langs.png
plot.html		plot.html
sent.png		sent.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IDSTA22-Team-Heigit

Proposal links

Introduction

Tweets

Workflow

Documentation

Requirements

Conda

Npm

Elastic search

Usage

Preprocess

Process

Api

Full

Bulk

Full+Bulk

API Endpoint

Models used in this Project

About

Releases

Packages

Contributors 3

Languages

itisacloud/IDSTA22-Team-Heigit

Folders and files

Latest commit

History

Repository files navigation

IDSTA22-Team-Heigit

Proposal links

Introduction

Tweets

Workflow

Documentation

Requirements

Conda

Npm

Elastic search

Usage

Preprocess

Process

Api

Full

Bulk

Full+Bulk

API Endpoint

Models used in this Project

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages