twitter_scraping_labelling

Pulls tweets based on location keyword, and create a labelling application to label them as tourism-relevant or not!

Current status:

Pull Tweets using API based on timeframe and keywords
Prefilter tweets (only keep Japanese language tweets, or English tweets explicitly mentioning tourist activities. Remove emojis and mentions)
Load into database (SQLite or PostgreSQL)
Stratified time-based sampling for labelling
Working on: Labelling process, and then relevance model (different repo)

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.gitignore		.gitignore
README.md		README.md
analysis_from_database.ipynb		analysis_from_database.ipynb
main.py		main.py
main_jupyter.ipynb		main_jupyter.ipynb
postgres_connection.py		postgres_connection.py
scrapedtweets.db		scrapedtweets.db
sqlite3_connection.py		sqlite3_connection.py
subsample.py		subsample.py
temp_queries.txt		temp_queries.txt
tweet_analysis_functions.py		tweet_analysis_functions.py
tweet_input_output_functions.py		tweet_input_output_functions.py
tweet_queries.txt		tweet_queries.txt

Provide feedback