Skip to content

Natural Language Processing (NLP) and analysis on reviews about delivery companies in the UK based on reviews extracted from the Trustpilot website

Notifications You must be signed in to change notification settings

Mohsanaliac/Text_Analysis_of_Consumer_Reviews

 
 

Repository files navigation

Description

Purpose of this project is to leverage reviews about major delivery companies that are operating in the UK, and perform NLP tasks to analyze different aspects of the reviews like the sentiment, most common words, probability distributions across word sequences, and more.

Project Roadmap

graph   LR
    A[Build a tool to connect to web sources APIs] -->|Get reviews from web| B[Clean reviews]
    B --> D[Knowledge Graphs]
    B --> F[Unsupervised Clustering]
    B --> C(Sentiment Analysis)
    B --> |Identify topic of review| E[Topic Extraction]
    E -->  |Train Model| I[Assign Topic to new instances]
    C --> |Train Model| J[Sentiment Classifier]
    I --> K[Build UI]
    J --> K[Build UI]
Loading

Data Retrieval API

To get reviews from the TrustPilot website, we are leveraging a custom made web scraping tool. This tool is iterating across different pages of the website and collects the reviews and any other relevant information, with the output being stored in CSV files.

Running Guide

  1. Set-up the appropriate configurations in config.json. The config needs to get populated with the following metadata:
    - source_url: Main domain URL
    - starting_page: Domain subpath to a specific reviews page
    - steps: Defines number of pages to iterate over
    - company: Company/Service of interest

  2. Execute the python retriever script
    python data_retriever.py

About

Natural Language Processing (NLP) and analysis on reviews about delivery companies in the UK based on reviews extracted from the Trustpilot website

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%