The purpose of this project was to analyze the Amazon reviews written by members of the paid Amazon Vine program on Software products. The Amazon Vine program is a service that allows manufacturers and publishers to receive reviews for their products - in this project we're analyzing reviews on software products. The Software review data was extracted from AWS, transformed in Google Colab with PySpark and loaded in to pgAdmin.
- PySpark
- PostgreSQL
- AWS
- Google Colab
In total, there were 248 vine (paid) reviews, and 17,514 non-vine (unpaid) reviews. Of these reviews, those that had 5-star ratings there were 102 (1.94%) vine reviews, and 5,154 (98.06%) non-vine reviews. Of all the total rating scores, those with 5-star paid ratings made up 0.6%, and those with 5-star unpaid ratings made up 29%.
With 98.06% of 5 star reviews from unpaid customers, this shows that there was no positivity bias in the 5 stars review. In other words, the fact that people were paid did not bias the reviews. The fact that the overwhelming majority of reviews came from unpaid customers demonstrates that people were so satisfied with their software products that they did not require payment to leave a 5 star review.
Since the 5star unpaid reviews made up 29% of all of the reviews, it would be interesting to perform the same percentage calculations for each star rating to see the distribution of the reviews.