Predictieding box office revenue of unreleased movies by applying sentiment analysis on tweets along with regression on other features based on exploratory analysis on pre-existing movies. This ML model can be applied on similar datasets for predicting revenue for business(marketing) and other analytics(advertising).
You will need to download the movie-review data for use in sentiment-analysis experiments. While http://www.cs.cornell.edu/people/pabo/movie-review-data/ was a possible dataset, I had my dataset from https://pythonprogramming.net/static/downloads/short_reviews/. The order of execution of the files are labled as 1,2,3 and 4.
You will need Jupyter notebook with the following libraries installed: nltk, random, pickle, sklearn, statistics, pandas, numpy and matplotlib. You can install them on Anaconda command promp as the example below:
pip install pickle
Below are the file and the functionality that they perform:
- Training sentiment models
Give the example
And repeat
until finished
End with an example of getting some data out of the system or using it for a little demo
Explain how to run the automated tests for this system
Explain what these tests test and why
Give an example
Explain what these tests test and why
Give an example
Add additional notes about how to deploy this on a live system
- Dropwizard - The web framework used
- Maven - Dependency Management
- ROME - Used to generate RSS Feeds
Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.
We use SemVer for versioning. For the versions available, see the tags on this repository.
- Billie Thompson - Initial work - PurpleBooth
See also the list of contributors who participated in this project.
This project is licensed under the MIT License - see the LICENSE.md file for details
- Hat tip to anyone whose code was used
- Inspiration
- etc