Predicting-Boxoffice-Revenue-using-Tweets

Predictieding box office revenue of unreleased movies by applying sentiment analysis on tweets along with regression on other features based on exploratory analysis on pre-existing movies. This ML model can be applied on similar datasets for predicting revenue for business(marketing) and other analytics(advertising).

Getting Started

You will need to download the movie-review data for use in sentiment-analysis experiments. While http://www.cs.cornell.edu/people/pabo/movie-review-data/ was a possible dataset, I had my dataset from https://pythonprogramming.net/static/downloads/short_reviews/. The order of execution of the files are labled as 1,2,3 and 4.

Prerequisites

You will need Jupyter notebook with the following libraries installed: nltk, random, pickle, sklearn, statistics, pandas, numpy and matplotlib. You can install them on Anaconda command promp as the example below:

pip install pickle

Installing

Below are the file and the functionality that they perform:

Training sentiment models

Give the example

And repeat

until finished

End with an example of getting some data out of the system or using it for a little demo

Running the tests

Explain how to run the automated tests for this system

Break down into end to end tests

Explain what these tests test and why

Give an example

And coding style tests

Explain what these tests test and why

Give an example

Deployment

Add additional notes about how to deploy this on a live system

Built With

Dropwizard - The web framework used
Maven - Dependency Management
ROME - Used to generate RSS Feeds

Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

Versioning

We use SemVer for versioning. For the versions available, see the tags on this repository.

Authors

Billie Thompson - Initial work - PurpleBooth

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

Hat tip to anyone whose code was used
Inspiration
etc

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
DB_CSV		DB_CSV
Regression features/input		Regression features/input
pickledfiles		pickledfiles
.gitattributes		.gitattributes
1. Training sentiment models .ipynb		1. Training sentiment models .ipynb
2. Pulling new movies data from twitter and adding sentiment to it.ipynb		2. Pulling new movies data from twitter and adding sentiment to it.ipynb
3. Old Tweets Regression Pulling data from twitter adding sentiment.ipynb		3. Old Tweets Regression Pulling data from twitter adding sentiment.ipynb
4. Final Linear Regression.ipynb		4. Final Linear Regression.ipynb
LICENSE		LICENSE
README.md		README.md
Report.pdf		Report.pdf
SQL_queries v2.sql		SQL_queries v2.sql
SQL_queries.sql		SQL_queries.sql
negative.txt		negative.txt
positive.txt		positive.txt
sentimentmodule.py		sentimentmodule.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predicting-Boxoffice-Revenue-using-Tweets

Getting Started

Prerequisites

Installing

Running the tests

Break down into end to end tests

And coding style tests

Deployment

Built With

Contributing

Versioning

Authors

License

Acknowledgments

About

Releases

Packages

Languages

License

souradeepta/Predicting-Boxoffice-Revenue-using-Tweets

Folders and files

Latest commit

History

Repository files navigation

Predicting-Boxoffice-Revenue-using-Tweets

Getting Started

Prerequisites

Installing

Running the tests

Break down into end to end tests

And coding style tests

Deployment

Built With

Contributing

Versioning

Authors

License

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages