Skip to content

MariannBea/Movie-Studio-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Movie Studio Analysis

Is it possible to predict if a movie will earn in the top 25% of profits? If so, what features are useful in making this prediction. That is what this project aims to find out.

Project Description

My goal was to see if I could predict if a film will be in the top 25% of profitability for the year, using factors that movie studios could consider when planning a movie project.

These are the specific research questions I aimed to answer:

  1. Is there a correlation between genre and profits?
  2. Is there a correlation between content rating and profits?
  3. Are some genres more profitable for specific content ratings?
  4. Does an increase in the number of Facebook likes for directors or actors correlate with increased profits?
  5. Is being in or directing a greater number of movies correlated with more profits?
  6. Are some topics correlated with more profits?

Methods Used

  • Data Cleaning
  • Exploratory Data Analysis
  • Inferential Statistics
  • Machine Learning:
    • Linear Regression
    • Decision Trees

Technologies

  • Python:
    • pandas
    • jupyter notebook
    • sklearn
    • nltk

Getting Started

  1. Clone this repo
  2. Raw Data is being kept here within this repo.
  • You can access the data yourself by following the links below:
    • Movie data set of a selction of 5000 movies from 1916 - 2016. It can be obtained from Kaggle.
    • World wide inflation data from the World Bank covering years 1960 to present.
    • Information about movie budget and revenue obtained using the The Movie Database api.
  1. Notebooks are being stored here: here
  2. You will need to install the following packages:

Featured Deliverables

References and Resources

General Research

M. T. Lash and K. Zhao, “Early Predictions of Movie Success: The Who, What, and When of Profitability,” Journal of Management Information Systems, vol. 33, no. 3, pp. 874–903, Jul. 2016, doi: 10.1080/07421222.2016.1243969.

Q. I. Mahmud, N. Z. Shuchi, F. M. Tawsif, A. Mohaimen, and A. Tasnim, “A machine learning approach to predict movie revenue based on pre-released movie metadata,” Journal of Computer Science, vol. 16, no. 6, pp. 749–767, 2020, doi: 10.3844/JCSSP.2020.749.767.

Hollywood Movies Make a Profit by Stephen Follows

Why Do All Hollywood Movies Lose Money? by Alex MayyasiPriconomics.com

Cleaning and Wrangling

Coding for Entrepuneurs, 30 Days of Code You Tube Vide0

Coding for Entrepuneurs, 30 Days of CodeSource Code

World Bank Inflation Indicators

Dealing with List Values in Pandas Dataframes by Max Hillsdorf on Medium

Text Analysis & Feature Engineering with NLP by Mauro Di Pietro on Medium

Exploratory Analysis

Accelerate Your Exploratory Data Analysis With Pandas-Profiling bySukanta Roy

Tutorial: Exploratory Data Analysis (EDA) with Categorical Variablesby Erin Hoffman on Medium

Machine Learning

How to Combine Oversampling and Undersampling for Imbalanced Classification by Jason Brownlee on Machine Learning Mastery

Contact

Mariann Beagrie

[email protected]

https://mariannbea.github.io/

About

A project to practice data analysis skills.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published