GitHub - rjsandeepkumawat/CODSOFT-TASK-2

Movie Rating Prediction

Author: Sandeep Kumawat

Batch: April 2024 (A48)

Domain: Data Science

Aim

The primary objective of this project is to develop a model that can predict movie ratings based on provided features.

Datasets

This project utilized the following datasets:

Movie_data: Contains preprocessed movie information, including MovieName, Genre, and MovieIDs.
Ratings_data: Contains preprocessed ratings information, including Ratings and Timestamp.
Users_data: Contains preprocessed user information, including Gender, Age, Occupation, and Zip-code.

Libraries Used

The project made use of several essential libraries:

numpy
pandas
matplotlib.pyplot
seaborn
sklearn.preprocessing.LabelEncoder
sklearn.preprocessing.MinMaxScaler
sklearn.model_selection.train_test_split
sklearn.linear_model.LogisticRegression

Data Exploration and Preprocessing

Loaded Movie_data, Ratings_data, and Users_data as DataFrames from separate CSV files.
Eliminated missing values from each DataFrame using dropna(inplace=True).
Displayed the shape and descriptive statistics for each DataFrame using df.shape and df.describe().
Converted the 'Gender' column in Users_data from categorical to numerical values using LabelEncoder.
Horizontally concatenated the DataFrames using pd.concat to create a final dataset df_data.
Dropped unnecessary columns like 'Occupation', 'Zip-code', and 'Timestamp' from the final dataset to create df2.
Removed any remaining missing values from the final dataset df2 using dropna().
Conducted data visualization using count plots and histograms to analyze the distribution of ratings, genders, and age.

Model Training

Created the feature matrix input and target vector target using relevant columns from the final dataset df_final.
Split the data into training and testing sets using train_test_split.
Scaled the input data using MinMaxScaler to normalize the values between 0 and 1.

Model Prediction

Initialized a logistic regression model and trained it on the training data using LogisticRegression.
Employed the trained model to predict movie ratings for the test set using model.predict(X_test).

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
CODSOFT_TASK2.ipynb		CODSOFT_TASK2.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Movie Rating Prediction

Author: Sandeep Kumawat

Batch: April 2024 (A48)

Domain: Data Science

Aim

Datasets

Libraries Used

Data Exploration and Preprocessing

Model Training

Model Prediction

About

Releases

Packages

Languages

rjsandeepkumawat/CODSOFT-TASK-2

Folders and files

Latest commit

History

Repository files navigation

Movie Rating Prediction

Author: Sandeep Kumawat

Batch: April 2024 (A48)

Domain: Data Science

Aim

Datasets

Libraries Used

Data Exploration and Preprocessing

Model Training

Model Prediction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages