Skip to content

zsilver1/Genre-Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NOTE: See Project Report for theory and implementation details.



This repository contains scripts and data for machine learning project by Sindhuula Selvaraju and Zachary Silver.
This repository contains 2 directories:
    1. Data : Raw and parsed data files
    2. Code : Scripts to parse and classify data and compute prediction accuracy

Usage:
    parse_data.py : python parse_data.py <input_file> <output_file> <stopword_file>
    assign_feature_weights.py : python assign_feature_weights.py <input_file> <output_file> <words_file>    
Points to note:
    1. The preprocessing/parsing of our raw data involves removing frequently occuring words and phrases and making sure keywords necessary for the classification get a higher weight
    2. The feature vector is formd similar to the way it waas formed in our homeworks but instead of integer our features are strings
    3. We're still using the model and predict files to keep track of possible points of failure.

    5. Our neural network uses some number of stacked "perceptrons." Each
    perceptron represents its input as a vector and its output as a number.
    Depending on the number of nodes in our hidden layers, we will create
    multiple perceptrons. For example, if we have 10 nodes in a hidden layer,
    each of those nodes will be the output of 10 perceptrons that represent
    the previous layers. These outputs will then be the inputs for a final
    perceptron that gives the final output value.



   

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published