Skip to content

Latest commit

 

History

History
53 lines (43 loc) · 2.88 KB

README.md

File metadata and controls

53 lines (43 loc) · 2.88 KB

photomath_junior

Photomath_junior is simple web app enabling user to pass a photo of a math expression and get the result.

Service can be run as Flask application from package src/backend. The user is presented with simple UI which asks him to send an image of math expression to server. The server backend is accessed through an image processor interface exposing method process_image(image). process_image(image) takes a grayscale image of a mathematical expression and passes it through internal pipeline consisting of object detection which detects digits and operators on an image then object classification which gives back which mathematical symbols correspond to the detected objects and finally and finally a math solver which solves the expression determined by these symbols.

If you wish to run the program as Flask application, first set FLASK_APP and FLASK_ENV environment variables and then run the server with flask run.

If you wish to start the image processor locally run the following script, (in this example working directory is photomath root directory)

python -m src.ml.image_processor {classifier_name} {path_to_image}

where {path_to_image} should be an absolute path to the image you want to process. {classifier_name} is a path relative to models/ folder which should be inside root directory of the project: this means that both of these ways to run a Photomath_junior depend on photomath directory containing directory models/ with saved models you want to use for classification in its subdirectories.

This means that the models should first be trained. This is done by invoking classifier script (in this example working directory is photomath root directory)

python -m src.ml.classifier classifier_name

Invoking this program starts the training, using data from data/processed directory + MNIST dataset from Tensorflow to train a digit & operator classifier, which will then save this model weights to models/{classifier_name} and the training metrics to metrics/{classifier_name}.png

data/processed directory should therefore contain images of operators and parentheses, because the rest of the data (digits) is borrowed from MNIST.

As far as the problems concerned, the biggest one is for sure similarities between 1 and / which causes most confusion to model. Improvements should be to increase dataset size, use data augmentation and maybe even to consider introducing another model, or some heuristics in order to determine if the model should consider classifying object as either 1 or /.

Metrics comment

The accuracy plot does not show signs of overfitting, and the loss plot backs that that up, so the training seems to be done well. We could have gone for the confusion matrix to see which classes are falsely classified the most and which one are most correct which is to say we could have calculated precision and recall for each of the classes.