by Gabriel and Francisco - developed as an assignment for the Algorithms II course at UFMG
This classifier is capable of creating an approximate linear model to separate classes in a binary fashion, a point belongs or not to a class.
A dataset is projected to 2D via PCA and every class is checked for a viable separation, if one is found a model is created and then evaluated. This is done in several steps:
- Calculate convex hulls for every set of points
- Perform a linear sweeping checking for hull intersections or hulls within hulls, this dictates whether a dataset is separable or not
- If separable, calculate the perpendicular line equation which intersects the middle point of the shortest segment between hulls
- A database can be "made" separable by manipulating points, if manipulated, a point is added to the test database
- Given the line equation, run tests and evaluate the Precision, Recall and F1-Score of the model
As part of the assignment, several algorithms are implemented from scratch:
- Simple line geometry (orientation, turn direction, intersections)
- Graham scan for convex hulls (does not use any trig functions)
- Linear sweeping for intersection tests
- Several other helper algorithms and functions
10 datasets were used to test the model, detailed discussion can be found in the "relatorio_tp1.ipynb" file
Some results can be seen below:
Dataset 3: Dermatology
Classes view
Resulting model
Dataset 1: Iris
Classes view
Resulting model
Dataset 10: E. coli
Classes view
Resulting model (unable to generate)