-
Semantic gap: from pixels to a cat
-
viewpoint variation
-
Illumination
-
deformation of objects
-
occlusion
-
background clutter
-
intraclass variation: different cats
-
train, validation, test
-
cross-validation (not much used in deep learning since training is expensive and data is quite large)
Linear classifiers can be seen from a point of template matching. The most matching sample scores the highest in the corresponding label. A linear classifiers has only one template per category. It doesn't work with multimodal data, in which data of one category is distributed among feature space, not aggregated together. Linearity also limits its use.