diff --git a/README.md b/README.md index 15960dd..72bc81f 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,31 @@ # PROCOM: Instance-wise features is all you need 🔥 +## Intraduciton + +The goal of this project is to be able to disambiguate the potentially misleading images in order to improve classiffication performances. Indeed, whenever the image start to contain several objects, the classification task becomes trikiers as the model has to choose between the different objects dedected. Lets imagine an image of a cat siting next to a dog, this image can be classified either with "dog" label or "cat" label, but the model doesn't know which part refered to a cat and which refered to a dog. + +In order to remove this incertity, the goal is to remove the other objects in a training image and only focus on the part which really contain the desire object of the class. To do so, we take a batch of images belonging to the same class and use a class agnostic object detector to identify all the possible objects within an image. Then, between all the object detected in the image, we have to find the object which best represent the class. To do so, we extract features from the image, like attention mask for intance. Then a similarity method is used between those features with all the others features extracted from the other images of the batch. + +The underlying hypothesis is that the batch is big enough to contain enough images without any ambiguities to correctly describe the class. The second hypothesis is that the similarity between the features of those easy images and the corresponding objet of the multi-object image is maximal, and is discrimitative enough compared the the similarity obtain with unrelated objects. + +At the end of the process we obtain a denoised dataset which precisely decribe a class. + +### Problematic: + + + +### Approach : + +Two approaches have been proposed, the difference between those approach relise is the agnostic object detector. One use LOST wich directly detect objects of "interest" in a image. The other approach uses SAM wich ouputs mask of all the elements present in an images : is "segment everything". + +## Schemes of the two options + +The first approach is using [LOST](https://arxiv.org/pdf/2109.14279.pdf) + +The second approach is using [SAM](https://arxiv.org/pdf/2304.02643.pdf) + + ## Gantt Chart 🗓️ @@ -37,14 +62,11 @@ Subject reformulation : active, des11, 2023-09-29, 21d Project planning : active, des12, 2023-09-29, 7d Risk evaluation : des13, 2023-10-12, 7d -section Brouillon -Config :done, des7, 2023-09-29, 7d -Dataset :active, des8, 2023-09-29, 21d -NCM :done, des9, 2023-10-05, 7d -Model :done, des10, 2023-10-05, 7d ``` +## Instalation + ## KIKIMETER 📈 diff --git a/images/LOST_pipeline.png b/images/LOST_pipeline.png new file mode 100644 index 0000000..61fc87f Binary files /dev/null and b/images/LOST_pipeline.png differ diff --git a/images/SAM_pipeline.png b/images/SAM_pipeline.png new file mode 100644 index 0000000..7ead247 Binary files /dev/null and b/images/SAM_pipeline.png differ