You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been playing around with the code to try to improve recognizing books.
These are issues I've noticed:
Sometimes two or more books are grouped. This seems to be due to various reasons such as the congratulation of the edge detection, lighting issues, and the fact that gray scale is used.
A lot of times there are slivers captured as books when they are clearly not. I think some could be removed simply by checking the width of the sliver and if it is a few pixels or so it could be removed.
Sometimes a spine is split in two. I've noticed this when the spine has a horizontal line in it or if there is a strong specular highlight. Basically anything that will produce a horizontal line.
I'm wondering if there can be a much more improved way to get the books and maybe even books in general such as using AI. If a NN was setup then it would just be a matter of training it and this might be auto generated by combining spines images in to "shelves".
I'm not sure how we can improve the current algorithm to make it more robust. I get a lot of false hits and strange results that make it a bit unusable. Not saying it can't get the data but it's too inaccurate to have it fully automatable.
There are some AI models that already have the ability to detect things like books so I wonder if it could be used as plug and play where one gets the books then clips them out of the image and feeds them in to the OCR to get info about them.
I've been playing around with the code to try to improve recognizing books.
These are issues I've noticed:
I'm wondering if there can be a much more improved way to get the books and maybe even books in general such as using AI. If a NN was setup then it would just be a matter of training it and this might be auto generated by combining spines images in to "shelves".
I'm not sure how we can improve the current algorithm to make it more robust. I get a lot of false hits and strange results that make it a bit unusable. Not saying it can't get the data but it's too inaccurate to have it fully automatable.
There are some AI models that already have the ability to detect things like books so I wonder if it could be used as plug and play where one gets the books then clips them out of the image and feeds them in to the OCR to get info about them.
E.g., https://www.freecodecamp.org/news/how-to-detect-objects-in-images-using-yolov8/
https://cocodataset.org/#explore
You can see it is pretty good at recognizing books.
The text was updated successfully, but these errors were encountered: