Submitted by Rodny_ t3_yyenpp in MachineLearning
Rodny_ OP t1_iwwd1ta wrote
Reply to comment by Jean-Porte in [P]Modern open-source OCR capabilities and which model to choose by Rodny_
Yea on one hand it seems like problem that is quite easy to solve but the more you dig the more problems and obstructions you find. And than it makes you wondering why is such a basic task so hard to solve with some easy to use tools but textToImage models does get so much attention witch such an accessible tools.
visarga t1_iwwdkai wrote
Because it's a lucrative AI API for all the big players. Selling OCR for documents.
AtomKanister t1_iwwx4hs wrote
Might also be the data. The open-source internet is full of images with related text that can be crawled, but you won't find a lot of document scans with annotated boxes out there.
However, it's definitely doable. The paid services from cloud providers are all very, very high quality. It's more likely an open source availability issue.
Viewing a single comment thread. View all comments