Viewing a single comment thread. View all comments

flapflip9 t1_iwumecm wrote

Look into open-mmlab's MMOCR, does both detection and recognition, with English and Chinese alphabet support. Absolutely wicked performance, it scrapes off text from logos, flyers, blurred text, etc. Not suitable for real-time performance.

Until a few years ago, I was quite happy with Tesseract, but they've fallen behind since then. Still good for scanning printed text or similar. Also supports a lot of languages.

5

robertknight2 t1_iwveveb wrote

To add to this, Tesseract's text recognition of identified lines of text uses a modern approach involving LSTM neural networks, but the text detection process which comes before this uses classical/heuristic (ie. non-ML) approaches which work well on clean-ish document images, but can struggle with photos of documents that have uneven lighting conditions and spotting text in a photo (eg. numberplates in a city scene).

I maintain a JavaScript build of Tesseract with an online demo that you can try with different images: https://robertknight.github.io/tesseract-wasm/

6