best foss app for OCR?

(⬤ᴥ⬤)@lemmy.blahaj.zone · 5 months ago

best foss app for OCR?

eco_game@discuss.tchncs.de · 5 months ago

https://github.com/Akylas/OSS-DocumentScanner

I use this app for scanning documents, I just tried sharing a picture to the app and running OCR on that picture, which also worked fine, so it should also fit your usecase.

(⬤ᴥ⬤)@lemmy.blahaj.zone · 5 months ago

seems promising! thanks

Majestix@lemmy.world · 5 months ago

What’s the use-case?

(⬤ᴥ⬤)@lemmy.blahaj.zone · 5 months ago

mostly i just use it to extract text from screenshots to use in image descriptions

Audalin@lemmy.world · 5 months ago

Like Firefox ScreenshotGo? (I think it only supports English though)

Vinny_93@lemmy.world · 5 months ago

Not particularly Android related but I’m fairly certain you can do some OCR with Python. Question is whether you want to analyse an image file or straight into the camera. The latter might be a challenge.

(⬤ᴥ⬤)@lemmy.blahaj.zone · 5 months ago

i have posted this in an android community because i want to do it with an android app :/
unless there’s a way to easily run a python program with all the necessary dependencies on android this does not help me

chrash0@lemmy.world · 5 months ago

no need for Python. there’s a Google SDK, ML Kit, that will do the heavy lifting on this. if that’s not acceptable, TensorFlow, PyTorch, and ONNX support Android, albeit not as nicely integrated.

your image processing pipeline will be imageSource -> RGB encoding -> OCR -> profit. your OCR just needs an RGB encoded image. doesn’t matter if that’s a JPEG or YUV video feed at the source.

as for if there’s an app that fits OP’s exact use case, dunno.

filister@lemmy.world · 5 months ago

Man, you over complicated this task. OP, in F-Droid there are three apps that are based on a popular OCR Python library called tesseract. Just search for this term on F-Droid and give those apps a try.

chrash0@lemmy.world · 5 months ago

i mean, you’re right. i’m just saying it’s a little silly to ship a Python interpreter when there are easier, better supported ways to do the same thing.

looks like tesseract provides C bindings which are probably being utilized in those apps.