ordellrb@lemmy.world to linuxmemes@lemmy.world · edit-229 days agoNot Total Recall (1990)lemmy.worldimagemessage-square57fedilinkarrow-up1728arrow-down110
arrow-up1718arrow-down1imageNot Total Recall (1990)lemmy.worldordellrb@lemmy.world to linuxmemes@lemmy.world · edit-229 days agomessage-square57fedilink
minus-squareR00bot@lemmy.blahaj.zonelinkfedilinkarrow-up18·28 days agoI can’t imagine it’d be that hard to write some code that does that using an existing AI model.
minus-squareMacN'Cheezus@lemmy.todaylinkfedilinkEnglisharrow-up2·28 days agoLlava and Bakllava are two Ollama models than can not only extract text but also describe what’s happening on screen. Using tesseract-ocr, as the other guy suggested, is probably simpler and less resource intensive though.
I can’t imagine it’d be that hard to write some code that does that using an existing AI model.
You’re probably right.
Llava and Bakllava are two Ollama models than can not only extract text but also describe what’s happening on screen.
Using
tesseract-ocr
, as the other guy suggested, is probably simpler and less resource intensive though.