Comment by Grom_PE
3 months ago
On my Linux system, I hooked up xfce4-screenshooter's "custom action" to a shell script (ocr.sh %f) with tesseract like so:
#!/bin/sh
set -o pipefail
lang=${2:-eng}
if tesseract "$1" - -l $lang | xclip -selection clipboard ; then
notify-send "Text copied"
else
notify-send "Could not copy text"
fi
It works great most of the time along with the xfce4-screenshooter's ability to select a rectangle.
When the text is especially difficult for tesseract, I can use Gemma3-4B via llama.cpp's llama-mtmd-cli, but that takes a minute.
https://0x0.st/K9hq.png