Gentoo Archives: gentoo-user

From: R0b0t1 <r030t1@×××××.com>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] extracting text, numbers from screencasts
Date: Sat, 09 Apr 2016 05:38:06
Message-Id: CAAD4mYjgF+_TzFWDdmjPewHnKpZzAjxeG0AgKNBBVEk0HqQd5g@mail.gmail.com
In Reply to: Re: [gentoo-user] extracting text, numbers from screencasts by "Urs Schütz"
1 Reading GUIs is a lot easier than most things tesseract was designed for.
2 You may still need a little preprocessing; I suggest OpenCV.
3
4 Basically enlarge it, filter to remove noise (sharpening and blob detection
5 and perhaps another), threshold to get BW image.
6
7 Most unneeded for GUI. OpenCV helps greatly in the situations Schutz
8 describes. Modification of the character data is better but more time
9 intensive.