Gentoo Archives: gentoo-user

From:	R0b0t1 <r030t1@×××××.com>
To:	gentoo-user@l.g.o
Subject:	Re: [gentoo-user] extracting text, numbers from screencasts
Date:	Sat, 09 Apr 2016 05:38:06
Message-Id:	`CAAD4mYjgF+_TzFWDdmjPewHnKpZzAjxeG0AgKNBBVEk0HqQd5g@mail.gmail.com`
In Reply to:	Re: [gentoo-user] extracting text, numbers from screencasts by "Urs Schütz"

1	Reading GUIs is a lot easier than most things tesseract was designed for.
2	You may still need a little preprocessing; I suggest OpenCV.
3
4	Basically enlarge it, filter to remove noise (sharpening and blob detection
5	and perhaps another), threshold to get BW image.
6
7	Most unneeded for GUI. OpenCV helps greatly in the situations Schutz
8	describes. Modification of the character data is better but more time
9	intensive.