From: | R0b0t1 <r030t1@×××××.com> | ||
---|---|---|---|
To: | gentoo-user@l.g.o | ||
Subject: | Re: [gentoo-user] extracting text, numbers from screencasts | ||
Date: | Sat, 09 Apr 2016 05:38:06 | ||
Message-Id: | CAAD4mYjgF+_TzFWDdmjPewHnKpZzAjxeG0AgKNBBVEk0HqQd5g@mail.gmail.com | ||
In Reply to: | Re: [gentoo-user] extracting text, numbers from screencasts by "Urs Schütz" |
1 | Reading GUIs is a lot easier than most things tesseract was designed for. |
2 | You may still need a little preprocessing; I suggest OpenCV. |
3 | |
4 | Basically enlarge it, filter to remove noise (sharpening and blob detection |
5 | and perhaps another), threshold to get BW image. |
6 | |
7 | Most unneeded for GUI. OpenCV helps greatly in the situations Schutz |
8 | describes. Modification of the character data is better but more time |
9 | intensive. |