Gentoo Archives: gentoo-user

From: "J. Roeleveld" <joost@××××××××.org>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] multi-region OCR
Date: Wed, 30 Nov 2016 09:48:00
Message-Id: 3315342.vUnCsu5vlo@eve
In Reply to: [gentoo-user] multi-region OCR by Michael Mol
1 On Tuesday, November 29, 2016 01:33:48 PM Michael Mol wrote:
2 > So, I've got scans of a half dozen new hard drives, and I've got scans of
3 > their labels. One image has two drives, the other has four.
4 >
5 > Rather than manually transcribing the label contents into my intake ticket,
6 > I'd like to select a region of each image and OCR it. (Darn, it'd be handy
7 > if they put all this metadata into a QR code...)
8 >
9 > What tools exist to let me do this? Keep in mind, I've got multiple regions
10 > I need to OCR, and the regions aren't going to be consistent across images.
11 >
12 > xsane would have let me do it during the scan process if I'd thought of it
13 > then, but the scans are done, drives aren't there any more. Something
14 > reasonably similar would be nice. Okular is reputed to have some OCR
15 > capability, but I can't find it. Dolphin is supposed to be able to do it if
16 > you have tesserract installed (I do), but I can't find the service to
17 > enable. I could use some pointers...
18
19 Quick search:
20
21 https://help.ubuntu.com/community/OCR
22
23 This contains some example-scripts for several OCR tools.
24
25 --
26 Joost
27
28 PS. I used a similar approach once to fix a PDF from an HR-department to enable
29 searching. They typed a document in MS Word, printed it, then scanned it into
30 a PDF... Merging the PDF with the OCR-results was quite nice as well