1 |
On March 1, 2021 12:50:35 PM GMT+01:00, Wols Lists <antlists@××××××××××××.uk> wrote: |
2 |
>I've got a bunch of scans, let's assume they're text documents. And |
3 |
>they're rather big ... I want to email them. |
4 |
> |
5 |
>How on earth do I convert them to TRUE b&w documents? At the moment they |
6 |
>are jpegs that weigh in at 3MB, and I guess they're using about 5 bytes |
7 |
>to store all the colour, luminance, whatever, per pixel. But actually, |
8 |
>there's only ONE BIT of information there - whether that pixel is black |
9 |
>or white. |
10 |
> |
11 |
>I'm using imagemagick, but so far all my attempts to strip out the |
12 |
>surplus information have resulted in INcreasing the file size ??? |
13 |
> |
14 |
>So basically, how do I save an image as "one bit per pixel" like you'd |
15 |
>think you'd send to a B&W printer? |
16 |
> |
17 |
>Even at 300dpi, I make that 300*300/8 ~= 10KB/in^2 or 800KB of |
18 |
>uncompressed info for a page of A4, not 3MB. |
19 |
> |
20 |
>Cheers, |
21 |
>Wol |
22 |
> |
23 |
|
24 |
Have you tried an optical character recognition software like Tesseract[1]? |
25 |
|
26 |
1. https://github.com/tesseract-ocr/tesseract |
27 |
|
28 |
|
29 |
|
30 |
-- |
31 |
Hund |