1 |
On 01/03/21 12:11, (Nuno Silva) wrote: |
2 |
> On 2021-03-01, Wols Lists wrote: |
3 |
> |
4 |
>> I've got a bunch of scans, let's assume they're text documents. And |
5 |
>> they're rather big ... I want to email them. |
6 |
>> |
7 |
>> How on earth do I convert them to TRUE b&w documents? At the moment they |
8 |
>> are jpegs that weigh in at 3MB, and I guess they're using about 5 bytes |
9 |
>> to store all the colour, luminance, whatever, per pixel. But actually, |
10 |
>> there's only ONE BIT of information there - whether that pixel is black |
11 |
>> or white. |
12 |
>> |
13 |
>> I'm using imagemagick, but so far all my attempts to strip out the |
14 |
>> surplus information have resulted in INcreasing the file size ??? |
15 |
>> |
16 |
>> So basically, how do I save an image as "one bit per pixel" like you'd |
17 |
>> think you'd send to a B&W printer? |
18 |
>> |
19 |
>> Even at 300dpi, I make that 300*300/8 ~= 10KB/in^2 or 800KB of |
20 |
>> uncompressed info for a page of A4, not 3MB. |
21 |
>> |
22 |
>> Cheers, |
23 |
>> Wol |
24 |
> |
25 |
> Somebody else might have a better suggestion, or perhaps a better |
26 |
> understanding of the JPEG format and of what needs to be tuned, but, for |
27 |
> example: |
28 |
> |
29 |
> convert origin.jpg -threshold 70% -monochrome result.jpg |
30 |
> |
31 |
> (And adjust the "-threshold percent" if needed. It might be that you |
32 |
> don't need thresholding at all, but if you do, it apparently must go |
33 |
> before "-monochrome".) |
34 |
> |
35 |
> (Depending on the receiving end, you could also explore other |
36 |
> formats. Here, if the scanned document can be stored in monochrome, I |
37 |
> usually use djvu.) |
38 |
> |
39 |
Thanks but no, I've already tried that. It makes matters worse! |
40 |
|
41 |
I've messed about with the scanner, so it is now creating 800KB images, |
42 |
but I don't want to rescan everything I've done. |
43 |
|
44 |
The problem is that it is clearly saving the images as greyscale, not as |
45 |
black&white. And when I search for help, what I want is swamped by all |
46 |
the false positives for greyscale. |
47 |
|
48 |
Oh - and for Nuno - sorry tesseract is no use, they are NOT text. That's |
49 |
why I used the word "assume" - to make it clear that I want a |
50 |
1-bit/pixel palette, not a 5-byte/pixel greyscale. |
51 |
|
52 |
Cheers, |
53 |
Wol |