1 |
On 2021-03-01, Wols Lists wrote: |
2 |
|
3 |
> On 01/03/21 12:11, (Nuno Silva) wrote: |
4 |
>> On 2021-03-01, Wols Lists wrote: |
5 |
>> |
6 |
>>> I've got a bunch of scans, let's assume they're text documents. And |
7 |
>>> they're rather big ... I want to email them. |
8 |
>>> |
9 |
>>> How on earth do I convert them to TRUE b&w documents? At the moment they |
10 |
>>> are jpegs that weigh in at 3MB, and I guess they're using about 5 bytes |
11 |
>>> to store all the colour, luminance, whatever, per pixel. But actually, |
12 |
>>> there's only ONE BIT of information there - whether that pixel is black |
13 |
>>> or white. |
14 |
>>> |
15 |
>>> I'm using imagemagick, but so far all my attempts to strip out the |
16 |
>>> surplus information have resulted in INcreasing the file size ??? |
17 |
>>> |
18 |
>>> So basically, how do I save an image as "one bit per pixel" like you'd |
19 |
>>> think you'd send to a B&W printer? |
20 |
>>> |
21 |
>>> Even at 300dpi, I make that 300*300/8 ~= 10KB/in^2 or 800KB of |
22 |
>>> uncompressed info for a page of A4, not 3MB. |
23 |
>>> |
24 |
>>> Cheers, |
25 |
>>> Wol |
26 |
>> |
27 |
>> Somebody else might have a better suggestion, or perhaps a better |
28 |
>> understanding of the JPEG format and of what needs to be tuned, but, for |
29 |
>> example: |
30 |
>> |
31 |
>> convert origin.jpg -threshold 70% -monochrome result.jpg |
32 |
>> |
33 |
>> (And adjust the "-threshold percent" if needed. It might be that you |
34 |
>> don't need thresholding at all, but if you do, it apparently must go |
35 |
>> before "-monochrome".) |
36 |
>> |
37 |
>> (Depending on the receiving end, you could also explore other |
38 |
>> formats. Here, if the scanned document can be stored in monochrome, I |
39 |
>> usually use djvu.) |
40 |
>> |
41 |
> Thanks but no, I've already tried that. It makes matters worse! |
42 |
> |
43 |
> I've messed about with the scanner, so it is now creating 800KB images, |
44 |
> but I don't want to rescan everything I've done. |
45 |
> |
46 |
> The problem is that it is clearly saving the images as greyscale, not as |
47 |
> black&white. And when I search for help, what I want is swamped by all |
48 |
> the false positives for greyscale. |
49 |
> |
50 |
> Oh - and for Nuno - sorry tesseract is no use, they are NOT text. That's |
51 |
> why I used the word "assume" - to make it clear that I want a |
52 |
> 1-bit/pixel palette, not a 5-byte/pixel greyscale. |
53 |
> |
54 |
> Cheers, |
55 |
> Wol |
56 |
|
57 |
Sorry, my bad - I was checking the file sizes, but I didn't notice the |
58 |
larger one was the new, "monochrome" version. More coffee needed, it |
59 |
seems. |
60 |
|
61 |
-- |
62 |
Nuno Silva |