Gentoo Archives: gentoo-user

From: Philip Webb <purslow@××××××××.net>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] PDF puzzle
Date: Sat, 14 Jan 2012 12:16:52
Message-Id: 20120114121531.GF3088@ca.inter.net
In Reply to: Re: [gentoo-user] PDF puzzle : files available by Florian Philipp
1 120114 Florian Philipp wrote:
2 > Am 14.01.2012 04:21, schrieb Philip Webb:
3 >> 120113 Florian Philipp wrote:
4 >>> Try the pdfdebugger provided by dev-java/pdfbox to inspect both files.
5 >> That needs Java, which I am definitely not going to re-install (smile).
6 > Well, I'll resist the temptation to start a flamewar over this ;)
7
8 Yes, definitely (grin).
9
10 >> I've now uploaded the files above & everyone can inspect their structure:
11 >> http://www.chass.utoronto.ca/~purslow/test/
12 >> I've renamed the PDFs to show their origin, ie LibreOffice + Ghostscript.
13 > I can reproduce the behavior with my LO.
14 > I've inspected the files with pdfdebugger. The LO-version really
15 > contains more, but nothing which seems to justify the difference.
16 > the content streams of each page seem to be better compressed by LO.
17 > Cups-PDF creates a smaller PDF than ps2pdf, probably because it outputs
18 > PDF-1.5. Otherwise it is identical to the other ghostscript outputs.
19 > Out of curiosity, I removed all pictures from an old report (25 pages)
20 > and tested that. There, Cups-PDF creates larger files than LO
21 > although the internal structure is similar to what you've provided.
22 > So I guess, all we can say is that their performance is inconsistent.
23
24 I suspect the difference in font resources (see other msg)
25 causes the difference in file sizes.
26
27 I discovered Pdf2ps among the Ghostscript binaries.
28 It allows a very simply means of reducing PDF size :
29
30 570: lit> pdf2ps boox.pdf boox-test.ps
31 572: lit> ps2pdf boox-test.ps boox-test.pdf
32 573: lit> ls -l
33 -rw-r--r-- 1 purslow purslow 67184 Jan 13 04:07 boox-gs.pdf
34 -rw-r--r-- 1 purslow users 366711 Jan 13 04:05 boox.pdf
35 -rw-r--r-- 1 purslow purslow 65695 Jan 14 06:52 boox-test.pdf
36 -rw-r--r-- 1 purslow purslow 407541 Jan 14 06:52 boox-test.ps
37
38 Ie take the LO PDF, convert it to PS with Pdf2ps,
39 then again to PDF with Ps2pdf & you get a file similar to the version
40 produced by creating a PS in LO & then converting that to PDF.
41 Perhaps this is not surprising, but it may help with PDFs from elsewhere.
42 This doesn't work if the PDF contains images, but for purely text files
43 it makes sense when using LO to create a PDF via PS, not directly.
44
45 PS I'm also heartened to discover that Pdftk still works without Java.
46
47 --
48 ========================,,============================================
49 SUPPORT ___________//___, Philip Webb
50 ELECTRIC /] [] [] [] [] []| Cities Centre, University of Toronto
51 TRANSIT `-O----------O---' purslowatchassdotutorontodotca