1 |
Am 14.04.2011 23:30, schrieb Florian Philipp: |
2 |
> Am 14.04.2011 23:08, schrieb Liviu Andronic: |
3 |
>> Dear all |
4 |
>> What is your experience with corrupted PDF files? Do you know any tool |
5 |
>> that can attempt to repair damaged PDF files? Does it make any sense |
6 |
>> to edit a PDF file in hex mode? |
7 |
>> |
8 |
>> I have a damaged PDF that cannot be opened with any of the about 10 |
9 |
>> tools that I've just tried. |
10 |
>> liv@liv-laptop:/tmp$ pdf2ps Class\ 1.pdf |
11 |
>> **** Warning: File has a corrupted %%EOF marker, or garbage after %%EOF. |
12 |
>> **** Warning: An error occurred while reading an XREF table. |
13 |
>> **** The file has been damaged. This may have been caused |
14 |
>> **** by a problem while converting or transfering the file. |
15 |
>> **** Ghostscript will attempt to recover the data. |
16 |
>> Error: /typecheck in --run-- |
17 |
>> Operand stack: |
18 |
>> --nostringval-- --nostringval-- 1 |
19 |
>> Execution stack: |
20 |
>> %interp_exit .runexec2 --nostringval-- --nostringval-- |
21 |
>> --nostringval-- 2 %stopped_push --nostringval-- |
22 |
>> --nostringval-- --nostringval-- false 1 %stopped_push 1878 |
23 |
>> 1 3 %oparray_pop 1877 1 3 %oparray_pop 1861 1 3 |
24 |
>> %oparray_pop --nostringval-- --nostringval-- --nostringval-- |
25 |
>> --nostringval-- --nostringval-- --nostringval-- |
26 |
>> Dictionary stack: |
27 |
>> --dict:1155/1684(ro)(G)-- --dict:1/20(G)-- --dict:75/200(L)-- |
28 |
>> --dict:75/200(L)-- --dict:108/127(ro)(G)-- --dict:288/300(ro)(G)-- |
29 |
>> --dict:20/25(L)-- --dict:1/10(L)-- |
30 |
>> Current allocation mode is local |
31 |
>> GPL Ghostscript 8.71: Unrecoverable error, exit code 1 |
32 |
>> |
33 |
>> liv@liv-laptop:/tmp$ pdftops Class\ 1.pdf |
34 |
>> Error: PDF file is damaged - attempting to reconstruct xref table... |
35 |
>> Error: Top-level pages object is wrong type (null) |
36 |
>> Error: Couldn't read page catalog |
37 |
>> |
38 |
>> Any ideas how I could try to repair it? (It's not sensitive and it's |
39 |
>> small, so I could post it.) I tried pdftk, but it also fails. |
40 |
>> liv@liv-laptop:/tmp$ pdftk Class\ 1.pdf output Class\ 11.pdf |
41 |
>> java.lang.NullPointerException |
42 |
>> at com.lowagie.text.pdf.PdfReader$PageRefs.iteratePages(itext-2.1.7.jar.so) |
43 |
>> at com.lowagie.text.pdf.PdfReader$PageRefs.readPages(itext-2.1.7.jar.so) |
44 |
>> at com.lowagie.text.pdf.PdfReader$PageRefs.<init>(itext-2.1.7.jar.so) |
45 |
>> at com.lowagie.text.pdf.PdfReader$PageRefs.<init>(itext-2.1.7.jar.so) |
46 |
>> at com.lowagie.text.pdf.PdfReader.readPages(itext-2.1.7.jar.so) |
47 |
>> at com.lowagie.text.pdf.PdfReader.readPdf(itext-2.1.7.jar.so) |
48 |
>> at com.lowagie.text.pdf.PdfReader.<init>(itext-2.1.7.jar.so) |
49 |
>> at com.lowagie.text.pdf.PdfReader.<init>(itext-2.1.7.jar.so) |
50 |
>> Error: Unexpected Exception in open_reader() |
51 |
>> Error: Failed to open PDF file: |
52 |
>> Class 1.pdf |
53 |
>> Errors encountered. No output created. |
54 |
>> Done. Input errors, so no output created. |
55 |
>> |
56 |
>> Regards |
57 |
>> Liviu |
58 |
>> |
59 |
>> |
60 |
> |
61 |
> Well, you could try app-text/qpdf from the benf overlay. It has an |
62 |
> option to suppress recovery of damaged files so it looks like it at |
63 |
> least tries to repair them per default. |
64 |
> |
65 |
> If you don't want to install layman and overlays, you can send me the |
66 |
> file off-list and I take a look. |
67 |
> |
68 |
> One good thing about PDF is that its structure is stored uncompressed |
69 |
> (AFAIK it only compresses text and binary data with zlib since version |
70 |
> 1.2). This means that it might be at least partially recoverable. |
71 |
> |
72 |
> Regards, |
73 |
> Florian Philipp |
74 |
> |
75 |
|
76 |
Okay, qpdf cannot handle this. From a quick inspection of the file you |
77 |
provided me off-list it looks like the file is split in two. Right there |
78 |
within "19 0 object" it just ends. I guess this file is the product of |
79 |
an aborted download? |
80 |
|
81 |
I guess you could fire up your favorite hex editor and try to add the |
82 |
necessary stream and object end markers, adjust the length indicator for |
83 |
the last object and hope that some tool can then do the rest. However, |
84 |
that's much work for a 36k PDF that is incomplete anyway. |
85 |
|
86 |
Regards, |
87 |
Florian Philipp |