Gentoo Archives: gentoo-user

From: Mick <michaelkintzios@×××××.com>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] OT: extract an image from a .doc file?
Date: Sun, 13 Dec 2009 12:51:25
Message-Id: 200912131250.23511.michaelkintzios@gmail.com
In Reply to: Re: [gentoo-user] OT: extract an image from a .doc file? by Stroller
1 On Sunday 13 December 2009 12:12:46 Stroller wrote:
2 > On 13 Dec 2009, at 10:50, Mick wrote:
3 > > On Sunday 13 December 2009 08:46:05 Stroller wrote:
4
5 > If I open the file(s) I have the interest in, the first 4 entries in the
6 > context-menu are the same, but after the first separator I get instead
7 > "Object" (which did not appear previously) and "Caption". There is then
8 > another separator and instead of Cut, Copy, Paste, I see only Cut & Copy.
9
10 This indicates that the graphic in question is an embedded MSWindows file. If
11 you were able to double click on it in MSWIndows it would read its metadata
12 and launch the respective MSWindows application for editing it; e.g. MSPaint,
13 PPt, Excel and what not. With OOo this API linkage is not there I guess, so
14 all you can do cut/copy it.
15
16 > This file was created by the software that a lettings agency uses to manage
17 > their properties. It runs on Windows and automatically generates letters
18 > (for overdue rent, inspections &c) in .doc format. One image in question
19 > is the boss' signature, so the letters appear like he actually signed
20 > them, but I think they also use company logos in other letters.
21
22 I guess that whoever created this image they did not save it as 'conventional'
23 image, e.g. jpeg, png, etc, and therefore OOo cannot deal with it as it would
24 with a normal image.
25
26 > Apart from that, I don't see why this image is treated differently by
27 > OpenOffice.
28
29 Because it is not an 'image' but an embedded MSWindows file in the MSWord
30 document with loads of its own proprietary metadata.
31
32 > Isn't there a program (command line?) for converting .doc into HTML? Maybe
33 > that would extract the image.
34
35 I think that MSWord has either a SaveAs or an export function which will
36 convert the file into HTML. Also OOo has File/Preview as HTML, which will
37 convert the document into html and open it in a browser - if the graphics look
38 correct then you could save it from with the browser.
39
40 > The reason I'd like to see this is because some of the .doc files are 2 meg
41 > in size (some others exactly 1meg, so cluster size may affect this) and
42 > there are thousands of them taking up space on the server. If the image is
43 > to blame then we would benefit many times from the size saving. I haven't
44 > yet spoken to the site about this, only discovering it yesterday, so I
45 > don't know if I can find the file by accessing the property management
46 > software.
47
48 Have you looked at what size you get with pdf'ing them?
49 --
50 Regards,
51 Mick

Attachments

File name MIME type
signature.asc application/pgp-signature