PDF file way too big

This is the place to discuss Help & Manual 4 issues. Please don't post questions on any other versions here!

Moderators: Alexander Halser, Tim Green

Post Reply
User avatar
Roy Lasris
Posts: 55
Joined: Fri Jul 06, 2007 9:31 pm
Location: Yorktown, Virginia, USA
Contact:

PDF file way too big

Unread post by Roy Lasris »

I took a 'Master Help File' that I had created with the goal of cutting out all but the bare startup essential. So deleted almost all of the substance except for the single 'startup' topic. Fully 90% of the original file is gone.

The original file was 20.5 mb. This new file is 14 mb, way more than half the size of the original, when it should be only about one-tenth the size.

The fonts are not embedded. The images are all gifs and small.

What else do I look for to shrink this file (or what tricks can I apply).

My goal was a quick transfer to interested users of a very small file, but that didn't happen. :{

Roy
Roy Lasris
Pathagoras
www.pathagoras.com
User avatar
Tim Green
Site Admin
Posts: 23154
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Unread post by Tim Green »

Roy,

Are your images resized in Help & Manual? If they are, the full-sized original images are still being exported to the PDF. Another thing that can inflate PDFs, apart from embedding fonts, is switching on the glyph font export option in the font embedding section, which will effectively export all your text as graphics.
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
User avatar
Roy Lasris
Posts: 55
Joined: Fri Jul 06, 2007 9:31 pm
Location: Yorktown, Virginia, USA
Contact:

Unread post by Roy Lasris »

The sum total of all of the images in the new file is 280kb. (That's a pretty insignificant number, compared to the total file size of 14mb. (Double it for the 'toggle' big and small and the number is still small.)

The total number of pages after the pdf export is 34. (As I mentioned, this is one tenth the size of the original file, 345 pages.) The original file had tons of images, but 20mb seemed an appropriate size for that file.)

I didn't change any settings re: font embedding from the 'big' document to the 'small.'

Font embedding is 'on' but I have excluded Arial, TNR and Courier. And I have only used those fonts in the document.

My CID font mode is 'off.'

Optimize embedded fonts for cross platform compatibility is 'on' (checked).

Glyph font export option is 'off' (i.e., not checked)

Any other place for me to look?

Thanks,

Roy
Roy Lasris
Pathagoras
www.pathagoras.com
User avatar
Dean Whitlock
Posts: 577
Joined: Thu Sep 01, 2005 5:59 pm
Location: Thetford Center, Vermont USA
Contact:

Unread post by Dean Whitlock »

Hi Roy,

This is a complete shot in the dark, but try running Repair and Recover on the new file. Since you started with an existing file and then made massive cuts, it's possible that there is something broken in the .hmx file that is causing the PDF compile to embed all of your fonts. (When in doubt, try everything.)

Good luck,
Dean
User avatar
Tim Green
Site Admin
Posts: 23154
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Unread post by Tim Green »

Roy,

I would second Dean's suggestion. There is no way that the project you describe could produce a 14MB PDF, that's physically impossible. There must be something else in it causing the inflation -- although I doubt that running the repair will help because a PDF only contains topics actually listed in the TOC. If this doesn't help please mail the project to support and we'll have a look at it. Please pack just the .HMX file and the graphics in a ZIP or RAR archive (necessary to prevent files from being stripped by some mail providers) and mail it to support AT ec-software.com (replace the AT with @).
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
User avatar
Dean Whitlock
Posts: 577
Joined: Thu Sep 01, 2005 5:59 pm
Location: Thetford Center, Vermont USA
Contact:

Unread post by Dean Whitlock »

I don't know if this might apply here but it's related. I have a project with many screen shots in it (173) and about 280 pages of text. It was compiling a PDF that was 10 mbytes. I decided to trim the screen shots as much as possible and also removed a few lingering png files and 256-color bmps. The gain was about 500 kbytes. I tried using the screen driver instead of the generic HP Laserjet 5 driver, but that made the file about 200 mbytes larger. I was not embedding any true-type fonts at all. My final recourse was to try different compression options in Project Properties > PDF Options.

I set Text Compression to be Deflate (max, recommended). My Image Quality setting had been JPEG 100%, which was giving me a size of 9585 kbytes and so-so print quality (liveable, but only just).

I changed Image Quality to JPEG 75%. The difference in print quality was not detectible, and the file size dropped to 6876 kbytes.

Just to see what the best possible print quality would look like, I changed Image Quality to No Compression (best for screen shots). This gave me very good print quality (much, much better than the two jpeg settings). And the file size was now... 2408 kbytes! :shock: :? 8)

I don't know what's going on, but I'll take it. Is it possible there is a bug in the PDF compiler or in the HM4 interface to it? Is it possible that this will also happen in HM5? My next project is to convert this manual to version 5, so I guess I'll find out one way or the other.

Dean
User avatar
Winsteps
Posts: 443
Joined: Wed Aug 29, 2007 7:34 am

Unread post by Winsteps »

Dean, are you telling us that an optimized PDF file was 10MB, but the unoptimized PDF was 2MB ? Then that is paradoxical ....
User avatar
Martin Wynne
Posts: 2656
Joined: Mon May 12, 2003 3:21 pm
Location: West of the Severn, UK

Unread post by Martin Wynne »

Dean Whitlock wrote:Just to see what the best possible print quality would look like, I changed Image Quality to No Compression (best for screen shots). This gave me very good print quality (much, much better than the two jpeg settings). And the file size was now... 2408 kbytes! :shock: :? 8)

I don't know what's going on, but I'll take it.
Hi Dean,

It's been said many times on here, so I'm a bit surprised you are using JPG format for screenshots. JPG is not just unsuitable for screenshots -- it's a total disaster. File size is often vastly increased over a simple run-length encoding native in the PDF format.

The P in JPG stands for photographic, and that's what the format is for -- real world images created with a camera or scanner. It makes a complete pig's ear of encoding solid-tone screen graphics.

regards,

Martin.
User avatar
Tim Green
Site Admin
Posts: 23154
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Unread post by Tim Green »

Dean,

As Martin says, JPEGs are the worst possible format for screenshots and this is exacerbated in PDF because the JPEGs need to be converted back to BMP (because PDF can only handle BMP as the base format), and additional JPEG compression is then applied after that. In addition to this, JPEGs are always True Color (16 million colors), so this is also inflating the size in the conversion to BMP.

What is happening in your case is this: First your screenshots, which originally consist of nice clean areas of single colors with sharp borders, are saved as JPEGs with lossy compression, which introduces "fuzziness" in both the single-color areas and the borders. These are then converted to BMP with all the fuzzy JPEG "artefacts", which make lossless compression less efficient by several orders of magnitude. Inside the PDF the images are then compressed and additional JPEG compression is added (if you turn it on), further exacerbating the situation. The 16 million colors do further damage in terms of file size.

If you use clean screenshots saved as BMP with 256 colors you will get much better results and much smaller files...
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
Post Reply