Merge HTML into PDF

Please post all questions and comments regarding Help & Manual 7 here.

Moderators: Alexander Halser, Tim Green

Tobias Escher
Posts: 202
Joined: Mon Dec 28, 2015 7:32 pm

Merge HTML into PDF

Unread post by Tobias Escher »

I have a section in a project that is published as WebHelp and extensively uses the MERGE command to put external HTML files into topics. These files are created externally, they have to be HTML, and I have no way to influence that.
Not customers want the content of these files also as part of our PDF documentation. H+M strips the merged HTML from the PDF output. Is there any way around that? Like having H+M render the HTML inside the editor when calling up the topic so that it is "regular" text?
The problem is that these HTML files need to be loaded dynamically upon publishing and change. So it is not an option to "rebuild" the content of these HTML files inside H+M. They change too frequently...
User avatar
Tim Green
Site Admin
Posts: 23156
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Re: Merge HTML into PDF

Unread post by Tim Green »

No, this isn't possible because PDF isn't HTML. It isn't really much of anything, actually. In reality, a PDF is just the raw data that would normally be sent to a printer, stored in a file and displayed with a printer driver that puts the output on the screen instead of in paper. There is almost nothing recognizable inside a PDF apart from plain text, because everything else consists of commands for drawing on the screen. :?
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
User avatar
Martin Wynne
Posts: 2656
Joined: Mon May 12, 2003 3:21 pm
Location: West of the Severn, UK

Re: Merge HTML into PDF

Unread post by Martin Wynne »

But Tim, you can Load Topic From File... and load an HTML file which will be imported into the editor as normal RVF editor content, and can be output to PDF in the normal way. So the HTML import functions already exist.

It should be possible to have a function which does the same with an HTML Code Object at PDF compile time, including any #MERGE file it contains. Probably it would be necessary to create it as a single-cell table in order to have it in a separate RVF document. Or perhaps the HTML could be opened in THTMLView and transferred into the RVF as a metafile.

Possibly it would slow down compilation if there were lots of them in the project, but it still seems doable.

regards,

Martin.
Tobias Escher
Posts: 202
Joined: Mon Dec 28, 2015 7:32 pm

Re: Merge HTML into PDF

Unread post by Tobias Escher »

wow - that would be AWESOME!!!!
Please, please, please, please, please build this in!
Load topic from file basically does exactly what I want. I just need it to do the loading every time the project compiles.
User avatar
Martin Wynne
Posts: 2656
Joined: Mon May 12, 2003 3:21 pm
Location: West of the Severn, UK

Re: Merge HTML into PDF

Unread post by Martin Wynne »

Tobias Escher wrote:wow - that would be AWESOME!!!!
Please, please, please, please, please build this in!
Load topic from file basically does exactly what I want. I just need it to do the loading every time the project compiles.
Hi Tobias,

I agree it would be very useful option. It's important that it applies only when compiling to non-HTML formats such as PDF, PRINT, WORD, otherwise it defeats the whole purpose of having the HTML Code Objects.

On reflection it might be implemented by converting the HTML Code Object to a linked Snippet so that the imported HTML file can be in a separate XML file. This would avoid creating a single-cell table.

regards,

Martin.
User avatar
Tim Green
Site Admin
Posts: 23156
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Re: Merge HTML into PDF

Unread post by Tim Green »

Tobias, Martin,

The problem here is that the #MERGE command is not working at edit time but at publish time, and the HTML never comes anywhere near the Help+Manual editor. It is merged into the HTML file. The function you describe might be possible, but it would be completely different and new, with no relation at all to what #MERGE is doing now. :?
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
User avatar
Martin Wynne
Posts: 2656
Joined: Mon May 12, 2003 3:21 pm
Location: West of the Severn, UK

Re: Merge HTML into PDF

Unread post by Martin Wynne »

Tim Green wrote:The function you describe might be possible, but it would be completely different and new, with no relation at all to what #MERGE is doing now. :?
Yes Tim, that's the whole point.

When compiling to HTML-based outputs it would work exactly as now.

When compiling to non-HTML outputs (PDF, WORD, etc.), HTML Code Objects would be imported into the editor in the same way as Load Topic From File does now (which imports HTML files).

Given that the HTML import function clearly exists, it doesn't seem too much of jump to apply it at compile time when creating non-HTML outputs. It might very significantly affect compile speed, which is why it needs to be an option setting.

regards,

Martin.
User avatar
Tim Green
Site Admin
Posts: 23156
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Re: Merge HTML into PDF

Unread post by Tim Green »

Martin Wynne wrote:Given that the HTML import function clearly exists, it doesn't seem too much of jump to apply it at compile time when creating non-HTML outputs. It might very significantly affect compile speed, which is why it needs to be an option setting.
I'm aware of that, but I'm always cautious jumping to conclusions with things like this because the Devil is in the details and many things that seem simple and straightforward at the first turn out to be anything but. I've passed it on to the dev team and we'll see what comes of it. 8)
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
Tobias Escher
Posts: 202
Joined: Mon Dec 28, 2015 7:32 pm

Re: Merge HTML into PDF

Unread post by Tobias Escher »

This feature would be a dream come true.
In my special case, I need to merge HTML files on publish time and I need to make PDFs out of them soon. There is no way around that. I can't use any other format, I need to merge the HTML files.
In a perfect world I'd be a Billionaire, then I could decide these things. As it stands, I am not, so if my bosses say HTML, then HTML it is! :)
So this feature would save me hours every week.
User avatar
Tim Green
Site Admin
Posts: 23156
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Re: Merge HTML into PDF

Unread post by Tim Green »

Tobias, Martin,

We have implemented a trial of HTML snippets in the current beta of 7.3, which you can download here:

http://download.ec-software.com/help-an ... 0-beta.exe

Please don't get your hopes up too much, however. There are serious restrictions that you need to consider: The HTML must be converted to HM XML and then back again, and there are problems with images and links that may or may not be solvable for you, depending on how your HTML files are and can be formatted.

How to use it:
  1. Use the Snippet tool with the From File option, navigate to the folder containing the HTML snippet file and type in the name of the file manually in the open field, but with the extension .xml. For example, if the file is called myfile.html, type in myfile.xml and then press ENTER.
  2. You will get an error message. This is important, because then you have passed through a parsing step. Now manually change the extension in the Open field and press ENTER again. Now the file will read in without any errors.
CSS will be parsed as far as possible, if it is included in the file.

IMPORTANT: After inserting the snippet, you can't edit the insertion by double-clicking it at the moment.

Now let's look at all the problems involved:

Graphics:

Only the names of the graphics can be imported. If your images are not stored in folders located in the project's search path they will not be found. NO path information can or will be taken into account. If the images are not on the search path they won't be found and won't be used.

Hyperlinks:

Normal web links to external sites and files should continue to work but topic links won't unless you format them as topic links in your HTML. This means your HTML would need to contain links to topic IDs that would be invalid in normal HTML, like this:

Code: Select all

<a href="mytopic">Link Text</a>
It must be entered like that, without a file extension. In a snippet it's not possible to do things like check whether the link might be to a topic or an external file.

HTML:
Only HTML will be imported that can be meaningfully processed in the Help+Manual editor. If your HTML doesn't import properly this means you will need to change your HTML. Changes in Help+Manual are categorically not possible to cater to different kinds of HTML in snippets. This is an absolute restriction and not up for discussion.
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
User avatar
Martin Wynne
Posts: 2656
Joined: Mon May 12, 2003 3:21 pm
Location: West of the Severn, UK

Re: Merge HTML into PDF

Unread post by Martin Wynne »

That's brilliant. A Christmas present from EC Software! :)

Many thanks.

Martin.
Tobias Escher
Posts: 202
Joined: Mon Dec 28, 2015 7:32 pm

Re: Merge HTML into PDF

Unread post by Tobias Escher »

WOW - this is almost perfect. No worries, I don't need links or graphics :)

There is one little graphics glitch that is rendered differently. For some reason alternating line colours in a table are only rendered if the table cell has someting in it, not in empty cells.
If that would be possible to look into, I would be the happiest guy on earth...
Should I attach screenshots plus example HTML here or send them to support?
User avatar
Tim Green
Site Admin
Posts: 23156
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Re: Merge HTML into PDF

Unread post by Tim Green »

Tobias Escher wrote:For some reason alternating line colours in a table are only rendered if the table cell has someting in it, not in empty cells.
That may fall in the category of things that can't be changed because of the way the HTML must be imported in snippet mode. I'm having it checked. 8)
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
Tobias Escher
Posts: 202
Joined: Mon Dec 28, 2015 7:32 pm

Re: Merge HTML into PDF

Unread post by Tobias Escher »

I will send you an example HTML. Can't attach here, it accepts neither html nor htm nor rar :(
User avatar
Tim Green
Site Admin
Posts: 23156
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Re: Merge HTML into PDF

Unread post by Tim Green »

Update: I need to correct the method for selecting HTML snippets slightly:
  1. Open the snippet tool, navigate to the folder containing the HTML snippet file and type in the name of the file manually in the open field, but with the extension .xml. For example, if the file is called myfile.html, type in myfile.xml and then press ENTER.
  2. You will get an error message. This is important, because then you have passed through a parsing step. Now manually change the extension in the Open field and press ENTER again. Now the file will read in without any errors.
I'm going to correct the original description above as well.

IMPORTANT: After inserting the snippet, you can't edit the insertion by double-clicking it at the moment.
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
Post Reply