Post processing of Word output

Please post all questions and comments regarding Help & Manual 7 here.

Moderators: Alexander Halser, Tim Green

Post Reply
John Johann
Posts: 305
Joined: Mon Aug 21, 2017 7:35 pm

Post processing of Word output

Unread post by John Johann »

Hi all,
This isn't directly HnM related.

I've been experimenting with output to docx and was wondering if anyone else does this.

I'm aware that that is not the point of using H&M and it is a one-way street (Tim has been very clear about that). Nonetheless, has anyone any experience in cleaning up the output to better match their CI, e.g. replacing names of styles.

In general, does anyone output their projects to Word and if so, what do you use the output for?

Thanks
User avatar
julio
Posts: 118
Joined: Wed May 28, 2008 12:06 am
Location: Porto Alegre, RS - Brasil
Contact:

Re: Post processing of Word output

Unread post by julio »

I usually compile projects to docx when I need someone to review all changes, because then I can switch track changes on. This way, I just say "do whatever you want, delete whatever you want, add whatever you want, and then send the document back to me." This is because most people are not comfortable with PDF options to add comments to documents. Plus, there is a lot of PDF viewers on the market with different ways to edit a PDF and Word versions work mostly the same way. But that is only my perception based on past experiences.
Regarding output cleaning up, that would require a lot of study on DOCX standard, which is particularly complex, although it is a simple zip file with an internal structure that can be changed.
John Johann wrote:Hi all,
This isn't directly HnM related.

I've been experimenting with output to docx and was wondering if anyone else does this.

I'm aware that that is not the point of using H&M and it is a one-way street (Tim has been very clear about that). Nonetheless, has anyone any experience in cleaning up the output to better match their CI, e.g. replacing names of styles.

In general, does anyone output their projects to Word and if so, what do you use the output for?

Thanks
User avatar
Tim Green
Site Admin
Posts: 23143
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Re: Post processing of Word output

Unread post by Tim Green »

julio wrote:Regarding output cleaning up, that would require a lot of study on DOCX standard, which is particularly complex, although it is a simple zip file with an internal structure that can be changed.
The simplicity ends when you open the zip. The Word specifications are over 7000 pages, I believe. 8)

Tip: You can use Word output for change tracking between project versions. If you make and store Word versions at specific waypoints in your project development and updating, you can then compare those versions in Word to get the content-level changes. :)
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
Simon_Dismore
Posts: 204
Joined: Thu Jul 13, 2017 2:57 pm

Re: Post processing of Word output

Unread post by Simon_Dismore »

John Johann wrote:has anyone any experience in cleaning up the output to better match their CI, e.g. replacing names of styles?
If you need your H&M documentation to match your corporate identity (CI), you've got several choices depending on how often your docs change vs your corporate identity:
  • Create styles in H&M that match your CI, install the right fonts on your authoring systems and you're good to go. This approach will work until your CI standards change, at which point you'd have to redefine styles in H&M and republish. If your CI standards change rarely, this is the easiest route.
  • Create a special Word style-sheet that maps H&M styles to your CI (i.e. have each H&M style name inherit without change from one of your CI styles). Import this style-sheet after publishing from H&M. If your CI standards change frequently, you'll have to re-define and re-import this special style-sheet and apply it to your Word documentation, but you won't have to re-publish from H&M. If you have few H&M editors, or you need to publish using different CI for different customers, it might work best.
  • Rename the styles in your H&M output so they match the style names in your CI. Do this every time you publish from H&M. Then import your CI style-sheet to over-ride the style definitions. This is what you asked for and it's the longest route.
Assuming you want the third route, i.e. to rename the styles in your output, there are lots of ways to organize styles in MS Word. If that's your corporate standard I expect you have a colleague who lives and breathes this stuff, probably the person who put together all your CI styles. If you aren't familiar with using Word you'll need to find this person and get them to help.

If you're in a hurry and want to try things out, here are some steps to rename styles:
  • Use Word to open the document that was generated by H&M, e.g. HELPMAN_manual.docx.
  • Open Word's styles panel by clicking on the down-right arrow in the bottom left-hand corner of the styles drop-down list-box in Word's ribbon.
  • In the panel, click the 'Manage styles button.
  • On the Manage Styles dialog, click 'Import/Export'.
  • You'll see a list of styles. Select the one you wish to rename.
  • Click the 'Rename' button.
  • Enter the name name and click OK.
Steps to rename styles.png
Having renamed the styles you can overwrite their definitions by importing your corporate style-sheet.
Once you have this manual process working successfully you can automate it with macros. For example, here's one that renames H&M's HowTo to How2 :

Code: Select all

Sub Rename_styles()
'
' Rename_styles Macro
'
'
    Application.OrganizerRename Source:= _
        "C:\Users\Public\Documents\My HelpAndManual Projects\Examples\Get Me Started Example\Manual\HELPMAN_manual.docx" _
        , Name:="H&M Body Text HowTo", NewName:="H&M Body Text How2", Object:= _
        wdOrganizerObjectStyles
    Application.OrganizerRename Source:= _
        "C:\Users\Public\Documents\My HelpAndManual Projects\Examples\Get Me Started Example\Manual\HELPMAN_manual.docx" _
        , Name:="H&M Heading 3 HowTo", NewName:="H&M Heading 3 How2", Object:= _
        wdOrganizerObjectStyles
    Application.OrganizerRename Source:= _
        "C:\Users\Public\Documents\My HelpAndManual Projects\Examples\Get Me Started Example\Manual\HELPMAN_manual.docx" _
        , Name:="H&M Heading HowTo Toggle_atoc_", NewName:= _
        "H&M Heading How2 Toggle_atoc_", Object:=wdOrganizerObjectStyles
    Application.OrganizerRename Source:= _
        "C:\Users\Public\Documents\My HelpAndManual Projects\Examples\Get Me Started Example\Manual\HELPMAN_manual.docx" _
        , Name:="H&M Heading HowTo_atoc_", NewName:="H&M Heading How2_atoc_", _
        Object:=wdOrganizerObjectStyles
End Sub
Caveat. In order to use Word for technical documentation you need to invest time in understanding how it works, or find a colleague who's already an expert.
You do not have the required permissions to view the files attached to this post.
John Johann
Posts: 305
Joined: Mon Aug 21, 2017 7:35 pm

Re: Post processing of Word output

Unread post by John Johann »

Thanks for the replies.
@Julio -- I'm hoping more to output a final document to Word, which colleagues in far away places can translate to their native language if that is really necessary.
@Tim -- thanks for the tip on exporting Word files at certain stages for comparison
@Simon -- the styles in our Word docs are inherited but currently maintained by yours truly. The styles in our HnM projects are based on said Word styles and maintained by me. Whether our styles change often or not, a user should ideally see no difference in a "native" Word document, or one output by HnM -- i.e. -- same layout, colours, styles
Layout and colours are easy enough. Style(name)s need attention.

Renaming the H&M styles and importing the stylesheet is one possibility if the CI style from the imported stylesheet gets applied automatically to the relevant content. I'm looking for a fairly simple, resilient solution.
Simon_Dismore
Posts: 204
Joined: Thu Jul 13, 2017 2:57 pm

Re: Post processing of Word output

Unread post by Simon_Dismore »

John Johann wrote:styles in our Word docs are inherited but currently maintained by yours truly. The styles in our HnM projects are based on said Word styles and maintained by me... Style(name)s need attention... I'm looking for a fairly simple, resilient solution.
You obviously understand Word styles, so I won't go into so much detail again. If you've already created styles in Help & Manual with names identical to those in your corporate DOT template, the problem may be that the compiler is prepending 'H&M_' to each style when publishing to DOCX. So when you apply your corporate template to your published output, it doesn't have any effect because the style names don't match.

If that's the case, the most resilient solution would be to ask ec-Software to add an option "not to prepend style names with 'H&M_' when publishing to DOCX". That's far simpler than maintaining VBA code that iterates through all the DOCX style names after publishing, just to reverse undocumented behaviour you didn't want

By the way, if your corporate identity also applies to HTML/CSS you might also like an option "not to prepend style names with 'p_' or 'f_' when publishing to WebHelp". This would allow you to have one set of style names for all your publishing formats, but optionally have different definitions for HTML vs Word, that you maintain outside Help and Manual. So when your corporate identity changes, you can update all your Web and Word material without needing to re-publish from H&M.
User avatar
Tim Green
Site Admin
Posts: 23143
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Re: Post processing of Word output

Unread post by Tim Green »

Simon_Dismore wrote:By the way, if your corporate identity also applies to HTML/CSS you might also like an option "not to prepend style names with 'p_' or 'f_' when publishing to WebHelp".
If it doesn't prepend that it would have to prepend something else. The entire functioning of the style system in HM, also in the HTML-exported version, is based on the separation of font and paragraph attributes into two separate classes. This makes it possible to mirror the <text> and <para> tags from the XML source as close to 1:1 as possible. The font attributes go in <span> tags around the text with the f_ version of the style, the paragraph attributes go in the paragraph tag with the p_ version of the style.
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
Simon_Dismore
Posts: 204
Joined: Thu Jul 13, 2017 2:57 pm

Re: Post processing of Word output

Unread post by Simon_Dismore »

Tim Green wrote:If it doesn't prepend that it would have to prepend something else. The entire functioning of the style system in HM, also in the HTML-exported version, is based on the separation of font and paragraph attributes into two separate classes. This makes it possible to mirror the <text> and <para> tags from the XML source as close to 1:1 as possible. The font attributes go in <span> tags around the text with the f_ version of the style, the paragraph attributes go in the paragraph tag with the p_ version of the style.
I know that's how the output is generated at present, but it doesn't need to be done that way. The CSS already defines the style for span separately:

Code: Select all

span.f_Heading2_atoc_ /* Heading2_atoc_ */
{
 font-size: 19px;
 font-family: 'Segoe UI Semibold',Tahoma,Verdana,Arial,sans-serif;
 color: #000080;
}
.p_Heading2_atoc_ /* Heading2_atoc_ */
{
 font-size: 100%;
 /* etc etc */
}
I don't see why it's necessary to have two different classes for that purpose. All you'd need to do is define the paragraph style without specificity (so it maps to anything not more specifically defined), then the span specifically, using a single un-prefixed class name that could be the same as that used in the corporate identity:

Code: Select all

.corporateHeading2 /* general definition for all elements */
{
 font-size: 100%;
 /* etc etc */
}
span.corporateHeading2 /* redefined for span element */
{
 font-size: 19px;
 font-family: 'Segoe UI Semibold',Tahoma,Verdana,Arial,sans-serif;
 color: #000080;
}
Those p_ and f_ prefixes aren't used in H&M's own files. The hmxp uses a type value ("text" or "para") internally:

Code: Select all

<styleclass name="Heading2_atoc_" parentclass="Normal" nextclass="Body Text HowTo" shortcut="116">
  <style-set media="screen" type="para">margin-top:15px; padding:1px; etc etc ;</style-set>
  <style-set media="screen" type="text">font-family:'Segoe UI Semibold'; font-size:14pt; color:#000080;</style-set>
</styleclass>
This means the source files can stay the same. Only the Webhelp compilation changes, and not much.
John Johann
Posts: 305
Joined: Mon Aug 21, 2017 7:35 pm

Re: Post processing of Word output

Unread post by John Johann »

Just going back to my original question so this doesn't get lost -- my current approach involves exporting to *.docx, deleting unused H&M styles, renaming H&M styles, importing *.docx stylesheet, replacing H&M styles, deleting unused styles, table formatting.

When I export H&M projects to *.docx, all styles in the repository are exported, whether they are used in the document or not, so I cycle through them in the *.docx via macro and delete those that appear to be unused as a first step.

The styles I inherited in Word were sometimes renamed built in styles. When I try to rename an H&M style to a built-in style, that doesn't always work.

Thus, I have a mixture of styles which I rename and styles where I try and cycle through the output and replace them with the 'correct' imported styles. The latter can take a while depending on the document.

Paragraph or character styles seem to work as expected. However, some styles in the *.docx get automatically redefined to linked styles (auto-update is off). That means that when I import the stylesheet, I see the style name with the formatting as I want/expect when I highlight the relevant text and look in the style library, but it is not applied to the text as expected (e.g. bold/italics may be missing).

As for tables, they are also styled, but also have to be re-styled with the same sort of issues such as headers sometimes being bold as they should be or not for some unclear reason.

I also have tables which are indented and contain images which then like to overshoot the margin in the output.

So its all fun for anyone interested in this (I'm using latest H&M and latest Word).
Post Reply