Techdoc showstopper: $ is a breaking character

This is the place to discuss Help & Manual 4 issues. Please don't post questions on any other versions here!

Moderators: Alexander Halser, Tim Green

helebor
Posts: 15
Joined: Mon Feb 12, 2007 2:20 pm

Techdoc showstopper: $ is a breaking character

Unread post by helebor »

In the H&M4 editor the $ character behaves like a breaking character. To explain what I mean, in wrapped text if you have a word like SYS$TABLENAME and the editor runs out of room at the end of a line, it considers '$' to be the last character of a word 'SYS$' and inserts a line break after it.

Is there something I can configure in H&M 4 to avoid this behaviour? It is a total showstopper when writing tech doc for programming languages and databases, which is what I use H&M for. So much so that I'm faced with moving the manuscript for two books into Framemaker or something just as horrible.

Thanks
Helen
User avatar
Martin Wynne
Posts: 2656
Joined: Mon May 12, 2003 3:21 pm
Location: West of the Severn, UK

Re: Techdoc showstopper: $ is a breaking character

Unread post by Martin Wynne »

helebor wrote:In the H&M4 editor the $ character behaves like a breaking character. To explain what I mean, in wrapped text if you have a word like SYS$TABLENAME and the editor runs out of room at the end of a line, it considers '$' to be the last character of a word 'SYS$' and inserts a line break after it.

Is there something I can configure in H&M 4 to avoid this behaviour?
Hi Helen,

This is one of the niggles in the RVF editor used in H&M. There is nothing at present you can do about it. However, it occurs only in the H&M editor where you are writing the text. There won't be a line break in the compiled output which your users see.

regards,

Martin.
helebor
Posts: 15
Joined: Mon Feb 12, 2007 2:20 pm

Unread post by helebor »

Martin,

OK, thanks, that's good to know. I can't test that for a few days yet because I have a whole lot of imported stuff sitting at the end of the book. -- Helen
User avatar
Alexander Halser
EC-Software Support
Posts: 4098
Joined: Mon Jun 24, 2002 7:24 pm
Location: Salzburg, Austria
Contact:

Unread post by Alexander Halser »

Helen,

The $ sign is not a word break character. Only if there is a formatting change in the word, it would be considered two words. I suspect this is the case here.

Select the word SYS$TABLENAME, right click on the editor and from the context menu select "Convert to plain text". If there were different formattings in that word before, it will be one word after that.
Alexander Halser
Senior Software Architect, EC Software GmbH
User avatar
Martin Wynne
Posts: 2656
Joined: Mon May 12, 2003 3:21 pm
Location: West of the Severn, UK

Unread post by Martin Wynne »

Alexander Halser wrote:The $ sign is not a word break character.
Hi Alexander,

Yes it is, and "convert to plain text" doesn't fix it. See attached.

Martin.
You do not have the required permissions to view the files attached to this post.
User avatar
Alexander Halser
EC-Software Support
Posts: 4098
Joined: Mon Jun 24, 2002 7:24 pm
Location: Salzburg, Austria
Contact:

Unread post by Alexander Halser »

Duh! You are right... what did I look at? :?

It seems to be a bug in the editor, but I'm not 100% sure before I have the feedback from the author of the editor component. My internal fix shows correct behaviour, but I am not certain if this isn't language dependent (Asian languages may differ here - word break rules are a sience in itself).
Alexander Halser
Senior Software Architect, EC Software GmbH
User avatar
Sergey Tkachenko
Posts: 13
Joined: Sat Apr 23, 2005 6:21 pm
Contact:

Unread post by Sergey Tkachenko »

In the editor, '$' is considered as a numeric prefix (it is in the same group as the chatracters like '±'). Line breaks are allowed before prefixes.
But really, '$' is often used in programming not as a currency prefix, so it may be a good idea to move it to another character group.
We are discussing this problem with Alexander.
http://www.trichview.com
helebor
Posts: 15
Joined: Mon Feb 12, 2007 2:20 pm

Unread post by helebor »

In the editor, '$' is considered as a numeric prefix (it is in the same group as the chatracters like '±'). Line breaks are allowed before prefixes.
Sure, but no character should *force* a line break, other than a linefeed sequence, a tab character or a hyphen. And certainly not an editor designed for creating published documents of all and any types!

I frequently use a very handy source code syntax editor called ConText which I suspect possibly uses the same Delphi RichView editor component. It has a similar problem with certain characters (although not the $ character in my experience with it...yet...) But it makes / a breaking character which is really nasty when you're using it for XML editing, especially as, in that implementation, ConText actually saves the line-breaks inserted by its wrapping routine. (Apart from that, and its ambiguity about Unicode, ConText is tres cool!)

I mention this, because that experience made me very concerned about the behaviour of the H&M editor with my manuscript material. An author does NOT want a generic editor that enforces its own rules about spelling the names of programming objects.

Helen
User avatar
xlation
Posts: 21
Joined: Fri Jan 04, 2008 2:17 pm
Location: Minsk, Belarus

Unread post by xlation »

Hi Helen,
You might consider using a text variable, say <%DOLLARSIGN%>. Text variables are inserted at compile time, which means no more word breaks in the editor and in the output.
To say this is a makeshift solution is an understatement - but it's better than nothing.
-=Think!=-
User avatar
Alexander Halser
EC-Software Support
Posts: 4098
Joined: Mon Jun 24, 2002 7:24 pm
Location: Salzburg, Austria
Contact:

Unread post by Alexander Halser »

but no character should *force* a line break
It's not forcing it. The character (pairs) mark possible word breaks if the word doesn't fit on the line, like a soft hyphen.
Alexander Halser
Senior Software Architect, EC Software GmbH
User avatar
Martin Wynne
Posts: 2656
Joined: Mon May 12, 2003 3:21 pm
Location: West of the Severn, UK

Unread post by Martin Wynne »

Alexander Halser wrote:
but no character should *force* a line break
It's not forcing it. The character (pairs) mark possible word breaks if the word doesn't fit on the line, like a soft hyphen.
Hi Alexander,

Surely a word should only break internally in the rare event that a single very long word is longer than the full line length? And in that case a hyphen should be automatically added?

Otherwise the only place to break to a new line is the white space between words?

regards,

Martin.
User avatar
Tim Green
Site Admin
Posts: 23155
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Unread post by Tim Green »

Martin Wynne wrote:Otherwise the only place to break to a new line is the white space between words?
I've always thought this too but apparently Russian logic works differently... 8)
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
User avatar
Sergey Tkachenko
Posts: 13
Joined: Sat Apr 23, 2005 6:21 pm
Contact:

Unread post by Sergey Tkachenko »

Line breaking rules are much more complicated than simple breaking on spaces. You can look here: http://unicode.org/reports/tr14/ , the editor's Unicode line breaking algorithm is based on these rules.
If '$' character is used as a currency symbol, there is nothing wrong with allowing line break before it. And, for example, it is perfectly ok to allow line break before '(' or after ')', even if there are no spaces there. From the other hand, line breaks after '(' or before ')' are undesirable, even on spaces.

The problem with '$' happens when it is not used as a currency symbol, i.e. only in text about programming (words like SYS$TABLENAME unlikely occurs in text unrelated to programming).

PS: these complex line breaking rules are applied only to Unicode text in the editor.
http://www.trichview.com
User avatar
Tim Green
Site Admin
Posts: 23155
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Unread post by Tim Green »

Hi Sergey,
If '$' character is used as a currency symbol, there is nothing wrong with allowing line break before it. And, for example, it is perfectly ok to allow line break before '(' or after ')', even if there are no spaces there. From the other hand, line breaks after '(' or before ')' are undesirable, even on spaces.
This is not correct, at least not in English. There is absolutely no need whatsoever to consider anything except whitespace and hyphens for word breaks. Everything else should never break.

It is not OK to allow line break before '(' or after ')' if there are no spaces before or after them. If there are no spaces it is an error in the text but there should be no break, because no spaces means there is no word boundary and thus no break. The definition of a word boundary is whitespace or a dash/hyphen. It might be a good idea to allow the user to define additional breaking characters if they want but that should be an option only. If these options are not explicitly provided then breaks should only happen on whitespace, nowhere else. Anything else is an error. :)
User avatar
Martin Wynne
Posts: 2656
Joined: Mon May 12, 2003 3:21 pm
Location: West of the Severn, UK

Unread post by Martin Wynne »

Tim Green wrote:This is not correct, at least not in English. There is absolutely no need whatsoever to consider anything except whitespace and hyphens for word breaks. Everything else should never break.
Hi Tim,

Normally yes. But I think two different things have got muddled up here. You need some rules for how to break in the middle of a word in the special case that a single word is longer than the available line length. Typically this occurs for example in tables with narrow fixed-width columns. For example, try shrinking your browser window on this:
Pneumonoultramicroscopicsilicovolcanoconiosis is a lung disease contracted from the inhalation of very fine silica particles, specifically from a volcano. There have been no recorded cases of Pneumonoultramicroscopicsilicovolcanoconiosis in the village of Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch so far this year.
However, these rules should be invoked only in the case that there is no white space in the line. If there is any white space, the line break should always occur there and not in the middle of a word.

regards,

Martin.
Post Reply