Techdoc showstopper: $ is a breaking character
Moderators: Alexander Halser, Tim Green
Techdoc showstopper: $ is a breaking character
In the H&M4 editor the $ character behaves like a breaking character. To explain what I mean, in wrapped text if you have a word like SYS$TABLENAME and the editor runs out of room at the end of a line, it considers '$' to be the last character of a word 'SYS$' and inserts a line break after it.
Is there something I can configure in H&M 4 to avoid this behaviour? It is a total showstopper when writing tech doc for programming languages and databases, which is what I use H&M for. So much so that I'm faced with moving the manuscript for two books into Framemaker or something just as horrible.
Thanks
Helen
Is there something I can configure in H&M 4 to avoid this behaviour? It is a total showstopper when writing tech doc for programming languages and databases, which is what I use H&M for. So much so that I'm faced with moving the manuscript for two books into Framemaker or something just as horrible.
Thanks
Helen
- Martin Wynne
- Posts: 2656
- Joined: Mon May 12, 2003 3:21 pm
- Location: West of the Severn, UK
Re: Techdoc showstopper: $ is a breaking character
Hi Helen,helebor wrote:In the H&M4 editor the $ character behaves like a breaking character. To explain what I mean, in wrapped text if you have a word like SYS$TABLENAME and the editor runs out of room at the end of a line, it considers '$' to be the last character of a word 'SYS$' and inserts a line break after it.
Is there something I can configure in H&M 4 to avoid this behaviour?
This is one of the niggles in the RVF editor used in H&M. There is nothing at present you can do about it. However, it occurs only in the H&M editor where you are writing the text. There won't be a line break in the compiled output which your users see.
regards,
Martin.
- Alexander Halser
- EC-Software Support
- Posts: 4106
- Joined: Mon Jun 24, 2002 7:24 pm
- Location: Salzburg, Austria
- Contact:
Helen,
The $ sign is not a word break character. Only if there is a formatting change in the word, it would be considered two words. I suspect this is the case here.
Select the word SYS$TABLENAME, right click on the editor and from the context menu select "Convert to plain text". If there were different formattings in that word before, it will be one word after that.
The $ sign is not a word break character. Only if there is a formatting change in the word, it would be considered two words. I suspect this is the case here.
Select the word SYS$TABLENAME, right click on the editor and from the context menu select "Convert to plain text". If there were different formattings in that word before, it will be one word after that.
Alexander Halser
Senior Software Architect, EC Software GmbH
Senior Software Architect, EC Software GmbH
- Martin Wynne
- Posts: 2656
- Joined: Mon May 12, 2003 3:21 pm
- Location: West of the Severn, UK
- Alexander Halser
- EC-Software Support
- Posts: 4106
- Joined: Mon Jun 24, 2002 7:24 pm
- Location: Salzburg, Austria
- Contact:
Duh! You are right... what did I look at?
It seems to be a bug in the editor, but I'm not 100% sure before I have the feedback from the author of the editor component. My internal fix shows correct behaviour, but I am not certain if this isn't language dependent (Asian languages may differ here - word break rules are a sience in itself).
It seems to be a bug in the editor, but I'm not 100% sure before I have the feedback from the author of the editor component. My internal fix shows correct behaviour, but I am not certain if this isn't language dependent (Asian languages may differ here - word break rules are a sience in itself).
Alexander Halser
Senior Software Architect, EC Software GmbH
Senior Software Architect, EC Software GmbH
- Sergey Tkachenko
- Posts: 13
- Joined: Sat Apr 23, 2005 6:21 pm
- Contact:
In the editor, '$' is considered as a numeric prefix (it is in the same group as the chatracters like '±'). Line breaks are allowed before prefixes.
But really, '$' is often used in programming not as a currency prefix, so it may be a good idea to move it to another character group.
We are discussing this problem with Alexander.
But really, '$' is often used in programming not as a currency prefix, so it may be a good idea to move it to another character group.
We are discussing this problem with Alexander.
http://www.trichview.com
Sure, but no character should *force* a line break, other than a linefeed sequence, a tab character or a hyphen. And certainly not an editor designed for creating published documents of all and any types!In the editor, '$' is considered as a numeric prefix (it is in the same group as the chatracters like '±'). Line breaks are allowed before prefixes.
I frequently use a very handy source code syntax editor called ConText which I suspect possibly uses the same Delphi RichView editor component. It has a similar problem with certain characters (although not the $ character in my experience with it...yet...) But it makes / a breaking character which is really nasty when you're using it for XML editing, especially as, in that implementation, ConText actually saves the line-breaks inserted by its wrapping routine. (Apart from that, and its ambiguity about Unicode, ConText is tres cool!)
I mention this, because that experience made me very concerned about the behaviour of the H&M editor with my manuscript material. An author does NOT want a generic editor that enforces its own rules about spelling the names of programming objects.
Helen
- Alexander Halser
- EC-Software Support
- Posts: 4106
- Joined: Mon Jun 24, 2002 7:24 pm
- Location: Salzburg, Austria
- Contact:
- Martin Wynne
- Posts: 2656
- Joined: Mon May 12, 2003 3:21 pm
- Location: West of the Severn, UK
Hi Alexander,Alexander Halser wrote:It's not forcing it. The character (pairs) mark possible word breaks if the word doesn't fit on the line, like a soft hyphen.but no character should *force* a line break
Surely a word should only break internally in the rare event that a single very long word is longer than the full line length? And in that case a hyphen should be automatically added?
Otherwise the only place to break to a new line is the white space between words?
regards,
Martin.
- Tim Green
- Site Admin
- Posts: 23187
- Joined: Mon Jun 24, 2002 9:11 am
- Location: Bruehl, Germany
- Contact:
I've always thought this too but apparently Russian logic works differently...Martin Wynne wrote:Otherwise the only place to break to a new line is the white space between words?
Regards,
Tim (EC Software Documentation & User Support)
Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
Tim (EC Software Documentation & User Support)
Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
- Sergey Tkachenko
- Posts: 13
- Joined: Sat Apr 23, 2005 6:21 pm
- Contact:
Line breaking rules are much more complicated than simple breaking on spaces. You can look here: http://unicode.org/reports/tr14/ , the editor's Unicode line breaking algorithm is based on these rules.
If '$' character is used as a currency symbol, there is nothing wrong with allowing line break before it. And, for example, it is perfectly ok to allow line break before '(' or after ')', even if there are no spaces there. From the other hand, line breaks after '(' or before ')' are undesirable, even on spaces.
The problem with '$' happens when it is not used as a currency symbol, i.e. only in text about programming (words like SYS$TABLENAME unlikely occurs in text unrelated to programming).
PS: these complex line breaking rules are applied only to Unicode text in the editor.
If '$' character is used as a currency symbol, there is nothing wrong with allowing line break before it. And, for example, it is perfectly ok to allow line break before '(' or after ')', even if there are no spaces there. From the other hand, line breaks after '(' or before ')' are undesirable, even on spaces.
The problem with '$' happens when it is not used as a currency symbol, i.e. only in text about programming (words like SYS$TABLENAME unlikely occurs in text unrelated to programming).
PS: these complex line breaking rules are applied only to Unicode text in the editor.
http://www.trichview.com
- Tim Green
- Site Admin
- Posts: 23187
- Joined: Mon Jun 24, 2002 9:11 am
- Location: Bruehl, Germany
- Contact:
Hi Sergey,
It is not OK to allow line break before '(' or after ')' if there are no spaces before or after them. If there are no spaces it is an error in the text but there should be no break, because no spaces means there is no word boundary and thus no break. The definition of a word boundary is whitespace or a dash/hyphen. It might be a good idea to allow the user to define additional breaking characters if they want but that should be an option only. If these options are not explicitly provided then breaks should only happen on whitespace, nowhere else. Anything else is an error.
This is not correct, at least not in English. There is absolutely no need whatsoever to consider anything except whitespace and hyphens for word breaks. Everything else should never break.If '$' character is used as a currency symbol, there is nothing wrong with allowing line break before it. And, for example, it is perfectly ok to allow line break before '(' or after ')', even if there are no spaces there. From the other hand, line breaks after '(' or before ')' are undesirable, even on spaces.
It is not OK to allow line break before '(' or after ')' if there are no spaces before or after them. If there are no spaces it is an error in the text but there should be no break, because no spaces means there is no word boundary and thus no break. The definition of a word boundary is whitespace or a dash/hyphen. It might be a good idea to allow the user to define additional breaking characters if they want but that should be an option only. If these options are not explicitly provided then breaks should only happen on whitespace, nowhere else. Anything else is an error.
- Martin Wynne
- Posts: 2656
- Joined: Mon May 12, 2003 3:21 pm
- Location: West of the Severn, UK
Hi Tim,Tim Green wrote:This is not correct, at least not in English. There is absolutely no need whatsoever to consider anything except whitespace and hyphens for word breaks. Everything else should never break.
Normally yes. But I think two different things have got muddled up here. You need some rules for how to break in the middle of a word in the special case that a single word is longer than the available line length. Typically this occurs for example in tables with narrow fixed-width columns. For example, try shrinking your browser window on this:
However, these rules should be invoked only in the case that there is no white space in the line. If there is any white space, the line break should always occur there and not in the middle of a word.Pneumonoultramicroscopicsilicovolcanoconiosis is a lung disease contracted from the inhalation of very fine silica particles, specifically from a volcano. There have been no recorded cases of Pneumonoultramicroscopicsilicovolcanoconiosis in the village of Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch so far this year.
regards,
Martin.