Techdoc showstopper: $ is a breaking character

This is the place to discuss Help & Manual 4 issues. Please don't post questions on any other versions here!

Moderators: Alexander Halser, Tim Green

Techdoc showstopper: $ is a breaking character

Unread postby helebor » Tue Apr 29, 2008 3:17 pm

In the H&M4 editor the $ character behaves like a breaking character. To explain what I mean, in wrapped text if you have a word like SYS$TABLENAME and the editor runs out of room at the end of a line, it considers '$' to be the last character of a word 'SYS$' and inserts a line break after it.

Is there something I can configure in H&M 4 to avoid this behaviour? It is a total showstopper when writing tech doc for programming languages and databases, which is what I use H&M for. So much so that I'm faced with moving the manuscript for two books into Framemaker or something just as horrible.

Thanks
Helen
helebor
 
Posts: 15
Joined: Mon Feb 12, 2007 2:20 pm

Re: Techdoc showstopper: $ is a breaking character

Unread postby Martin Wynne » Tue Apr 29, 2008 3:30 pm

helebor wrote:In the H&M4 editor the $ character behaves like a breaking character. To explain what I mean, in wrapped text if you have a word like SYS$TABLENAME and the editor runs out of room at the end of a line, it considers '$' to be the last character of a word 'SYS$' and inserts a line break after it.

Is there something I can configure in H&M 4 to avoid this behaviour?

Hi Helen,

This is one of the niggles in the RVF editor used in H&M. There is nothing at present you can do about it. However, it occurs only in the H&M editor where you are writing the text. There won't be a line break in the compiled output which your users see.

regards,

Martin.
User avatar
Martin Wynne
 
Posts: 2243
Joined: Mon May 12, 2003 3:21 pm
Location: West of the Severn, UK

Unread postby helebor » Tue Apr 29, 2008 4:06 pm

Martin,

OK, thanks, that's good to know. I can't test that for a few days yet because I have a whole lot of imported stuff sitting at the end of the book. -- Helen
helebor
 
Posts: 15
Joined: Mon Feb 12, 2007 2:20 pm

Unread postby Alexander Halser » Wed Apr 30, 2008 4:07 pm

Helen,

The $ sign is not a word break character. Only if there is a formatting change in the word, it would be considered two words. I suspect this is the case here.

Select the word SYS$TABLENAME, right click on the editor and from the context menu select "Convert to plain text". If there were different formattings in that word before, it will be one word after that.
Alexander Halser
Senior Software Architect, EC Software GmbH
User avatar
Alexander Halser
EC-Software Support
 
Posts: 3730
Joined: Mon Jun 24, 2002 7:24 pm
Location: Salzburg, Austria

Unread postby Martin Wynne » Wed Apr 30, 2008 5:08 pm

Alexander Halser wrote:The $ sign is not a word break character.

Hi Alexander,

Yes it is, and "convert to plain text" doesn't fix it. See attached.

Martin.
You do not have the required permissions to view the files attached to this post. This may mean that you can't view them at all, or it may just mean that you need to log into the forum to view them.
User avatar
Martin Wynne
 
Posts: 2243
Joined: Mon May 12, 2003 3:21 pm
Location: West of the Severn, UK

Unread postby Alexander Halser » Thu May 01, 2008 12:36 am

Duh! You are right... what did I look at? :?

It seems to be a bug in the editor, but I'm not 100% sure before I have the feedback from the author of the editor component. My internal fix shows correct behaviour, but I am not certain if this isn't language dependent (Asian languages may differ here - word break rules are a sience in itself).
Alexander Halser
Senior Software Architect, EC Software GmbH
User avatar
Alexander Halser
EC-Software Support
 
Posts: 3730
Joined: Mon Jun 24, 2002 7:24 pm
Location: Salzburg, Austria

Unread postby Sergey Tkachenko » Thu May 01, 2008 10:24 am

In the editor, '$' is considered as a numeric prefix (it is in the same group as the chatracters like '±'). Line breaks are allowed before prefixes.
But really, '$' is often used in programming not as a currency prefix, so it may be a good idea to move it to another character group.
We are discussing this problem with Alexander.
User avatar
Sergey Tkachenko
 
Posts: 13
Joined: Sat Apr 23, 2005 6:21 pm

Unread postby helebor » Thu May 01, 2008 11:56 am

In the editor, '$' is considered as a numeric prefix (it is in the same group as the chatracters like '±'). Line breaks are allowed before prefixes.

Sure, but no character should *force* a line break, other than a linefeed sequence, a tab character or a hyphen. And certainly not an editor designed for creating published documents of all and any types!

I frequently use a very handy source code syntax editor called ConText which I suspect possibly uses the same Delphi RichView editor component. It has a similar problem with certain characters (although not the $ character in my experience with it...yet...) But it makes / a breaking character which is really nasty when you're using it for XML editing, especially as, in that implementation, ConText actually saves the line-breaks inserted by its wrapping routine. (Apart from that, and its ambiguity about Unicode, ConText is tres cool!)

I mention this, because that experience made me very concerned about the behaviour of the H&M editor with my manuscript material. An author does NOT want a generic editor that enforces its own rules about spelling the names of programming objects.

Helen
helebor
 
Posts: 15
Joined: Mon Feb 12, 2007 2:20 pm

Unread postby xlation » Thu May 01, 2008 2:19 pm

Hi Helen,
You might consider using a text variable, say <%DOLLARSIGN%>. Text variables are inserted at compile time, which means no more word breaks in the editor and in the output.
To say this is a makeshift solution is an understatement - but it's better than nothing.
-=Think!=-
User avatar
xlation
 
Posts: 21
Joined: Fri Jan 04, 2008 2:17 pm
Location: Minsk, Belarus

Unread postby Alexander Halser » Fri May 02, 2008 5:00 pm

but no character should *force* a line break

It's not forcing it. The character (pairs) mark possible word breaks if the word doesn't fit on the line, like a soft hyphen.
Alexander Halser
Senior Software Architect, EC Software GmbH
User avatar
Alexander Halser
EC-Software Support
 
Posts: 3730
Joined: Mon Jun 24, 2002 7:24 pm
Location: Salzburg, Austria

Unread postby Martin Wynne » Fri May 02, 2008 5:16 pm

Alexander Halser wrote:
but no character should *force* a line break

It's not forcing it. The character (pairs) mark possible word breaks if the word doesn't fit on the line, like a soft hyphen.

Hi Alexander,

Surely a word should only break internally in the rare event that a single very long word is longer than the full line length? And in that case a hyphen should be automatically added?

Otherwise the only place to break to a new line is the white space between words?

regards,

Martin.
User avatar
Martin Wynne
 
Posts: 2243
Joined: Mon May 12, 2003 3:21 pm
Location: West of the Severn, UK

Unread postby Tim Green » Sat May 03, 2008 7:16 am

Martin Wynne wrote:Otherwise the only place to break to a new line is the white space between words?

I've always thought this too but apparently Russian logic works differently... 8)
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
User avatar
Tim Green
Site Admin
 
Posts: 19549
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany

Unread postby Sergey Tkachenko » Sat May 03, 2008 10:54 am

Line breaking rules are much more complicated than simple breaking on spaces. You can look here: http://unicode.org/reports/tr14/ , the editor's Unicode line breaking algorithm is based on these rules.
If '$' character is used as a currency symbol, there is nothing wrong with allowing line break before it. And, for example, it is perfectly ok to allow line break before '(' or after ')', even if there are no spaces there. From the other hand, line breaks after '(' or before ')' are undesirable, even on spaces.

The problem with '$' happens when it is not used as a currency symbol, i.e. only in text about programming (words like SYS$TABLENAME unlikely occurs in text unrelated to programming).

PS: these complex line breaking rules are applied only to Unicode text in the editor.
User avatar
Sergey Tkachenko
 
Posts: 13
Joined: Sat Apr 23, 2005 6:21 pm

Unread postby Tim Green » Sun May 04, 2008 8:37 am

Hi Sergey,

If '$' character is used as a currency symbol, there is nothing wrong with allowing line break before it. And, for example, it is perfectly ok to allow line break before '(' or after ')', even if there are no spaces there. From the other hand, line breaks after '(' or before ')' are undesirable, even on spaces.

This is not correct, at least not in English. There is absolutely no need whatsoever to consider anything except whitespace and hyphens for word breaks. Everything else should never break.

It is not OK to allow line break before '(' or after ')' if there are no spaces before or after them. If there are no spaces it is an error in the text but there should be no break, because no spaces means there is no word boundary and thus no break. The definition of a word boundary is whitespace or a dash/hyphen. It might be a good idea to allow the user to define additional breaking characters if they want but that should be an option only. If these options are not explicitly provided then breaks should only happen on whitespace, nowhere else. Anything else is an error. :)
User avatar
Tim Green
Site Admin
 
Posts: 19549
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany

Unread postby Martin Wynne » Sun May 04, 2008 11:14 am

Tim Green wrote:This is not correct, at least not in English. There is absolutely no need whatsoever to consider anything except whitespace and hyphens for word breaks. Everything else should never break.

Hi Tim,

Normally yes. But I think two different things have got muddled up here. You need some rules for how to break in the middle of a word in the special case that a single word is longer than the available line length. Typically this occurs for example in tables with narrow fixed-width columns. For example, try shrinking your browser window on this:

Pneumonoultramicroscopicsilicovolcanoconiosis is a lung disease contracted from the inhalation of very fine silica particles, specifically from a volcano. There have been no recorded cases of Pneumonoultramicroscopicsilicovolcanoconiosis in the village of Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch so far this year.

However, these rules should be invoked only in the case that there is no white space in the line. If there is any white space, the line break should always occur there and not in the middle of a word.

regards,

Martin.
User avatar
Martin Wynne
 
Posts: 2243
Joined: Mon May 12, 2003 3:21 pm
Location: West of the Severn, UK

Next

Return to Help & Manual 4 Forum

Who is online

Users browsing this forum: No registered users and 3 guests