Unicode symbols cannot be selected in PDF output

Please post all questions and comments regarding Help & Manual 7 here.

Moderators: Alexander Halser, Tim Green

Post Reply
MartinAP
Posts: 13
Joined: Wed Sep 26, 2018 3:52 pm

Unicode symbols cannot be selected in PDF output

Unread post by MartinAP »

Hello,

I'm struggling with having symbols in my PDF output in such a way that they can be selected. Particularly, I want to use the right-arrow symbol, unicode 2192. What I have tried so far:
  • Made sure to use Arial throughout because I would assume that if there is any font that should not create any problems, it is this one.
  • Made sure there are no other symbols that might be missing, i.e., I replaced the right-arrow unicode symbol temporarily by "->" and published the PDF - in that case, I was able to select the text, so there is definitely no other symbols creating a problem.
  • In the Font Embedding options for PDF publishing, Font Mode is set to "Embed TrueType fonts", no fonts are excluded, and I have unticked "Export Symbol text as glyphs" as well as unticked "Export all text as glyphs".
  • Inserting the unicode symbol either directly through the Symbol menu in Help & Manual or by Copy & Paste from the Windows character map.
Now, during testing, I found the following:
  • After having replaced all the unicode symbol arrows by "->", I can select the text in the PDF
  • Then, after just inserting the right-arrow unicode symbol right before the "->", I can still select the text, including the unicode symbol
  • Then, after just deleting the "->", I can NOT select the right-arrow unicode symbol in the PDF anymore (though I can select the directly preceding and subsequent text)
It appears somehow it makes a difference if the symbol is enclosed by spaces or not.

I really don't understand this behavior and would appreciate if anyone could enlighten me about this. Surely, it must be possible somehow to include unicode symbols that are present in the used font in such a way that they can be selected in the PDF.
Last edited by MartinAP on Wed Oct 28, 2020 12:33 pm, edited 1 time in total.
User avatar
Tim Green
Site Admin
Posts: 23156
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Re: Unicode symbols cannot be selected in PDF output

Unread post by Tim Green »

Hi Martin,

Look in your PDF font embedding options to see whether you have activated the setting to convert symbol font characters to glyphs. If you have, they will be exported as SVG glyphs and those will not be selectable.
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
MartinAP
Posts: 13
Joined: Wed Sep 26, 2018 3:52 pm

Re: Unicode symbols cannot be selected in PDF output

Unread post by MartinAP »

Hi Tim,

as I've mentioned in my post already, I have unticked "Export Symbol text as glyphs" as well as unticked "Export all text as glyphs" - are there any related settings I should be aware of?
User avatar
Tim Green
Site Admin
Posts: 23156
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Re: Unicode symbols cannot be selected in PDF output

Unread post by Tim Green »

MartinAP wrote: Wed Oct 28, 2020 12:34 pm as I've mentioned in my post already, I have unticked "Export Symbol text as glyphs" as well as unticked "Export all text as glyphs" - are there any related settings I should be aware of?
In that case what is probably happening is that the font you are using does not include the symbols you are inserting. You will probably need to specifically format them with a font that does include those symbols.
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
MartinAP
Posts: 13
Joined: Wed Sep 26, 2018 3:52 pm

Re: Unicode symbols cannot be selected in PDF output

Unread post by MartinAP »

Unfortunately, that cannot be the issue either because (and I had mentioned that, too), I have already changed the formatting to Arial, and I'm 100% sure that Arial includes this character (Unicode 2192) I've been testing with. Therefore, it must be something else.

As I've stated, apparently I can get the text including this symbol be selectable in the PDF if it's followed by text directly. However, as soon as I just separate the symbol and the subsequent text by a space, I cannot select it in the PDF output anymore. I find this rather puzzling.

Another interesting observation I just made: If I have the above (i.e. only the unicode symbol cannot be selected in the PDF, but the preceding and subsequent text can) and then I select the whole text in H&M and just select the current style in the drop-down menu again, then publish, then not just the symbol cannot be selected but also the preceding and subsequent text cannot be selected anymore.

I also now noticed that, when I type something, then just insert the symbol, and then continue typing, the normal text will have the style "Normal", but the symbol has style "Normal+". I think that points to where the issue might be. It's still the same font (Arial) though.
User avatar
Tim Green
Site Admin
Posts: 23156
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Re: Unicode symbols cannot be selected in PDF output

Unread post by Tim Green »

Hi Martin,

I just checked with our developers on this. The problem here is that PDF cannot handle normal Unicode, so these symbols will always be converted into glyphs, which cannot be selected. They suggest that you try the CID font embedding option. That may make them selectable. Otherwise they will be glyphs and there isn't anything that can be done about it.
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
MartinAP
Posts: 13
Joined: Wed Sep 26, 2018 3:52 pm

Re: Unicode symbols cannot be selected in PDF output

Unread post by MartinAP »

Huh, but then how do you explain that in some circumstances, as explained earlier, I can select them? I'm quoting myself:
  • After having replaced all the unicode symbol arrows by "->", I can select the text in the PDF
  • Then, after just inserting the right-arrow unicode symbol right before the "->", I can still select the text, including the unicode symbol
  • Then, after just deleting the "->", I can NOT select the right-arrow unicode symbol in the PDF anymore (though I can select the directly preceding and subsequent text)
User avatar
Tim Green
Site Admin
Posts: 23156
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Re: Unicode symbols cannot be selected in PDF output

Unread post by Tim Green »

Hi Martin,
Huh, but then how do you explain that in some circumstances, as explained earlier, I can select them?
I haven't looked into this in exhaustive detail (this is information from the HM developers) but my guess would be that not all symbols need to be converted. 8)
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
Post Reply