Carl Witthoft Carl Witthoft - 2 months ago 13
LaTeX Question

How to search for any unicode symbol in a character string?

I've got an existing DOORS module which happens to have some rich text entries; these entries have some symbols in them such as 'curly' quotes. I'm trying to upgrade a DXL macro which exports a LaTeX source file, and the problem is that these high-number symbols are not considered "standard UTF-8" by TexMaker's import function (and in any case probably won't be processed by Xelatex or other converters) . I can't simply use the

UnicodeString
functions in DXL because those break the rest of the rich text, and apparently the character identifier
charOf(decimal_number_code)
only works over the basic set of characters, i.e. less than some numeric code value. For example,
charOf(8217)
should create a right-curly single quote, but when I tried code along the lines of

if (charOf(8217) == one_char)


I never get a match. I did copy the curly quote from the DOORS module and verified via an online unicode analyzer that it was definitely Unicode decimal value 8217 .

So, what am I missing here? I just want to be able to detect any symbol character, identify it correctly, and then replace it with ,e.g.,
\textquoteright
in the output stream.

My overall setup works for lower-count chars, since this works:
(
c
is a single character pulled from a string)

thedeg = charOf(176)
if( thedeg == c )
{
temp += "$\\degree$"
}

Answer

Got some help from DXL coding experts over at IBM forums.

Quoting the important stuff (there's some useful code snippets there as well):

Hey, you are right it seems intOf(char) and charOf(int) both do some modulo 256 and therefore cut anything above that off. Try:

int i=8217;
 char c = addr_(i);
 print c;

Which then allows comparison of c with any input char.

Comments