Carl Witthoft Carl Witthoft - 24 days ago 5x
LaTeX Question

How to search for any unicode symbol in a character string?

I've got an existing DOORS module which happens to have some rich text entries; these entries have some symbols in them such as 'curly' quotes. I'm trying to upgrade a DXL macro which exports a LaTeX source file, and the problem is that these high-number symbols are not considered "standard UTF-8" by TexMaker's import function (and in any case probably won't be processed by Xelatex or other converters) . I can't simply use the

functions in DXL because those break the rest of the rich text, and apparently the character identifier
only works over the basic set of characters, i.e. less than some numeric code value. For example,
should create a right-curly single quote, but when I tried code along the lines of

if (charOf(8217) == one_char)

I never get a match. I did copy the curly quote from the DOORS module and verified via an online unicode analyzer that it was definitely Unicode decimal value 8217 .

So, what am I missing here? I just want to be able to detect any symbol character, identify it correctly, and then replace it with ,e.g.,
in the output stream.

My overall setup works for lower-count chars, since this works:
is a single character pulled from a string)

thedeg = charOf(176)
if( thedeg == c )
temp += "$\\degree$"


Got some help from DXL coding experts over at IBM forums.

Quoting the important stuff (there's some useful code snippets there as well):

Hey, you are right it seems intOf(char) and charOf(int) both do some modulo 256 and therefore cut anything above that off. Try:

int i=8217;
 char c = addr_(i);
 print c;

Which then allows comparison of c with any input char.