Tomer Something Tomer Something -4 years ago 102
C# Question

Encode and Decode multilingual string c#

I want to encode and then decode a string that contains multilingual characters, in which the language, length and character positioning (like, chinese character on indexes 8-10) are unknown.

Is it even possible to have a "universal" encoder? Or some algorithm that knows how to decode this?

Searching the web came up with only solutions that involved knowing where the special characters are, and of what language, and I cant even know the language itself.

Any ideas?

EDIT:
Example: a string that consists of several languages, such as:


"Hello {CHINESE} my {LATIN} is rusted"


which consists of english, chinese, and latin.

But when I do

var test = ASCIIEncoding.ASCII.GetBytes(someStr);


and then

ASCIIEncoding.ASCII.GetString(test)


the "special characters" (IE, not english characters) are converted to question marks

Answer Source

Don't use ASCII encoding since it isn't supposed to handle multiple language characters in the same string.

Use Unicode instead:

var test = UnicodeEncoding.Unicode.GetBytes(someStr);
var test1 = UnicodeEncoding.Unicode.GetString(test);
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download