Martin AJ Martin AJ - 2 months ago 7
Javascript Question

How can I detect Persian characters?

Let me explain my question by some examples;

// expected result: ("true" means "rlt" and "false" means "ltr")
var test = "..!"; // true
var test = "te"; // false
var test = "!te"; // false
var test = "..ق"; // true
var test = "مب"; // true
var test = "eس"; // false
var test = "سe"; // true

Here is my current code:

// declare direction of comment in textarea
var x = new RegExp("[A-Za-z]"); // is ascii
var isAscii = x.test($("#textarea-edit-"+post_id_for_edit).val().substring(0, 1));
$("#textarea-edit-"+post_id_for_edit).css("direction", "ltr");
} else {
$("#textarea-edit-"+post_id_for_edit).css("direction", "rtl");

I want it be based on the first character which is a letter (either Persian or English). But my code is based on the first character (it can be anything, even a sign).

Well how can I do that?


I suggest using a regex with ASCII letter and Persian letter regexps as alternation parts, and only capture one of them (say, ASCII). If there is a match, and Group 1 was matched, the text is identified as ASCII. If there was no match, or the match was a success, but Group 1 did not match, the text should be Persian.

See the code below:

function check(s) {
  var PersianOrASCII = /[آ-ی]|([a-zA-Z])/;
  if ((m = s.match(PersianOrASCII)) !== null) {
    if (m[1]) {
       return false;
    else { return true; }
  else { return true; }
console.log(check("..!"));  // true
console.log(check("te"));   // false
console.log(check("!te"));  // false
console.log(check("..ق"));  // true
console.log(check("مب"));   // true 
console.log(check("eس"));   // false
console.log(check("سe"));   // true

NOTE: You may fine tune the Persian letter regex using [\u0600-\u06FF], or even [\u0600-\u06FF\uFB8A\u067E\u0686\u06AF] regexps. Or even [\u06A9\u06AF\u06C0\u06CC\u060C\u062A\u062B\u062C\u062D\u062E\u062F\u063A\u064A\u064B\u064C\u064D\u064E\u064F\u067E\u0670\u0686\u0698\u200C\u0621-\u0629\u0630-\u0639\u0641-\u0654] (from persianRex).