Desmond Hume Desmond Hume - 6 months ago 29
Objective-C Question

Is there a way to check if a string contains a Unicode letter?

In Cocoa, regular expressions are presumably following the ICU Unicode rules for character matching and the ICU standard includes character properties such as

\p{L}
for matching all kinds of Unicode letters. However

NSString* str = @"A";
NSPredicate* pred = [NSPredicate predicateWithFormat:@"SELF MATCHES '\\p{L}'"];
NSLog(@"%d", [pred evaluateWithObject:str]);


doesn't seem to compile:

Can't do regex matching, reason: Can't open pattern U_REGEX_BAD_INTERVAL (string A, pattern p{L}, case 0, canon 0)


If character properties are not supported (are they?), how else could I check if a string contains a Unicode letter in my iOS app?

Answer

The main point here is that MATCHES requires a full string match, and also, \ backslash passed to the regex engine should be a literal backslash.

The regex can thus be

(?s).*\p{L}.*

Which means:

  • (?s) - enable DOTALL mode
  • .* - match 0 or more any characters
  • \p{L} - match a Unicode letter
  • .* - match zero or more characters.

In iOS, just double the backslashes:

NSPredicate * predicat = [NSPredicate predicateWithFormat:@"SELF MATCHES '(?s).*\\p{L}.*'"];

See IDEONE demo

If the backslashes inside the NSPrediciate are treated specifically, use:

NSPredicate * predicat = [NSPredicate predicateWithFormat:@"SELF MATCHES '(?s).*\\\\p{L}.*'"];