Dorad Dorad - 3 years ago 180
Swift Question

Swift Regex doesn't work

I am using the following extension method to get NSRange array of a substring:

extension String {
func nsRangesOfString(findStr:String) -> [NSRange] {
let ranges: [NSRange]
do {
// Create the regular expression.
let regex = try NSRegularExpression(pattern: findStr, options: [])

// Use the regular expression to get an array of NSTextCheckingResult.
// Use map to extract the range from each result.
ranges = regex.matches(in: self, options: [], range: NSMakeRange(0, self.characters.count)).map {$0.range}
}
catch {
// There was a problem creating the regular expression
ranges = []
}
return ranges
}
}


However, I didn't realize why it doesn't work sometimes. Here are two similar cases, one works and the other doesn't:

That one works:

self(String):


"וצפן (קרי: יִצְפֹּ֣ן) לַ֭יְשָׁרִים תּוּשִׁיָּ֑ה מָ֝גֵ֗ן לְהֹ֣לְכֵי תֹֽם׃"


findStr:


"קרי:"


And that one doesn't:

self(String):


"לִ֭נְצֹר אָרְח֣וֹת מִשְׁפָּ֑ט וְדֶ֖רֶךְ חסידו (קרי: חֲסִידָ֣יו) יִשְׁמֹֽר׃"


findStr:


"קרי:"


(An alternate steady method would be an appropriate answer though.)

Answer Source

NSRange ranges are specified in terms of UTF-16 code units (which is what NSString uses internally), therefore the length must be self.utf16.count:

        ranges = regex.matches(in: self, options: [],
                               range: NSRange(location: 0, length: self.utf16.count))
            .map {$0.range}

In the case of your second string we have

let s2 = "לִ֭נְצֹר אָרְח֣וֹת מִשְׁפָּ֑ט וְדֶ֖רֶךְ חסידו (קרי: חֲסִידָ֣יו) יִשְׁמֹֽר׃"
print(s2.characters.count) // 46
print(s2.utf16.count)      // 74

and that's why the pattern is not found with your code.

Starting with Swift 4 you can compute a NSRange for the entire string also as

NSRange(self.startIndex..., in: self)
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download