Winston Winston - 2 months ago 12
Objective-C Question

Using NSRegularExpression to extract URLs on the iPhone

I'm using the following code on my iPhone app, taken from http://tinyurl.com/remarkablepixels to extract all URLs from striped .html code.

I'm only being able to extract the first URL, but I need an array containing all URLs. My NSArray isn't returning NSStrings for each URL, but the objects descriptions only.

How do I make my

arrayOfAllMatches
return all URLs, as NSStrings?

-(NSArray *)stripOutHttp:(NSString *)httpLine {

// Setup an NSError object to catch any failures
NSError *error = NULL;

// create the NSRegularExpression object and initialize it with a pattern
// the pattern will match any http or https url, with option case insensitive

NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"http?://([-\\w\\.]+)+(:\\d+)?(/([\\w/_\\.]*(\\?\\S+)?)?)?" options:NSRegularExpressionCaseInsensitive error:&error];

// create an NSRange object using our regex object for the first match in the string httpline
NSRange rangeOfFirstMatch = [regex rangeOfFirstMatchInString:httpLine options:0 range:NSMakeRange(0, [httpLine length])];

NSArray *arrayOfAllMatches = [regex matchesInString:httpLine options:0 range:NSMakeRange(0, [httpLine length])];

// check that our NSRange object is not equal to range of NSNotFound
if (!NSEqualRanges(rangeOfFirstMatch, NSMakeRange(NSNotFound, 0))) {
// Since we know that we found a match, get the substring from the parent string by using our NSRange object

NSString *substringForFirstMatch = [httpLine substringWithRange:rangeOfFirstMatch];

NSLog(@"Extracted URL: %@",substringForFirstMatch);
NSLog(@"All Extracted URLs: %@",arrayOfAllMatches);

// return all matching url strings
return arrayOfAllMatches;
}

return NULL;


}

Here is my NSLog output:

Extracted URL: http://mydomain.com/myplayer
All Extracted URLs: (
"<NSExtendedRegularExpressionCheckingResult: 0x106ddb0>{728, 53}{<NSRegularExpression: 0x106bc30> http?://([-\\w\\.]+)+(:\\d+)?(/([\\w/_\\.]*(\\?\\S+)?)?)? 0x1}",
"<NSExtendedRegularExpressionCheckingResult: 0x106ddf0>{956, 66}{<NSRegularExpression: 0x106bc30> http?://([-\\w\\.]+)+(:\\d+)?(/([\\w/_\\.]*(\\?\\S+)?)?)? 0x1}",
"<NSExtendedRegularExpressionCheckingResult: 0x106de30>{1046, 63}{<NSRegularExpression: 0x106bc30> http?://([-\\w\\.]+)+(:\\d+)?(/([\\w/_\\.]*(\\?\\S+)?)?)? 0x1}",
"<NSExtendedRegularExpressionCheckingResult: 0x106de70>{1129, 67}{<NSRegularExpression: 0x106bc30> http?://([-\\w\\.]+)+(:\\d+)?(/([\\w/_\\.]*(\\?\\S+)?)?)? 0x1}"
)

Answer

The method matchesInString:options:range: returns an array of NSTextCheckingResult objects. You can use fast enumeration to iterate through the array, pull out the substring of each match from your original string, and add the substring to a new array.

NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"http?://([-\\w\\.]+)+(:\\d+)?(/([\\w/_\\.]*(\\?\\S+)?)?)?" options:NSRegularExpressionCaseInsensitive error:&error];

NSArray *arrayOfAllMatches = [regex matchesInString:httpLine options:0 range:NSMakeRange(0, [httpLine length])];

NSMutableArray *arrayOfURLs = [[NSMutableArray alloc] init];

for (NSTextCheckingResult *match in arrayOfAllMatches) {    
    NSString* substringForMatch = [httpLine substringWithRange:match.range];
    NSLog(@"Extracted URL: %@",substringForMatch);

    [arrayOfURLs addObject:substringForMatch];
}

// return non-mutable version of the array
return [NSArray arrayWithArray:arrayOfURLs];