kairos kairos - 1 month ago 8
AppleScript Question

Extract substring from path. Make sure the extracted substring isn't followed by a character

Let's say I have this string:

/folder1/folder2/match/folder3/match.app


I want to extract a substring that ends with a matching condition ("
match
") but that isn't followed by a specific character ("
.
")

The result should be:

/folder1/folder2/match





Case 1



In the case of "
match.
" before "
match
" followed by any other character or none:

/folder1/folder2/match.app/folder3/match/folder4


The result should be:

/folder1/folder2/match.app/folder3/match


Since the first "
match
" is followed by a "
.
"




Case 2



In case there are two matches:

/folder1/folder2/match/folder3/match/folder4


The result should be:

/folder1/folder2/match


I want to keep just the first substring.




Case 3



In the case there isn't any "
match
" without a following "
.
":

/folder1/folder2/match.app


The result should be:

False


The regex should output "
False
" when there is no occurrences.




Case 4



In case there is a "
match
" followed by any other character besides "
/
" or "nothing more" :

/folder1/folder2/matcha/match/folder3


or

/folder1/folder2/matcha/match


The result should be:

/folder1/folder2/matcha/match


Any ideas? Thanks!




NOTE: I want to use this regex in Applescript:

set strRegEx to ???
set strResult to find text strRegEx in strTextToSearch with regexp and string result


UPDATE: Added Case 4

Answer

Here is one solution:

.*?match(?!\.)

Demo (Note: The ^ in this demo is only added to show multiple examples together; you shouldn't need it.)

Explanation:

. - matches any character

*? - repeats the previous pattern as many times as necessary, but at little as possible ("non-greedy").

match - literal text for the word "match"

(?!...) a negative lookahead; the contained pattern is not included in the result, and cannot be matched.

\. a literal "." (the \ prevents it from being treated as "any character", like above)


Edit:

Taking into consideration the "case 4" that you've now added, you could perhaps change the regex to:

.*?match(?=\/|$)

Demo

Explanation:

(?=...) is a positive lookahead.

\/ matches a literal "/" character.

$ matches the end of the line.

\/|$ matches either of the above.

Comments