Radivarig Radivarig - 7 months ago 22
Perl Question

regex for capturing path from a string with optional character ~ (perl|awk|sed|..)

I want to match everything between first and last slash

/
including optional
~
before first slash.

I used this for the first part:

echo ~~a~/dir1/di r2/b.c \
| perl -pe 's/[^\/]*(\/.*\/).*/\1/'


which produces
/dir1/di r2/
.

This match includes the tilde:

perl -pe 's/
.*
(
~
\/.*\/).*/\1/'


but adding
?
for optional character doesn't seem to work like in these cases:

perl -pe 's/
.*
(
~?
\/.*\/).*/\1/'
->
/di r2/


perl -pe 's/
.*
(
(?:~)
\/.*\/).*/\1/'
->
~~a/dir1/di r2/b.c


What am I doing wrong?

Answer

If I understood the desired output right, this works for me with or without tilde

echo "path /d1/d2/43a/" | perl -nE '$_ =~ m{ ( ~? (?: /.*/ | /) ) }x; say "$1"'

Prints

/d1/d2/43a/

Same Perl code, with a tilde before the first slash in the input

echo "path ~/d1/d2/43a/" | perl -nE '$_ =~ m{ ( ~? (?: /.*/ | /) ) }x; say "$1"'

prints

~/d1/d2/43a/

Notes   Use of /1 in the substitution is deprecated. Use $1 instead. Use of {} for the delimiters only allows to not have to escape / thus making it all more readable -- but otherwise the same works when using / for delimiter and then escaping it inside.


Update

To also catch a lone ~/ (or /), the simplest change was to add that explicitly, /.*/ | /. In order to capture the (optinal) ~ in both cases there is a (non-capturing) grouping around this. Removed -w flag so no warnings are issued when the input string has no slashes at all, but only an empty line is printed.