Kismet Agbasi Kismet Agbasi - 4 months ago 12
Linux Question

How Can I Match the Last String Using this REGEX

I'm working on a bash script and need to use SED and a REGEX to match this line in a text file:

database.system = "pgsql://hostaddr=127.0.0.1 port=5432 dbname=mydb user=myuser password=mypassword options='' application_name='myappname'";


This is the regex I've come up with:

database.system\s=\s((?=")(.*)(?=;))


So far my regex is matching everything except for the last semi-colon. How do I modify the regex to catch the semi-colon as well?

Answer

You're using look-ahead assertions in your regular expression ((?=...)), which sed doesn't support.

However, you don't need them, if all you're trying to do is to extract the string inside the double quotes (using GNU sed syntax):

line=$'database.system = "pgsql://hostaddr=127.0.0.1 port=5432 dbname=mydb user=myuser password=mypassword options=\'\' application_name=\'myappname\'";'

sed -rn 's/database\.system\s*=\s*"(.*)";/\1/p' <<<"$line"

will extract

pgsql://hostaddr=127.0.0.1 port=5432 dbname=mydb user=myuser password=mypassword options='' application_name='myappname'
  • -r activates support for extended regular expressions, which function (more) like regular expressions in other languages.

  • -n suppresses printing of each input line by default, so that an explicit output command is needed to produce output.

  • s/<regex>/<replacement>/p matches each input line against <regex>, replaces it with <replacement>, and prints the result (p), but only if a match was found.

The basic approach is to match the entire line, yet limit the (one and only) capture group to the substring of interest, and then replace the line with only the capture group, which effectively outputs only the substring of interest for each matching line.