prabodhprakash prabodhprakash - 1 month ago 15
Bash Question

extract substring using regex in shell script

The strings could be of form:


  1. com.company.$(PRODUCT_NAME:rfc1034identifier)

  2. $(PRODUCT_BUNDLE_IDENTIFIER)

  3. com.company.$(PRODUCT_NAME:rfc1034identifier).$(someRandomVariable)



I need help in writing regex that extract all the string inside $(..)

I created a regex like
([(])\w+([)])
but when I try to execute in shell script, it gives me error of unmatched parenthesis.

This is what I executed:

echo "com.io.$(sdfsdfdsf)"|grep -P '([(])\w+([)])' -o


I need to get all matching substrings.

Answer

Your question specifies "shell", but not "bash". So I'll start with a common shell-based tool (awk) rather than assuming you can use any particular set of non-POSIX built-ins.

$ cat inp.txt

com.company.$(PRODUCT_NAME:rfc1034identifier)
$(PRODUCT_BUNDLE_IDENTIFIER)
com.company.$(PRODUCT_NAME:rfc1034identifier).$(someRandomVariable)

$ awk -F'[()]' '{for(i=2;i<=NF;i+=2){print $i}}' inp.txt

PRODUCT_NAME:rfc1034identifier
PRODUCT_BUNDLE_IDENTIFIER
PRODUCT_NAME:rfc1034identifier
someRandomVariable

This awk one-liner defines a field separator that consists of opening or closing brackets. With such a field separator, every even-numbered field will be the content you're looking for, assuming all lines of input are correctly formatted and there are no parentheses embedded inside other parentheses.

If you did want to do this in POSIX shell alone, the following would be an option:

#!/bin/sh

while read line; do
  while expr "$line" : '.*(' >/dev/null; do
    line="${line#*(}"
    echo "${line%%)*}"
  done
done < inp.txt

This steps through each line of input, slicing it up using the parentheses and printing each slice. Note that this uses expr, which most likely an external binary, but is at least included in POSIX.1.