Dorothy P Dorothy P - 2 months ago 7
Bash Question

Search for a particular string, extract a number and add few strings below containing that number

This is how my text file looks:

!
hello_group serial_1234
hello-domain serial_1234
!
!
hello_group serial_2345
hello-domain serial_2345
!





This is how I want to see my result :

!
hello_group serial_1234
hello-domain serial_1234
my_content xxxx.1234
my_another_content yyyy.1234
!
!
hello_group serial_2345
hello-domain serial_2345
my_content xxxx.2345
my_another_content yyyy.2345
!





I want to search for
hello-domain
and in that line grep the number ending after
serial_*
. Store that number in a variable and create my content using that number. Add my ready content below the line
hello-domain
.

Answer

Using sed

Try:

sed -E 's/hello-domain serial_([[:digit:]]+)/&\nmy_content xxxx.\1\nmy_another_content yyyy.\1/' file

For example, with your input data:

$ sed -E 's/hello-domain serial_([[:digit:]]+)/&\nmy_content xxxx.\1\nmy_another_content yyyy.\1/' file
!
hello_group serial_1234
hello-domain serial_1234
my_content xxxx.1234
my_another_content yyyy.1234
!
!
hello_group serial_2345
hello-domain serial_2345
my_content xxxx.2345
my_another_content yyyy.2345
!

How it works:

The sed script consists of a single substitute command:

s/hello-domain serial_([[:digit:]]+)/&\nmy_content xxxx.\1\nmy_another_content yyyy.\1/

This looks for a line matching hello-domain serial_ followed by one or more digits, ([[:digit:]]+). Because that regex is in parens, those digits are saved in group 1.

When such a matching line is found, it is replaced with itself, &, followed by a newline, \n, followed by my_content xxxx. followed by group 1, \1, followed by a newline, \n, followed by my_another_content yyyy., followed by group 1, \1.

Using awk

$ awk -F_ '{print} /hello-domain serial_/{print "my_content xxxx." $NF; print "my_another_content yyyy." $NF}' file
!
hello_group serial_1234
hello-domain serial_1234
my_content xxxx.1234
my_another_content yyyy.1234
!
!
hello_group serial_2345
hello-domain serial_2345
my_content xxxx.2345
my_another_content yyyy.2345
!

How it works:

  • -F_

    This makes _ the field separator. Because of this, the number that we are interested in will be the last field on a line which, in awk, is denoted $NF.

  • print

    This prints each line to output.

  • /hello-domain serial_/{print "my_content xxxx." $NF; print "my_another_content yyyy." $NF}

    For any line that matches the regex hello-domain serial_, we also print two more lines, each followed by the last field (that is, the number) on the current line.

Comments