DomainsFeatured DomainsFeatured - 1 year ago 80
Linux Question

Match Anything In Between Strings For Linux Grep Command

I have read the post grep all characters including newline but I not working with XML so it's a bit different with my Linux command.

I have the following data:

Example line 0</span>
<tag>Example line 1</tag>
<span>Example line 1.5</span>
Example line 2
Example line 3
<span>Example line 4</span>

Using this command
cat file.txt | grep -o '<tag.*tag>\|^--.*'
I get:

<tag>Example line 1</tag>

However, I want the output to be:

<tag>Example line 1</tag>
<tag>Example line 2</tag>

How can I match anything between the strings, including the newline?

Note: I need to used
as strings because other files can contain multiple tags and text in between the lines. Will update sample data to show that.

Answer Source

This is easier done with gnu-awk using </tag> as record separator:

awk -v RS='</tag>' 'RT {gsub(/\n/, ""); print $0 RT}' file

<tag>Example line 1</tag>
<tag>Example line 2</tag>