DomainsFeatured DomainsFeatured - 1 month ago 18
Linux Question

Match Anything In Between Strings For Linux Grep Command

I have read the post grep all characters including newline but I not working with XML so it's a bit different with my Linux command.

I have the following data:

Example line 0</span>
<tag>Example line 1</tag>
<span>Example line 1.5</span>
<tag>
Example line 2
</tag>
Example line 3
<span>Example line 4</span>


Using this command
cat file.txt | grep -o '<tag.*tag>\|^--.*'
I get:

<tag>Example line 1</tag>


However, I want the output to be:

<tag>Example line 1</tag>
<tag>Example line 2</tag>


How can I match anything between the strings, including the newline?

Note: I need to used
<tag
and
tag>
as strings because other files can contain multiple tags and text in between the lines. Will update sample data to show that.

Answer

This is easier done with gnu-awk using </tag> as record separator:

awk -v RS='</tag>' 'RT {gsub(/\n/, ""); print $0 RT}' file

<tag>Example line 1</tag>
<tag>Example line 2</tag>
Comments