akash12300 akash12300 - 13 days ago 5
Perl Question

Extract content from file between two same pattern

I have a log file for which I need portions belonging to a particular type of log. It can be of multiple lines.
I cannot post the log file here directly but it is of below format:


<date-format> Thread-MESSAGE1 random-message
line 1
line 2
line 3
line 4
<date-format> Thread-MESSAGE1 random-message2
line 5
<date-format> Thread-MESSAGE2 random-message3
line 6
line 7
line 8
line 9
<date-format> Thread-MESSAGE3 random-message4
<date-format> Thread-MESSAGE1 random-message5
<date-format> Thread-MESSAGE1 random-message6
line 10
line 11
<date-format> Thread-MESSAGE7 random-message7
<date-format> Thread-MESSAGE8 random-message9
<date-format> Thread-MESSAGE9 random-message10
<date-format> Thread-MESSAGE1 random-message11


I need the output to be:

<date-format> Thread-MESSAGE1 random-message
line 1
line 2
line 3
line 4
<date-format> Thread-MESSAGE1 random-message2
line 5
<date-format> Thread-MESSAGE1 random-message5
<date-format> Thread-MESSAGE1 random-message6
line 10
line 11
<date-format> Thread-MESSAGE1 random-message11


I tried using sed but using ' Thread-MESSAGE1' as both the start as well as end pattern did not work if there are two consecutive logs with 'MESSAGE1' key.

I thought of using negative lookup ahead using Perl(which worked), but unfortunately I cannot use Perl and neither 'sed' nor 'awk' supports negative lookup ahead in pattern.

Recently I was trying with the following 'sed' pattern:


tac source_file.log | sed -n '{/<date-format> Thread-/!H; /<date-format> Thread-/{H;d;x} /<date-format> Thread-MESSAGE1/p; d;}' > test.log


The idea was to reverse output of test.log afterwards, but for adding curly braces after 'Thread-/{H;d;x}' I am getting 'extra characters after command' error.
Is there a better alternative? Or is there a way I can group commands using curly braces in sed?

Answer

You can use this awk command:

awk -v kw='Thread-MESSAGE1' '$2 ~ /^Thread-/ {p = ($2 == kw)} p' file

<date-format> Thread-MESSAGE1 random-message
line 1
line 2
line 3
line 4
<date-format> Thread-MESSAGE1 random-message2
line 5
<date-format> Thread-MESSAGE1 random-message5
<date-format> Thread-MESSAGE1 random-message6
line 10
line 11
<date-format> Thread-MESSAGE1 random-message11

If this doesn't workout then I suggest you post more realistic sample data.