Young Young - 10 months ago 35
Perl Question

Perl: Find a match, remove the same lines, and to get the last field

Being a Perl newbie, please pardon me for asking this basic question.

I have a text file @server1 that shows a bunch of sentences (white space is the field separator) on many lines in the file.

I needed to match lines with my keyword, remove the same lines, and extract only the last field, so I have tried with:

my @allmatchedlines;

open(output1, "ssh user1@server1 cat /tmp/myfile.txt |");

while(<output1>) {
@allmatchedlines = $_ if /mysearch/;

my @uniqmatchedline = split(/ /, @allmatchedlines);

my $lastfield = $uniqmatchedline[-1]\n";
print "$lastfield\n";

and it gives me the output showing:

I don't know why it's giving me just "1".

Could someone please explain why I'm getting "1" and how I can get the last field of the matched line correctly?

Thank you!

Answer Source

There is one unclear thing with "remove the same lines and extract only the last field", apart from two likely errors in the code. Once duplicate matching lines are removed, there may still be multiple distinct sentences with the pattern.

Until a clarification comes, here is code that picks the last field from the last such sentence.

use warnings 'all';
use strict;

use List::MoreUtils qw(uniq)

# open the ssh process with the remote file ...

while (<output1>) {
    push @allmatchedlines, $_ if /mysearch/;

my @unique_matched_lines = uniq @allmatchedlines;

my $lastfield = ( split ' ', $unique_matched_lines[-1] )[-1]; 

print $lastfield, "\n";

Note that the (default) pattern ' ' used in split splits on any amount of whitespace. The regex / / turns off this behavior and splits on a single space. You most likely want to use ' '.

For comments see the original post below.

The statement @allmatchedlines = $_ if /mysearch/; on every iteration assigns to the array, overwriting whatever has been in it. So you end up with only the last line that matched mysearch. You want push @allmatchedlines, $_ ... to get all those lines.

Also, as shown in the answer by Justin Schell, split needs a scalar so it is taking the length of @allmatchedlines – which is 1 as explained above. You should have

my @words_in_matched_lines = map { split } @allmatchedlines;

When all this is straightened out, you'll have words in the array @uniqmatchedline and if that is the intention then its name is misleading.

To get unique elements of the array you can use the module List::MoreUtils

use List::MoreUtils qw(uniq);

my @unique_elems = uniq @whole_array;