lmo lmo - 5 months ago 13
Linux Question

Using cat command to combine non-adjacent sections of a file

Is it possible to concatenate disparate lines in a file using the cat command? In particular, I have a tab delimited file (see dataset) with headers. I would like to append the headers to a set of observations (lines) that have been filtered using, say grep:

head -1 filename
grep -E CA filename


The output of these would be combined using cat. (background: I am looking into using GNU/Linux commands for simple data work, usually to reduce data set size).

For example, I have a file that looks like the following:

var1 var2 var3
1 MT 500
30 CA 40000
10 NV 1240
...


Where the white space is a tab (\t or the like). I would like to select a subset of lines 2 - N using grep but place the first row (the variable names) at the top of the file using Unix/GNU/Linux commands.

The desired output would be:

var1 var2 var3
30 CA 40000
35 CA 65000
15 CA 2500
...

Answer

If you're running the commands from a shell (including shell scripts), you can run each command separately and redirect the output:

head -1 filename > outputfile
grep -E CA filename >> outputfile

The first line will overwrite outputfile, because a single > was used. The second line will append to outputfile, because >> was used.

If you want to do this in a single command, the following worked in bash:

(head -1 filename && grep -E CA filename) > outputfile

If you want the output to go to standard output, leave off the parenthesis and redirection:

head -1 filename && grep -E CA filename