panbar panbar - 4 months ago 11
Bash Question

Manipulating column text in unix

I have a tab delimited text file like this-


test.txt


chrom1 start1 end1
2 8828280 8828281
2 8828952 8828953
2 115627275 115627276
3 63945545 63945546
3 109753479 109753480
3 109753640 109753641
4 31116488 31116489
4 31116523 31116524


How can I do the following tasks in a unix shell-


  1. change the column name "chrom1" to "chr" and

  2. add "chr" in front of the each values in column "chr".



The output should look like -

chr start1 end1
chr2 8828280 8828281
chr2 8828952 8828953
chr2 115627275 115627276
chr3 63945545 63945546
chr3 109753479 109753480
chr3 109753640 109753641
chr4 31116488 31116489
chr4 31116523 31116524

Answer

You can use awk:

awk 'BEGIN{FS=OFS="\t"} {$1 = "chr" (NR==1 ? "" : $1)} 1' file

chr   start1     end1
chr2  8828280    8828281
chr2  8828952    8828953
chr2  115627275  115627276
chr3  63945545   63945546
chr3  109753479  109753480
chr3  109753640  109753641
chr4  31116488   31116489
chr4  31116523   31116524