vahap eldem vahap eldem - 5 months ago 10
Bash Question

Inserting new line in sequence manner

I have a question about inserting a new line in a big text.file by bash scripting;

My file;

E.coli.1 gi|1035701777|ref|WP_064543348.1| 85.62 160 23 0 12 171 1 160 4,00E-103 300
E.coli.2 gi|1035701777|ref|WP_064543348.1| 85.62 160 23 0 1 160 1 160 3,00E-103 300
E.coli.5 gi|1036669825|ref|WP_064721309.1| 96.69 393 13 0 2 394 1 393 0.0 748
E.coli.6 gi|1036669825|ref|WP_064721309.1| 96.69 393 13 0 2 394 1 393 0.0 748
E.coli.7 gi|1037427804|ref|WP_064741043.1| 67.95 78 25 0 1 78 1 78 9,00E-33 114


My expected output:

E.coli.1 gi|1035701777|ref|WP_064543348.1| 85.62 160 23 0 12 171 1 160 4,00E-103 300
E.coli.2 gi|1035701777|ref|WP_064543348.1| 85.62 160 23 0 1 160 1 160 3,00E-103 300
E.coli.3
E.coli.4
E.coli.5 gi|1036669825|ref|WP_064721309.1| 96.69 393 13 0 2 394 1 393 0.0 748
E.coli.6 gi|1036669825|ref|WP_064721309.1| 96.69 393 13 0 2 394 1 393 0.0 748
E.coli.7 gi|1037427804|ref|WP_064741043.1| 67.95 78 25 0 1 78 1 78 9,00E-33 114

Answer

If I understood your problem with not-so-clear description, you can solve it using awk:

awk -F '[.[:blank:]]+' 'p{for (;p<$3; p++) print f p} NF>3{p=$3+1; f=$1 "." $2 "."}1' file

E.coli.1    gi|1035701777|ref|WP_064543348.1|   85.62   160 23  0   12  171 1   160 4,00E-103   300
E.coli.2    gi|1035701777|ref|WP_064543348.1|   85.62   160 23  0   1   160 1   160 3,00E-103   300
E.coli.3
E.coli.4
E.coli.5    gi|1036669825|ref|WP_064721309.1|   96.69   393 13  0   2   394 1   393 0.0 748
E.coli.6    gi|1036669825|ref|WP_064721309.1|   96.69   393 13  0   2   394 1   393 0.0 748
E.coli.7    gi|1037427804|ref|WP_064741043.1|   67.95   78  25  0   1   78  1   78  9,00E-33    114
Comments