Dgstah Dgstah - 1 month ago 11
Linux Question

Line Breaks in Unix File

I have a file with records delimited by |. There are a few line breaks where a part of the first line moves into the second line. If I calculate the number of | in a particular line, it should be consistent throughout. How do I identify which line has a line break as such and append two lines into one so as the number of '|' in each line is consistent throughout?

The file is something like below:

DeptID|EmpFName|EmpLName|Salary
Engg|Sam|Le
wis|1000
Engg|Smith|Davis|2000
HR|Denis
|Lillie|1500
HR|Danny|Borr
inson|3000
IT|David|Letterman|2000
IT|John|Newman|3000


whereas I want to calculate the number of '|' in each line.

In this case, each line should have 3 '|' each, but due to line breaks, that is not the case,

My final desired output is

DeptID|EmpFName|EmpLName|Salary
Engg|Sam|Lewis|1000
Engg|Smith|Davis|2000
HR|Denis|Lillie|1500
HR|Danny|Borrinson|3000
IT|David|Letterman|2000
IT|John|Newman|3000

Answer

Given that at max the split is across two lines as stated by OP in question, sed can be used for an easy solution

$ cat ip.txt 
DeptID|EmpFName|EmpLName|Salary
Engg|Sam|Le
wis|1000
Engg|Smith|Davis|2000
HR|Denis
|Lillie|1500
HR|Danny|Borr
inson|3000
IT|David|Letterman|2000
IT|John|Newman|3000

$ sed '/.*|.*|.*|/! {N; s/\n//}' ip.txt 
DeptID|EmpFName|EmpLName|Salary
Engg|Sam|Lewis|1000
Engg|Smith|Davis|2000
HR|Denis|Lillie|1500
HR|Danny|Borrinson|3000
IT|David|Letterman|2000
IT|John|Newman|3000
  • /.*|.*|.*|/! if line doesn't contain three |
    • {N; s/\n//} get next line and remove first \n


Use grouping and quantifier to specify a number instead

sed '/\(.*|\)\{3\}/! {N; s/\n//}' ip.txt

with extended regex, -E or -r

sed -E '/(.*\|){3}/! {N; s/\n//}' ip.txt
Comments