technopathe technopathe - 4 months ago 8
Bash Question

wrapping a long string in multiple lines starting from a position

I have a file that looks this:

FirstSentences1 bfjkjhdfhizhfzibfkjezfzfiuzehfizdjfldfsdfsljfklj
SecondSentences2 fjlskdjfjoijrgeojrgijgoejrgrjgiorjofgjeirjgoergd
.
.
.
NthhhSentencesN klkdlffjsldfsljflsfjlskfjldkjflsfjlfkdjfdfjojjij




I Have to get the following output :

FirstSentences1 bfjkjhdfhizhfzibfkje
FirstSentences1 zfzfiuzehfizdjfldfsd
FirstSentences1 fsljfklj
SecondSentences2 fjlskdjfjoijrgeojrgi
SecondSentences2 jgoejrgrjgiorjofgjei
SecondSentences2 rjgoergd
.
.
.
NthhhSentencesN klkdlffjsldfsljflsfj
NthhhSentencesN lskfjldkjflsfjlfkdjf
NthhhSentencesN dfjojjij




Explanation :

for example the first line :

FirstSentences1 bfjkjhdfhizhfzibfkjezfzfiuzehfizdjfldfsdfsljfklj




We take the string "bfjkjhdfhizhfzibfkjezfzfiuzehfizdjfldfsdfsljfklj" and we wrapp it when the length is equal to 20

Do you know a way to get this ?

Answer

You can do it with a short script utilizing string indexes and a nested loop:

#!/bin/bash

declare -i len=${2:-20}     ## take length as 2nd arg (filename is 1st)

while read -r line; do      ## read each line
    while [ ${#line} -gt 0 ]; do            ## if characters remain
        printf "%s\n" "${line:0:$((len))}"  ## print len chars
        line="${line:$((len))}"             ## strip len chars from line
    done
done < "$1"

Example Input File

$ cat dat/longsent.txt
bfjkjhdfhizhfzibfkjezfzfiuzehfizdjfldfsdfsljfklj
fjlskdjfjoijrgeojrgijgoejrgrjgiorjofgjeirjgoergd

Example Use/Output

Wrapping at default 20-chars per-line:

$ bash wrap.sh dat/longsent.txt
bfjkjhdfhizhfzibfkje
zfzfiuzehfizdjfldfsd
fsljfklj
fjlskdjfjoijrgeojrgi
jgoejrgrjgiorjofgjei
rjgoergd

Wrapping at 10 characters per-line:

$ bash wrap.sh dat/longsent.txt 10
bfjkjhdfhi
zhfzibfkje
zfzfiuzehf
izdjfldfsd
fsljfklj
fjlskdjfjo
ijrgeojrgi
jgoejrgrjg
iorjofgjei
rjgoergd

note: you should validate that len is greater than 0, and you can add || test -n "$line" to the first while clause to accommodate non-POSIX line ending on the last line (omitted for brevity).


Including the Line Prefix

If your datafile includes the prefixes, (e.g. FirstSentence1, ...) and you need to include those in your output, you simply add the read of the prefix before line and output prefix (with some sane field width, left-justified) before each wrapped line. e.g.:

#!/bin/bash

declare -i len=${2:-20}     ## take length as 2nd arg (filename is 1st)
declare -i wdth=22          ## set min field width for prefix (so cols align)

while read -r prefix line; do      ## read each line
    while [ ${#line} -gt 0 ]; do   ## if characters remain
        ## print len chars w/prefix width set to wdth, left-justified
        printf "%-*s %s\n" $wdth "$prefix" "${line:0:$((len))}"
        line="${line:$((len))}"    ## strip len chars from line
    done
done < "$1"

Example Input File w/Prefix

$ cat dat/longsentpfx.txt
FirstSentence1   bfjkjhdfhizhfzibfkjezfzfiuzehfizdjfldfsdfsljfklj
SecondSentences2 fjlskdjfjoijrgeojrgijgoejrgrjgiorjofgjeirjgoergd

Example Use/Output

$ bash wrap.sh dat/longsentpfx.txt
FirstSentence1         bfjkjhdfhizhfzibfkje
FirstSentence1         zfzfiuzehfizdjfldfsd
FirstSentence1         fsljfklj
SecondSentences2       fjlskdjfjoijrgeojrgi
SecondSentences2       jgoejrgrjgiorjofgjei
SecondSentences2       rjgoergd

$ bash wrap.sh dat/longsentpfx.txt 10
FirstSentence1         bfjkjhdfhi
FirstSentence1         zhfzibfkje
FirstSentence1         zfzfiuzehf
FirstSentence1         izdjfldfsd
FirstSentence1         fsljfklj
SecondSentences2       fjlskdjfjo
SecondSentences2       ijrgeojrgi
SecondSentences2       jgoejrgrjg
SecondSentences2       iorjofgjei
SecondSentences2       rjgoergd

Let me know if you have additional questions.

note: to set the width to exactly one character past the longest prefix, you would need to read all prefix values before actually writing the wrapped lines to find the longest width and then add +1. If your datafile is short, you could read the prefixes and lines into a pair of indexed arrays and scan the lengths from the prefix array first, if the datafile is huge, you could scan the file twice (not optimal), or you can just set some predetermined width as done above.

Comments