Justin Justin - 6 months ago 9
Bash Question

Bash if statement based on file number

Had a quick question regarding an if statement in bash. Currently I have the following, which takes all the even numbered csv files and then prints all the lines three times.

sed -i 'p;p;p' *[02468].csv


However, is there a way I can print, say, 2 lines if the odd-numbered file (right after each even-numbered file) has less than 20 lines? In other words,

sed -i 'p;p' *[02468].csv IF (# lines of [13579].csv RIGHT after each [02468].csv < 20)


To clarify, say, 5.csv has 19 lines. Then I will print 4.csv's lines twice. But say 7.csv has 21 lines. Then I will print 6.csv's lines thrice.

Sample Input even-numbered csv file:

STATE UNIVERSITY


Desired output if odd-numbered csv file (even-numbered+1) < 20 lines:

STATE UNIVERSITY
STATE UNIVERSITY


Desired output if odd-numbered csv file (even-numbered+1) > 20 lines:

STATE UNIVERSITY
STATE UNIVERSITY
STATE UNIVERSITY

Answer

Try the following:

threshold=20
prevLineCount=$threshold
while IFS= read -r fname; do
  [[ $fname =~ ([0-9])\.csv$ ]] # match the last digit before the .csv suffix
  if (( ${BASH_REMATCH[1]} % 2 == 0 )); then # even
    sedScript='p;p;p'
    (( prevLineCount < threshold )) && sedScript='p;p'
    sed -n "$sedScript" "$fname"
  else # odd
    prevLineCount=$(wc -l < "$fname") # count lines
    # Don't print odd-numbered files
  fi
done < <(printf '%s\n' *[0-9].csv | sort -r)

Note that for safety I've omitted the -i to prevent in-place updating; add it, once you've confirmed that the script works as intended.
Also note that -n was added, as without it you'd print each line an additional time, given that the default is to print (possibly modified) input lines.

Assumptions:

  • Reverse-sorting the filenames yields the desired processing order (highest index first).

  • Filenames have no embedded newlines (such files would be very rare).

  • If the first filename (the one with the highest digit before the suffix) is even, it is assumed that 3 lines should be printed; replace prevLineCount=$threshold with prevLineCount=0 to default to printing 2 lines.

Comments