Clabis Clabis - 1 month ago 10
Bash Question

Bash Shell: Infinite Loop

The problem is the following I have a file that each line has this form:

id|lastName|firstName|gender|birthday|joinDate|IP|browser


i want to sort alphabetically all the firstnames in that file and print them one on each line but each name only once

i have created the following program but for some reason it creates an infinite loop:

array1=()
while read LINE
do
if [ ${LINE:0:1} != '#' ]
then
IFS="|"
array=($LINE)
if [[ "${array1[@]}" != "${array[2]}" ]]
then
array1+=("${array[2]}")
fi
fi
done < $3
echo ${array1[@]} | awk 'BEGIN{RS=" ";} {print $1}' | sort


NOTES


  • if [ ${LINE:0:1} != '#' ]
    : this command is used because there are comments in the file that i dont want to print

  • $3
    : filename

  • array1
    : is used for all the seperate names


Answer

Wow, there's a MUCH simpler and cleaner way to achieve this, without having to mess with the IFS variable or using arrays. You can use "for" to do this:

First I created a file with the same structure as yours:

$ cat file
id|lastName|Douglas|gender|birthday|joinDate|IP|browser
id|lastName|Tim|gender|birthday|joinDate|IP|browser
id|lastName|Andrew|gender|birthday|joinDate|IP|browser
id|lastName|Sasha|gender|birthday|joinDate|IP|browser
#id|lastName|Carly|gender|birthday|joinDate|IP|browser
id|lastName|Madson|gender|birthday|joinDate|IP|browser

Here's the script I wrote using "for":

#!/bin/bash

for LINE in `cat file | grep -v "^#" | awk -F'|' '{print$3}' | sort -u`
do
        echo $LINE
done

And here's the output of this script:

$ ./script.sh
Andrew
Douglas
Madson
Sasha
Tim

Explanation:

for LINE in `cat file`

Creates a loop that reads each line of "file". The commands between ` are run by linux, for example, if you wanted to store the date inside of a variable you could use "VARDATE=`date`".

grep -v "^#"

The option -v is used to exclude results matching the pattern, in this case the pattern is "^#". The "^" character means "line begins with". So grep -v "^#" means "exclude lines beginning with #".

awk -F'|' '{print$3}'

The -F option switches the column delimiter from the default (the default is a space) to whatever you put between ' after it, in this case the "|" character. The '{print$3}' prints the 3rd column.

sort -u

And the "sort -u" command to sort the names alphabetically.

Comments