Fshamri Fshamri - 1 year ago 36
Bash Question

Search a word and count its occurrence in a file

I want to search for 3 words and count their occurrences in tens of files. those files names contains prefix + time stamp like


i want to search on them the following words:
then get their counts into new file. the desired output should be like:

filename OK RETRY DROP
XXX20160622XXX 221 305 400 //those values are the count of words
....... ... ... ...

I have tries the following:

fileName=$(date --date="-1 day" +"%Y%m%d")
cd /advdata/ticketdatashareA/FTM_Sms/
format=*`echo $fileName`*
for i in $format;
if [[ "$i" == "$format" ]]
echo "No Files"
echo -n "file name $i :" | cut -c21-49 ; echo '\t' `grep OK $i | wc -l`; echo '\t' `grep "RETRY" $i | wc -l`; echo '\t' `grep "DROP" $i | wc -l`;

what i got is:

\t 107
\t 0
\t 0


This is a solution for Bash:

declare -a words=( OK RETRY DROP )

for file in FTM.FC102.*; do
    printf "$file "
    for word in "${words[@]}"; do
        grep -o "$word" "$file" | wc -l | tr '\n' ' '
done | rs 0 $(( ${#words[@]} + 1 )) # alternatively:  | tr -s ' ' '\t'


  • We store the words that we'll look for in the array words.
  • Loop through the files (change the pattern to match your needs).
  • For each file, we construct a line starting with the filename, then...
  • For each word, grep -o on the file to get all matches for it.
  • Count the number of matches (removing newlines from the end of the output of wc with tr).
  • At the end of the line, emit a newline with a bare echo to end the line of output for this file.
  • Pipe everything to rs to format the columns nicely. This utility is available on at least BSD system... If you don't have it, just remove the pipe and live with wonky columns, or use | tr -s ' ' '\t' instead, which does a half-decent job.

Does not print the header though.

With two files with the following contents:

$ cat text1
Neque porro quisquam est qui dolorem ipsum quia dolor sit amet,
consectetur, adipisci velit...

$ cat text2
There is no one who loves pain itself, who seeks after it and wants to
have it, simply because it is pain...

... and with the "words" a, b and c, the script does this:

$ bash script.sh
text1  4      0      3
text2  7      1      1