codeREXO codeREXO - 17 days ago 5
Java Question

Sorting string occurrences from text file

I have stored strings from a file into an ArrayList, and used a HashSet to count the number of occurrences of each string.

I am looking to list the top 5 words and their number of occurrences. I should be able to accomplish this w/o implementing a hashtable, treemap, etc. How can I go about achieving this?

Here is my ArrayList:

List<String> word_list = new ArrayList<String>();

while (INPUT_TEXT1.hasNext()) {
String input_word = INPUT_TEXT1.next();
word_list.add(input_word);

}

INPUT_TEXT1.close();

int word_list_length = word_list.size();



System.out.println("There are " + word_list_length + " words in the .txt file");
System.out.println("\n\n");

System.out.println("word_list's elements are: ");



for (int i = 0; i<word_list.size(); i++) {
System.out.print(word_list.get(i) + " ");

}

System.out.println("\n\n");


Here is my HashSet:

Set<String> unique_word = new HashSet<String>(word_list);

int number_of_unique = unique_word.size();

System.out.println("unique worlds are: ");

for (String e : unique_word) {
System.out.print(e + " ");

}

System.out.println("\n\n");


String [] word = new String[number_of_unique];
int [] freq = new int[number_of_unique];

int count = 0;

System.out.println("Frequency counts : ");

for (String e : unique_word) {
word[count] = e;
freq[count] = Collections.frequency(word_list, e);



System.out.println(word[count] + " : "+ freq[count] + " time(s)");
count++;

}


Could it be that I am overthinking a step? Thanks in advance

Answer

You can do this using HashMap (holds with unique word as key and frequency as value) and then sorting the values in the reverse order as explained in the below steps:

(1) Load the word_list with the words

(2) Find the unique words from word_list

(3) Store the unique words into HashMap with unique word as key and frequency as value

(4) Sort the HashMap with value (frequency)

You can refer the below code:

public static void main(String[] args) {

        List<String> word_list = new ArrayList<>();
        //Load your words to the word_list here

        //Find the unique words now from list
        String[] uniqueWords = word_list.stream().distinct().
                                       toArray(size -> new String[size]);
        Map<String, Integer> wordsMap = new HashMap<>();
        int frequency = 0;

        //Load the words to Map with each uniqueword as Key and frequency as Value
        for (String uniqueWord : uniqueWords) {
            frequency = Collections.frequency(word_list, uniqueWord);
            System.out.println(uniqueWord+" occured "+frequency+" times");
            wordsMap.put(uniqueWord, frequency);
        }

       //Now, Sort the words with the reverse order of frequency(value of HashMap)
       Stream<Entry<String, Integer>> topWords = wordsMap.entrySet().stream().
         sorted(Map.Entry.<String,Integer>comparingByValue().reversed()).limit(5);

        //Now print the Top 5 words to console
        System.out.println("Top 5 Words:::");
        topWords.forEach(System.out::println);
 }