bryantee bryantee - 1 year ago 82
Javascript Question

JavaScript: Using an object to iterate over array and keep track of item frequency

I need a function

to return the most common string found in an array
. I would like to use an object to keep track of these word frequencies. Using getter and setter methods seem like the most viable option. Where the setter function is used to change the value for each key representing the word. Then after I sort the object by frequency value I can return the word with the highest frequency. Am I over thinking this problem?

Answer Source

Here is how this can be solved using Array.prototype.reduce()

var words = ["one", "three", "three", "three", "two", "two"];

var frequencies = words.reduce(function(memo, word) {
    //either start the count now, if this is the first encounter or increas it by 1
    memo[word] = (memo[word] + 1) || 1;
    return memo;
}, {}); // note the empty object being passed in here - that's the initial value for the variable "memo"


var mostFrequentWord = Object.keys(frequencies)
  .reduce(function(highest, current) {
    return frequencies[highest] > frequencies[current] ? highest : current;
  }, "");

console.log("most frequent word: " + mostFrequentWord + 
"\ncount: " + frequencies[mostFrequentWord])

To get the highest value then, it's as simple as running reduce again, only this time using Object.keys()

EDIT: addressing a comment:

Is there any advantage to using .reduce() over .forEach() in your first loop? You're just returning the same object every time so it seems that .forEach() would work just as well and perhaps be a little clearer

Well, it's somewhat down to style - both of these can achieve the same result. The manner in which they do is different though and I'd argue that for this reason reduce has at least a miniscule advantage. Here is why:

  1. reduce and forEach communicate different intent. While they both can be used to achieve similar results, the difference in how they operate does make them a bit biased to some operations.

    • For reduce the intent is "I want to take this collection of things, go through it and return one thing". It's perfectly suitable to find minimums or maximums, or sums, for example. So, you would use it if you have an array at the start and want to end with something else (though sometimes, you can also return an array).
    • The intent of forEach is slightly different, though - it is "I want to go through this collection and do something with each item". Essentially, it would be for when you want to do the same operation on each object, say, you might be console.logging them, or validating them or uploading them. In general, you will have one some code that takes one item and does something thing with it and you will just apply it to all items through forEach.
  2. reduce is self-contained. It may not look like much, and it may not be much depending on context, but you have to recognise that the entirety of the functionality is contained within reduce. This makes it vastly easier to grasp within a larger context, since you have everything you need in one place. Let's re-write it using forEach and I will try to show the difference

var frequencies = {}; //<- instantiation needs to be separate

words.forEach(function(word) { //<- population needs to be separate
    frequencies[word] = (frequencies[word] + 1) || 1;

console.log(frequencies); //<- usage is separate

So, you make the function one line shorter (no return) but gain one line because of the instantiation of the variable. This looks completely fine now, because it's isolated but in a larger codebase you might have code between each of the sections. And this makes it harder to keep all the logic in your head - if you read just the forEach loop, you don't have the full context, since you need to know about frequencies when you scroll to it, you might not be able to see the forEach. What's more, you don't even know what state frequencies would be in then you get to the forEach - would it have some values pre-populated? Would it be set to null? Would it be an array instead of an object? Not only would you have to find the initial declaration of frequencies but you also have to track down if it was changed at any point before the function was called.

Now, with that said, let's re-examine what reduce does - everything you need to know about how it operates is in a single place. The declaration, all changes and final assignment of frequencies always happens in the span of three lines of code, so you will not need to find any other thing for the context to make sense, regardless of how much code you have. Yes, you might need to know what words contains, however, the same holds true about forEach.

Regarding both of these points, I would say that reduce is clearer to understand. The only reason why forEach would seem like the simpler solution is if you only ever do things using a regular for loop and you need a functional replacement. Yet, the declarative approach has its differences to imperative - a forEach and a for are different. Neither is inherently better but they do have strengths and weaknesses depending on the situation. A reduce operation is the better functional approach in this situation.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download