user3198603 user3198603 - 3 months ago 27
Java Question

Memory complexity of Boyer–Moore majority vote algorithm?

Per mine understanding , To find the majority element Boyer–Moore majority vote algorithm is O(1) i.e. it constant and not proportional to size of input.
Then why thi wiki link mentions logarithmic space {\displaystyle O(\log n)} O(\log n)

Here is the program for reference

public class MajorityElement {
/* Function to print Majority Element */
void printMajority(int a[], int size) {
/* Find the candidate for Majority */
int cand = findCandidate(a, size);

/* Print the candidate if it is Majority */
if (isMajority(a, size, cand))
System.out.println(" " + cand + " ");
else
System.out.println("No Majority Element");
}

/* Function to find the candidate for Majority */
int findCandidate(int a[], int size) {
int maj_index = 0, count = 1;
int i;
for (i = 1; i < size; i++) {
if (a[maj_index] == a[i])
count++;
else
count--;
if (count == 0) {
maj_index = i;
count = 1;
}
}
return a[maj_index];
}

/*
* Function to check if the candidate occurs more than n/2 times
*/
boolean isMajority(int a[], int size, int cand) {
int i, count = 0;
for (i = 0; i < size; i++) {
if (a[i] == cand)
count++;
}
if (count > size / 2)
return true;
else
return false;
}

Answer

This is why Wikipedia cannot always be relied upon, at least not without some critical thinking on the part of the reader. (Which should not be considered a reason to not use Wikipedia; it's an immensely valuable resource thanks to a vast and committed team of volunteer contributors.)

There are two common models used to measure space and time complexity: the uniform cost model and the logarithmic cost model. The uniform cost model assumes that the storage cost of a single value is o(1) (regardless of the magnitude of that value) and that the time complexity of a single simple arithmetic computation is also o(1). If the values are very large, then these simplifications are not correct, so one might want to use the logarithmic model. In the logarithmic model, we measure the size of the problem not by the count of values, but rather by the total size in bits of the values. (A different Wikipedia article provides a discussion of these models. Also see the references.)

This has little impact on simple arithmetic. The cost of adding two N-bit numbers is o(N) and the cost of adding a vector of numbers whose total size is N bits is o(N), just as it would be with the simplifying assumption that the size of the problem is measured in values and the cost of adding two values is o(1). But if multiplication and division are involved, the complexity computations get much more complicated, and it is really not worth going down that road unless the numbers are really very very big, as in, for example, various encryption algorithms which include operations on values whose size is thousands of bits.

While there are algorithms which involve arithmetic on numbers sufficiently large that it needs to be taken into account for an accurate analysis, there really are no practical algorithms which involve so many inputs that the size of the address of a value (in the random access machine) needs to be taken into account. There are not 2256 subatomic particles in the entire universe, so it is completely reasonable to assume that a limited-bit-width register is sufficient for any addressing purpose, which includes counting the number of participating objects.

Consequently, categorizing an algorithm which needs to maintain a count of inputs as O(log N) just because the counter might have an arbitrary number of bits in some alternative universe is, at best, pedantry, and (in my opinion) contributes nothing to the understanding of the complexity of given algorithms.

Nonetheless, pedants have as much right as anyone to contribute to Wikipedia; indeed, it might be theorized that the Wikipedia culture invites pedantry. That still needs to be balanced against the Wikipedia insistence that authors not include "original research", which would include (again, in my opinion) reinterpreting the storage complexity of an algorithm in a way which contradicts commonly published results. (And that might explain the "citation-needed" marker in the Wikipedia article in question.)

Comments