ananya ananya - 11 months ago 40
C++ Question

Number of subsets of an array (size n) of size k having difference between maximum and minimum elements at most x

I came across this question during a test. Given an array of size n find number of subsets of size k having difference between maximum and minimum element at most x.

Constraints 1<=a[i],n,k,x<=10^6
Example:n=5 k=3 x=5 a={1,2,3,4,5}
output: 10

My Approach So Far:
I first sorted the array and considered the greatest element . Now using linear search I found the smallest element whose such that difference is less than or equal to x.

Now if I take that greatest element and selected any k-1 elements between them .Selection process was (index of greatest - index of smallest)C(k-1) and I summed up these.

Complexity is coming around O(nnk). I wasn't stuck but my solution couldn't pass the test cases.

Answer Source

I think you're on the right track. As I understand it, this is what you have so far:

  • Sort the array.
    • You don't say how you're doing this, you can do it in O(n) time using a radix sort, so I'll assume O(n) time.
  • For each element:
    • Count how many other elements are between E − x and E (where E is the value of this element). Call the result m.
      • You're doing this in worst-case O(n) time by using a linear search.
    • Compute mCk−1.
      • You don't say how you're doing this, but you can precompute this for all possible m in O(n) time and then just use a constant-time lookup, so I'll assume that's what you're doing.

This approach gives the correct result, and requires worst-case O(n2) time. (You mention that you were getting O(n2k) time; but I don't see any reason for that.)

The major optimization that you seem to be missing is that instead of restarting your linear search for each element, you can resume your linear search from where the previous element's linear search left off. That way, all of your linear searches put together will add up to linear time, since they amount to just a single pass over the array. (In other words, your linear searches will have amortized constant time.)

This gives us an algorithm in O(n) time:

sort a
precompute nCr(m, k - 1) for all m in (0 .. n-1)
set total_num_subsets := 0
set min_element_index := 0
for max_element_index in 0 .. n-1 do:
    while a[min_element_index] + x < a[max_element_index] do:
        set min_element_index := min_element_index + 1
    set num_eligible_elements := max_element_index - min_element_index
    set num_subsets := nCr(num_eligible_elements, k - 1)
    set total_num_subsets := total_num_subsets + num_subsets

(There are some additional minor optimizations we can do — for example, we can start the max_element_index iteration at k rather than 0, and we can use memoization rather than precomputation to avoid computing mCk−1 for larger values of m than we actually end up needing — but they don't change the algorithmic complexity.)