ananya - 9 months ago 36

C++ Question

I came across this question during a test. Given an array of size n find number of subsets of size k having difference between maximum and minimum element at most x.

`Constraints 1<=a[i],n,k,x<=10^6`

Example:n=5 k=3 x=5 a={1,2,3,4,5}

output: 10

I first sorted the array and considered the greatest element . Now using linear search I found the smallest element whose such that difference is less than or equal to x.

Now if I take that greatest element and selected any k-1 elements between them .Selection process was (index of greatest - index of smallest)C(k-1) and I summed up these.

Complexity is coming around O(nnk). I wasn't stuck but my solution couldn't pass the test cases.

Answer Source

I think you're on the right track. As I understand it, this is what you have so far:

- Sort the array.
- You don't say how you're doing this, you can do it in O(
*n*) time using a radix sort, so I'll assume O(*n*) time.

- You don't say how you're doing this, you can do it in O(
- For each element:
- Count how many other elements are between
*E*−*x*and*E*(where*E*is the value of this element). Call the result*m*.- You're doing this in worst-case O(
*n*) time by using a linear search.

- You're doing this in worst-case O(
- Compute
_{m}*C*_{k−1}.- You don't say how you're doing this, but you can precompute this for all possible
*m*in O(*n*) time and then just use a constant-time lookup, so I'll assume that's what you're doing.

- You don't say how you're doing this, but you can precompute this for all possible

- Count how many other elements are between

This approach gives the correct result, and requires worst-case O(*n*^{2}) time. (You mention that you were getting O(*n*^{2}*k*) time; but I don't see any reason for that.)

The major optimization that you seem to be missing is that instead of restarting your linear search for each element, you can resume your linear search from where the previous element's linear search left off. That way, all of your linear searches *put together* will add up to linear time, since they amount to just a single pass over the array. (In other words, your linear searches will have *amortized constant time*.)

This gives us an algorithm in O(*n*) time:

```
sort a
precompute nCr(m, k - 1) for all m in (0 .. n-1)
set total_num_subsets := 0
set min_element_index := 0
for max_element_index in 0 .. n-1 do:
while a[min_element_index] + x < a[max_element_index] do:
set min_element_index := min_element_index + 1
set num_eligible_elements := max_element_index - min_element_index
set num_subsets := nCr(num_eligible_elements, k - 1)
set total_num_subsets := total_num_subsets + num_subsets
```

(There are some additional minor optimizations we can do — for example, we can start the `max_element_index`

iteration at `k`

rather than `0`

, and we can use memoization rather than precomputation to avoid computing _{m}*C*_{k−1} for larger values of *m* than we actually end up needing — but they don't change the algorithmic complexity.)