abhi divekar - 2 months ago 3
C++ Question

# C++ strange jump in unsigned long long int values

I have the following question, which is actually from a coding test I recently took:

## Question:

A function
`f(n) = a*n + b*n*(floor(log(n)/log(2))) + c*n*n*n`
exists.

At a particular value, let
`f(n) = k`
;

Given
`k, a, b, c`
, find
`n`
.

For a given value of
`k`
, if no
`n`
value exists, then return 0.

## Limits:

``````1 <= n < 2^63-1
0 < a, b < 100
0 <= c < 100
0 < k < 2^63-1
``````

The logic here is that since
`f(n)`
is purely increasing for a given a, b and c, I can find
`n`
by binary search.

The code I wrote was as follows:

``````#include<iostream>
#include<stdlib.h>
#include<math.h>
using namespace std;

unsigned long long logToBase2Floor(unsigned long long n){
return (unsigned long long)(double(log(n))/double(log(2)));
}

#define f(n, a, b, c) (a*n + b*n*(logToBase2Floor(n)) + c*n*n*n)

unsigned long long findNByBinarySearch(unsigned long long k, unsigned long long a, unsigned long long b, unsigned long long c){
unsigned long long low = 1;
unsigned long long high = (unsigned long long)(pow(2, 63)) - 1;
unsigned long long n;
while(low<=high){
n = (low+high)/2;
cout<<"\n\n          k= "<<k;
cout<<"\n f(n,a,b,c)= "<<f(n,a,b,c)<<"  low = "<<low<<"  mid="<<n<<"  high = "<<high;
if(f(n,a,b,c) == k)
return n;
else if(f(n,a,b,c) < k)
low = n+1;
else high = n-1;
}
return 0;
}
``````

I then tried it with a few test cases:

``````int main(){
unsigned long long n, a, b, c;
n = (unsigned long long)pow(2,63)-1;
a = 99;
b = 99;
c = 99;
cout<<"\nn="<<n<<"  a="<<a<<"  b="<<b<<"  c="<<c<<"    k = "<<f(n, a, b, c);
cout<<"\nANSWER: "<<findNByBinarySearch(f(n, a, b, c), a, b, c)<<endl;
n = 1000;
cout<<"\nn="<<n<<"  a="<<a<<"  b="<<b<<"  c="<<c<<"    k = "<<f(n, a, b, c);
cout<<"\nANSWER: "<<findNByBinarySearch(f(n, a, b, c), a, b, c)<<endl;
return 0;
}
``````

Then something weird happened.

The code works for the test case
`n = (unsigned long long)pow(2,63)-1;`
, correctly returning that value of n. But it did not work for
`n=1000`
. I printed the output and saw the following:

``````n=1000  a=99  b=99  c=99    k = 99000990000

k= 99000990000
f(n,a,b,c)= 4611686018427387904  low = 1  mid=4611686018427387904  high = 9223372036854775807
...
...
k= 99000990000
f(n,a,b,c)= 172738215936  low = 1  mid=67108864  high = 134217727

k= 99000990000
f(n,a,b,c)= 86369107968  low = 1  mid=33554432  high = 67108863

k= 99000990000
f(n,a,b,c)= 129553661952  low = 33554433  mid=50331648  high = 67108863**
...
...
k= 99000990000
f(n,a,b,c)= 423215328047139441  low = 37748737  mid=37748737  high = 37748737
``````

Something didn't seem right mathematically. How was it that the value of
`f(1000)`
was greater than the value of
`f(33554432)`
?

So I tried the same code in Python, and got the following values:

``````>>> f(1000, 99, 99, 99)
99000990000L
>>> f(33554432, 99, 99, 99)
3740114254432845378355200L
``````

So, the value is definitely larger.

## Questions:

• What is happening exactly?

• How should I solve it?

### What is happening exactly?

The problem is here:

``````unsigned long long low = 1;
// Side note: This is simply (2ULL << 62) - 1
unsigned long long high = (unsigned long long)(pow(2, 63)) - 1;
unsigned long long n;
while (/* irrelevant */) {
n = (low + high) / 2;
// Some stuff that do not modify n...
f(n, a, b, c) // <-- Here!
}
``````

In the first iteration, you have `low = 1` and `high = 2^63 - 1`, which mean that `n = 2^63 / 2 = 2^62`. Now, let's look at `f`:

``````#define f(n, a, b, c) (/* I do not care about this... */ + c*n*n*n)
``````

You have `n^3` in `f`, so for `n = 2^62`, `n^3 = 2^186`, which is probably way too large for your `unsigned long long` (which is likely to be 64-bits long).

### How should I solve it?

Preamble, I am using `ull_t` because I am lazy, and you should avoid macro in C++, prefer using a function and let the compiler inline it. Also, I prefer a loop against using the `log` function to compute the log2 of an `unsigned long long`.

``````using ull_t = unsigned long long;

constexpr auto log2 (ull_t n) {
ull_t log = 0;
while (n >>= 1) ++log;
return log;
}

constexpr auto f (ull_t n, ull_t a, ull_t b, ull_t c) {
return a * n + b * n * log2(n) + c * n * n * n;
}
``````

The main problem here is the way you find your bounds for `n`, so you should try to find a better upper (and maybe lower) bound(s) for `n` to start with. You should also split your function into two cases:

• If c is 0, you have `f(n, a, b, 0) = a * n + b * n + log2(n) = n * (a + b * log2(n))`.

Depending on the value of `a` and `b`, getting a bound for `n` may be difficult (see comments). I personally would try all possible values of `log2(n)` (there are only 64 possible values given the range of `n`, so this would have the same complexity as a binary search) and for each of these value, check if the corresponding `n` matches the given `k`:

``````// Find n when c = 0
constexpr auto find_n (ull_t k, ull_t a, ull_t b)
for (ull_t l = 1; l < 64; ++l) {
auto n = k / (a + b * l);
if (f(n, a, b, 0) == k)
return n;
}
return 0ULL;
}
``````
• If c is not 0, the dominant term is clearly `c * n * n * n` when `n` is large (we do not care when `n` is small), so you should starts with something like (approximation, this will not work for all values of `k`/`n`):

high = 2((64 - log2(c)) / 4)