I'm trying to adapt a C program on reinforcement learning, https://webdocs.cs.ualberta.ca/~sutton/book/code/pole.c, to Python to participate in the OpenAI Gym. I've copied the

`get_box`

`#include <stdio.h>`

int get_box(float x, float x_dot, float theta, float theta_dot);

int main() {

int box;

box = get_box(0.01, 0.01, 0.01, 0.01);

printf("The value of box is : %x\n", box);

return 0;

}

#define one_degree 0.0174532 /* 2pi/360 */

#define six_degrees 0.1047192

#define twelve_degrees 0.2094384

#define fifty_degrees 0.87266

int get_box(x,x_dot,theta,theta_dot)

float x,x_dot,theta,theta_dot;

{

int box=0;

if (x < -2.4 ||

x > 2.4 ||

theta < -twelve_degrees ||

theta > twelve_degrees) return(-1); /* to signal failure */

if (x < -0.8) box = 0;

else if (x < 0.8) box = 1;

else box = 2;

if (x_dot < -0.5) ;

else if (x_dot < 0.5) box += 3;

else box += 6;

if (theta < -six_degrees) ;

else if (theta < -one_degree) box += 9;

else if (theta < 0) box += 18;

else if (theta < one_degree) box += 27;

else if (theta < six_degrees) box += 36;

else box += 45;

if (theta_dot < -fifty_degrees) ;

else if (theta_dot < fifty_degrees) box += 54;

else box += 108;

return(box);

}

which I call

`scratch.c`

`gcc scratch.c -lm`

`./a.out`

`The value of box is : 55`

However, if I go through the conditional statements manually I would expect to get 1 + 3 + 27 + 54 = 85, which is also what I get with my Python program. Why does the program print 55?

If you'd do a `printf("%d\n", box)`

instead of `printf("%x\n", box)`

you'll get the decimal value printed. 0x55 = 5*16 + 5 = 85