Alexandre Pieroux - 8 months ago 42

C Question

Here is the portion of the code that trigger this error (this is part of a bigger file that have roughly 1000 lines and is intended to implement the hypercuts algorithm) :

`void get_combination_cuts_characteristics(uint32_t* cuts,`

uint32_t nb_dim_cut,

struct hypercuts_dimension** dimensions,

uint32_t* return_cuts,

uint32_t sum_cuts,

struct classifier_rule** rules,

uint32_t nb_rules,

uint32_t* children_rules_sum,

uint32_t* max_rules)

{

// Array of children

uint32_t nb_children = (uint32_t) 0x1 << sum_cuts;

uint32_t* min_index = chkmalloc(sizeof(*min_index) * nb_dim_cut);

uint32_t* max_index = chkmalloc(sizeof(*max_index) * nb_dim_cut);

uint32_t* current_index = chkmalloc(sizeof(*current_index) * nb_dim_cut);

uint32_t children_array[nb_children];

for (uint32_t i = 0; i < nb_children; ++i)

children_array[i] = 0;

// For each rules we compute the number of rule each child get.

uint32_t min_value;

uint32_t max_value;

uint32_t nb_cuts;

uint32_t subregion_size;

uint32_t index;

for (uint32_t i = 0; i < nb_rules; ++i)

{

for (uint32_t j = 0; j < nb_dim_cut; ++j)

{

min_value = rules[i]->statements[dimensions[j]->id]->value;

max_value = rules[i]->statements[dimensions[j]->id]->value | rules[i]->statements[dimensions[j]->id]->mask;

nb_cuts = (uint32_t)0x1 << cuts[j];

subregion_size = (dimensions[j]->max_dim - dimensions[j]->min_dim) + 1;

subregion_size = subregion_size / nb_cuts;

if(subregion_size == 0)

continue;

// Fit the interval in the region of the dimension

if(min_value < dimensions[j]->min_dim)

min_value = dimensions[j]->min_dim;

if(max_value > dimensions[j]->max_dim)

max_value = dimensions[j]->max_dim;

// Compute the minimal and maximal index of the rule in this dimension

min_index[j] = (min_value - dimensions[j]->min_dim) / subregion_size;

max_index[j] = (max_value - dimensions[j]->min_dim) / subregion_size;

current_index[j] = min_index[j];

}

// Locate the first child

index = get_multi_dimension_index(min_index, nb_dim_cut, cuts);

children_array[index] ++;

// Locate all the other children that the rule span

while(get_next_dimension_index(current_index, min_index, max_index, nb_dim_cut))

{

index = get_multi_dimension_index(current_index, nb_dim_cut, cuts);

children_array[index]++;

}

}

// Set the return variables

uint32_t rules_sum = 0;

uint32_t max_rule_child = 0;

for (uint32_t i = 0; i < nb_children; ++i)

{

rules_sum += children_array[i];

if(max_rule_child < children_array[i])

max_rule_child = children_array[i];

}

if(max_rule_child < *max_rules || ((max_rule_child == *max_rules) && (rules_sum < *children_rules_sum)))

{

*max_rules = max_rule_child;

*children_rules_sum = rules_sum;

for (uint32_t i = 0; i < nb_dim_cut; ++i)

return_cuts[i] = cuts[i];

}

free(min_index);

free(max_index);

free(current_index);

}

gdb tell me that I got a segfault at the line

`rules_sum += children_array[i];`

`if(max_rule_child < *max_rules || ((max_rule_child == *max_rules) && (rules_sum < *children_rules_sum)))`

Another tricky thing is that if I put a fprint before the for loop, one after and one inside I run fine...

Here is what valgrind give me:

`Invalid read of size 4`

==8397== at 0x4017EE: get_combination_cuts_characteristics (hypercuts.c:775)

==8397== by 0x401947: get_optimal_cut_combination (hypercuts.c:663)

==8397== by 0x40198F: get_optimal_cut_combination (hypercuts.c:678)

==8397== by 0x40198F: get_optimal_cut_combination (hypercuts.c:678)

==8397== by 0x40198F: get_optimal_cut_combination (hypercuts.c:678)

==8397== by 0x401B78: set_nb_cuts (hypercuts.c:485)

==8397== by 0x4025B2: build_node (hypercuts.c:219)

==8397== by 0x40273D: build_node (hypercuts.c:285)

==8397== by 0x4029B2: new_hypercuts_classifier (hypercuts.c:143)

==8397== by 0x403B02: main (hypercuts_test.c:277)

==8397== Address 0x11fefff77a is not stack'd, malloc'd or (recently) free'd

==8397==

==8397==

==8397== Process terminating with default action of signal 11 (SIGSEGV)

==8397== Access not within mapped region at address 0x11FEFFF77A

==8397== at 0x4017EE: get_combination_cuts_characteristics (hypercuts.c:775)

==8397== by 0x401947: get_optimal_cut_combination (hypercuts.c:663)

==8397== by 0x40198F: get_optimal_cut_combination (hypercuts.c:678)

==8397== by 0x40198F: get_optimal_cut_combination (hypercuts.c:678)

==8397== by 0x40198F: get_optimal_cut_combination (hypercuts.c:678)

==8397== by 0x401B78: set_nb_cuts (hypercuts.c:485)

==8397== by 0x4025B2: build_node (hypercuts.c:219)

==8397== by 0x40273D: build_node (hypercuts.c:285)

==8397== by 0x4029B2: new_hypercuts_classifier (hypercuts.c:143)

==8397== by 0x403B02: main (hypercuts_test.c:277)

I'am out of idea and I come here to have help that could give me hints or new ideas. This function is called by another recursive one (build_node: I'am talking of 4 recursive calls in the case of the segmentation fault so not too many) and is executed fine 3 times before it fault. This give me the feeling that something is messing around with the stack (a pointer or an array), but I didn't found tools to profile the stack and I checked that portion of code many times.

To give a little of details about that part of the code: this intend to perform a linear optimisation on the number of cuts to perform in a multidimensional space. This particular function give the caracteristics of the cuts performed and is performed at the end of each optimisation steps.

Thanks in advance !!

Answer

This should be straightforward to debug.

I'd investigate the valgrind crash first as it tends to be more precise. The line, `if(max_rule_child < *max_rules || ((max_rule_child == *max_rules) && (rules_sum < *children_rules_sum)))`

, has two pointers being dereferenced. One or more of them are sure to be rubbish. Carefully check max_rules and children_rules_sum are pointing to a valid address. Add a debug statement and see if the values change.

The other line,`rules_sum += children_array[i];`

, seems possible as well. There doesn't appear to be a check of `index`

being less than `nb_children`

. Use the same strategy and add some debug statements of the two values. A write past the end of the Array will corrupt the stack. The stack corruption could overwrite children_rules_sum or max_rules resulting in the valgrind crash.