Smaointe - 1 year ago 70
C Question

# C Bitset Implementation Using Structs Unexpected Output

The below code, which creates and uses a bitset, is from the following tutorial "Intro to Bit Vectors". I am rewriting this code to try to learn and understand more about C structs and pointers.

``````#include <stdio.h>
#include <stdlib.h>

#define WORDSIZE 32
#define BITS_WORDSIZE 5

// Create a bitset
int initbv(int **bv, int val) {
*bv = calloc(val/WORDSIZE + 1, sizeof(int));
return *bv != NULL;
}

// Place int 'i' in the biset
void set(int bv[], int i) {
bv[i>>BITS_WORDSIZE] |= (1 << (i & MASK));
}

// Return true if integer 'i' is a member of the bitset
int member(int bv[], int i) {
int boolean = bv[i>>BITS_WORDSIZE] & (1 << (i & MASK));
return boolean;
}

int main() {
int *bv, i;

int s1[] = {32, 5, 0};
int s2[] = {32, 4, 5, 0};

initbv(&bv, 32);

// Fill bitset with s1
for(i = 0; s1[i]; i++) {
set(bv, s1[i]);
}

// Print intersection of bitset (s1) and s2
for(i = 0; s2[i]; i++) {
if(member(bv, s2[i])) {
printf("%d\n", s2[i]);
}
}

free(bv);
return 0;
}
``````

Below, is what I have rewritten to make use of structs.

``````#include <stdio.h>
#include <stdlib.h>

#define WORDSIZE 32
#define BITS_WS 5

struct bitset {
int *bv;
};

/* Create bitset that can hold 'size' items */
struct bitset * bitset_new(int size) {
struct bitset * set = malloc(sizeof(struct bitset));

set->bv = calloc(size/WORDSIZE + 1, sizeof(int));

return set;
}

/* Add an item to a bitset */
int bitset_add(struct bitset * this, int item) {
return this->bv[item>>BITS_WS] |= (1 << (item & MASK));
}

/* Check if an item is in the bitset */
int bitset_lookup(struct bitset * this, int item) {
int boolean = this->bv[item>>BITS_WS] & (1 << (item & MASK));
printf("%d\n", boolean);
return boolean;
}

int main() {
struct bitset * test = bitset_new(32);

int num = 5;

printf("%d\n", bitset_lookup(test, num));

return 0;
}
``````

What I have rewritten is not working as expected. To test the implementation, in main() I try a bitset_lookup and expect a return value of 0 or 1 but I get a value of 32. I believe it must be something to do with my use of pointers, although I cannot see what I am doing wrong.

Any help appreciated!

That is not a tutorial, it is misleading examples at best.

First of all, use an unsigned type. I recommend `unsigned long` (for various reasons, none of them critical). The `<limits.h>` header file defines constant `CHAR_BIT`, and the number of bits you can use in any unsigned integer type is always `CHAR_BIT * sizeof (unsigned_type)`.

Second, you can make the bit map (or ordered bit set) dynamically resizable, by adding the size information to the structure.

The above boils down to

``````#include <stdlib.h>
#include <limits.h>

#define ULONG_BITS (CHAR_BIT * sizeof (unsigned long))

typedef struct {
size_t         ulongs;
unsigned long *ulong;
} bitset;

#define BITSET_INIT { 0, NULL }

void bitset_init(bitset *bset)
{
if (bset) {
bset->ulongs = 0;
bset->ulong  = NULL;
}
}

void bitset_free(bitset *bset)
{
if (bset) {
free(bset->ulong);
bset->ulongs = 0;
bset->ulong  = NULL;
}
}

/* Returns: 0 if successfully set
-1 if bs is NULL
-2 if out of memory. */
int bitset_set(bitset *bset, const size_t bit)
{
if (bset) {
const size_t  i = bit / ULONG_BITS;

/* Need to grow the bitset? */
if (i >= bset->ulongs) {
const size_t   ulongs = i + 1; /* Use better strategy! */
unsigned long *ulong;
size_t         n = bset->ulongs;

ulong = realloc(bset->ulong, ulongs * sizeof bset->ulong[0]);
if (!ulong)
return -2;

/* Update the structure to reflect the changes */
bset->ulongs = ulongs;
bset->ulong  = ulong;

/* Clear the newly acquired part of the ulong array */
while (n < ulongs)
ulong[n++] = 0UL;
}

bset->ulong[i] |= 1UL << (bit % ULONG_BITS);

return 0;
} else
return -1;
}

/* Returns: 0 if SET
1 if UNSET
-1 if outside the bitset */
int bitset_get(bitset *bset, const size_t bit)
{
if (bset) {
const size_t  i = bit / ULONG_BITS;

if (i >= bset->ulongs)
return -1;

return !(bset->ulong[i] & (1UL << (bit % ULONG_BITS)));
} else
return -1;
}
``````

In a `bitset` structure, the `ulong` is a dynamically allocated array of `ulongs` `unsigned long`s. Thus, it stores `ulongs * ULONG_BITS` bits.

`BITSET_INIT` is a preprocessor macro you can use to initialize an empty bitset. If you cannot or do not want to use it, you can use `bitset_init()` to initialize a bitset. The two are equivalent.

`bitset_free()` releases the dynamic memory allocated for the bitset. After the call, the bit set is gone, and the variable used is re-initialized. (Note that it is perfectly okay to call `bitset_free()` on an un-used but initialized bit set, because calling `free(NULL)` is perfectly safe and does nothing.)

Because the OS/kernel will automatically release all memory used by a program (except for certain types of shared memory segments), it is not necessary to call `bitset_free()` just before a program exits. But, if you use bit sets as part of some algorithm, it is obviously a good practice to release the memory no longer needed, so that the application can potentially run indefinitely without "leaking" (wasting) memory.

`bitset_set()` automatically grows the bit set when necessary, but only to as large as is needed. This is not necessarily a good reallocation policy: `malloc()`/`realloc()` etc. calls are relatively slow, and if you happen to call `bitset_set()` in increasing order (by increasing bit number), you end up calling `realloc()` for every `ULONG_BITS`. Instead, it is often a good idea to adjust the new size (`ulongs`) upwards -- the exact formula you use for this is your reallocation policy --, but suggesting a good policy would require practical testing with practical programs. The shown one works, and is quite robust, but may be a bit slow in some situations. (You'd need to use at least tens of thousands of bits, though.)

The `bitset_get()` function return value is funky, because I wanted the function to return a similar value for both "unset" and "outside the bit set", because the two are logically similar. (That is, I consider the bit set, a set of set bits; in which case it is logical to think of all bits outside the set as unset.)

A much more traditional definition is

``````int bitset_get(bitset *bset, const size_t bit)
{
if (bset) {
const size_t  i = bit / ULONG_BITS;

if (i >= bset->ulongs)
return 0;

return !!(bset->ulong[i] & (1UL << (bit % ULONG_BITS)));
} else
return 0;
}
``````

which returns 1 only for bits set, and 0 for bits outside the set.

Note the `!!`. It is just two not operators, nothing too strange; making it a not-not operator. `!!x` is 0 if `x` is zero, and 1 if `x` is nonzero.

(A single not operator, `!x`, yields 1 if `x` is zero, and 0 if `x` is nonzero. Applying not twice yields the not-not I explained above.)

To use the above, try e.g.

``````int main(void)
{
bitset train = BITSET_INIT;

printf("bitset_get(&train, 5) = %d\n", bitset_get(&train, 5));

if (bitset_set(&train, 5)) {
printf("Oops; we ran out of memory.\n");
return EXIT_FAILURE;
} else
printf("Called bitset_set(&train, 5) successfully\n");

printf("bitset_get(&train, 5) = %d\n");

bitset_free(&train);

return EXIT_SUCCESS;
}
``````

Because we do not make any assumptions about the hardware or system we are running (unless I goofed somewhere; if you notice I did, let me know in the comments so I can fix my goof!), and only stuff that the C standard says we can rely on, this should work on anything you can compile the code with a standards-compliant compiler. Windows, Linux, BSDs, old Unix, macOS, and others.

With some changes, it can be made to work on microcontrollers, even. I'm not sure if all development libraries have `realloc()`; even `malloc()` might not be available. Aside from that, on things like 32-bit ARMs this should work as-is just fine; on 8-bit AVRs and such it would be a good idea to use `unsigned char` and `CHAR_BIT`, instead, since they tend to emulate the larger types rather than supporting them in hardware. (The code above would work, but be slower than necessary.)

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download