Federico Federico - 3 months ago 19
C Question

How to Hash CPU ID in C

I'm trying to short the cpu id of my microcontroller (STM32F1).

The cpu id is composed by 3 word ( 3 x 4 bytes). This is the id string built from the 3 word:

980416578761680031125348904


I found a very useful library that do this.

The library is Hashids and there is a C code.

I try to build a test code on PC with "Code Blocks IDE" and the code works.

But when I move the code into the embedded side (Keil v5 IDE), I get an error on strdup() function: "strdup implicit declaration of function".

The problem is related to the strdup function isn't a standard library function and ins't included into string.h.

I will avoid to replace the strdup function with a custom function (that mimic the behaviour of strdup) to avoid memory leak because strdup copy strings using malloc.

Is there a different approach to compress long numbers?

Thanks for the help!

<---Appendix--->

This is the function that uses the strdup.

/* common init */
struct hashids_t *
hashids_init3(const char *salt, size_t min_hash_length, const char *alphabet)
{
struct hashids_t *result;
unsigned int i, j;
size_t len;
char ch, *p;

hashids_errno = HASHIDS_ERROR_OK;

/* allocate the structure */
result = _hashids_alloc(sizeof(struct hashids_t));
if (HASHIDS_UNLIKELY(!result)) {
hashids_errno = HASHIDS_ERROR_ALLOC;
return NULL;
}

/* allocate enough space for the alphabet and its copies */
len = strlen(alphabet) + 1;
result->alphabet = _hashids_alloc(len);
result->alphabet_copy_1 = _hashids_alloc(len);
result->alphabet_copy_2 = _hashids_alloc(len);
if (HASHIDS_UNLIKELY(!result->alphabet || !result->alphabet_copy_1
|| !result->alphabet_copy_2)) {
hashids_free(result);
hashids_errno = HASHIDS_ERROR_ALLOC;
return NULL;
}

/* extract only the unique characters */
result->alphabet[0] = '\0';
for (i = 0, j = 0; i < len; ++i) {
ch = alphabet[i];
if (!strchr(result->alphabet, ch)) {
result->alphabet[j++] = ch;
}
}
result->alphabet[j] = '\0';

/* store alphabet length */
result->alphabet_length = j;

/* check length and whitespace */
if (result->alphabet_length < HASHIDS_MIN_ALPHABET_LENGTH) {
hashids_free(result);
hashids_errno = HASHIDS_ERROR_ALPHABET_LENGTH;
return NULL;
}
if (strchr(result->alphabet, ' ')) {
hashids_free(result);
hashids_errno = HASHIDS_ERROR_ALPHABET_SPACE;
return NULL;
}

/* copy salt */
result->salt = strdup(salt ? salt : HASHIDS_DEFAULT_SALT);
result->salt_length = (unsigned int) strlen(result->salt);

/* allocate enough space for separators */
result->separators = _hashids_alloc((size_t)
(ceil((float)result->alphabet_length / HASHIDS_SEPARATOR_DIVISOR) + 1));
if (HASHIDS_UNLIKELY(!result->separators)) {
hashids_free(result);
hashids_errno = HASHIDS_ERROR_ALLOC;
return NULL;
}

/* non-alphabet characters cannot be separators */
for (i = 0, j = 0; i < strlen(HASHIDS_DEFAULT_SEPARATORS); ++i) {
ch = HASHIDS_DEFAULT_SEPARATORS[i];
if ((p = strchr(result->alphabet, ch))) {
result->separators[j++] = ch;

/* also remove separators from alphabet */
memmove(p, p + 1,
strlen(result->alphabet) - (p - result->alphabet));
}
}

/* store separators length */
result->separators_count = j;

/* subtract separators count from alphabet length */
result->alphabet_length -= result->separators_count;

/* shuffle the separators */
hashids_shuffle(result->separators, result->separators_count,
result->salt, result->salt_length);

/* check if we have any/enough separators */
if (!result->separators_count
|| (((float)result->alphabet_length / (float)result->separators_count)
> HASHIDS_SEPARATOR_DIVISOR)) {
unsigned int separators_count = (unsigned int)ceil(
(float)result->alphabet_length / HASHIDS_SEPARATOR_DIVISOR);

if (separators_count == 1) {
separators_count = 2;
}

if (separators_count > result->separators_count) {
/* we need more separators - get some from alphabet */
int diff = separators_count - result->separators_count;
strncat(result->separators, result->alphabet, diff);
memmove(result->alphabet, result->alphabet + diff,
result->alphabet_length - diff + 1);

result->separators_count += diff;
result->alphabet_length -= diff;
} else {
/* we have more than enough - truncate */
result->separators[separators_count] = '\0';
result->separators_count = separators_count;
}
}

/* shuffle alphabet */
hashids_shuffle(result->alphabet, result->alphabet_length,
result->salt, result->salt_length);

/* allocate guards */
result->guards_count = (unsigned int) ceil((float)result->alphabet_length
/ HASHIDS_GUARD_DIVISOR);
result->guards = _hashids_alloc(result->guards_count + 1);
if (HASHIDS_UNLIKELY(!result->guards)) {
hashids_free(result);
hashids_errno = HASHIDS_ERROR_ALLOC;
return NULL;
}

if (HASHIDS_UNLIKELY(result->alphabet_length < 3)) {
/* take some from separators */
strncpy(result->guards, result->separators, result->guards_count);
memmove(result->separators, result->separators + result->guards_count,
result->separators_count - result->guards_count + 1);

result->separators_count -= result->guards_count;
} else {
/* take them from alphabet */
strncpy(result->guards, result->alphabet, result->guards_count);
memmove(result->alphabet, result->alphabet + result->guards_count,
result->alphabet_length - result->guards_count + 1);

result->alphabet_length -= result->guards_count;
}

/* set min hash length */
result->min_hash_length = min_hash_length;

/* return result happily */
return result;
}

Answer

The true question seems to be

Is there a different approach to compress long numbers?

There are many. They differ in several respects, including which bits of the input contribute to the output, how many inputs map to the same output, and what manner of transformations of the input leave the output unchanged.

As a trivial examples, you can compress the input to a single bit by any of these approaches:

  • Choose the lowest-order bit of the input
  • Choose the highest-order bit of the input
  • The output is always 1
  • etc

Or you can compress to 7 bits by using using the number of 1 bits in the input as the output.

None of those particular options is likely to be of interest to you, of course.

Perhaps you would be more interested in producing 32-bit outputs for your 96-bit inputs. Do note that in that case on average there will be at least 264 possible inputs that map to each possible output. That depends only on the sizes of input and output, not on any details of the conversion.

For example, suppose that you have

uint32_t *cpuid = ...;

pointing to the hardware CPU ID. You can produce a 32-bit value from it that depends on all the bits of the input simply by doing this:

uint32_t cpuid32 = cpuid[0] ^ cpuid[1] ^ cpuid[2];

Whether that would suit your purpose depends on how you intend to use it.