gabe appleton gabe appleton - 2 months ago 21
C Question

Issues translating C++ string manipulation to C

So I have a small portion of my program that is doing base conversion. In this case from a byte buffer (base 256) to base 58.

I'm trying to translate this portion into C, so that when I need to write other implementations of it I can just reuse the same code.

From the original C++:

static unsigned int divide_58(string& x) {
const size_t length = x.length();
size_t pos = 0;
char *quotient = new char[length];

for (size_t i = 0; i < length; ++i) {
const size_t j = i + 1 + x.length() - length;
if (x.length() < j)
break;

const unsigned int value = base2dec(x.c_str(), j); //defined elsewhere; consistent in both

quotient[pos] = (unsigned char)(value / 58);
if (pos != 0 || quotient[pos] != ascii[0])
pos++;

size_t len = 4;
char *temp_str = dec2base(value % 58, len); //defined elsewhere; consistent in both
x.replace(0, j, temp_str, len); //Replace the contents at 0 thru j with the whole contents of temp_str, moving things appropriately
free(temp_str);
}

// calculate remainder
const unsigned int remainder = base2dec(x.c_str(), x.length()); //defined elsewhere; consistent in both

// remove leading "zeros" from quotient and store in 'x'
x.assign(quotient, quotient + pos);

return remainder;
}


I translated this to the following bit of C:

static unsigned int divide_58(char *x, size_t &length) {
const size_t const_length = length;
size_t pos = 0;
char *quotient = (char*) malloc(sizeof(char) * const_length);

for (size_t i = 0; i < const_length; ++i) {
const size_t j = i + 1 + length - const_length;
if (length < j)
break;

const unsigned int value = base2dec(x, j); //defined elsewhere; consistent in both

quotient[pos] = (unsigned char)(value / 58);
if (pos != 0 || quotient[pos] != ascii[0])
pos++;

size_t len = 4;
char *temp_str = dec2base(value % 58, len); //defined elsewhere; consistent in both
memcpy(x, temp_str, len);
free(temp_str);

memmove(x + len, x + j, length - j);
length -= j;
length += len;
}

// calculate remainder
const unsigned int remainder = base2dec(x, length); //defined elsewhere; consistent in both

// remove leading "zeros" from quotient and store in 'x'
memcpy(x, quotient, pos);
free(quotient);
length = pos;

return remainder;
}


This works in almost all cases, but on my Linux testing environment (and none of my local machines) it produces the wrong answer, despite agreeing that the input is correct.

Failing example: https://app.shippable.com/runs/57cf7ae56f908e0e00c5e451/1/console (build_ci -> make cpytest cov=true)

Working example: https://travis-ci.org/gappleto97/p2p-project/jobs/158036360#L392

I know that the standard is to provide the shortest example of the problem, but as near as I can tell this is the shortest example. Can y'all help me out?

For the MCVE folks, you can verify this yourself via my git repo.

git clone https://github.com/gappleto97/p2p-project
cd p2p-project
git checkout develop
make cpytest
git checkout c_issue
rm -r build
make cpytest


The first call to make will have failing tests. The second will not. The second is using the C++ code provided here, the first is using the C code provided here. It's been bumped up to Python for ease of testing, but I've narrowed it down to this function. That will probably be useless though because I can only replicate the bug on Shippable.

Answer

You have translated this C++ line:

    x.replace(0, j, temp_str, len);

into this C code:

    memcpy(x, temp_str, len);
    memmove(x + len, x + j, length - j);
    length -= j;
    length += len;

That is not equivalent code.

One thing is that you never add a zero-termination to x.

Another thing is that the code produce completely different strings when j is less than 4.

Look at this C++ code:

#include <iostream>
#include <string>
#include <string.h>
using namespace std;

int main() {
    string s = "abcdefgh";
    char t[10] = "01234567";
    cout << "Original: " << s << endl;

    int j = 2;

    // The c++ way
    s.replace(0, j, t, 4);
    cout << "C++: " << s << endl;

    // Your C way
    char x[10] = "abcdefgh";
    size_t length = strlen(x);
    memcpy(x, t, 4);
    memmove(x + 4, x + j, length - j);
    cout << "C  : " << x << endl;

    return 0;
}

Output:

Original: abcdefgh

C++: 0123cdefgh

C : 012323efgh

The same code with j = 6:

Output:

Original: abcdefgh

C++: 0123gh

C : 0123ghgh

Conclusion Your C code for C++ replace is not working.