Zerowalker Zerowalker - 14 days ago 5
C# Question

Using fixed char[] to copy data and creating string from it, not always using entire char[], safe?

So i noticed that manipulating

Strings
is super slow when it comes to anything that causes it to resize, basically removing or adding character (in my case removing).

So i figured that using a
stackalloc
or fixed temporary buffer and just copy all data to it except what i don't want equals the same thing as removing.

But i need to allocate the same length for this buffer, cause that's the limit,
it can never be greater than it, but it is surely lower.

So here is the code, I wonder if this way of doing it is actually safe,
cause there can be much of the buffer that's never used.

//Remove all unneccessery empty spaces
private unsafe static string FormatCodeUnsafe(string text)
{

int length = text.Length;
var charbuffer = new char[length];
int index = 0;
fixed (char* charbuf = charbuffer)
fixed (char* strptr = text)
{
char* charptr = charbuf;
for (int i = 0; i < length; i++)
{
char c = strptr[i];

if (i > 0)
{
if (c == ' ' && strptr[i - 1] == ',')
continue;

if (c == ' ' && strptr[i - 1] == ')')
continue;
if (c == ' ' && strptr[i - 1] == ' ')
continue;
}
if (i < length - 1)
{
if (c == ' ' && strptr[i + 1] == ' ')
continue;
if (c == ' ' && strptr[i + 1] == ',')
continue;
if (c == ' ' && strptr[i + 1] == '(')
continue;
}

*charptr = c;
charptr++;
index++;
}
}
//Return the result
return new string(charbuffer, 0, index);
}


EDIT:

Hard to choose between the answers as both give good examples and explanation.
I would like to choose both for helping out, but well i have to choose one.

Thanks!:)

Answer

Manipulating strings is slow, because strings are immutable - each time you add concatenate or replace parts of the string a new string gets created.

Because string manipulation is very common, there is another class in the .NET Framework - StringBuilder, which allows you to do this very efficiently (it is mutable) and when you are done, you can get the resulting string by calling the ToString() method on the StringBuilder instance.

Your code could look like this:

private static readonly char[] SkipCharacters = new[] {',', '(', ')'};

//Remove all unneccessery empty spaces
private static string FormatCode(string text)
{
    StringBuilder builder = new StringBuilder();
    for (int i = 0; i < text.Length; i++)
    {
        var character = text[i];
        //set defaults - so that we do not have to check
        //for the start and end of the string
        char previous = 'x';
        char next = 'x';
        if (i > 0)
        {
            previous = text[i - 1];
        }
        if (i < text.Length - 1)
        {
            next = text[i + 1];
        }
        if ( character == ' ' &&
                SkipCharacters.Contains( previous ) ||
                SkipCharacters.Contains( next ) )
        {
            continue;
        }
        builder.Append( character );
    }
    return builder.ToString();
}

Using unsafe code might be a little faster than this managed approach, but the performance gain is hindered by the fact that you could potentially be wasting a lot of space (for the whole text-sized array) and are using potentially dangerous and less maintainable code. That said, if your benchmarks show unsafe has significantly better performance, there is nothing stopping you from using it if you are careful :-) .