Arty Arty - 6 months ago 29
Python Question

Transmission bytearray from Python to C and return it

I need fast processing of XOR bytearray,
In a variant on Python

for i in range(len(str1)): str1[i]=str1[i] ^ 55


works very slow

I wrote this module in C.
I know C language very badly, before I on it wrote nothing.

In a variant

PyArg_ParseTuple (args, "s", &str))


everything works as expected, but I need to use instead of s s* because elements can contain embeded null, but if I change s to s* when calling python crash

PyArg_ParseTuple (args, "s*", &str)) // crash


Maybe some beginner like me want to use my example as a start to write something of his own, so bring all the information to be used in this example on Windows.

Parsing arguments and building values on page http://docs.python.org/dev/c-api/arg.html

test_xor.c

#include <Python.h>

static PyObject* fast_xor(PyObject* self, PyObject* args)
{
const char* str ;
int i;

if (!PyArg_ParseTuple(args, "s", &str))
return NULL;

for(i=0;i<sizeof(str);i++) {str[i]^=55;};
return Py_BuildValue("s", str);

}

static PyMethodDef fastxorMethods[] =
{
{"fast_xor", fast_xor, METH_VARARGS, "fast_xor desc"},
{NULL, NULL, 0, NULL}
};

PyMODINIT_FUNC

initfastxor(void)
{
(void) Py_InitModule("fastxor", fastxorMethods);
}


test_xor.py

import fastxor
a=fastxor.fast_xor("World") # it works with s instead s*
print a
a=fastxor.fast_xor("Wo\0rld") # It does not work with s instead s*


compile.bat

rem use http://bellard.org/tcc/
tiny_impdef.exe C:\Python26\python26.dll
tcc -shared test_xor.c python26.def -IC:\Python26\include -LC:\Python26\libs -ofastxor.pyd
test_xor.py

Answer

You don't need build an extension module to do this quickly, you can use NumPy. But for your question, you need some c code like this:

#include <Python.h>
#include <stdlib.h> 

static PyObject * fast_xor(PyObject* self, PyObject* args)
{
    const char* str;
    char * buf;
    Py_ssize_t count;
    PyObject * result;
    int i;

    if (!PyArg_ParseTuple(args, "s#", &str, &count))
    {
        return NULL;
    }

    buf = (char *)malloc(count);

    for(i=0;i<count;i++)
    {
        buf[i]=str[i] ^ 55;
    }

    result = Py_BuildValue("s#", buf, count);
    free(buf);
    return result;
}

You can't change the content of string object, because string in Python is immutable. You can use "s#" to get the char * pointer and the buffer length.

If you can use NumPy:

In [1]: import fastxor

In [2]: a = "abcdsafasf12q423\0sdfasdf"

In [3]: fastxor.fast_xor(a)
Out[3]: 'VUTSDVQVDQ\x06\x05F\x03\x05\x047DSQVDSQ'


In [5]: import numpy as np

In [6]: (np.frombuffer(a, np.int8)^55).tostring()
Out[6]: 'VUTSDVQVDQ\x06\x05F\x03\x05\x047DSQVDSQ'

In [7]: a = a*10000

In [8]: %timeit fastxor.fast_xor(a)
1000 loops, best of 3: 877 us per loop

In [15]: %timeit (np.frombuffer(a, np.int8)^55).tostring()
1000 loops, best of 3: 1.15 ms per loop