Felipe Santos Felipe Santos - 5 months ago 7x
Python Question

How to safely manipulate user code in Python?

If I were to make, for example, a program that takes as input a sorting algorithm and determines empirically whether or not the algorithm is partially correct, how would I convert the string input to an executable program? I have read other threads in which it was suggested using exec or eval, but all answers recommended against using this method due to security risks. Is there a way to create such a program that does not involve converting a string to executable code? Or will it inherently be a risky program no matter the implementation? Lastly, is there another programming language that would be a better alternative to define such a program?


Executing Arbitrary Code

No matter what language you choose, if you read code from the user and execute that code, it will be dangerous. No ifs, ands, or buts. You notice the same caveats to Python's exec and eval also are noted for Javascript, PHP, and many other languages.

Safely Executing Code from a String

There are safe ways to map strings to predefined functions, but there is no safe way to compile/interpret and execute arbitrary code.

One good example is the following on how to safely map functions to a string:

functions = {
    'print': print,
    'str': str,
    'int': int

name = input('Choose from the above functions here')

Static Code Analysis

And for the final answer, potentially, but no, as there would be ways of evaluating a sorting algorithm, but they're unlikely to be effective, reproducible, or accurate without compiling the code or at least interpreting it. Static code analysis is difficult, and can only go so far.

One simple example for how difficult static code analysis can be even with a single if statement is the following:

for index, value in enumerate(range(10)):
    if index or old - value == 1:
     old = value

Some libraries that do static code analysis think this code will raise an error (such as Pylint, for example), because old is defined after it is first used, however, since bool(0) will evaluate to False, old actually only ever checked after the first loop, after it is already defined, and so the code runs without issue.

Think of the complexity of inputs, the complexity of outputs, and the number of variations of possible sort algorithms that would all be equivalent. The easiest way to test code is to run it. There are limitations of dynamic code analysis, but with a given input and then comparing it to the desired output, you can get a good idea if the code works as it should, something that is very difficult with merely static analysis.