I have a list of numbers. This list is stored in two ways: either as an in-memory python object, or as a redis list (redis set up in the same server).
I'm comparing the time it takes to retrieve these two lists, using python's
POOL = redis.ConnectionPool(host='127.0.0.1',port=6379,db=0)
my_server = redis.Redis(connection_pool=POOL)
print min(timeit.Timer('pylist1 = my_server.lrange("nums:5",0,-1)', setup='from __main__ import my_server').repeat(7,1000))
pylist = my_server.lrange("nums:5",0,-1)
print min(timeit.Timer('pylist2 = pylist',setup='from __main__ import pylist').repeat(7,1000))
In the comparison you've put up here, you're basically just measuring how long Python takes to bind a new name to a value in the second case. So it doesn't surprise me that this is vastly faster than communicating with a different process (Redis). I guess what surprises me is that you would consider getting a value from Redis if the option exists simply to keep it in memory.
So, you need to be more clear about why you are using Redis for this in the first place. It will always be slower than in-process memory, no benchmark needed for that. You need to ask "why am I not just using Python lists and dictionaries"? There are several valid answers: your data is too large to fit into memory, you require the cache-specific features like allowing values to disappear after a while, or you want to use it for IPC, or persistence. Once you know the answer here, that will inform the benchmarking you want to do. And the question will be more like "How do I obtain the benefits/features I have listed above for the least performance penalty". Redis may not be the only answer. You may consider
shelf for persistence, or perhaps even a full-on relational database or Mongo or whatever.
In short, once you have a good idea of why, the how often solves itself.