Rolf of Saxony - 1 year ago 77
Python Question

# Why does "".join() appear to be slower than +=

Despite this question Why is ''.join() faster than += in Python? and it's answers and this great explanation of the code behind the curtain: https://paolobernardi.wordpress.com/2012/11/06/python-string-concatenation-vs-list-join/

My tests suggest otherwise and I am baffled.

Am I doing something simple, incorrectly? I'll admit that I'm fudging the creation of x a bit but I don't see how that would affect the outcome.

``````import time
x="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
y=""
t1 = (time.time())
for i in range(10000):
y+=x
t2 = (time.time())
#print (y)
print (t1,t2,"=",t2-t1)
``````

(1473524757.681939, 1473524757.68521, '=', 0.0032711029052734375)

``````import time
x="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
y=""
t1 = (time.time())
for i in range(10000):
y=y+x
t2 = (time.time())
#print (y)
print (t1,t2,"=",t2-t1)
``````

(1473524814.544177, 1473524814.547544, '=', 0.0033669471740722656)

``````import time
x=10000*"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
y=""
t1 = (time.time())
y= "".join(x)
t2 = (time.time())
#print (y)
print (t1,t2,"=",t2-t1)
``````

(1473524861.949515, 1473524861.978755, '=', 0.029239892959594727)

As can be seen the
`"".join()`
is much slower and yet we're told that it's meant to be quicker.

These values are very similar in both python2.7 and python3.4

Edit:
Ok fair enough, I delete this once it's reached the maximum number of down votes.
The "one huge string" thing is the kicker.

``````import time
x=[]
for i in range(10000):
x.append("xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")
y=""
t1 = (time.time())
y= "".join(x)
t2 = (time.time())
#print (y)
print (t1,t2,"=",t2-t1)
``````

(1473526344.55748, 1473526344.558409, '=', 0.0009288787841796875)

An order of magnitude quicker.
Mia Culpa!

You called `''.join()` on one huge string, not a list (multiplying a string produces a larger string). This forces `str.join()` to iterate over that huge string, joining 74k individual `'x'` characters. In other words, your second test does 74 times more work than your first.

To conduct a fair trial, you need to start with the same inputs for both, and use the `timeit` module to reduce the influence of garbage collection and other processes on your system.

That means both approaches need to work from a list of strings (your assignment examples rely on repeatedly adding a string literal, stored as a constant):

``````from timeit import timeit

testlist = ['x' * 74 for _ in range(100)]

def strjoin(testlist):
return ''.join(testlist)

def inplace(testlist):
result = ''
for element in testlist:
result += element
return result

def concat(testlist):
result = ''
for element in testlist:
result = result + element
return result

for f in (strjoin, inplace, concat):
timing = timeit('f(testlist)', 'from __main__ import f, testlist',
number=100000)
print('{:>7}: {}'.format(f.__name__, timing))
``````

On my Macbook Pro, on Python 3.5, this produces:

``````strjoin: 0.09923043003072962
inplace: 1.0032496969797648
concat: 1.0027298880158924
``````

On 2.7, I get:

``````strjoin: 0.118290185928
inplace: 0.85814499855
concat: 0.867822885513
``````

`str.join()` is still the winner here.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download