AK47 - 6 months ago 43

Python Question

Q. Write an algorithm that returns the second largest number in an array

`a = [1, 2, 3, 4, 5]`

print(max([x for x in a if x != max(a)]))

>> 4

I'm trying to figure out how this algorithm works and whether or not pythons internal magic will make this as efficient as writing a linear algorithm which just loops over the list

`a`

Correct me if I'm wrong:

- The call to would be O(n)
`max(a)`

- would also be O(n)
`[x for x in a]`

Would python be smart enough to cache the value of

`max(a)`

And then the final

`max([listcomp])`

Is there any fancy business going on internally that would cache the

`max(a)`

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

The easy way to find out is timing it. Consider this timing code:

```
for i in range(1, 5):
a = list(range(10**i))
%timeit max([x for x in a if x != max(a)])
```

17.6 µs ± 178 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) 698 µs ± 14.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 61.6 ms ± 340 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) 6.31 s ± 167 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Each time it multiplied the number of elements by 10 the runtime increased by 100. That's almost certainly `O(n**2)`

. For an `O(n)`

algorithm the runtime would increase linearly with the number of elements:

```
for i in range(1, 6):
a = list(range(10**i))
max_ = max(a)
%timeit max([x for x in a if x != max_])
```

4.82 µs ± 27.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) 29 µs ± 161 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) 262 µs ± 3.89 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 2.42 ms ± 13 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 24.9 ms ± 231 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

But I'm not certain the algorithm really does what is asked. Consider the list `a=[1,3,3]`

even the `heapq`

module tells me that the second largest element is `3`

(not 1 - what your algorithm returns):

```
import heapq
>>> heapq.nlargest(2, [1,3,3])[0]
3
```

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**