Jean-Paul Calderone Jean-Paul Calderone - 8 months ago 20
Python Question

How do I drive Ansible programmatically and concurrently?

I would like to use Ansible to execute a simple job on several remote nodes concurrently. The actual job involves grepping some log files and then post-processing the results on my local host (which has software not available on the remote nodes).

The command line ansible tools don't seem well-suited to this use case because they mix together ansible-generated formatting with the output of the remotely executed command. The Python API seems like it should be capable of this though, since it exposes the output unmodified (apart from some potential unicode mangling that shouldn't be relevant here).

A simplified version of the Python program I've come up with looks like this:

from sys import argv
import ansible.runner
runner = ansible.runner.Runner(
pattern='*', forks=10,
sleep 10
results =

sleep 10
stands in for the actual log grepping command - the idea is just to simulate a command that's not going to complete immediately.

However, upon running this, I observe that the amount of time taken seems proportional to the number of hosts in my inventory. Here are the timing results against inventories with 2, 5, and 9 hosts respectively:

exarkun@top:/tmp$ time python two-hosts.inventory
real 0m24.285s
user 0m0.216s
sys 0m0.120s
exarkun@top:/tmp$ time python five-hosts.inventory
real 0m55.120s
user 0m0.224s
sys 0m0.160s
exarkun@top:/tmp$ time python nine-hosts.inventory
real 1m57.272s
user 0m0.360s
sys 0m0.284s

Some other random observations:

  • ansible all --forks=10 -i five-hosts.inventory -m command -a "sleep 10"
    exhibits the same behavior

  • ansible all -c local --forks=10 -i five-hosts.inventory -m command -a "sleep 10"
    appears to execute things concurrently (but only works for local-only connections, of course)

  • ansible all -c paramiko --forks=10 -i five-hosts.inventory -m command -a "sleep 10"
    appears to execute things concurrently

Perhaps this suggests the problem is with the ssh transport and has nothing to do with using ansible via the Python API as opposed to from the comand line.

What is wrong here that prevents the default transport from taking only around ten seconds regardless of the number of hosts in my inventory?


Some investigation reveals that ansible is looking for the hosts in my inventory in ~/.ssh/known_hosts. My configuration has HashKnownHosts enabled. ansible isn't ever able to find the host entries it is looking for because it doesn't understand the hash known hosts entry format.

Whenever ansible's ssh transport can't find the known hosts entry, it acquires a global lock for the duration of the module's execution. The result of this confluence is that all execution is effectively serialized.

A temporary work-around is to give up some security and disabled host key checking by putting host_key_checking = False into ~/.ansible.cfg. Another work-around is to use the paramiko transport (but this is incredibly slow, perhaps tens or hundreds of times slower than the ssh transport, for some reason). Another work-around is to let some unhashed entries get added to the known_hosts file for ansible's ssh transport to find.