Eytan Eytan - 5 months ago 9
Linux Question

ssh + ssh is stuck on remote machine

I developed the following simple script in order to perform reboot on ~100 Linux machine

for i in $LIST_OF_LINUX_MACHINES
do
ssh $LINUX_MACHINE /var/tmp/restart.sh
done


after running this script couple of times ,

some times the ssh process is stuck! ( and loop is stuck on current machine )-

so how it can be - what could be the reason that in rare case ssh is stuck

and how to avoid that?

Answer

I'd suggest something rather different -- instead of having a fixed delay between instances, having a fixed maximum number of instances to run at a time. For instance, with that value at 25:

numprocs=25
timeout=5
xargs -P "$numprocs" -J '{}' -n 1 -- \
  perl -e 'alarm shift; exec @ARGV' -- "$timeout" \
    ssh -nxaq -o ConnectTimeout=5 -o StrictHostKeyChecking=no '{}' /tmp/reboot.sh \
  <hostnames # if a file; use < <(awk ...) if a script providing per-line info

Note that -J {} is an extension which avoids bugs implicit in the specification for the (standards-mandated) -I {} xargs behavior. If it's not available, -I '{}' can be used instead -- but do read the man page to understand caveats.