user3395103 user3395103 - 5 months ago 20
Perl Question

How to wait for running process to complete in perl when running process is not child process?

I am going through a Perl script which is using

waitpid($pid, 0)
to wait for current process to complete.
statement written right after this
is printing it before the process gets complete.

I want to know why
is not waiting for process to complete first.

Also, control of running process is under different module, not part of this perl script. Only accessible is pid and name of the process. I can't change anything in the module which invokes the process.


Update A simple-minded one-line approach with kill 0, $pid added at end, commented.

We need to detect completion of an external program, which had not been started by this script. The question asks about using waitpid. To copy my early comment:

You cannot. You can only wait on a child process. See perldoc wait (or waitpid, it's the same), first sentence.

The wait and waitpid wait for signals delivered to the script regarding the fate of its child(ren). There is no reason for the script to receive such signals about processes that it did not start.

We know the process's id and its name. Its PID can be used to poll for whether it is running. Using pid on its own is not completely reliable since in between our checks the process can finish and a random new one be assigned the same pid. We can use the program's name to strengthen this.

On a Linux system information about a process can be obtained by utilizing (the many) ps options. Either of these returns the program's full invocation

ps --no-headers -o cmd PID
ps --no-headers -p PID -o cmd

The returned string may start with the interpreter's path (for a Perl script, for example), followed by the program's full name. The version ps -p PID -o comm= returns only the program's name, but I find that it may break that word on a hyphen (if there), resulting in an incomplete name. This may need tweaking on some systems, please consult your man ps. If there is no process with given PID we get nothing back.

Then we can check for PID and if found check whether the name for that PID matches the program. The program's name is known and we could just hardcode that. However, it is still obtained by the script as it starts, using the above ps command, to avoid ambiguities. (Then it is also in the same format for later comparison.) This itself is checked against the known name since there is no guarantee that the PID at the time of script execution is indeed for the expected program.

use warnings;
use strict;

# For testing. Retrieve your PID as appropriate for real use    
my $ext_pid = $ARGV[0] || $$;

my $cmd_get_name = "ps --no-headers -o cmd $ext_pid";

# For testing.  Replace 'sleep' by your program name for real use
my $known_prog_name = 'sleep';

# Get the name of the program with PID
my $prog_name = qx($cmd_get_name);

# Test against the known name, exit if there is a mismatch
if ($prog_name !~ $known_prog_name) {
    warn "Mismatch between:\n$prog_name\n$known_prog_name -- $!";

my $name;
while ( $name = qx($cmd_get_name) and $name =~ /$prog_name/ )
    print "Sleeping 1 sec ... \n";
    sleep 1;
# regex above may need slight adjustment, depending on format of ps return

The command output received via qx() above (backtick operator) contains a newline. If that proves to be a problem in what the script does it can be chomp-ed, which would require a slight adjustment. The remaining loophole is that that very program may have finished and was restarted between the checks, and with the same PID.

This would be tested by running in a shell

sleep 30 & `ps aux | grep sleep | grep -v grep`

The output from `ps ...` contains multiple words. These are passed as command line arguments to our script, which uses the first one as the PID. If there is a problem with it run the ps filtering first and then manually enter the PID as script's input argument. The sleep of 30 seconds above is to give enough time to do all this on the command line.

The code can be simplified by matching $name with a hard-coded $prog_name, if the program name is unique enough.

If the process is owned by the same user as the script one can use kill 0, $pid, as

while ( kill 0, $ext_pid ) { sleep 1 }

Then you'd either have to make another call to check the name or be content with the (small) possibility of an error in what actual process the $pid represents.

The module Proc::ProcessTable can be used for all of this instead. On a Windows system this would be the way to go.