NinjaKC NinjaKC - 28 days ago 15
Linux Question

PHP pthreads failing when run from cron

Ok, so lets start slow...

I have a pthreads script running and working for me, tested and working 100% of the time when I run it manually from the command line via ssh. The script is as follows with the main thread process code adjusted to simulate random process' run time.

class ProcessingPool extends Worker {
public function run(){}
}
class LongRunningProcess extends Threaded implements Collectable {
public function __construct($id,$data) {
$this->id = $id;
$this->data = $data;
}

public function run() {
$data = $this->data;
$this->garbage = true;

$this->result = 'START TIME:'.time().PHP_EOL;

// Here is our actual logic which will be handled within a single thread (obviously simulated here instead of the real functionality)
sleep(rand(1,100));

$this->result .= 'ID:'.$this->id.' RESULT: '.print_r($this->data,true).PHP_EOL;
$this->result .= 'END TIME:'.time().PHP_EOL;

$this->finished = time();
}
public function __destruct () {
$Finished = 'EXITED WITHOUT FINISHING';
if($this->finished > 0) {
$Finished = 'FINISHED';
}

if ($this->id === null) {
print_r("nullified thread $Finished!");
} else {
print_r("Thread w/ ID {$this->id} $Finished!");
}
}

public function isGarbage() : bool { return $this->garbage; }

public function getData() {
return $this->data;
}
public function getResult() {
return $this->result;
}

protected $id;
protected $data;
protected $result;
private $garbage = false;
private $finished = 0;
}

$LoopDelay = 500000; // microseconds
$MinimumRunTime = 300; // seconds (5 minutes)

// So we setup our pthreads pool which will hold our collection of threads
$pool = new Pool(4, ProcessingPool::class, []);

$Count = 0;

$StillCollecting = true;
$CountCollection = 0;
do {

// Grab all items from the conversion_queue which have not been processed
$result = $DB->prepare("SELECT * FROM `processing_queue` WHERE `processed` = 0 ORDER BY `queue_id` ASC");
$result->execute();
$rows = $result->fetchAll(PDO::FETCH_ASSOC);

if(!empty($rows)) {

// for each of the rows returned from the queue, and allow the workers to run and return
foreach($rows as $id => $row) {
$update = $DB->prepare("UPDATE `processing_queue` SET `processed` = 1 WHERE `queue_id` = ?");
$update->execute([$row['queue_id']]);

$pool->submit(new LongRunningProcess($row['fqueue_id'],$row));

$Count++;
}
} else {
// 0 Rows To Add To Pool From The Queue, Do Nothing...
}


// Before we allow the loop to move on to the next part, lets try and collect anything that finished
$pool->collect(function ($Processed) use(&$CountCollection) {
global $DB;

$data = $Processed->getData();
$result = $Processed->getResult();


$update = $DB->prepare("UPDATE `processing_queue` SET `processed` = 2 WHERE `queue_id` = ?");
$update->execute([$data['queue_id']]);

$CountCollection++;

return $Processed->isGarbage();
});
print_r('Collecting Loop...'.$CountCollection.'/'.$Count);


// If we have collected the same total amount as we have processed then we can consider ourselves done collecting everything that has been added to the database during the time this script started and was running
if($CountCollection == $Count) {
$StillCollecting = false;
print_r('Done Collecting Everything...');
}

// If we have not reached the full MinimumRunTime that this cron should run for, then lets continue to loop
$EndTime = microtime(true);
$TimeElapsed = ($EndTime - $StartTime);
if(($TimeElapsed/($LoopDelay/1000000)) < ($MinimumRunTime/($LoopDelay/1000000))) {
$StillCollecting = true;
print_r('Ended To Early, Lets Force Another Loop...');
}

usleep($LoopDelay);

} while($StillCollecting);

$pool->shutdown();


So while the above script will run via a command line (which has been adjusted to the basic example, and detailed processing code has been simulated in the above example), the below command gives a different result when run from a cron setup for every 5 minutes...

/opt/php7zts/bin/php -q /home/account/cron-entry.php file=every-5-minutes/processing-queue.php


The above script, when using the above command line call, will loop over and over during the run time of the script and collect any new items from the DB queue, and insert them into the pool, which allows 4 processes at a time to run and finish, which is then collected and the queue is updated before another loop happens, pulling any new items from the DB. This script will run until we have processed and collected all processes in the queue during the execution of the script. If the script has not run for the full 5 minute expected period of time, the loop is forced to continue checking the queue, if the script has run over the 5 minute mark it allows any current threads to finish & be collected before closing. Note that the above code also includes a code based "flock" functionality which makes future crons of this idle loop and exit or start once the lock has lifted, ensuring that the queue and threads are not bumping into each other. Again, ALL OF THIS WORKS FROM THE COMMAND LINE VIA SSH.

Once I take the above command, and put it into a cron to run for every 5 minutes, essentially giving me a never ending loop, while maintaining memory, I get a different result...

That result is described as follows... The script starts, checks the flock, and continues if the lock is not there, it creates the lock, and runs the above script. The items are taken from the queue in the DB, and inserted into the pool, the pool fires off the 4 threads at a time as expected.. But the unexpected result is that the run() command does not seem to be executed, and instead the __destruct function runs, and a "Thread w/ ID 2 FINISHED!" type of message is returned to the output. This in turn means that the collection side of things does not collect anything, and the initiating script (the cron script itself /home/account/cron-entry.php file=every-5-minutes/processing-queue.php) finishes after everything has been put into the pool, and destructed. Which prematurely "finishes" the cron job, since there is nothing else to do but loop and pull nothing new from the queue, since they are considered "being processed" when processed == 1 in the queue.

The question then finally becomes... How do I make the cron's script aware of the threads that where spawned and run() them without closing the pool out before they can do anything?

(note... if you copy / paste the provided script, note that I did not test it after removing the detailed logic, so it may need some simple fixes... please do not nit-pick said code, as the key here is that pthreads works if the script is executed FROM the Command Line, but fails to properly run when the script is executed FROM a CRON. If you plan on commenting with non-constructive criticism, please go use your fingers to do something else!)

Joe Watkins! I Need Your Brilliance! Thanks In Advance!

Answer

After all of that, it seems that the issue was with regards to user permissions. I was setting this specific cron up inside of cpanel, and when running the command manually I was logged in as root.

After setting this command up in roots crontab, I was able to get it to successfully run the threads from the pool. Only issue I have now is some threads never finish, and sometimes I am unable to close the pool. But this is a different issue, so I will open another question elsewhere.

For those running into this issue, make sure you know who the owner of the cron is as it matters with php's pthreads.

Comments