J. Bug J. Bug - 10 months ago 42
Linux Question

Calling Matlab on a linux based Cluster: matlab sessions stops before m file is completly executed

i'm running a bash script that submits some pbs Jobs on a Linux based Cluster multiple times. each Submission calls Matlab, reads some data, performs calculations, and writes the results back to my Directory.

This process works fine without one exception. For some calculations the m-file starts, loads everything, than performs the calculation, but while printing the results to the stdout the Job terminates.

the log file of pbs Shows no error Messages, matlab Shows no error Messages.
the code runs perfectly on my Computer. I am out of ideas.

if anyone would have an idea what i could do, i would appreciate it.

thanks in advance

is there a possibility to force matlab to reach the end of file? may that help?

edit @18:00:
as requested in the comment below by HBHB here is the comment that Shows how matlab is called by an external *.sh file

#PBS -l nodes=1:ppn=2
#PBS -l pmem=1gb
#PBS -l mem=1gb
#PBS -l walltime=00:05:00
module load MATLAB/R2015b

matlab -nosplash -nodisplay -nojvm -r "addpath('./data/calc');myFunc("$a","$b"),quit()"

Where $a and $b Comes from a Loop within the caller bash file and ./data/calc Points to the Directory where myFunction is located

edit @18:34: if i perform the calculation manually than everything runs fine. so the given data is fine and seems to narrow down to pbs?

edit @21:27 i put an until Loop around the matlab call that checks if matlab Returns the desired data. if not, it should restart matlab again after some delay. but still. matlab stops after finished calulation while printing the result(some matrices) and even the Job finishes. the checking part of the restart will never be reached.

what i don't understand. the Job stays in the Queue, like i planned it with the small delay. so the sleep$w will be executed? but if I check the error files, it just shows me the frozen matlab in its first round, recognizable by i. here is that part of code. maybe you can help me

#w=w wait
until [[ -e ./temp/$b/As$a && -e ./temp/$b/Bs$a && -e ./temp/$b/Cs$a && -e ./temp/$b/lamb$a ]]
echo $i
matlab -nosplash -nodisplay -nojvm -r "addpath('./data/calc');myFunc("$a","$b"),quit()"
sleep $w

Answer Source

You are most likely choking your matlab process with limited memory. Your PBS file:

#PBS -l nodes=1:ppn=2 
#PBS -l pmem=1gb 
#PBS -l mem=1gb 
#PBS -l walltime=00:05:00

You are setting your physical memory to 1gb. Matlab without any files runs around 900MB of virtual memory. Try:

#PBS -l nodes=1:ppn=1
#PBS -l pvmem=5gb
#PBS -l walltime=00:05:00

Additionally, this is something you should contact your local system administrator for. Without system logs, I can't tell you for sure why your job is cutting short (but my guess is resource limits). As an SA of an HPC center, I can tell you that they would be able to tell you in about 5 minutes why your job is not working correctly. Additionally, different HPC centers utilize different PBS configurations. So mem might not even be recognized; this is something your local adminstrators can help you with much better then StackOverflow.