Running compiled Matlab under Condor
Problems
Multiple users on the same node
When running compiled Matlab codes under Condor, you need to be aware that the default settings of Matlab as well the Matlab runtime environment (also called MCR) might bite you. It seems that when a MCR job is run, the directories
/tmp/.matlab
and
/tmp/.mcr_cache_v78
(for Matlab 2008a) are being created and used by the executables. However, these are most of the times not cleaned up after the job is finished and if another user's MCR program wants to start on this node, it will wait forever since it cannot create a lock file in this directory as it is owned by another user. It is currently not fully clear if this could also be an issue if the
same user runs multiple different programs on the same nodes.
Solution...
Matlab runs on all available cores
Matlab executable starts on all available cores, which does not work well with condor.
Solution
Possible solutions
Multiple users on the same node
After consulting with Keith Thorne and Jamie Rollins the attached wrapper script was devised and should solve this issue by creating a temporary unique directory and redirecting
$HOME
to that place. Please report success and failures to us!
It is important to use the wrapper script as an executable, e.g. if your mcc-compiled binary is named
GW_detection
and you need the arguments
a 50 --verbose
you need to run it like
/path/to/atlas-matlab-init /path/to/GW-detection a 50 --verbose
or in your submit file:
Executable = /path/to/atlas-matlab-init
Arguments = /path/to/GW_detection a 50 --verbose
[...]
Matlab runs on all available cores
Either use File->Preferences->General->Multithreading from the Matlab menu before compiling your code or add
maxNumCompThreads(1)
to you script.
--
CarstenAulbert - 10 Jun 2009