POV-Ray on Slurm Workload Manager demo scripts
- make, povray, ffmpeg
- munged
- slurmd
- nfs client
sudo apt -y install make povray povray-examples ffmpegsudo apt -y install mungeThe start of the munge.service will fail, you have to make a little change:
sudo systemctl edit --system --full mungeChange the ExecStart line to read:
ExecStart=/usr/sbin/munged --forceCopy the /etc/munge/munge.key from another compute node in the cluster and check user, groups and permission flags:
# ls -al /etc/munge/munge.key
-r-------- 1 munge munge 1024 Mai 24 16:55 /etc/munge/munge.keyRestart the munge.service:
sudo systemctl restart mungeTest, that munge works:
munge -n | ssh <clusternode> unmungesudo apt -y install slurmdThe start of the slurmd.service will fail.
To fix this, edit the slurmd systemd unit file:
sudo systemctl edit --system --full slurmdAnd change from:
[Service]
Type=forking
EnvironmentFile=/etc/default/slurmd
ExecStart=/usr/sbin/slurmd $SLURMD_OPTIONS
PIDFile=/var/run/slurm-llnl/slurmd.pidto:
[Service]
Type=simple
EnvironmentFile=/etc/default/slurmd
ExecStart=/usr/sbin/slurmd $SLURMD_OPTIONS -cD
PIDFile=/var/run/slurm-llnl/slurmd.pidNext, copy /etc/slurm-llnl/slurm.conf from another node in your cluster.
Restart, check and enable the slurmd.service:
sudo systemctl restart slurmd.service
sudo systemctl status slurmd.service
sudo systemctl enable slurmd.serviceRestart the slurmctld.service on your slurmctld node:
sudo systemctl restart slurmctld.serviceCheck the node's slurmd status with sview:
sviewsudo apt -y install nfs-common
sudo mkdir -p /nfs/dataEdit /etc/fstab
192.168.0.113:/data /nfs/data nfs auto,noatime,nolock,bg,nfsvers=4,intr,tcp,actimeo=1800,rsize=8192,wsize=8192 0sudo mount /nfs/dataRender your first POV-Ray animation:
koppi@x200:~/data/demo-povay-slurm$ ./run.sh -s sphere
submitting job sphere.pov with 300 frames
executing: sbatch --hint=compute_bound -n 1 -J povray -p debug -t 8:00:00 -O -J sphere -a 0-300 povray.sbatch sphere 300 '+A0.01 -J +W1280 +H720'
* created povray job 33237 in /home/koppi/data/demo-povay-slurm/sphere-33237
executing: sbatch --hint=compute_bound -n 1 -J povray -p debug -t 8:00:00 --job-name=ffmpeg --depend=afterok:33237 -D sphere-33237 sphere-33237/ffmpeg.sbatch
* created ffmpeg job 33238 for /home/koppi/data/demo-povay-slurm/sphere-33237
doneWatch the job queue:
$ watch squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
33237_[44-300] debug sphere koppi PD 0:00 1 (Resources)
33238 debug ffmpeg koppi PD 0:00 1 (Dependency)
33237_43 debug sphere koppi R 0:03 1 dell
33237_42 debug sphere koppi R 0:04 1 x220
33237_41 debug sphere koppi R 0:05 1 x200