Running jobs
HPC and Data for Lattice QCD
Running jobs
Batch jobs
Jobs should be submitted to the parallel environment gpe1 requesting 1 slot. The job will be scheduled in 1 out of 2 available queues. To allow the job to select the right CPU cores and GPU device the following environment variables are defined:
Variable |
Description |
---|---|
GPE_AFFMSK |
CPU affinity mask |
GPE_DEVICE |
GPU device selector |
An executable should be launched using the taskset tool as shown in the following example script:
#!/bin/sh #$ -cwd #$ -pe gpe1 1 . /scratch/adm/env.sh taskset ${GPE_AFFMSK} command [arg ...] status=$? if [ $status != 0 ]; then echo "Execution failed" >&2 exit 1 fi
Inside the executable the environment variable GPE_DEVICE should be used to select the correct device, like in this example:
char* envp; int dev; cudaError_t cstatus; envp = getenv("GPE_DEVICE"); assert(envp != NULL); dev = atoi(envp); cstatus = cudaSetDevice(argp->dev); assert(cstatus == cudaSuccess);
WARNING: Jobs will be started without Kerberos/AFS token. When reading files from AFS file systems make sure that reading is possible without token. For writing files it is suggested to use either the local or Lustre file system where no Kerberos tokens are used to evaluate access permissions.