HPC
HPC

| HPC and Data for Lattice QCD

krun

HPC and Data for Lattice QCD

krun

Information related to the normal user

NOTE: Never use krun on a subset of Boards before
      setting the corresponding Root Mask (see 
      /zroot/tools/rootmask and HOWTO-rootboard)!!
      Otherwise the PC is likely to crash :-(

unit/run commands

(0) check whether PLD driver is installed:
		# /sbin/lsmod

	should show "plddrv" (or some variant of it)
(1) check whether the machine is used by someone else:
	Using `ps -aux` and searching for all possible
	programs (krun, taco, testmem, ...) is not convenient
	and safe. 
	Instead one should use 
		# /sbin/lsmod

	and check whether the column "used by" shows a 
	number > 0. This indicates that other processes 
	are using the PBs.
	The processes which use the PB can be found by
		# lsof /dev/apemem
(2) Loading the driver by hand :

		# /sbin/insmod [-f] $ZROOT/lib/<kernel-vers>/<driver-version.o>
	You get the kernel version via : uname -r
        E.g.: /sbin/insmod $ZROOT/lib/2.2.12/plddrv.o
(3) Check whether PB is present/seen:

		# krun -H 
	performs hard-reset and output is self-explaining
(2) Most common options to krun:
        -h, --help   
        -V, --version	print version number
        -H             	hard reset
        -t <mask>      	set Tz exception mask
        -j <mask>      	set Jn exception mask (recommended: -j 0x88)
        -G node|board|unit|crate|rack|tower

(3) Masking of Jn exceptions:
      	Jn ALU DENORM  and Jn LUT DENORM are (the only!) exceptions 
      	which can safely be ignored, because they serve only to 
      	monitor underflow (i.e. rounding of numbers to zero). To 
	mask them, use 
	# krun -j 0x88
(4) Running on more than one PB:
	OS Environment:
	---------------
    	OSSETCHREGS 	must be set to 1 
			(otherwise communications can not be opened)
    	OSNOROOT 	must be unset       
			(otherwise RootBoard is ignored)
    	OSPBMASK 	assign PB's executing a program
			all PB's of unit#0 work are used if not set
			1 -> only PB#0 
			2 -> only PB#1 
			4 -> only PB#2 
			8 -> only PB#3
			3 -> PB#0 & PB#1 
			e -> PB#1 & PB#2 & PB#3
			etc.

	Opening communications:
	-----------------------
	(by default the communications on each PB are closed
		krun -o Z       (opens Z direction)
    		krun -o YZ      (opens Y and Z direction)
    		etc.

Running on the SF simulator:

(0) use the bit-identical arithmetic in the SF simulator:
    -check whether ENV-variable SF_USE_FILU is set to 1
    -or set SF_USE_FILU right away
                                       setenv SF_USE_FILU 1 (csh)

                                       export SF_USE_FILU=1 (zsh)
                                       krun ...
    See HOWTO.sf for further details.