CAOS
HPC and Data for Lattice QCD
CAOS
HOWTO CAOS
In the following we suppose you have dowloaded all APEmille software under your KROOT directory.
Installation
To install CAOS you have to perform the following steps:- driver compilation: go under the directory KROOT/lib/caosdrv and
run make.
WARNING: the machine where you compile the drivers must run the same kernel version of you unit PCs. - extra perl module compilation: go under the directory KROOT/lib/caolib and run make
Configuration
On the master machine set the following environment variables:- create the unix devices for the PB-BOARD on each unit PC:
mknod /dev/ampb0 c 120 0
mknod /dev/ampb1 c 120 1
mknod /dev/ampb2 c 120 2
mknod /dev/ampb3 c 120 3 - create the unix device on each unit PC where a ROOT board is installed: mknod /dev/amrb0 c 121 0
-
PCSLAVENAMES as the name list of your unit pc.
Example:- setenv SLAVENAMES pcam0:pcam1:pcam2:pcam3 (if you use tcsh)
- export SLAVENAMES=pcam0:pcam1:pcam2:pcam3 (if you use bash)
-
ROOTNAMES as the list of PC names where you have a ROOT-BOARD.
Example- setenv ROOTNAMES pcam0:pcam4 (if you use tcsh)
- export ROOTNAMES=pcam0:pcam4 (if you use bash)
Running CAOS
To run CAOS:- Insert the ampbdrv.o driver on each unit PC where a PB-BOARD is installed. Login as root on each unit PC and type: insmod KROOT/lib/caosdrv/ampbdrv.o
- Insert the amrbdrv.o driver on each unit PC where a RB-BOARD is installed. Login as root and type: insmod KROOT/lib/caosdrv/amrbdrv.o
- Start the slave process on each unit PC.
KROOT/bin/slave.pl [slave_name]
where slave_name can optionally be the name of the slave.
- Start the root process on each unit PC where a ROOT-BOARD is installed
KROOT/bin/root.pl [root_name]
where root_name can optionally be the name of the root.
- From the master machine run
KROOT/bin/caos -C configuration filename.jex
where configuration can be one of the following:
- board bid to run a program on a single board. bid specifies the board number.
- unit uid to run a program on a unit (4 boards). uid specifies the unit number.
- crate cid to run a program on a crate (16 boards). cid specifies the crate number.
CAOS options
CAOS accept moreover the following options:
-j mask : | set jmille mask register (REG[0]) |
-t mask : | set tmille mask register (REG[1]) |
-f value : | set memory refresh value register (REG[0x200]) |
-F : | set PLD fast modality |
-r value : | set master root mask register (REG[9]) |
-n x y z : | set node [x,y,z] as default node |
-p filename : | load a script filename |
-i : | interactive mode |
-H : | exec an hard reset at start |
-R : | exec an hard reset before exit |
-o string : | open the machine along X Y Z dimension, "string" may be one of: x, y, z, xy, xz, or xyz |
-s : | skip program loading except system variables and data |
-T : | tower|crate|unit|pb tid cid aid pid |
-V : | show release version |
-v : | verbose flag |
-h : | show this help |
CAOS interactive
CAOS commands can be executed either interactively as caos -C configuration -i or using command files as caos -C configuration -p commandfile. The commands can be specified using the so called TACO like syntax.TACO like syntax
CAOS support also the TACO like syntax to write and read each device.- Write access to devices: w device nodes addr num : data
- Read access to devices: r device nodes addr num
- ar Altera Register
- tr Tz Register
- td Tz Data Memory
- tp Tz Program Memory
- jr Jn Register
- jd Jn Data Memory
- jp Jn Program Memory
and nodes are specified as
- all all nodes
- n node specified by node_abs_id
- [x,y,z] node specified by triple of node_abs_x, node_abs_y, node_abs_z
- [x1,y1,z1][x2,y2,z2] slice specified by two triples of node_abs_x, node_abs_y, node_abs_z
- def default node
CAOS Daemon
If you would like to use the CAOS daemon resources manager:- start the caosd daemon
- start any caos session from a shell where the USECAOSD variable is setted.
Example
Let suppose to run the program pippo.jex with a mask jmille value of 0x88 on the unit 1: caos -H -C unit 0 -j 0x88 pippo.jex the -H option exec an harware reset before to load and run your program.If you would like to read the status register of each jmille after running a program:
caos -C unit 1 -iAPE_master --> r jr all 6 1
APE_master --> quit