Debugging of APEmille Communication Links
Notations:
- Comm Registers and bits refer to Sasha's documentation
"Ch-registers.rtc 14-Dec-99"
Slices:
- Comm slices/Piggy Backs are numbered such that
Comm0 is closest to Jn1
Comm1 is closest to Jn3
Comm2 is closest to Jn5
Comm3 is closest to Jn7
- "caos" prints the four Comm slices in the order of growing BID
and in each line in the order: Comm0 Comm1 Comm2 Comm3
- Data errors seen in the bits 23-16 and 7-0 of any (even or odd bank)
32-bit data on Jn correspond to the UPPER cables, i.e. Comm0 and Comm1.
- Data errors seen in the bits 31-24 and 15-8 of any (even or odd bank)
32-bit data on Jn correspond to the LOWER cables, i.e. Comm2 and Comm3
- Schematically, the nibbles of the hex-representation of the data
pass through U(pper) or L(ower) slices as follows: 0xLLUULLUU
Directions:
(Most) directions in the Comm-world refer to an active, i.e. sender
point of view. In particular:
- "+x" on the PB front panel labels the connector over which data
is sent out when communicaiton is done in send_x_plus direction
- TAO directions refer to the passive (receiver) point of view:
send direction remote address TAO direction
---------------------------------------------------------
send_x_plus 0x01000000 x_minus
send_x_minus 0x01800000 x_plus
send_y_plus 0x02000000 y_minus
send_z_plus 0x03000000 z_minus
i.e. in the assignment b = a[x_plus] the node [x,y,z] receives data
from node [x+1,y,z] which corresponds to data transfer along "send_x_minus"
- Exception registers refer to the direction of data movement,
for instance removing the "-x" cable on PB 1 causes exceptions
in the Registers labled Xplus when transfering data in send_x_plus
direction:
PB0 PB1
| |
|+x -----. |+x
| \ |
| \ |
| \ |
|-x `-------> |-x
send_x_plus
EDAC Error Registration:
NO EDAC errors/exceptions are registered unless ALL directiosn are opened!!!
Dummy Transfers:
Whenever in RUN-mode, the Comm's perform dummy communications (dummy data with
alternating 0 and 1 bits) along the last activated direction. At the begining
of a program (or during the whole program if no remote communications are done)
the dummy communications are along the send_*_plus directions.
For this reason, it is advisable to perform a HARDRESET to stop any ongoing
dummy transfers (e.g. in case of an exception???).