Lustre FAQ
HPC and Data for Lattice QCD
Lustre FAQ
What do I do when the Lustre filesystem is not mounted to /work on a node?
Try to execute the following command (in this order)
$> lustre_unload $> lustre_load $> mount /work
After executing lustre_unload
, please check that indeed all Lustre modules are unloaded.
If lsmod
lists only the 3 QPACE modules (gs
, torus
and qmem
), proceed with lustre_load
.
Otherwise, get rid of the remaining Lustre modules by executing
$> lustre_rmmod
(and probably checking again with lsmod
).
Lustre Failover
In Wuppertal, the Lustre servers are prepared vor Failover - the MDS features an active/passive failover, whereas the OSS are configured for active/active failover. When there is problem with a Lustre partition, it can simply be mounted from the corresponding failover server. The following pairs are configured:
qmdsW0
andqmdsW1
qossW2
andqossW3
- The MDS server
qmds0
dies. Then do the following onqmdsW1
:$> mount /mnt/mgs $> mount /mnt/mdt-work
qossW3
has no longer access to the partition/mnt/ost31
. Then do the following onqossW2
:$> mount /mnt/ost31
For the Jülich installation, please contact the admins.
How do I start/stop the Lustre filesystem?
Wuppertal - start Lustre
To start the Lustre filesystem in Wuppertal, please execute the following commands in the specified order:
- On
qmdsW1
:$> mount /mnt/mgs
- On
qossW2
:$> mount /mnt/ost20 $> mount /mnt/ost21 $> mount /mnt/ost22 $> mount /mnt/ost23
- On
qossW3
:$> mount /mnt/ost30 $> mount /mnt/ost31 $> mount /mnt/ost32 $> mount /mnt/ost33
- On
qmdsW1
:$> mount /mnt/mdt-lustre
Wuppertal - stopt Lustre
Either umount the partitions listed above in the reverse order, or execute the following script on qmasterW
:
$> /usr/sbin/qshutdown -l
Jülich
If there are any problems with the Lustre filesystem in Jülich (server side), please contact the local admins.