HPC
HPC

| HPC and Data for Lattice QCD

Lustre FAQ

HPC and Data for Lattice QCD

Lustre FAQ

What do I do when the Lustre filesystem is not mounted to /work on a node?

Try to execute the following command (in this order)

$> lustre_unload
$> lustre_load
$> mount /work 

After executing lustre_unload, please check that indeed all Lustre modules are unloaded. If lsmod lists only the 3 QPACE modules (gs, torus and qmem), proceed with lustre_load. Otherwise, get rid of the remaining Lustre modules by executing

$> lustre_rmmod

(and probably checking again with lsmod).

Lustre Failover

In Wuppertal, the Lustre servers are prepared vor Failover - the MDS features an active/passive failover, whereas the OSS are configured for active/active failover. When there is problem with a Lustre partition, it can simply be mounted from the corresponding failover server. The following pairs are configured:

  • qmdsW0 and qmdsW1
  • qossW2 and qossW3
Two examples:
  • The MDS server qmds0 dies. Then do the following on qmdsW1:
        $> mount /mnt/mgs
        $> mount /mnt/mdt-work
        
  • qossW3 has no longer access to the partition /mnt/ost31. Then do the following on qossW2:
        $> mount /mnt/ost31
        

For the Jülich installation, please contact the admins.

How do I start/stop the Lustre filesystem?

Wuppertal - start Lustre

To start the Lustre filesystem in Wuppertal, please execute the following commands in the specified order:

  • On qmdsW1:
      $> mount /mnt/mgs
      
  • On qossW2:
      $> mount /mnt/ost20
      $> mount /mnt/ost21
      $> mount /mnt/ost22
      $> mount /mnt/ost23
      
  • On qossW3:
      $> mount /mnt/ost30
      $> mount /mnt/ost31
      $> mount /mnt/ost32
      $> mount /mnt/ost33
      
  • On qmdsW1:
      $> mount /mnt/mdt-lustre
      
  • Wuppertal - stopt Lustre

    Either umount the partitions listed above in the reverse order, or execute the following script on qmasterW:

    $> /usr/sbin/qshutdown -l
    

    Jülich

    If there are any problems with the Lustre filesystem in Jülich (server side), please contact the local admins.