HPC and Data for Lattice QCD
SGE
Generic information
Realtime
To use the realtime (and not CPU time of apemaster) for accounting and scheduling
set
execd_params to
SHARETREE_RESERVED_USAGE, ACCT_RESERVED_USAGE
(see
qconf -sconf and sge_conf(5)), AND set
control_slaves to
FALSE and
job_is_first_task to
TRUE
in all parallel environments.
Priorities
Useful documentation
The job priorities are calculated by
job_priority = weight_urgency * normalized_urgency_value +
weight_ticket * normalized_ticket_value +
weight_POSIX_priority * normalized_POSIX_priority_value
As can be seen from our scheduler configuration (
qconf -ssconf),
only ticket priority is used (
weight_priority seems to be the
weight_POSIX_priority, but it is not mentioned at all in sched_conf(5)).
halftime and
compensation_factor are the important
parameters for this scheduling.
From the sched_conf(5):
halftime
When executing under a share based policy, the scheduler "ages" (i.e.
decreases) usage to implement a sliding window for achieving the share
entitlements as defined by the share tree. The halftime defines the
time interval in which accumulated usage will have been decayed to half
its original value. Valid values are specified of type time as
specified in queue_conf(5).
compensation_factor
Determines how fast Grid Engine should compensate for past usage below
of above the share entitlement defined in the share tree. Recommended
values are between 2 and 10, where 10 means faster compensation.
Old HOWTO
Handling projects in GRD
* Adding a new project
- For each project a GRD user access list (ACL) has to be defined:
qconf -au user1[,user2,...] acl_name
(Note: acl_name should be equal to project name)
and define name and acl (user access list defined before).
- To enforce definition of a project for each job, the list of projects
has to appear in the board queue configuration. In order to avoid
modification of all board queues this is only done for board zero of
each unit, i.e. queues matching b??0.
* Other useful commands:
- Show list of access lists: qconf -sul
- Show user access list: qconf -su acl_name
- Modify access list: qconf -mu acl_name
- Add user to access list: qconf -au user1[,user2,...] acl_name
- Delete user from access list: qconf -du user1[,user2,...] acl_name
- Show list of projects: qconf -sprjl
- Modify project: qconf -mprj prj_name
- Delete project: qconf -dprj prj_name
* Hints:
Users do not have to define a project for every qsub (or in every job
script). They simply have to generate a file $HOME/.grd_request or
./grd_request (in the submit directory) and add the line '-P prj_name'.