Tuesday, September 23, 2014

Maui scheduler partition setup (test and clarification)

Maui scheduler provides node partition function.
You can also setup multiple queue and node properties, but so far I can't make it enforced with acl_users setting.
However, there seems hard to find detailed (and working) Maui partition setup guide...

OK, our test cluster has a master (c6) and 2 nodes (c61, c62), each with 2 procs.

In the maui.cfg:

NODECFG[c61]    PARTITION=group1
NODECFG[c62]    PARTITION=group2

This assigns 2 nodes into 2 partitions.

Then we assign users with available partitions.
(or assign groups, just replace USERCFG with GROUPCFG, followed by user group name)

SYSCFG[base]    PLIST=
USERCFG[DEFAULT]    PLIST=group1
USERCFG[user1]    PLIST=group1
USERCFG[user2]    PLIST=group2
USERCFG[user3]    PLIST=group2:group1

With this setting, user1 can only uses nodes in group1, and user2 in group2.

If user1 submits 3 jobs, the 3rd will be queued since group1 has only 2 procs available.
User3 will use nodes in group2 first, and then in group1.
Which means that the partition list sequence matters.
Unlisted users will use the rule for DEFAULT (must use all capital).

There is a trick of using partitions, however...

In the default, the job must fit into a partition, and cannot span across partitions.
For example, if user3 submits a job with:

#PBS -l nodes=2:ppn=2

The job will be accepted but keep queued without error message.
Use checkjob with it will give:

cannot select job for partition group2 (NodeCount)
cannot select job for partition group1 (NodeCount)

However, you can use the QoS flag "SPAN" to make the job spans on multiple partitions with:

#PBS -l nodes=2:ppn=2,qos=SPAN

Because the rules are treated with logically OR, the SYSCFG[base] PLIST must be kept empty.
If:

SYSCFG[base]     PLIST=
USERCFG[user2]    PLIST=group2

The two rules will give a PLIST of group2.
If:

SYSCFG[base]     PLIST=group1:group2
USERCFG[user2]    PLIST=group2


The two rules will give a PLIST of group1:group2, which allows user1 to use both partitions.
If the SYSCFG lines is omitted, the effect will equal to the latter one. (Here's the pitfall)

You can also use AND rules by adding the ampersand (&) in the PLIST.
In this case, the SYSCFG must not be left empty, or all jobs will be queued:

SYSCFG[base]    PLIST=group1,group2&
USERCFG[user1]    PLIST=group1&
USERCFG[user2]    PLIST=group2&
USERCFG[user3]    PLIST=group2,group1&

Watchout for the comma instead of colon.

No comments: