The SAM Batch Adapter package is a python API which serves as an interface between SAM and batch systems used for submitting user jobs. The package is fully configurable and does not make any assumptions about underlying batch systems. It comes with a full set of administrative commands that can be used for adapter configuration. Once it has been configured, it will contain knowledge about all batch systems available to the local SAM stations. Overview of the SAM Batch Adapter package, its requirements and design, including (somewhat simplified) class diagrams, can be found here.
SAM station can have any number of batch systems available for submitting and running user jobs. For each of those batch systems there should be an adapter configured. Station's batch adapter configuration is kept in a local python module which gets updated every time a valid administrative command is executed. Adapter configuration consists of batch commands and queues available to users, as well as of the default batch system limits. Batch commands are described by their type (e.g., job submission command) and command string which may contain any number of predefined string templates (e.g., qstat %__BATCH_JOB_ID__). They can be associated with any number of possible outcomes characterized by the command exit status, as well as by its output string which also may contain templates.
The Batch Adapter API does not execute batch commands. It simply provides functionality for preparing commands before their execution, as well as for analyzing their outcome. It is responsibility of the API user to execute commands and interpret their results.
There are several types of queues that can be configured for a given
adapter. The Batch Adapter API does not make any assumptions about client
usage of those queues, so that different clients may use the same type of
queue for different purposes. Adding new queue types is straightforward,
which makes the API fairly flexible and extensible. The batch queues can
be have different limits configured, and those limits override the default
adapter limits.
The SAM Batch Adapter API is used by the SAM Job Submission Handler (i.e., the commands like sam submit and sam run project). The SAM Submission Handler makes the Batch Adapter API calls in order to obtain adapter configuration for a given station, get the requested batch queue and various batch commands. These are used for preparing several wrapper scripts:
There are currently three different
queue types that are recognized by the SAM Job Submission Handler:
interactive, consumer and project queues (note that
project queues may be associated with a single consumer queue).
Different queue types are used to support several different modes of running
SAM jobs via the Batch Adapter API:
The SAM Job Submission Handler understands several predefined command types that are used for job submission, lookup, and killing:
The predefined templates that are understood by the SAM Job Submission Handler and that can be used to form the batch command strings, as well as to define their output, are listed below:
As mentioned before, the SAM Batch Adapter package comes with a full set of
administrative commands
that can be used for viewing and modifying the local adapter configuration
for a given SAM station.
In this example we create configuration for a new station. We assume that station's name is "d0station", and that it uses the pbs batch system for submitting jobs. The queues configured for SAM are "sam_short" (intended for short SAM jobs), "sam_long" (intended for large SAM jobs), and a special queue "sam_project" (intended only for the SAM projects). The "sam_project" queue requires resource "pmaster".
We start by adding the new station's configuration:
d0test> sambatch list configured stations
Configured stations: ['samadams', 'cab-test', 'cab', 'd0mainz', 'central-analysis', 'clued0', 'sammy', 'fnal-farm', 'generic_station']
d0test> sambatch add station config --station=d0station
Created new configuration module for station d0station.
Added configuration for station d0station.
d0test> sambatch list configured stations
Configured stations: ['samadams', 'fnal-farm', 'd0mainz', 'clued0', 'central-analysis', 'cab', 'generic_station', 'cab-test', 'd0station', 'sammy']
d0test> sambatch display station config --station=d0station
Station: d0station
Available Adapters: []
d0test>
The next step is to add the adapter and its queues. The "sam_short" and "sam_long" will be added as consumer queues, while "sam_project" will be added as the project queue with "sam_long" as its associated consumer queue.
d0test> sambatch add adapter --adapter=PBS --station=d0station
Updated batch configuration for station d0station.
Added batch adapter PBS for station d0station.
d0test> sambatch add consumer queue --queue=sam_short --description="Short SAM jobs" --adapter=PBS --station=d0station
Updated batch configuration for station d0station.
Added consumer queue sam_short to batch adapter PBS for station d0station.
d0test> sambatch add consumer queue --queue=sam_long --description="Long SAM jobs" --adapter=PBS --station=d0station
Updated batch configuration for station d0station.
Added consumer queue sam_long to batch adapter PBS for station d0station.
d0test> sambatch add project queue --queue=sam_project --description="SAM projects" --adapter=PBS --station=d0station --consumer-queue=sam_long
Updated batch configuration for station d0station.
Added project queue sam_project to batch adapter PBS for station d0station.
d0test> sambatch display station config --station=d0station
Station: d0station
Default Adapter: PBS
Available Adapters: ['PBS']
Adapter: PBS
Default Queue: sam_short
Available Queues: ['sam_short', 'sam_project', 'sam_long']
Consumer Queue: sam_short (Short SAM jobs)
Project Queue: sam_project (SAM projects)
Consumer Queue: sam_long (Long SAM jobs)
Consumer Queue: sam_long (Long SAM jobs)
d0test>
At this point we decide that we will not allow SAM jobs to be submitted directly into the "sam_long" queue, so we remove it from the list of available queues. This does not affect configuration of our "sam_project" queue:
d0test> sambatch delete queue --queue=sam_long --adapter=PBS --station=d0station
Updated batch configuration for station d0station.
Deleted queue sam_long from batch adapter PBS for station d0station.
d0test> sambatch display station config --station=d0station
Station: d0station
Default Adapter: PBS
Available Adapters: ['PBS']
Adapter: PBS
Default Queue: sam_short
Available Queues: ['sam_short', 'sam_project']
Consumer Queue: sam_short (Short SAM jobs)
Project Queue: sam_project (SAM projects)
Consumer Queue: sam_long (Long SAM jobs)
d0test>
We also decide to set a limit for the number of parallel user jobs for the "sam_short" queue:
d0test> sambatch list limit types
Available limit types: ['Maximum number of processes per user', 'Maximum cpu time per event']
d0test> sambatch set queue limit --limit="Maximum number of processes per user" --value=1 --queue=sam_short --adapter=PBS --station=d0station
Updated batch configuration for station d0station.
Limit for "Maximum number of processes per user" has been set to "1" (queue: sam_short, adapter: PBS, station: d0station).
d0test> sambatch display station config --station=d0station
Station: d0station
Default Adapter: PBS
Available Adapters: ['PBS']
Adapter: PBS
Default Queue: sam_short
Available Queues: ['sam_short', 'sam_project']
Consumer Queue: sam_short (Short SAM jobs)
Limits:
Maximum number of processes per user: 1
Project Queue: sam_project (SAM projects)
Consumer Queue: sam_long (Long SAM jobs)
d0test>
We still have to add the adapter commands. Since the "sam_project" queue requires special resource, we'll need two submission commands: one for the consumer wrapper scripts, and one for the project wrapper scripts. For the user's convenience, we'll add standard job lookup and kill commands as well:
d0test> sambatch list command types
Available command types: ['job submit command', 'job lookup command', 'job killcommand', 'project submit command', 'project lookup command', 'project kill command', 'consumer submit command', 'consumer lookup command', 'consumer kill command', 'process submit command', 'process lookup command', 'process kill command']
d0test> sambatch list command templates
Available command templates: ['%__USER_PROJECT__', '%__USER_SCRIPT__', '%__USER_SCRIPT_ARGS__', '%__USER_JDF__', '%__USER_JOB_OUTPUT__', '%__USER_JOB_ERROR__', '%__USER_NAME__', '%__BATCH_JOB_ID__', '%__BATCH_JOB_NAME__', '%__BATCH_QUEUE__', '%__BATCH_FLAGS__', '%__BATCH_HOST__', '%__UNIX_PROCESS_ID__', '%__UNIX_HOST__']
d0test> sambatch add command --command-type="job submit command" --command-strin
g="qsub -q %__BATCH_QUEUE__ -o %__USER_JOB_OUTPUT__ -e %__USER_JOB_ERROR__ %__USER_SCRIPT__" --adapter=PBS --station=d0station
Updated batch configuration for station d0station.
Added batch command of type "job submit command" to batch adapter PBS for station d0station.
d0test> sambatch add command --command-type="project submit command" --command-string="qsub -l nodes=1:pmaster -k oe -q %__BATCH_QUEUE__ %__USER_SCRIPT__" --adapter=PBS --station=d0station
Updated batch configuration for station d0station.
Added batch command of type "project submit command" to batch adapter PBS for station d0station.
d0test> sambatch add command --command-type="job lookup command" --command-string="qstat %__BATCH_JOB_ID__.%__BATCH_HOST__" --adapter=PBS --station=d0station
Updated batch configuration for station d0station.
Added batch command of type "job lookup command" to batch adapter PBS for station d0station.
d0test> sambatch add command --command-type="job kill command" --command-string="qdel %__BATCH_JOB_ID__.%__BATCH_HOST__" --adapter=PBS --station=d0station
Updated batch configuration for station d0station.
Added batch command of type "job kill command" to batch adapter PBS for station d0station.
d0test> sambatch display station config --station=d0station
Station: d0station
Default Adapter: PBS
Available Adapters: ['PBS']
Adapter: PBS
Default Queue: sam_short
Available Queues: ['sam_short', 'sam_project']
Consumer Queue: sam_short (Short SAM jobs)
Limits:
Maximum number of processes per user: 1
Project Queue: sam_project (SAM projects)
Consumer Queue: sam_long (Long SAM jobs)
Available Commands: ['job kill command', 'job lookup command', 'job submit command', 'project submit command']
Command: qdel %__BATCH_JOB_ID__.%__BATCH_HOST__
Type: job kill command
Known Outcomes:
Exit Status: 0
Outcome Description: Success
Exit Status: 1
Outcome Description: Failure
Command: qstat %__BATCH_JOB_ID__.%__BATCH_HOST__
Type: job lookup command
Known Outcomes:
Exit Status: 0
Outcome Description: Success
Exit Status: 1
Outcome Description: Failure
Command: qsub -q %__BATCH_QUEUE__ -o %__USER_JOB_OUTPUT__ -e %__USER_JOB_ERROR__ %__USER_SCRIPT__
Type: job submit command
Known Outcomes:
Exit Status: 0
Outcome Description: Success
Exit Status: 1
Outcome Description: Failure
Command: qsub -l nodes=1:pmaster -k oe -q %__BATCH_QUEUE__ %__USER_SCRIPT__
Type: project submit command
Known Outcomes:
Exit Status: 0
Outcome Description: Success
Exit Status: 1
Outcome Description: Failure
d0test>
The final step is to define successful submission result and to add it to the job submission commands:
d0test> sambatch add command result --command-type="job submit command" --exit-status=0 --command-output="%__BATCH_JOB_ID__.%__BATCH_HOST__" --description="Successful job submission" --adapter=PBS --station=d0station
Updated batch configuration for station d0station.
Added exit status 0 outcome for batch command of type "job submit command" (adapter: PBS, station: d0station).
d0test> sambatch add command result --command-type="project submit command" --exit-status=0 --command-output="%__BATCH_JOB_ID__.%__BATCH_HOST__" --description="Successful project submission" --adapter=PBS --station=d0station
Updated batch configuration for station d0station.
Added exit status 0 outcome for batch command of type "project submit command" (adapter: PBS, station: d0station).
d0test> sambatch display station config --station=d0station
Station: d0station
Default Adapter: PBS
Available Adapters: ['PBS']
Adapter: PBS
Default Queue: sam_short
Available Queues: ['sam_short', 'sam_project']
Consumer Queue: sam_short (Short SAM jobs)
Limits:
Maximum number of processes per user: 1
Project Queue: sam_project (SAM projects)
Consumer Queue: sam_long (Long SAM jobs)
Available Commands: ['job kill command', 'job lookup command', 'job submit command', 'project submit command']
Command: qdel %__BATCH_JOB_ID__.%__BATCH_HOST__
Type: job kill command
Known Outcomes:
Exit Status: 0
Outcome Description: Success
Exit Status: 1
Outcome Description: Failure
Command: qstat %__BATCH_JOB_ID__.%__BATCH_HOST__
Type: job lookup command
Known Outcomes:
Exit Status: 0
Outcome Description: Success
Exit Status: 1
Outcome Description: Failure
Command: qsub -q %__BATCH_QUEUE__ -o %__USER_JOB_OUTPUT__ -e %__USER_JOB_ERROR__ %__USER_SCRIPT__
Type: job submit command
Known Outcomes:
Exit Status: 0
Outcome Description: Success
Exit Status: 0
Expected Output: %__BATCH_JOB_ID__.%__BATCH_HOST__
Outcome Description: Successful job submission
Exit Status: 1
Outcome Description: Failure
Command: qsub -l nodes=1:pmaster -k oe -q %__BATCH_QUEUE__ %__USER_SCRIPT__
Type: project submit command
Known Outcomes:
Exit Status: 0
Outcome Description: Success
Exit Status: 0
Expected Output: %__BATCH_JOB_ID__.%__BATCH_HOST__
Outcome Description: Successful project submission
Exit Status: 1
Outcome Description: Failure
d0test>
At this point the PBS batch adapter for d0station should be ready for use.