Chapter 3: Working with Machines
Machines :
Before you can schedule jobs to run on a machine, you must define the machine to
Autosys. You can define real machines, virtual machines, or real
machine pools.
Real Machines :
A real machine is any single machine that meets the following criteria:
■ It has been identified in the appropriate network so that CA Workload Automation
AE can access it.
■ An agent, CA AutoSys WA Connect Option, CA NSM, or CA UAJM is installed, which
lets CA Workload Automation AE run jobs on it.
■ It is defined to CA Workload Automation AE as a real machine using JIL.
A real machine must meet these conditions to run jobs. However, for CA Workload
Automation AE to perform intelligent load balancing and queuing while running jobs, it
must know the relative processing power of the real machines. CA Workload
Automation AE uses the virtual machines to provide load balancing and queuing.
Virtual Machines :
A virtual machine is a machine definition that references one or more existing real
machine definitions. By defining virtual machines to CA Workload Automation AE and
submitting jobs to run on those machines, you can specify the following:
■ Run-time resource policies (or constraints) at a high level.
■ That CA Workload Automation AE automatically runs those policies in a
multi-machine environment.
Note: Previous releases of CA Workload Automation AE required that all machines in a
virtual machine be of the same type. In the current release, the component real
machines in a virtual machine definition can be UNIX or Windows machines or a mix of
both.
Real Machine Pools :
A real machine pool is similar to a virtual machine in that it references one or more
existing real machine definitions. Real machine pools are used specifically to integrate
with CA Automation Suite for Data Centers for load balancing, When a job is scheduled
that references a real machine pool, the node names associated with the machines
referenced are used by CA Automation Suite for Data Centers to assign the real machine
to run the job.
The localhost Definition :
The localhost machine name is a reserved name. You cannot define a machine for
localhost by creating an insert_machine: localhost definition.
By default, the localhost value is resolved to the name of the machine where the CA
Workload Automation AE scheduler was installed. You can override the reserved
localhost value to the name of another real machine using the local machine definition
setting. On UNIX, you can configure this setting using the LocalMachineDefinition
parameter in the configuration file. On Windows, you can configure this setting using
the Local Machine Definition field in the Scheduler window of CA Workload Automation
AE Administrator (autosysadmin).
You must create a machine definition in the database for the machine resolved from the
localhost. To create a machine definition, use the insert_machine JIL command.
As part of the CA Workload Automation AE installation process, your administrator must
have created a machine definition for the default localhost (the machine where the
scheduler was installed) in the database. If you configure the local machine definition
setting to another machine, you must create a definition for that machine in the
database. For example, if you configure the local machine definition setting to a
machine named prod, you must define machine prod using the insert_machine: prod
command.
Note: For more information about the LocalMachineDefinition parameter in the
configuration file (UNIX), see the Administration Guide. For more information about the
Local Machine Definition (Windows), see the Online Help.
How the localhost Value is Resolved :
If the machine: localhost attribute is specified in a job definition, the scheduler tries to
resolve the localhost value when it runs the job. The localhost value is resolved as
follows:
■ The scheduler checks the value of the LocalMachineDefinition parameter (UNIX) or
the Local Machine Definition field (Windows).
■ If the local machine definition setting is set to a value other than “localhost”, the
scheduler searches the database for a machine definition with that name. For
example, suppose that LocalMachineDefinition is set to agentmach. If an
agentmach machine definition is found and all conditions are satisfied, the job runs
on agentmach. If the scheduler cannot find an agentmach machine definition, or if
it finds multiple agentmach machine definitions, the scheduler does not resolve
localhost. All jobs defined to run on the localhost machine fail.
■ If the local machine definition is not defined or is set to “localhost”, the scheduler
searches the database for a machine definition corresponding to the machine
where the scheduler was started (the default localhost). For example, suppose that
the scheduler was started on a machine named prodserver and
LocalMachineDefinition is not defined. When the job runs, the scheduler searches
for a machine definition named prodserver. If the scheduler cannot find the
prodserver definition, or if it finds multiple prodserver definitions, the scheduler
does not resolve localhost. All jobs defined to run on the localhost machine fail.
■ In a high availability failover where the shadow scheduler takes over from the
primary scheduler, the localhost is resolved in the same way. To run a job on the
localhost, the shadow scheduler first checks its local machine definition setting,
which may be different from the setting for the primary scheduler. If the local
machine definition is not defined, the localhost is resolved to the machine where
the shadow scheduler was started.
Automation AE to perform intelligent load balancing and queuing while running jobs, it
must know the relative processing power of the real machines. CA Workload
Automation AE uses the virtual machines to provide load balancing and queuing.
Virtual Machines :
A virtual machine is a machine definition that references one or more existing real
machine definitions. By defining virtual machines to CA Workload Automation AE and
submitting jobs to run on those machines, you can specify the following:
■ Run-time resource policies (or constraints) at a high level.
■ That CA Workload Automation AE automatically runs those policies in a
multi-machine environment.
Note: Previous releases of CA Workload Automation AE required that all machines in a
virtual machine be of the same type. In the current release, the component real
machines in a virtual machine definition can be UNIX or Windows machines or a mix of
both.
Real Machine Pools :
A real machine pool is similar to a virtual machine in that it references one or more
existing real machine definitions. Real machine pools are used specifically to integrate
with CA Automation Suite for Data Centers for load balancing, When a job is scheduled
that references a real machine pool, the node names associated with the machines
referenced are used by CA Automation Suite for Data Centers to assign the real machine
to run the job.
The localhost Definition :
The localhost machine name is a reserved name. You cannot define a machine for
localhost by creating an insert_machine: localhost definition.
By default, the localhost value is resolved to the name of the machine where the CA
Workload Automation AE scheduler was installed. You can override the reserved
localhost value to the name of another real machine using the local machine definition
setting. On UNIX, you can configure this setting using the LocalMachineDefinition
parameter in the configuration file. On Windows, you can configure this setting using
the Local Machine Definition field in the Scheduler window of CA Workload Automation
AE Administrator (autosysadmin).
You must create a machine definition in the database for the machine resolved from the
localhost. To create a machine definition, use the insert_machine JIL command.
As part of the CA Workload Automation AE installation process, your administrator must
have created a machine definition for the default localhost (the machine where the
scheduler was installed) in the database. If you configure the local machine definition
setting to another machine, you must create a definition for that machine in the
database. For example, if you configure the local machine definition setting to a
machine named prod, you must define machine prod using the insert_machine: prod
command.
Note: For more information about the LocalMachineDefinition parameter in the
configuration file (UNIX), see the Administration Guide. For more information about the
Local Machine Definition (Windows), see the Online Help.
How the localhost Value is Resolved :
If the machine: localhost attribute is specified in a job definition, the scheduler tries to
resolve the localhost value when it runs the job. The localhost value is resolved as
follows:
■ The scheduler checks the value of the LocalMachineDefinition parameter (UNIX) or
the Local Machine Definition field (Windows).
■ If the local machine definition setting is set to a value other than “localhost”, the
scheduler searches the database for a machine definition with that name. For
example, suppose that LocalMachineDefinition is set to agentmach. If an
agentmach machine definition is found and all conditions are satisfied, the job runs
on agentmach. If the scheduler cannot find an agentmach machine definition, or if
it finds multiple agentmach machine definitions, the scheduler does not resolve
localhost. All jobs defined to run on the localhost machine fail.
■ If the local machine definition is not defined or is set to “localhost”, the scheduler
searches the database for a machine definition corresponding to the machine
where the scheduler was started (the default localhost). For example, suppose that
the scheduler was started on a machine named prodserver and
LocalMachineDefinition is not defined. When the job runs, the scheduler searches
for a machine definition named prodserver. If the scheduler cannot find the
prodserver definition, or if it finds multiple prodserver definitions, the scheduler
does not resolve localhost. All jobs defined to run on the localhost machine fail.
■ In a high availability failover where the shadow scheduler takes over from the
primary scheduler, the localhost is resolved in the same way. To run a job on the
localhost, the shadow scheduler first checks its local machine definition setting,
which may be different from the setting for the primary scheduler. If the local
machine definition is not defined, the localhost is resolved to the machine where
the shadow scheduler was started.
Define a Machine :
Before you can schedule jobs to run on a machine, you must define the machine to CA
Workload Automation AE.
Follow these steps:
1. Do one of the following:
■ Issue JIL in interactive mode.
■ Open a JIL script in a text editor.
2. Specify the following definition:
insert_machine: machine_name
node_name: address
type: type
machine_name
Defines a unique name for the machine to add.
address
Specifies the IP address or DNS name of the machine. The default is the
machine_name specified on the insert_machine statement.
type
Specifies the type of machine you are defining. Options are the following:
a
Specifies a CA Workload Automation Agent for UNIX, Linux, Windows,
i5/OS, or z/OS machine. This is the default.
c
Specifies a CA AutoSys Workload Automation Connect Option machine.
l
Specifies a 4.5.1 real UNIX machine. You must specify a lowercase l.
L
Specifies a 4.5.1 real Windows machine. You must specify a capital L.
n
Specifies an r11 real Windows machine or a virtual machine that consists
only of r11 real Windows machines (type n).
p
Specifies a real machine pool managed by CA Automation Suite for Data
Centers.
Note: In the documentation, the type "p" machine is referred to as the real
machine pool.
r
Specifies an r11 real UNIX machine.
u
Specifies a CA NSM or a CA Universal Job Management Agent (CA UJMA)
machine.
v
Specifies a virtual machine. The virtual machine can consist of CA
Workload Automation Agent machines (type a), r11 real UNIX machines
(type r), and r11 real Windows machines (type n).
3. (Virtual or CA Automation Suite for Data Centers machine pool types only) Specify
the following attribute:
machine
Specifies a real machine as a component of the virtual machine or real machine
pool. The specified machine must have been defined to CA Workload
Automation AE as a real machine.
4. (CA Workload Automation Agent for UNIX, Linux, Windows, i5/OS, or z/OS machine
only) Specify the following attribute:
opsys
Specifies the operating system where the CA Workload Automation Agent is
installed. Options are the following:
aix
Specifies a CA Workload Automation Agent for UNIX installed on an AIX
computer.
hpux
Specifies a CA Workload Automation Agent for UNIX installed on an HP-UX
computer.
linux
Specifies a CA Workload Automation Agent for LINUX.
I5os
Specifies a CA Workload Automation Agent for i5/OS.
solaris
Specifies a CA Workload Automation Agent for UNIX installed on a Solaris
computer.
windows
Specifies a CA Workload Automation Agent for Windows.
zos
Specifies a CA Workload Automation Agent for z/OS.
5. Specify additional optional attributes as required:
■ agent_name
■ character_code
■ description
■ encryption_type
■ factor
■ heartbeat_attempts (CA Workload Automation Agent for UNIX, Linux,
Windows, i5/OS, or z/OS only)
■ heartbeat_freq (CA Workload Automation Agent for UNIX, Linux, Windows,
i5/OS, or z/OS only)
■ key_to_agent
■ max_load
■ opsys
■ port
6. Do one of the following:
■ Enter exit if you are using interactive mode.
■ Redirect the script to the jil command if you are using a script.
The data is loaded into the database and the machine is defined
Examples: Defining Real Machines :
The following examples define real machines:
Example: Define a CA Workload Automation Agent
This example defines a machine named eagle where the agent WA_AGENT runs on the
node myagenthostname and uses 49154 as its main input port.
insert_machine: eagle
type: a
agent_name: WA_AGENT
node_name: myagenthostname
port: 49154
max_load: 100
factor: 1.0
Example: Define an r11 Real Windows Machine
This example defines a Windows real machine named jaguar.
insert_machine: jaguar
type: n
max_load: 100
factor: 1.0
Example: Define an r11 Real UNIX Machine
This example defines a UNIX real machine named jaguar.
insert_machine: jaguar
type: r
max_load: 100
factor: 1.0
Examples: Defining Virtual Machines :
The following examples define virtual machines:
Example: Define a Virtual Machine to Include Two Real Windows Machines
This example defines a virtual machine named giraffe to include two real Release 11.3.6
Windows machines (cheetah with a factor value of 5.0 and a max_load value of 400, and
lily with a factor value of 2 and a max_load value of 15).
insert_machine: cheetah
type: a
opsys: windows
insert_machine: lily
type: a
opsys: windows
insert_machine: giraffe
type: v
machine: cheetah
max_load: 400
factor: 5.0
machine: lily
max_load: 15
factor: 2
Example: Define a Windows Virtual Machine with Subsets of r11 Real Machines
This example defines two r11 Windows real machines (lion and lotus), and a virtual
machine (gorilla), which is composed of slices, or subsets, of the max_load specified for
the real machines. Although the real machines were defined with specific max_load
values (100 and 80), the virtual machine only makes use of the reduced loads specified
in the virtual machine definition (10 and 9).
insert_machine: lion
type: n
max_load: 100
factor: 1
insert_machine: lotus
type: n
max_load: 80
factor: .8
insert_machine: gorilla
type: v
machine: lion
max_load: 10
machine: lotus
max_load: 9
Example: Define a Virtual Machine with Default Real Machines
This example defines a virtual machine (sheep), which is composed of two Release
11.3.6 UNIX real machines (warthog and camel). Because the max_load and factor
attributes are not defined for the real machines, they use the default values for these
attributes (a factor of 1.0 and a max_load of none, indicating unlimited load units).
insert_machine: warthog
opsys: linux
insert_machine: camel
opsys: solaris
insert_machine: sheep
type: v
machine: warthog
machine: camel
Example: Define a Virtual Machine with r11 Real Machines
This example defines two r11 UNIX real machines (lion and lotus), and a virtual machine
(zebra), which is composed of the two real machines. The virtual machine is a superset
of the two real machines and uses the max_load and factor attributes defined for them.
insert_machine: lion
type: r
max_load: 100
factor: 1
insert_machine: lotus
type: r
max_load: 80
factor: .9
insert_machine: zebra
type: v
machine: lion
machine: lotus
Example: Define a Real Machine Pool
This example defines a real machine pool (DCAPOOL), which is composed of three real
machines (MWIN, MLIN, and MSQL). DCAPOOL is monitored by CA Automation Suite for
Data Centers. When you define a job to reference DCAPOOL, CA Automation Suite for
Data Centers is used for machine selection.
insert_machine: MWIN
node_name: myhost
insert_machine: MLIN
node_name: myhost1
insert_machine: MSQL
node_name: myhost2
insert_machine: DCAPOOL
type: p
machine: MWIN
machine: MLIN
machine: MSQL
Examples: Defining Real Machine Pools :
The following example defines a real machine pool:
Example: Define a Real Machine Pool
This example defines a real machine pool (DCAPOOL), which is composed of three real
machines (MWIN, MLIN, and MSQL). DCAPOOL is monitored by CA Automation Suite for
Data Centers. When you define a job to reference DCAPOOL, CA Automation Suite for
Data Centers is used for machine selection.
insert_machine: MWIN
node_name: myhost
insert_machine: MLIN
node_name: myhost1
insert_machine: MSQL
node_name: myhost2
insert_machine: DCAPOOL
type: p
machine: MWIN
machine: MLIN
machine: MSQL
Delete a Real Machine :
To delete a real machine definition, specify the following subcommand in the JIL script:
delete_machine: name_of_real_machine
[remove_references: y]
[force: y]
name_of_real_machine
Specifies the name of the machine to delete.
remove_references: y
(Optional) Instructs JIL to remove references to the specified machine from the
definitions of machine pools and virtual machines. We recommend that you use
this option when you delete real machines that are referenced in the definitions of
any machine pools or virtual machines. If you do not instruct JIL to remove the
references, you cannot delete the real machine until you delete all of the
references manually.
force: y
(Optional) Use this option to delete a machine that is in use.
Example: Delete a Real Machine Not Referenced in Virtual Machines or Real Machine
Pools
This example deletes the real machine definition for the computer named jaguar:
delete_machine: jaguar
Example: Delete a Real Machine Currently Referenced via a Virtual Machine:
This example explicitly deletes the real machine hyena reference from the virtual
machine carnivores followed by the real machine definition itself.
delete_machine: carnivores
machine: hyena
delete_machine: hyena
Example: Delete a Real Machine and All Virtual Machines References Implictly:
This example deletes the real machine panther and implicitly deletes all references to it
that may be in any virtual machines or real machine pools.
delete_machine: panther
remove_references: y
Delete a Virtual Machine :
To delete a virtual machine, specify the delete_machine: machine_name subcommand
without the machine attribute in the JIL script. When you delete a virtual machine, the
definitions for its component real machines are not deleted.
You can delete all real machine references in a virtual machine until there is only one
reference remaining. You cannot delete the last reference. To delete all real machine
references in a virtual machine, you must also delete the virtual machine itself.
Note: For more information about deleting virtual machines and the related attributes,
see the Reference Guide.
Example: Delete a Virtual Machine
This example deletes the virtual machine definition named gorilla:
delete_machine: gorilla
Delete a Real Machine Pool :
To delete a real machine pool, specify the delete_machine: machine_name
subcommand without the machine attribute in the JIL script. When you delete a real
machine pool, the definitions for its component real machines are not deleted.
You can delete all real machine references in a real machine pool until there is only one
reference remaining. You cannot delete the last reference. To delete all real machine
references in a real machine pool, you must also delete the real machine pool itself.
Note: For more information about deleting real machine pools and the related
attributes, see the Reference Guide.
Example: Delete a Real Machine Pool
This example deletes the real machine pool definition named gorilla:
delete_machine: gorilla
Delete a Real Machine from a Virtual Machine or Real Machine
Pool
To delete a virtual machine or real machine pool reference to a real machine, specify
the following subcommand in the JIL script:
delete_machine: virtual_machine_name
machine: real_machine_name_referenced
Example: Delete a Real Machine from a Virtual Machine
This example deletes the real machine named camel from the virtual machine named
sheep. The machine definitions for sheep and camel are not deleted from the database.
delete_machine: sheep
machine: camel
Specifying Machine Load (max_load)
You can use the max_load attribute to define the maximum load (in load units) that a
machine can reasonably handle. The max_load attribute is valid in a real machine
definition or component machines defined to virtual machines.
Load units are arbitrary values, the range of which is user-defined. You can use any
weighting scheme you prefer. For example, a load unit with a range of 10 to 100 would
specify that machines with limited processing power are expected to carry a load of only
10, while machines with ample processing power can carry a load of 100. There is no
direct relationship between the load unit value and any of the machine's physical
resources. Therefore, we recommend that you use conventions that are meaningful to
you. You cannot use zero (0) or negative numbers as load units.
The max_load attribute is primarily used to limit the load on a machine. As long as a
job's load will not exceed a machine's maximum load, the max_load attribute does not
influence which machine a job runs on.
If you do not define the max_load attribute in a machine definition, CA Workload
Automation AE does not limit the load on the machine.
Example: Set the Maximum Load for a Real Machine
Suppose that the range of possible load values is 1 to 100. This example sets the
maximum load for a relatively low-performance real machine.
max_load: 20
Specifying Job Load (job_load) :
For load balancing to work, you must assign a job_load value to every job that impacts
the load on a machine. The job_load attribute in a job definition defines the relative
amount of processing power the job consumes (the relative load the job places on a
machine).
Load units are arbitrary values, and the range is user-defined. You can use any weighting
scheme you prefer. You can use the max_load attribute to assign a real machine a
maximum job load. Then, you can use the job_load attribute in the job definition to
assign the job a load value that indicates the relative amount of the machine's load that
the job should consume. These attributes let you control machine loading and prevent a
machine from being overloaded.
Example: Define the Relative Processing Load for a Job
Suppose that the range of possible load values is 1 to 100. This example sets the load for
a job that typically uses 10% of the CPU.
job_load: 10
Specifying Queuing Priority (priority) :
When a job is ready to run on a designated machine but the current load on that
machine is too large to accept the new job’s load, CA Workload Automation AE queues
the job for that machine so it runs when sufficient resources are available.
For job queuing to take place, you must define the priority attribute in the job
definition. The queue priority establishes the relative priority of all jobs queued for a
given machine. The lower number indicates a higher priority. If you do not set the
priority attribute or the priority is set to 0, the job runs immediately on a machine and is
not put in the queue. The job ignores any other job or machine load settings defined.
Example: Set the Job to Run with Highest Priority
This example sets the job to run with the highest priority without overriding the
machine load control mechanism.
priority: 1
Example: Set the Job to Run in the Background
This example sets the job to run in the background when the machine load is low.
priority: 100
Specifying Relative Processing Power (factor) :
You can use the factor attribute to determine the relative processing power for a
machine. To calculate the relative processing power for each machine, the scheduler
multiplies the available CPU cycles by the factor attribute value:
(Available CPU Cycles) x (Factor Attribute Value) = Relative Processing Power
The scheduler determines which of the agent machines specified in the job definition
have the best calculated usage (highest relative processing power). The scheduler starts
the job on the agent machine with the best calculated usage (highest relative processing
power).
Setting the factor value to zero (0) results in a calculated usage of zero (0) but does not
disqualify the machine. The scheduler selects an agent machine with a factor value of
zero (0) only when all other available machines specified in the job definition also have a
factor value of zero (0).
Notes:
■ Factor units are arbitrary, user-defined values. The value consists of a real number,
typically between 0.0 and 1.0. You can set factor units to a value containing a
decimal, such as 0.5. If you do not define the factor attribute in a machine
definition, the value defaults to 1.0.
■ The factor attribute is valid in a real machine definition or component machines
defined to virtual machine.
■ Sometimes the scheduler identifies multiple machines as having the best calculated
usage (highest relative processing power). In such cases, the scheduler randomly
selects one of those machines and starts the job on it. To allow the scheduler to
start the job on any machine specified in the job definition, set the factor attribute
value for all of those machines to zero (0).
■ For more information about the factor attribute in machine definitions, see the
Reference Guide.
Example: Set the Factor for a Low-Performance Real Machine
This example sets the factor for a low-performance real machine, on a scale of 0.0 to
1.0.
factor: 0.1
Example: Set the Factor for a High-Performance Real Machine
This example sets the factor for a high-performance real machine, on a scale of 0.0 to
1.0.
factor: 1.0
Machine Status :
Real machines have a run-time status attribute designed to reflect the machine’s
availability. The machine status lets the scheduler run more efficiently by not wasting
time trying to contact machines that are out of service. If a job is scheduled for a
machine that is offline, it is set to PEND_MACH status until the machine comes back
online. In the case of a virtual machine, offline machines are not considered as possible
candidates for running a job.
A machine can have one of following statuses:
Online
Indicates that the machine is available and accepting jobs to run.
Offline
Indicates that the machine has been manually removed from service and will not
accept jobs to run.
Missing
Indicates that the scheduler has verified that the machine is not responding and has
automatically removed it from service. The machine will not accept jobs to run.
Unqualified
Indicates that the scheduler is attempting to qualify the status of an agent before
switching the machine from an online to missing status. The machine will not
accept jobs to run.
Empty
Indicates that a virtual machine or real machine pool does not contain any
component machines. Jobs scheduled to machines in this status will not run.
Take a Machine Offline Manually
To manually take a machine offline (for example, during hardware service), use the
sendevent command to send a MACH_OFFLINE event.
When you send a MACH_OFFLINE event, jobs that are currently running run to
completion even though the machine’s status is offline. You can use the autorep
command to monitor running jobs.
If you shut a machine down for servicing, you may want to let the running jobs complete
before continuing. With the machine offline, you can service the machine while the
scheduler continues running. All jobs that are scheduled to start on the offline machine
are put in PEND_MACH status until the machine returns to service.
Machine Status
Note: For more information, see the Reference Guide.
Example: Manually Take a Machine Offline
This example takes the machine cheetah offline:
sendevent -E MACH_OFFLINE -n cheetah
The scheduler log displays a message similar to the following when the machine is
offline:
[11/28/2005 15:38:21] CAUAJM_I_40245 EVENT: MACH_OFFLINE MACHINE: cheetah
Put a Machine Online Manually
To manually put a machine online, use the sendevent command to send a
MACH_ONLINE event.
When you send a MACH_ONLINE event for a machine, jobs with a status of
PEND_MACH on that machine are automatically started.
Note: For more information, see the Reference Guide.
Example: Manually Put a Machine Online
This example returns the machine cheetah to online status:
sendevent –E MACH_ONLINE –n cheetah
The scheduler log displays a message, similar to the following, when the machine is
online:
[11/28/2005 15:38:21] CAUAJM_I_40245 EVENT: MACH_ONLINE MACHINE: cheetah
How Status Changes Automatically
When the scheduler verifies that a real machine is not reachable, it uses the following
process to manage machine and job status:
■ If the scheduler fails to contact a machine's agent, the scheduler marks the machine
as unqualified and logs a message similar to the following:
[11/28/2005 16:01:46] CAUAJM_W_40290 Machine cheetah is in question. Placing
machine in the unqualified state. Machine Status
58 User Guide
■ The scheduler puts all jobs scheduled to start on the unqualified machine in
PEND_MACH status. The scheduler checks the GlobalPendMachStatus parameter
(on UNIX) or Global Pend Mach Status field (on WIndows) value. If the status is set
to any valid value other than the default (PEND_MACH), the scheduler checks the
GlobalPendMachDelay parameter (on UNIX) or Global Pend Mach Delay field (on
Windows) value. If the delay interval is set to the default value (zero), the scheduler
immediately sends a CHANGE_STATUS event for the job. If the delay interval is set
to a value other than the default, the scheduler waits for the specified interval
before sending the CHANGE_STATUS event.
■ The scheduler attempts to qualify the status of that machine by pinging the agent
every 10 seconds.
■ If the agent responds, the scheduler sends a MACH_ONLINE event and the machine
returns to service.
■ When the machine returns to service, the scheduler starts all jobs in PEND_MACH
status for that machine. The scheduler checks the GlobalPendMachInterval
parameter (on UNIX) or Global Pend Mach Interval field (on WIndows) value. If the
interval is set to the default value (zero), the scheduler does not wait between job
starts. If the interval is set to a value other than the default, the scheduler waits for
the specified interval before starting jobs in PEND_MACH status, and then repeats
that cycle until all of the jobs are restarted.
■ If the agent fails to respond after three attempts, the scheduler marks the machine
as missing, issues a MACHINE_UNAVAILABLE alarm, and logs a message similar to
the following:
[11/28/2005 16:01:46] CAUAJM_I_40253 Machine cheetah is not responding. Taking
offline.
■ The scheduler puts all jobs scheduled to start on the missing machine in
PEND_MACH status based on the values set for the GlobalPendMachStatus and
GlobalPendMachDelay parameters (on UNIX) or Global Pend Mach Status and
Global Pend Mach Delay fields (on Windows). These values control the status of
jobs that are scheduled on a machine that is currently offline.
■ If the machine definition is updated, the scheduler marks the machine as
unqualified, logs the following message, and pings the agent until the machine
returns to service or is marked missing:
[11/28/2005 16:01:46] CAUAJM_W_40291 Machine cheetah has been updated. Placing
machine in the unqualified state. Machine Status
■ Otherwise, the scheduler pings the missing machine’s agent every 60 seconds to
check its availability.
■ If the agent responds, the scheduler sends a MACH_ONLINE event and the machine
returns to service.
■ When the machine returns to service, the scheduler starts all jobs in PEND_MACH
status for that machine based on the value set for the GlobalPendMachInterval
parameter (on UNIX) or Global Pend Mach Interval field (on Windows). This
parameter controls the starting of jobs in PEND_MACH status.
Notes:
■ If you understand the cause of a missing machine and intervene to correct it, you
can use the sendevent command to send a MACH_ONLINE event to bring the
machine back online instead of waiting for the scheduler to do so.
■ For more information about the GlobalPendMachInterval, GlobalPendMachStatus,
or GlobalPendMachDelay parameters on UNIX, see the Administration Guide. For
more information about the Global Pend Mach Interval, Global Pend Mach Status,
or Global Pend Mach Delay fields on Windows, see the Online Help.
More Information:
Controlling Jobs in PEND_MACH Status (see page 111)
How Status Affects Jobs on Virtual Machines
If a job is defined to run on a virtual machine or a list of machines and one of those
machines is offline, the job will run on another available machine with which it is
associated.
If, however, all machines in the virtual list are offline, the scheduler puts the job in
PEND_MACH status. If any of the machines with which the job is associated comes back
online, the scheduler removes the job from PEND_MACH status and runs it on the
online machine, subject to the queuing criteria.
Load Balancing
60 User Guide
Load Balancing
Load balancing can be implemented to use inherent features of CA Workload
Automation AE or integrated with CA Automation Suite for Data Centers. The usage of
real machine pools provides automatic load balancing through CA Automation Suite for
Data Centers.
When not using CA Automation Suite for Data Centers for load balancing, you can
implement load balancing (where the workload is spread across multiple machines
based on each machine's capabilities) by using the machine attribute to specify a virtual
machine or multiple real machines in a job definition. This is also an easy way to help
ensure reliable job processing. For example, the scheduler can use load balancing to
check which of the machines in a job definition is best suited to run the job, and
automatically start it on that machine.
The advantages of building a virtual machine are as follows:
■ Its definition can be changed and the new construct is immediately applied globally.
■ The max_load and factor values can vary between machines.
Alternatively, you can specify a list of real machines in the job's machine attribute. The
system configuration includes machines of varying processing power. CA Workload
Automation AE uses one of various load balancing methods to choose a real machine. If
you specify the cpu_mon or vmstat load balancing methods in the configuration file, CA
Workload Automation AE chooses which machine to run on based on the available
processing power obtained from the agent. If you specify the job_load load balancing
method in the configuration file, CA Workload Automation AE chooses which machine
to run on based on the max_load and factor attributes for each real machine in
conjunction with the job definition’s priority and job_load attributes. If you specify the
UNIX-only rstatd load balancing method, in the configuration file, CA Workload
Automation AE chooses which machine to run on based on the information obtained
from the remote UNIX computer’s remote kernel statistics daemon.
Load Balancing
In either case, CA Workload Automation AE uses the following process to verify the
available relative processing cycles for each machine:
1. CA Workload Automation AE calculates the number of load units available on each
real machine in the specified virtual machine. To do this, CA Workload Automation
AE uses the load balancing method specified in the configuration file.
Notes:
■ For the CA Workload Automation Agent on UNIX, Linux, Windows, or i5/OS
(machine type: a), CA Workload Automation AE uses cpu_mon, rstatd, or the
job_load method. If the machine method specified in the configuration file is
set to the cpu_mon or vmstat methods, the scheduler runs a CPU Monitoring
(OMCPU) job to determine the available CPU cycles. This is the default. For the
CA Workload Automation Agent on UNIX and Linux only (opsys: aix, hpux, linux,
or solaris), CA Workload Automation AE supports the rstatd method.
■ For the legacy agent on UNIX, CA Workload Automation AE only uses vmstat,
rstatd, or the job_load method.
■ For the legacy agent on Windows, CA Workload Automation AE only uses
vmstat or the job_load method.
2. CA Workload Automation AE performs the following calculation:
Machine Usage = Available Load Units * Factor value
3. CA Workload Automation AE chooses the machine with the most relative load units
available, based on the calculation in Step 2.
Notes:
■ If a real machine in the virtual machine is not online, the scheduler does not
attempt to contact it and it is not considered in the load balancing algorithm.
■ If the machines have equal max_load and factor values, it is equivalent to defining a
job and specifying the following in the machine field:
machine: cheetah, camel
■ If the factor attribute is not specified for a machine, CA Workload Automation AE
assumes the default factor value for each machine (1.0).
■ On UNIX, the load balancing method is specified using the MachineMethod
parameter in the CA Workload Automation AE configuration file. On Windows, the
method is specified using the Machine Method field on the Scheduler window of CA
Workload Automation AE Administrator. For more information, see the
Administration Guide or Online Help for CA Workload Automation AE Administrator.
■ The cpu_mon machine method does not apply to z/OS machines (CA Workload
Automation Agent on z/OS) because the OMCPU job is not supported on z/OS.
■ If the load balancing request is sent to a legacy agent, CA Workload Automation AE
uses the vmstat method to obtain the available CPU cycles.
Load Balancing
62 User Guide
Example: Load Balancing With a Virtual Machine
This example defines a virtual machine (marmot) with three real machines (cheetah,
hippogriff, and camel):
insert_machine: marmot
machine: cheetah
factor: 1
machine: hippogriff
factor: .8
machine: camel
factor: .3
To start a job on this virtual machine, specify marmot in the job's machine attribute. The
scheduler performs the necessary calculations to verify on which machine to run the
job, and reflects these calculations in its output log. The output is similar to the
following:
EVENT: STARTJOB JOB: test_mach
[11/22/2005 10:16:53] CAUAJM_I_40245 EVENT: STARTJOB JOB: tvm
[11/22/2005 10:16:54] CAUAJM_I_10208 Checking Machine usages:
[11/22/2005 10:16:59] <cheetah=78>
[11/22/2005 10:17:02] <hippogriff=80*[.80]=64>
[11/22/2005 10:17:07] <camel=20*[.30]=6>
[11/22/2005 10:17:11] CAUAJM_I_40245 EVENT: CHANGE_STATUS STATUS: STARTING
JOB: tvm
[11/22/2005 10:17:11] CAUAJM_I_10082 [cheetah connected for tvm]
Note that even though the machine usage on cheetah was less than that of machine
hippogriff, machine cheetah was picked because of the result of the factor calculation
(machine cheetah had 78% of its processing power available, while machine hippogriff
only had 64% available). Thus, the factors weigh each machine to account for variations
in processing power.
Load Balancing Using Virtual Resource Dependencies
Load Balancing Using Virtual Resource Dependencies
Load balancing can also be performed using virtual resources combined with virtual
machines or machine lists for basic throttling and serialization. If you assign a virtual
machine or list of real machines to a job along with virtual resource dependencies, the
jobs can be dispatched to the various machines based on resource availability.
Example: Load Balancing Using Machine Virtual Resources
Suppose that you have three machines (sloth, tiger, and leopard) capable of running
several applications but they vary in capacity or physical resources, for example CPU
speed, memory, or utilization. You also have various jobs that use different runtime
resources. Jobs with low resource usage can run anywhere. Jobs that use more
resources are limited to where they can run. The number of concurrent jobs and their
requirements must be controlled to avoid overburdening any machine.
■ Define the real machine definitions for sloth, tiger, and leopard. Then define a
virtual machine, domain, that references all the machines where the jobs should be
allowed to run.
insert_machine: sloth
insert_machine: tiger
insert_machine: leopard
insert_machine: domain
type: v
machine: sloth
machine: tiger
machine: leopard
■ Define the maximum amount of virtual resources available to each machine.
Remember, they are virtual resources. They do not really exist. The virtual resource
amounts are approximations based on estimated capabilities of the machines. In
this example, the sloth machine has the fewest capabilities while the leopard
machine has the most.
insert_resource: job_weight
res_type: r
machine: sloth
amount: 2
insert_resource: job_weight
res_type: r
machine: tiger
amount: 10
insert_resource: job_weight
res_type: r
machine: leopard
amount: 30 Load Balancing Using Virtual Resource Dependencies
64 User Guide
■ Define jobs to run on one of the real machines referenced by the virtual machine
domain. Identify the virtual resource units each job consumes. Similar to the
resource definition, these values are approximations based on perceptions or
expectations of what the job consumes while running. The quantity required for the
job determines where it can run and the mix of jobs that can run concurrently.
insert_job: quick_job
machine: domain
command: efficient_report
resources: (job_weight,quantity=1,free=y)
insert_job: heavy_job
machine: domain
command: analytical_report
resources: (job_weight,quantity=5,free=y)
insert_job: beastly_job
machine: domain
command: quarterly_update
resources: (job_weight,quantity=10,free=y)
Based on the above definitions, job quick_job could run on any of the machines defined
to the virtual machine named domain because all machines have at least one unit of the
job_weight virtual resource defined to it. The jobs heavy_job and beastly_job can only
be scheduled to real machines tiger and leopard. The two jobs cannot be scheduled
concurrently to the tiger machine as that would exceed the virtual resources defined to
it. If the job heavy_job is already running on the tiger machine when job beastly_job is
being scheduled, the job beastly_job would be scheduled to run on the leopard real
machine.
Load Balancing Using Virtual and Real Resource Dependencies
Load Balancing Using Virtual and Real Resource Dependencies
You can also implement load balancing using virtual and real resources as dependencies
to a job. Virtual and real resource dependencies can be defined to both virtual machines
and real machine pools.
If you assign a virtual machine or real machine pool to a job with either virtual and/or
real resource dependencies, the job runs on the machine that satisfies the resource
dependencies. If the job has real resource dependencies and two or more machines
satisfy the specified metrics, CA Automation Suite for Data Centers returns the best
machine based on the lowest overall utilization. If the job has only virtual resource
dependencies and two or more machines satisfy the specified metrics, the job runs on
the first machine that satisfies the virtual resource requirements.
Note: If CA Workload Automation AE is not integrated with CA Automation Suite for
Data Centers, real resource dependencies are ignored and the job is submitted on the
first machine that satisfies the virtual resource requirements. If the job does not have
virtual resources, it runs on the machine as determined by load balancing using the
max_load and factor values.
Example: Load Balancing Using Virtual and Real Resource Dependencies
Suppose that you want to define a job that gets submitted on a machine that satisfies
the real and virtual resource dependencies, you can do the following:
■ Define a renewable resource ren_glb1 at the global level:
insert_resource: ren_glb1
res_type: R
amount: 10
■ Define a real machine pool DCAPOOL to include three real machines (MWIN, MLIN,
and MSOL) that are discovered and monitored by CA Automation Suite for Data
Centers for real time load balancing:
insert_machine: DCAPOOL
type: p
machine: MWIN
machine: MLIN
machine: MSOL Load Balancing Using Virtual and Real Resource Dependencies
66 User Guide
■ Define a job job_load with real and virtual resource dependencies:
insert_job: job_load
job_type: CMD
command: sleep 1
machine: DCAPOOL
owner: autosys
date_condition: 0
alarm_if_fail: 1
resources: (ren_glb1, quantity=2, free=y) and (MEM_INUSE_PCT, VALUE=30,
VALUEOP=LTE)
■ Generate a report for the job (job_load) to view whether the real and virtual
resource dependencies are satisfied on the MWIN, MLIN, and MSOL machines:
job_depends -J job_load -r
The report might resemble the following:
Job Name Machine
-------- ----------
job_load MLIN
Virtual Resources
-----------------
Resource Type Amount Satisfied?
-------- ---- ------ ----------
ren_glb1 R 2 YES
Real Resources
-----------------
Resource Satisfied?
----------- -----------
MEM_INUSE_PCT, VALUE=30, VALUEOP=LTE NO
----------------------------------------------------------------------------
Job Name Machine
-------- ----------
job_load MSOL
Virtual Resources
-----------------
Resource Type Amount Satisfied?
-------- ---- ------ ----------
ren_glb1 R 2 YES
Real Resources
-----------------
Resource Satisfied?
----------- -----------
MEM_INUSE_PCT, VALUE=30, VALUEOP=LTE YES
----------------------------------------------------------------------------
Job Name Machine
-------- ----------
job_load MWIN Load Balancing Using Virtual and Real Resource Dependencies
Virtual Resources
-----------------
Resource Type Amount Satisfied?
-------- ---- ------ ----------
ren_glb1 R 2 YES
Real Resources
-----------------
Resource Satisfied?
----------- -----------
MEM_INUSE_PCT, VALUE=30, VALUEOP=LTE YES
The report displays that the virtual resource (ren_gbl1) is satisfied on all the
machines. However, the real resource MEM_INUSE_PCT is satisfied only on MWIN
and MSOL machines. When you start the job (job_load), CA Automation Suite for
Data Centers decides the best machine with the least overall utilization between
the MWIN and MSOL machines.
■ Start the job (job_load):
sendevent –E START_JOB –J job_load
■ Generate a detailed report for the job (job_load) to view the machine on which the
job runs:
autorep -J job_load -d
The resulting report might resemble the following:
Job Name Last Start Last End ST Run/Ntry Pri/Xit
___________________________ ____________________ ____________________ __ ________ _______
job_load 10/07/2010 15:06:21 10/07/2010 15:06:23 SU 689/1 0
Status/[Event] Time Ntry ES ProcessTime Machine
-------------- --------------------- -- -- --------------------- ----------------
STARTING 10/07/2010 15:06:21 1 PD 10/07/2010 15:06:22 MSOL
RUNNING 10/07/2010 15:06:22 1 PD 10/07/2010 15:06:22 MSOL
<Executing at WA_AGENT>
SUCCESS 10/07/2010 15:06:23 1 PD 10/07/2010 15:06:23 MSOL
The report displays that CA Automation Suite for Data Centers decided MSOL as the
best machine with the least overall utilization.
Load Balancing Using Real Resource Pools
68 User Guide
Load Balancing Using Real Resource Pools
If you assign the real machine pool to a job without any real resource dependencies, CA
Automation Suite for Data Centers monitors these machines and decides the best
machine with the least overall utilization for job submission.
Note: This does not apply to virtual machines although you may create a virtual machine
that is composed of real machines that are monitored by CA Automation Suite for Data
Centers.
Example: Load Balancing Using Real Machine Pools
Suppose that you want to define a job that gets submitted on a machine with least
overall utilization, you can do the following:
■ Define a real machine pool DCAPOOL to include three real machines (MWIN, MLIN,
and MSOL) that are discovered and monitored by CA Automation Suite for Data
Centers for real time load balancing:
insert_machine: DCAPOOL
type: p
machine: MWIN
machine: MLIN
machine: MSOL
■ Define a job job_load and assign the real machine pool DCAPOOL to it:
insert_job: job_load
machine: DCAPOOL
command: sleep 1
owner: autosys
CA Automation Suite for Data Centers monitors the three machines (MWIN, MLIN,
and MSOL) and decides the best machine with the least overall utilization for job
submission. For example, if the overall utilization of MWIN is 68%, MLIN is 50%, and
MSOL is 56%, CA Automation Suite for Data Centers selects MLIN machine for job
submission.
Forcing a Job to Start
■ Start the job (job_load):
sendevent –E START_JOB –J job_load
■ Generate a detailed report for the job (job_load) to view the machine on which the
job runs:
autorep -J job_load -d
The resulting report might resemble the following:
Job Name Last Start Last End ST Run/Ntry Pri/Xit
___________________________ ____________________ ____________________ __ ________ _______
job_load 10/07/2010 14:35:42 10/07/2010 14:35:43 SU 687/1 0
Status/[Event] Time Ntry ES ProcessTime Machine
-------------- --------------------- -- -- --------------------- ----------------
STARTING 10/07/2010 14:35:42 1 PD 10/07/2010 14:35:42 MLIN
RUNNING 10/07/2010 14:35:42 1 PD 10/07/2010 14:35:42 MLIN
<Executing at WA_AGENT>
SUCCESS 10/07/2010 14:35:43 1 PD 10/07/2010 14:35:43 MLIN
Forcing a Job to Start
If you use the sendevent command to send a FORCE_STARTJOB event to a job, CA
Workload Automation AE immediately starts the job on the machine that is specified in
the job definition, regardless of the current load on the machine or the job_load value
that is set for the job. If the job was defined to run on a virtual machine or a list of real
machines, CA Workload Automation AE checks which machine has the most processing
power available and runs the job on that machine, even if the job_load value set for the
job exceeds the max_load value set for the machine.
Notes:
■ If you send a FORCE_STARTJOB event to a job in ON_ICE or ON_HOLD status, the
job's status does not revert to its previous status when it completes.
■ If you send a FORCE_STARTJOB event to a job in RESWAIT status, the
FORCE_STARTJOB is ignored and the job remains in the RESWAIT status. You can
remove or alter the resource requirements of the job so the job is no longer in
RESWAIT and can be started.
■ If you send a FORCE_STARTJOB event to a job in FAILURE or TERMINATED status
that has a virtual resource dependency with free=Y or free=N and has not released
the virtual resources, the FORCE_STARTJOB event verifies if the job's current status
is FAILURE or TERMINATED and schedules the job using the already held virtual
resources. Before force starting the job, the scheduler does not re-evaluate other
resource dependencies.
Forcing a Job to Start
70 User Guide
Example: Force a Job to Start
This example describes the effects of forcing a job to start. Assume you scheduled Job1
to run every Monday at 3:00 A.M. On Sunday, you sent a JOB_ON_HOLD event to put
the job in ON_HOLD status, so that the job does not run as scheduled on Monday. If you
send a FORCE_STARTJOB event to Job1 on Wednesday at 2:00 P.M., Job1 runs to
completion (either success or failure), and then runs again as scheduled on Monday at
3:00 A.M. The job did not revert to the ON_HOLD status after you forced it to start on
Wednesday.
Queuing Jobs
Queuing Jobs
Queuing is a mechanism used in CA Workload Automation AE to check the run order of
jobs that cannot run immediately. There is no actual physical queue. Instead, CA
Workload Automation AE uses queuing policies, which are based on the use and
subsequent interaction of the job_load and priority attributes in a job definition and the
max_load and factor attributes in a machine definition. Jobs that are in a queued state
already meet their starting conditions, but cannot start due to external conditions.
Jobs that meet their starting conditions but are in the ON_HOLD or ON_ICE state also do
not start; however, these jobs are not considered queued jobs. To place a job on hold or
on ice, send a JOB_ON_HOLD or JOB_ON_ICE event using the sendevent command. Jobs
in these states do not start until you take them off hold or off ice.
When a job leaves a queued state, the scheduler determines whether to start the job by
re-evaluating starting conditions for that job unless you configure CA Workload
Automation AE to skip starting condition evaluation for queued jobs. If a job that is
contained in a box fails its starting condition checks when leaving the queue, the
scheduler places that job in the ACTIVATED state. If a job that is not contained in a box
fails its starting condition checks when leaving the queue, the scheduler places that job
in the INACTIVE state. If the job meets its starting conditions, or if you configure the
system to skip starting condition evaluation for queued jobs, the job starts.
When you take a job off hold, the scheduler re-evaluates starting conditions for that job.
When you take a job off ice, the scheduler does not restart the job until its starting
conditions recur, even if those conditions were met while the job was on ice.
When you instruct the scheduler to bypass execution of a job, the scheduler starts the
job when it meets its starting conditions. The scheduler simulates running these jobs,
but the agent does not execute commands associated with the jobs. Bypassed jobs
evaluate as successfully completed on the scheduler machine. When you issue the
JOB_OFF_NOEXEC event, the agent executes commands associated with the specified
job the next time the job starts. Changing the executable status of a job does not affect
evaluation of starting conditions.
Notes:
■ When you take jobs off hold or when jobs leave a queued state, the scheduler does
not re-evaluate date and time conditions. Jobs that meet their date and time
conditions while they are in a queued state or on hold start as soon as they leave
the queue or are taken off hold unless other starting conditions apply and are not
satisfied.
■ If you configure CA Workload Automation AE to skip starting condition evaluation
for queued jobs, those jobs start immediately upon leaving a queued state.
■ For more information about configuring CA Workload Automation AE to skip
starting condition evaluation for queued jobs, see the Administration Guide or the
Online Help. Queuing Jobs
72 User Guide
The following sections discuss queuing jobs and give examples of how to use load
balancing and queuing to optimize job processing in your environment.
How CA Workload Automation AE Queues Jobs
For queuing to be most effective, you must set the priority attribute for all jobs. By
default, the priority attribute is set to 0, indicating that the job should not be queued
and should run immediately. When you let the priority attribute default for a job, it runs
even if its job load would push the machine over its load limit. However, even when jobs
have a priority of 0, CA Workload Automation AE tracks job loads on each machine so
that jobs with non-zero priorities can be queued.
Note: If the job has resource dependencies, CA Workload Automation AE does not use
the following process to limit the job load on machines and to queue jobs for
processing. Instead, the resource manager (CA Workload Automation AE) is used to
select the best machine to run the job.
CA Workload Automation AE uses the following process to limit the job load on
machines and to queue jobs for processing:
■ If you set a job_load value for a job and you assigned a max_load for every real
machine comprising a virtual machine, CA Workload Automation AE checks if each
machine has sufficient available load units before running the job.
When more than one job is queued, the priority value is considered first when
deciding which job to run next. If there are insufficient load units available to run
the highest priority job, no other priority jobs are considered subsequently.
■ If each real machine has sufficient load units, CA Workload Automation AE employs
the load balancing and factor algorithms to verify on which machine the job should
start.
■ If only one of the machines has sufficient load units, the job runs on that machine.
■ If none of the machines has sufficient load units, CA Workload Automation AE puts
the job in QUE_WAIT status for all the machines. The job stays in QUE_WAIT status
until one of the machines has sufficient load units available.
Note: If a job is in QUE_WAIT status and you want it to run immediately, do not force
the job to start. Instead, use the sendevent command to send a CHANGE_PRIORITY
event that changes the job's priority to 0.
Queuing Jobs
Example: Job Queuing
This example shows a simple job queuing scenario that uses a previously defined
machine named lion with a max_load of 100:
insert_job: jobA
machine: lion
job_load: 80
priority: 1
insert_job: jobB
machine: lion
job_load: 90
priority: 1
In this example, if JobA was running when JobB started, CA Workload Automation AE
would put JobB in QUE_WAIT status until JobA completes, at which point JobB can run.
Example: Job Queuing and Load Balancing
This example shows a situation in which a machine has 80 load units and multiple jobs
are waiting to start. In this example, JobB and JobC are executing, while JobA and JobD
are queued (in the QUE_WAIT state) and waiting for available load units. The numbers
in the following illustration indicate the job_load assigned to each job, and the
max_load set for the machine.
The following JIL statements define the machine and the jobs in this example:
insert_machine: cheetah
max_load: 80
insert_job: JobA
machine: cheetah
job_load: 50
priority: 70
insert_job: JobB
machine: cheetah
job_load: 50
priority: 50 Queuing Jobs
74 User Guide
insert_job: JobC
machine: cheetah
job_load: 30
priority: 60
insert_job: JobD
machine: cheetah
job_load: 30
priority: 80
In this example, JobB and JobC are already running because their starting conditions
were satisfied first. After JobB or JobC completes, JobA is considered to start before
JobD because JobA has a higher priority.
How soon JobA starts is determined by a combination of its priority and job_load
attributes, and the max_load machine attribute. The result differs based on whether
JobB or JobC finishes first, as follows:
■ If JobB finishes first, 50 load units become available, so JobA runs. After JobA or
JobB complete, sufficient load units become available, so JobD runs.
■ If JobC finishes first, only 30 load units become available, so both JobA and JobD
remain queued until JobB completes.
■ After JobB completes, a total of 80 load units become available, so both JobA and
JobD are eligible to run. Because JobA has a higher priority, it runs first. JobD runs
shortly after.
Using a Virtual Machine as a Subset of a Real Machine
One variety of virtual machine can be considered a subset of a real machine. Typically,
you would use this type of virtual machine to construct an individual queue on a given
machine. One use for this construct might be to limit the number of jobs of a certain
type that run on a machine at any given time.
Example: Define a Virtual Machine as a Subset of a Real Machine
This example shows how to define a virtual machine that functions as a subset of a real
machine, thereby acting as a queue.
In this example, cheetah is a real machine with a max_load value of 80. If you create
three different print jobs, but you want only one job to run on a machine at a time, you
can use a combination of the max_load attribute for a virtual machine and the job_load
attributes for the jobs themselves to control how the jobs run.
Queuing Jobs
To implement this scenario, you would first create the virtual machine named
cheetah_printQ as follows:
insert_machine: cheetah_printQ
machine: cheetah max_load: 15
Next, you would define the three print jobs as follows:
insert_job: Print1
machine: cheetah_printQ
job_load: 15
priority: 1
insert_job: Print2
machine: cheetah_printQ
job_load: 15
priority: 1
insert_job: Print3
machine: cheetah_printQ
job_load: 15
priority: 2
Although the real machine cheetah has a max_load value of 80, meaning that all three
jobs (with their job_load values of 15) could run on it simultaneously, the virtual
machine cheetah_printQ effectively resets the real machine's max_load to 15. Because
each job is defined to run on cheetah_printQ, not cheetah, only one of the jobs can run
at a time because each job requires all of the load units available on the specified
machine.
Note: The load units associated with a virtual machine have no interaction with the load
units for the real machine. This example implies that the virtual load of 15 does not
subtract from the load units of 80 for the real machine. Load units are simply a
convention that lets the user restrict concurrent jobs running on any one machine.
Using a Virtual Machine to Combine Subsets of Real Machines
You can also define virtual machines to combine subsets (or slices) of real machines into
one virtual machine. You might do this, for example, if there are two machines that are
print servers and you want only one print job to run at a time on each.
Example: Define a Virtual Machine to Combine Subsets of Real Machines
This example defines a virtual machine (printQ) that uses subsets of the loads available
on two real machines to control where jobs run.
User-Defined Load Balancing
76 User Guide
To implement this, you would create the virtual machine named printQ, and specify two
real machines (cheetah and camel), as shown in the following JIL statements:
insert_machine: printQ
type: v
machine: cheetah
max_load: 15
machine: camel
max_load: 15
When a job is ready to start on printQ, CA Workload Automation AE checks if the
component real machine (cheetah or camel) has enough load units available to run the
job.
■ If neither machine has enough available load units, the product puts the job in
QUE_WAIT status and starts it when there are enough load units.
■ If only one machine has enough available load units, the product starts the job on
that machine.
■ If both machines have enough available load units, the product checks the usage on
each, and starts the job on the machine with the most available CPU resources.
User-Defined Load Balancing
As an alternative to using the load balancing methods that CA Workload Automation AE
provides, you can write your own programs or batch files to check which machine to use
at run time. If you specify the name of a program or batch file as the value of the
machine attribute in the job definition, the scheduler runs the batch file at job run time,
and substitutes its output for the machine name.
If the machine returned by the script is offline, the product puts the job in PEND_MACH
status for that machine. When the missing machine returns to service, the pending job
runs on it regardless of whether the script would return a different machine name at
that point in time. Because a machine must be defined for the scheduler to run a job on
it, you must have previously defined the machine returned by the script to CA Workload
Automation AE.
User-Defined Load Balancing
Example: User-Defined Load Balancing
This example shows how you would specify a user-defined program or batch file in place
of a real or virtual machine for processing a job.
For example, you might supply the following:
insert_job: run_free
machine: '/usr/local/bin/pick_free_mach'
command: $HOME/DEL_STUFF
At run time, the script /usr/local/bin/pick_free_mach runs on the scheduler machine.
The standard output is substituted for the name of the machine, and the job runs on
that machine.
Important! The escape character in the machine value above is the back-tic character
(`), not an apostrophe ('). You must escape a program or batch file used as the machine
attribute value with back-tic characters as shown for the scheduler to recognize that the
machine value specifies a script. The apostrophe and quotation mark characters do not
work in this case.
No comments:
Post a Comment
Note: only a member of this blog may post a comment.