Disaster recovery configurations
This topic describes agent configuration for a disaster recovery implementation and provides a disaster recovery scenario.
The approach for disaster recovery is to connect agents to the load balancer server that determines which of the server nodes communicate with agents.
To set up agents to connect to the load balancer server:
Install the Deployment Automation server application on a number of servers. For details, see Install for disaster recovery.
These servers are used as a disaster recovery cluster with a load balancer, so make sure you have set the External Agent URL and External User URL system settings to point to the IP address of the load balancer.
To ensure that agents can communicate with each server node within the cluster, when you install an agent, point the agent to the load balancer server. The load balancer decides which server node communicates with the agent.
When installing agents, configure the agents to communicate through the load balancer server:
- In the agent installer Server Details, in Hostname or address, use the IP address of the load balancer server.
- In the agent installer Server Details, in Agent Communication Port, use the port number of the load balancer server.
Configure agent relay failover
To configure agent relay failover, specify two or more target servers for the agent relay to connect to. If a server fails, the agent relay switches to another server from the list.
When the agent relay starts, it connects to the source server defined in the agentrelay.jms_proxy.server_host parameter. If that server fails, the agent relay selects a failover server from the list. The agent relay continues to use the server until it fails, even if the previous or the source server becomes available again.
Note: If a large number of agents are configured to use failover, reconnecting can take a while.
To configure agent relay failover:
Open the agent relay properties file: <agentrelay_install_directory>\conf\agentrelay.properties
In the agentrelay.jms_proxy.failover_hosts_with_ports parameter, enter a list of failover server locations in this format:
<IP address or hostname>:JMS_port
Separate each server definition with a comma.
Example: Two failover servers:
Disaster recovery with cold standby
To ensure server availability in case of system failure, a cold standby strategy is implemented for disaster recovery.
When the primary system fails, the cold standby is brought online and promoted to the primary server through the load balancer. When online, the standby reestablishes connections with all agents, performs recovery, and proceeds with any queued processes.
Because the most intense work is handed off to agents, avoid installing agents on the same hardware as the primary server.
When using the cold standby data center configuration, you typically configure the data tier with network storage and a clustered database. The service tier performs best when it is on a dedicated, stable, multicore machine with a fast connection to the data tier. Maintain a standby machine and keep it ready in case the primary server goes down.
The following diagram displays a typical cold standby data center configuration.
Disaster recovery with a cold standby system