Brokerage Firm Upgrades from Disaster Recovery to Automatic Failover and Recovery with Sizzling-Hot-Takeover (SZT)
Share This!
- Copy link to clipboard
- Email link
- Print
Situation
A brokerage firm located in the Midwestern U.S. provides trading services and trading suggestions to customers through its brokers. It operates a home-grown application running on a redundant pair of HPE NonStop Servers using HPE Shadowbase Active/Passive Disaster Recovery for backup in the event of a primary node failure.
Problem
The firm’s application was mission-critical, yet it was running in a Disaster Recovery (DR) architecture. In a DR architecture (with any backup solution), the IT team can only react after an outage occurs to begin the recovery process. This delay increased the risk for an extended recovery period costing the broker significant revenue with an offline application, since management would need to authorize the failover, slowing the application’s recovery and exacerbating the outage. Additionally, to even perform this recovery process, they would need to assemble the team, potentially while team members were unavailable and difficult-to-reach in remote locations.
For these reasons, the brokerage needed to improve upon its DR architecture, removing the possibility of human error for a rapid recovery in the event of an outage.
Solution
- Implement HPE Shadowbase Sizzling-Hot-Takeover (SZT) to enable automatic failover and recovery to a known-working state.
- Periodically, test the automatic failover and recovery process.
- Switch the polarity of the nodes to ensure the standby environment is available.
- Send test transactions to the backup node to ensure the data is properly replicating to the primary node.
In the “Final HPE Shadowbase Sizzling-Hot-Takeover Architecture” above, brokers are connected to a broker application that routes data to the “active” NonStop Server for processing security buy/sell orders. This NonStop Server is connected to another NonStop Server via Shadowbase bi-directional replication in an active/almost-active SZT architecture. The application on the “Standby” node is up and running, and waiting for requests (it is not processing live transactions). If the active node fails, the network will switch the requests to the hot Standby node, which will process the broker requests.
Additional Notes About this Architecture
- In this configuration, the Primary and Standby servers are identically (or similarly) configured with active/almost active bi-directional replication used to maintain synchronization between the two databases.
- In an HPE Shadowbase SZT architecture, the application on the Standby server is up and running, but is not processing any broker requests.
- Though the applications are active on both nodes, client interactions (broker requests) are only directed to the primary node.
- This is an important distinction between an HPE Shadowbase SZT architecture and a fully active/active architecture. In a fully active/active architecture, both nodes are actively processing broker requests.
- Data collisions are impossible, since broker requests are sent to only one node at a time in this architecture.
- Brokers will not notice whether their request is processed on the Primary or Standby server. Both servers are similarly configured, with all processes active at all times (the databases are opened for read/write access, external connections are enabled, etc.).
- To perform a zero downtime migration (ZDM) for an upgrade, the network router is switched so it sends broker requests to the Standby server. The Primary server is taken offline and replication to the offline server is stopped. The offline server is upgraded and restarted, and Shadowbase replication is then used to resynchronize the offline server’s database after it is brought back online in preparation for a switchover.
Outcomes
- Preserves application availability – This restores management’s confidence in the application, their IT team’s capability to productively execute on projects, and avoids disrupting incoming revenue from its customers
- Automatically fails over and recovers – If the primary node fails or is taken offline, routers switch the clients to the hot backup node within a sub-second time frame. This shifted failover and recovery to a digital business process
- Rapidly recovers with little to no user impact – Takeover is virtually instantaneous and imposes very little impact on user processing, since the application is already active (up and running and waiting for requests) on the backup node.
- Familiarizes operations staff with their replication engine – Periodic testing ensures the staff is well-versed with the replication engine, understands it, and can use it.
- Offloads querying/reporting from Primary to the Standby server – This capability reduces the load on the Primary node and utilizes the capacity of the Standby servers for productive work.
- Enables zero downtime migration – This architecture can upgrade the firm’s application, database, or perform other forms of changes that would normally require production downtime. This migration leverages the Shadowbase bi-directional replication capabilities to keep the databases synchronized, even if the data formats/schemas are changed.
- Reduces risk inherent in active/passive architectures – A SZT architecture allows the application to be up and running on the standby node, eliminating the risk that it (or the database) will not or cannot come up when the failover happens. This capability is analogous to “failing over to a known-working system” and reduces the potential for failover faults that can otherwise occur in an active/passive environment.
HPE Shadowbase Products of Interest
Contact us or your HPE Shadowbase representative, and learn how Shadowbase software will benefit you.
Further Reading
Related White Papers:
Related Solution Briefs: