Sizzling-Hot-Takeover (SZT) Replication Systems

Not all applications require the same level of business continuity protection. Less critical applications/data can tolerate longer recovery times and amounts of lost data, while highly critical applications may not be able to tolerate any downtime or data loss. To satisfy this range of needs, the Shadowbase business continuity product suite supports both high and continuous availability solutions.

To measure the characteristics of a business continuity solution, the parameters Recovery Time Objective (RTO) and Recovery Point Objective (RPO) are used. RTO is the time taken to perform a failover recovery and resumption of business services following an outage. RPO is the amount of lost data resulting from an outage. The closer these parameters are to zero (faster time to recovery, less amount of lost data), the more effective is the business continuity solution.

Shadowbase Business Continuity Continuum

Figure 1 — The Shadowbase Business Continuity Continuum

While Shadowbase disaster recovery solutions support high availability (as described here), Shadowbase Sizzling-Hot-Takeover (SZT, also known as “Sizzling-Hot-Standby”) solutions support continuous availability (see Figure 1), and offer significant benefits.

Figure 2 — Sizzling-Hot-Takeover (SZT) with Reverse Replication

Figure 2 — Sizzling-Hot-Takeover with Reverse Replication

A Shadowbase SZT system is similar to an active/passive architecture using uni-directional replication, except that the backup system is immediately ready to start processing transactions (Figure 2). In this configuration the backup system has its applications up and running, with the local copy of the application database already open for read/write access. It is basically an active/active architecture, except that all user transactions are directed to the primary node, thereby avoiding data collision issues, which can arise with fully active/active systems. With Shadowbase data replication operating, the SZT system can very quickly take over processing because its local database is synchronized with the active database and completely consistent and accurate. The applications are also already up and running with the database open for read/write access; there is no time taken for the applications to be started or switched from read-only database access.

When the active node fails, switching the users or their transactions to the standby node is all that is required for failover. The switch can be done in seconds to subseconds, leading to very low RTOs (continuous availability). To operate in this mode, it is essential that the replication engine being used, allows the application processes to open the target database for read/write access while it is also being written to by the replication engine. While this configuration is true for Shadowbase technology, it is not true for some other replication products.

The Shadowbase SZT configuration has another big advantage over active/passive disaster recovery systems, and that is the absence of failover faults. In active/passive systems, the standby system is not actively involved in the application. Testing a failover to the backup system requires taking an outage of the production system, something which many companies are unwilling to do. What if the backup system cannot be made operational, and/or recovery back to the primary system is unsuccessful? Consequently, the state of the passive system in an active/passive configuration is often untested, and not known if it is really operational. If a failover attempt is then made to a nonfunctioning backup system, the application is down, and may remain unavailable for an extended period. However, in a Shadowbase SZT configuration, it is known at all times that the backup node is working, because it can be easily exercised without requiring an outage of the primary system, by periodically submitting test or verification transactions to the application to ensure proper operation. Consequently, failover is guaranteed to be successful, and can be automated, which is a requirement if very short RTOs are to be satisfied.

The Shadowbase SZT system is also configured with reverse replication up and running so that after a takeover, the passive (now active) system will have a backup once the down primary system is recovered, ensuring continued availability protection after an outage of one system (Figure 2). With reverse replication enabled, Shadowbase replication on the backup (now active) node queues the changes that it is making to its copy of the database. When the formerly active node is recovered, Shadowbase replication will replay the queued updates to rapidly resynchronize the two databases. This step provides continuous availability protection if the now active node fails. To operate in this mode, it is essential that the replication engine being used supports bi-directional replication. While this configuration is true for Shadowbase technology, it is not true for some other replication products.

A Shadowbase SZT configuration can achieve a zero RPO if synchronous replication is used, or RPOs measured in tens or hundreds of milliseconds if asynchronous replication is used. In addition, RTOs measured in seconds or subseconds are possible, which is considerably better than is typically achievable using an active/passive architecture.

Shadowbase SZT configurations are suitable for applications that require continuous availability, but for which some small data loss is acceptable. A Shadowbase SZT configuration also offers the best solution for applications which cannot avoid or tolerate data collisions. Typical applications include telco applications (many call-related transactions worth pennies). Point-of-sale (POS) transactions are another example, because like ATM transactions, they generally have low value. However, if a POS application goes down, retailers cannot service customers using credit or debit cards. Shadowbase SZT should be considered the absolute minimum business continuity architecture for any mission-critical application, since it is only a small move from an active/passive architecture, and results in a significant benefit in terms of RTO.

While a Shadowbase SZT architecture certainly offers continuous availability, there are other Shadowbase business continuity solutions which should also be considered, particularly a fully active/active architecture.

Related Solutions:
Related White Paper:
Related Case Study:
Related Information: