By themselves, HPE Shadowbase ETL and SOLV can produce snap-shot loads of all or part of a source database into a target environment. These loads are often called point-in-time loads as they are initiated at a particular point-in-time. When used in this mode, any changes made to the source database while the load runs and after the load completes are not subsequently replicated to the target environment.
During a SOLV and Shadowbase ETL load, the target database is loaded in a consistent state in the sense that SOLV only reads and loads consistent source database data (i.e., unless configured to do so, it does not read through locked or otherwise dirty data). However, since the load takes time, if the source database is being updated while the load is occurring, then the target will have load-time consistent data in it. When run in this mode, to ensure that the target matches the entire source after the load completes, the source database needs to stop updating when the snap-shot load is taken.
SOLV and ETL offer several data consistency level options. By default, the data being loaded is locally consistent (unlocked and not dirty). However, it also supports a dirty load level (and will read through locks) as well as a file/table level lock consistency mode, where it will lock the entire source file or table before the load begins. This option ensures that the point-in-time load of the target is consistent with the entire source when the load completes.
Shadowbase ETL and SOLV use the Shadowbase database replication engine for certain services. If the Shadowbase engine is being used for data replication of the source file or table being loaded to the target, then Shadowbase ETL and SOLV can integrate with it to load the target database while database change replication is occurring. Shadowbase ETL and SOLV fully cooperate with the Shadowbase data replication engine to ensure that the data being replicated and the data being loaded are properly serialized when applied to the target database. Hence, while the load occurs, the Shadowbase engine properly keeps that part of the target database that has already been loaded synchronized with any subsequent changes that the application makes to the source database. This approach is particularly useful for high volume and large database applications because database change replication is not suspended while the load takes place, as it is with some other data loading products. Instead, the queue of database changes are consumed while the load takes place, and the target is always kept current with the source’s database changes during the load. The target does not grow stale after the load completes because the Shadowbase data replication engine continues to keep the target database synchronized with the source database.
SOLV uses the Shadowbase replication engine’s powerful data transformation, data filtering, and data cleansing routines and scripts. Many data mappings are performed out-of-the-box by the Shadowbase data replication engine. Other more complex data mapping functions must be developed once within Shadowbase technology. Then, both the Shadowbase data replication engine as well as Shadowbase ETL and SOLV can use these same functions during the load. The routines/mappings only need to be written and tested once, since they are then used for both database change replication as well as the loading function. They do not have to be re-implemented in a separate custom ETL tool, script, or program.
When used alongside Shadowbase data replication, SOLV integrates the loading of data with the source application‘s online database changes so that the changes to be replicated are properly sequenced and applied to the target database at the same time as the data is being loaded. At the end of the load, the target database is current and there is no queue of database changes that accumulated during the load that needs to be replicated and applied; instead, these changes are consumed and applied during the load.
SOLV leverages the Shadowbase technology that allows the source and target to be similar (such as like-to-like replication), or markedly different (such as loading a non-relational source file into a target SQL table or when loading a data warehouse). Once Shadowbase replication is configured, SOLV automatically uses this information to perform the load.
The Shadowbase ETL utility uses and extends the SOLV loading capabilities to allow for reading and injecting events from flat files into the Shadowbase replication engine for processing, as well as producing flat files of database data or database change events that can then be subsequently processed by an ETL tool.
Figure 1 depicts a typical HPE Shadowbase ETL scenario for producing flat files of your source data. In this figure, a source database is being read and parsed and formatted into a flat file, such as a typical fixed-position (FP) or column separated values (CSV) format for a subsequent program, application, or ETL tool to process. More specifically, SOLV is reading the source data (of interest) from the source database, and sending that data to the Shadowbase replication engine to transform, filter, and process it (via the embedded Shadowbase ETL Toolkit) into the desired flat file target format. The flat file is produced by the Shadowbase user exit functionality using the Shadowbase ETL Toolkit, typically written as a consumptive user exit. A consumptive user exit receives the data changes, and parses them into the desired output format and writes them directly into the target flat file. Shadowbase data replication can also be active on the source data being processed, allowing for database changes to be added into the output file. This application has proven popular, for instance, for telco environments to capture their call detail record (CDR) data for loading into a downstream data warehouse system.
For example, Figure 2 depicts a typical scenario where an application is actively modifying the source database, and SOLV is unloading that source data into a target flat file in CSV or fixed-position format along with any changes made to the source for the data of interest. The application change data is merged with the load data. In addition to generating flat files for output, HPE Shadowbase replication can also process flat files of input data and inject the events they contain into the replication engine for subsequent applying into a target database or another target flat file.
Figure 3 depicts a typical HPE Shadowbase ETL scenario for injecting ETL data into the replication engine. In this figure, the HPE Shadowbase File Chaser is reading data to inject into the Shadowbase replication engine from a variety for formats, including a flat file, an application change data log file, a middleware queue, or other application-generated file. Shadowbase File Chaser (or in some cases a user exit built into the Shadowbase replication engine directly), will read the events as they are generated and inject them into the replication stream for subsequent processing by the Shadowbase replication engine.
The HPE Shadowbase File Chaser is a component of the SOLV loading product that can monitor source objects (flat files, application log files, queues, etc.) and chase the data insertions into these objects, typically at EOF (end of file). The Shadowbase File Chaser will read and prepare that data and inject it into the Shadowbase data replication engine for processing.
Shadowbase ETL environments typically have very customer-specific requirements for data conversions and filtering/processing, and usually require developing these conversion functions in the bundled Shadowbase user exit customization capability. Services are available from Gravic and HPE to help users implement such custom data transformations.
The Shadowbase ETL and SOLV products use patented technology, with additional patents pending. (See U.S. Patent Numbers 6,745,209, 7,003,531, and 7,321,904 at www.uspto.gov.)