Replication Guide and Reference
Source data most likely has to be subsetted, transformed, or enhanced as
part of the replication process, particularly in the support of decision
support or data warehousing. This section sorts these requirements into those
that are easily fulfilled by using the Control Center and those that require
direct manipulation of the control tables.
The Control Center supports all of the following basic capabilities:
- User copy target tables - These are the most common target
tables, and are copies of the replication source without an overhead
time-stamp column. These tables require a primary key.
- Point-in-time target tables - These are plain copies of source
tables with timestamp columns added to specify when updates were made. These
tables require a primary key.
- Base aggregate target tables - These tables are used if you
need to summarize your source table contents on a regular basis. For example,
they could be used to track a daily inventory of items from a source table.
- Change aggregate target tables - These tables are used if you
need to summarize the results of the changes made between each Apply program
refresh operation. They could be used on their own, where daily summarizations
of change events are required, or as input to an application that must apply
the summarizations to a cumulative table.
- Noncondensed CCD target tables - These tables are used for
history tables or audit trail tables. They can contain changes only, or they
can include the initial state of the source table.
- Condensed CCD target tables - These target tables can be used
as an alternative to a point-in-time copy when it is important to keep the
last state of any record, (that is, no records are to be deleted). A delete is
processed as an update, with the IBMSNAP_OPERATION column acting as the
logical delete flag.
- Updateable replica target tables - These target tables can be
used as sources for copying data back to the replication source table and are
used in update-anywhere scenarios.
- Column subsetting - If you want only a subset of the source
columns to be copied, use the Control Center. All target table types support
vertical fragmentation.
- Row subsetting - You can split a single source table into
various target tables based on the source column contents; for example,
department number or region code; or you specify certain source rows while
defining replication subscriptions. All target table types support row
subsetting. You might need to define a view replication source, in order to
bring together all the columns needed to define the row subset.
- Before and after images - Both before and after images can be
defined in replication sources and subscriptions. Before images do not make
sense with a base aggregate target table type. All other target table types
can make use of before-image columns.
- Column renaming - Source site column names can be renamed for
point-in-time and user-copy target table types.
- Computed columns - New columns can be derived from the existing
source columns on the basis of SQL expressions. New columns can be aggregate
functions such as COUNT or SUM in the case of aggregate target table types, or
simple derivations in the case of all other target table types.
User-defined functions for DB2 Universal Database can be specified by using
the computed column facility in the Control Center.
- Triggers - On DB2 Universal Database, triggers can be defined
on the target table such that row updates on the target can trigger other
processing, such as data transformations.
- Before and after run-time processing - Run-time statements
perform special processing just before or after changes are replicated. For
example, data transformation can be applied to source data before replicating
the data or to target data after it is copied.
Depending on the DB2 platform, user processing involving logic might be
invoked through the SQL before and after processing:
- Calls to stored procedures can be made through a before or after
processing statement, or triggers can be used to do more complex
transformations.
- On DB2 Universal Database, before or after SQL statements can invoke
user-defined functions through the use of triggers.
The transformations listed below require manipulation of the IBM
Replication control tables outside of the Control Center. The techniques
required to implement these transformations are not discussed in this book.
For planning purposes, when you find that you will use many of these
techniques in the replication scenario, you should allow additional
implementation time.
- Data consolidation - Union scenarios can be used to consolidate
identically structured data into a common consolidated target table.
- Outer join of source tables - Join scenarios can be used when
one or more of the sources are subject to change. An outer join can be
implemented with multiple subscriptions to a single table.
- Poorly structured data - Many older applications often have
poorly structured data; for example, conflicting sources for common data
items, no primary keys defined, poorly normalized tables, and heavily encoded
data values.
Although IBM Replication might be able to handle these individually, there
are some extreme situations where it is not useful to attempt replication
before some manual data cleansing is done.
General rule: - If the data transformation can be
expressed by using SQL, IBM Replication will almost certainly be able to
execute it. The SQL is introduced either as part of the SELECT statement that
does the copying or as part of the user-specified SQL before and after
statements. This SQL might include procedural logic by including database
triggers, user-defined functions, and stored procedure calls as described in "Basic Data Enhancement".
IBM Replication can also be supplemented with procedural code in
user-written programs. For example, a program could be scheduled to manipulate
data in the CD or CCD tables before execution of the Apply program. Or, if the
changes are so radical that the data cannot be changed in place, you could go
as far as writing a program to read the data changes from the CD or CCD tables
and then applying them to the target tables yourself. (See "Defining SQL Statements or CALL Procedures for the Replication Subscription" to learn about running SQL and Call procedures with replication
subscriptions.)
[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]
[ DB2 List of Books |
Search the DB2 Books ]