Oracle Data Integrator E-L-T mechanism: Introduction to Declarative Design

To design an integration process with conventional ETL systems, a developer needs to design each step of the process: Consider, for example, a common case in which sales figures must be summed over time for different customer age groups. The sales data comes from a sales management database, and age groups are described in an age distribution file. In order to combine these sources then insert and update appropriate records in the customer statistics systems, you must design each step, which includes:

Load the customer sales data in the engine

Load the age distribution file in the engine

Perform a lookup between the customer sales data and the age distribution data

Aggregate the customer sales grouped by age distribution

Load the target sales statistics data into the engine

Determine what needs to be inserted or updated by comparing aggregated information with the data from the statistics system

Insert new records into the target

Update existing records into the target

This method requires specialized skills, depending on the steps that need to be designed. It also requires significant efforts in development, because even repetitive succession of tasks, such as managing inserts/updates in a target, need to be developed into each task. Finally, with this method, maintenance requires significant effort. Changing the integration process requires a clear understanding of what the process does as well as the knowledge of how it is done. With the conventional ETL method of design, the logical and technical aspects of the integration are intertwined.Declarative Design is a design method that focuses on “What” to do (the Declarative Rules) rather than “How” to do it (the Process). In our example, “What” the process does is:

Relate the customer age from the sales application to the age groups from the statistical file

Aggregate customer sales by age groups to load sales statistics

“How” this is done, that is the underlying technical aspects or technical strategies for performing this integration task – such as creating temporary data structures or calling loaders – is clearly separated from the declarative rules.

Declarative Design in Oracle Data Integrator uses the well known relational paradigm to declare in the form of an Interface the declarative rules for a data integration task, which includes designation of sources, targets, and transformations.

Declarative rules often apply to metadata to transform data and are usually described in natural language by business users. In a typical data integration project (such as a Data Warehouse project), these rules are defined during the specification phase in documents written by business analysts in conjunction with project managers. They can very often be implemented using SQL expressions, provided that the metadata they refer to is known and qualified in a metadata repository.

The four major types of Declarative Rules are mappings, joins, filters and constraints:

A mapping is a business rule implemented as an SQL expression. It is a transformation rule that maps source columns (or fields) onto one of the target columns. It can be executed by a relational database server at run-time. This server can be the source server (when possible), a middle tier server or the target server.

A join operation links records in several data sets, such as tables or files. Joins are used to link multiple sources. A join is implemented as an SQL expression linking the columns (fields) of two or more data sets. Joins can be defined regardless of the physical location of the source data sets involved. For example, a JMS queue can be joined to an Oracle table. Depending on the technology performing the join, it can be expressed as an inner join, right outer join, left outer join and full outer join.

A filter is an expression applied to source data sets columns. Only the records matching this filter are processed by the data flow.

A constraint is an object that defines the rules enforced on data sets' data. A constraint ensures the validity of the data in a given data set and the integrity of the data of a model. Constraints on the target are used to check the validity of the data before integration in the target.

Table: Examples of declarative rules gives examples of declarative rules.

Oracle Data Integrator E-L-T mechanism

Introduction to Declarative Design

No comments:

Post a Comment