Sunday, March 30, 2008

informatica Transformation Types

INFORMATICA

Source Qualifier Transformation

Transformation type: Active (Connected)

Join data originating from the same source database. You can join two or more tables with primary key-foreign key relationships by linking the sources to one Source Qualifier transformation.

Filter rows when the Power Center Server reads source data. If you include a filter condition, the Power Center Server adds a WHERE clause to the default query.

Specify an outer join rather than the default inner join. If you include a user-defined join, the Power Center Server replaces the join information specified by the metadata in the SQL query.

Specify sorted ports. If you specify a number for sorted ports, the Power Center Server adds an ORDER BY clause to the default SQL query.

Select only distinct values from the source. If you choose Select Distinct, the Power Center Server adds a SELECT DISTINCT statement to the default SQL query.

* You can use mapping parameters and variables in the SQL query, user-defined join, and source filter of a Source Qualifier transformation

U can used a mapping variable or parameters in a source filter

* Tip: You must connect the columns in the Source Qualifier transformation to another transformation or target before you can generate the default query.

* Heterogeneous Joins

To perform a heterogeneous join, use the Joiner transformation. Use the Joiner transformation when you need to join the following types of sources:

  • Join data from different source databases
  • Join data from different flat file systems
  • Join relational sources and flat files

The Power Center Server supports two kinds of outer joins:

  • Left. Power Center Server returns all rows for the table to the left of the join syntax and the rows from both tables that meet the join condition.
  • Right. Power Center Server returns all rows for the table to the right of the join syntax and the rows from both tables that meet the join condition.

E.G 1:

SELECT EMP.EMPNO, EMP.ENAME, EMP.JOB, EMP.SAL, EMP.COMM, EMP.DEPTNO, DEPT.LOC ,DEPT.DNAME

FROM

EMP RIGHT OUTER JOIN DEPT

ON

DEPT.DEPTNO=EMP.DEPTNO

ORDER BY EMP.EMPNO

U CAN USE EMP INNER JOIN DEPT ON EMP.DEPTNO=DEPT.DEPTNO

E.G 2: $$P_SOURCE_DNAME=NAGPUR ( IF STRING GIVE SINGLE QUOTES)

SELECT EMP.EMPNO, EMP.ENAME, EMP.JOB, EMP.SAL, EMP.COMM, EMP.DEPTNO, DEPT.LOC ,DEPT.DNAME

FROM

EMP INNER JOIN DEPT

ON

DEPT.DEPTNO=EMP.DEPTNO AND DEPT.DNAME='$$P_SOURCE_DNAME'

ORDER BY

EMP.EMPNO

Joiner Transformation

Transformation type:

Connected

Active


To perform a heterogeneous join, use the Joiner transformation.

You can use the Joiner transformation to join source data from two related heterogeneous sources residing in different locations or file systems. Or, you can join data from the same source.

The Joiner transformation joins two sources with at least one matching port. The Joiner transformation uses a condition that matches one or more pairs of ports between the two sources.

If you need to join more than two sources, you can add more Joiner transformations to the mapping.

In the following example, the Aggregator transformation and the Source Qualifier transformation are the input transformations for the Joiner transformation.

Figure 7-1 shows the Joiner transformation joining two pipelines:

Figure 7-1. Sample Mapping with a Joiner Transformation

The Joiner transformation accepts input from most transformations. However, there are some limitations on the pipelines you connect to the Joiner transformation. You cannot use a Joiner transformation in the following situations:

  • Either input pipeline contains an Update Strategy transformation.
  • You connect a Sequence Generator transformation directly before the Joiner transformation.

· We can use “AND” condition in one joiner but not “OR” condition

· E.g.: order_id=order_id1 AND Employee_id=employees_id1

If you join Char and Varchar data types, the Power Center Server counts any spaces that pad Char values as part of the string. So if you try to join the following:

Char (40) = “abcd”

Varchar (40) = “abcd”

Then the Char value is “abcd” padded with 36 blank spaces, and the Power Center Server does not join the two fields because the Char field contains trailing spaces.

U can join data from multiple sources or same sources.

You can also join same source data by creating a second instance of the source.

Lookup Transformation Overview

Transformation type:

Passive

Connected/Unconnected


Use a Lookup transformation in a mapping to look up data in a flat file or a relational table, view, or synonym. You can import a lookup definition from any flat file or relational database to which both the PowerCenter Client and Server can connect.

You can use multiple Lookup transformations in a mapping.

You can use the Lookup transformation to perform many tasks, including:

  • Get a related value. For example, your source includes employee ID, but you want to include the employee name in your target table to make your summary data easier to read.
  • Perform a calculation. Many normalized tables include values used in a calculation, such as gross sales per invoice or sales tax, but not the calculated value (such as net sales).
  • Update slowly changing dimension tables. You can use a Lookup transformation to determine whether rows already exist in the target
  • If you cache the lookup, you can choose to use a dynamic or static cache. By default, the lookup cache remains static and does not change during the session. With a dynamic cache, the Power Center Server inserts or updates rows in the cache during the session. When you cache the target table as the lookup, you can look up values in the target and insert them if they do not exist, or update them if they do.

Note: If you use a flat file lookup, you must use a static cache.

# What happens when the Lookup transformation finds multiple rows that match the lookup condition. You can select the first or last row returned from the cache or lookup source, or report an error.

Dynamic Cache

If you configure a Lookup transformation to use a dynamic cache, you can only use the equality operator (=) in the lookup condition.

If the ORDER BY statement does not contain the condition ports in the same order they appear in the Lookup condition, the session fails with the following error message:

CMN_1701 Error: Data for Lookup [] fetched from the database is not sorted on the condition ports. Please check your Lookup SQL override

Difference

Table 9-1. Lookup Caching Comparison

Uncached

Static Cache

Dynamic Cache

You cannot insert or update the cache.

You cannot insert or update the cache.

You can insert or update rows in the cache as you pass rows to the target.

You cannot use a flat file lookup.

You can use a relational or a flat file lookup.

You can use a relational lookup only.

Table 8-1. Differences Between Connected and Unconnected Lookups

Connected Lookup

Unconnected Lookup

Receives input values directly from the pipeline.

Receives input values from the result of a :LKP expression in another transformation.

You can use a dynamic or static cache.

You can use a static cache.

Can return multiple columns

Designate one return port (R). Returns one column from each row.

If there is no match for the lookup condition, the PowerCenter Server returns the default value for all output ports.

If there is no match for the lookup condition, the PowerCenter Server returns NULL.

Supports user-defined default values.

Does not support user-defined default values.

Union Transformation Overview

Transformation type:

Connected

Active


The Union transformation is a multiple input group transformation that you can use to merge data from multiple pipelines into one pipeline branch. Using the Union transformation to merge data from multiple sources is similar to using the UNION ALL SQL statement to combine the results from two or more SQL statements. Similar to the UNION ALL statement, the Union transformation does not remove duplicate rows.

You can connect heterogeneous sources to a Union transformation. The Union transformation merges sources with matching ports and outputs the data from one output group with the same ports as the input groups.

  • You can create multiple input groups, but only one output group.
  • All input groups and the output group must have matching ports. The precision, datatype, and scale must be identical across all groups.
  • The Union transformation does not remove duplicate rows. To remove duplicate rows, you must add another transformation such as a Router or Filter transformation.
  • You cannot use a Sequence Generator or Update Strategy transformation upstream from a Union transformation.

When you add a Union transformation to a mapping, you must verify that you connect the same ports in all input groups. If you connect all ports in one input group, but do not connect a port in another input group, the Power Center Server passes Nulls to the unconnected port.

Update Strategy Transformation Overview

Transformation type:

Active

Connected


Update Strategy is used to maintain all the historic data or just the most recent changes.

In Power Center, you set your update strategy at two different levels:

  • Within a session. When you configure a session, you can instruct the Power Center Server to either treat all rows in the same way (for example, treat all rows as inserts), or use instructions coded into the session mapping to flag rows for different database operations.
  • Within a mapping. Within a mapping, you use the Update Strategy transformation to flag rows for insert, delete, update, or reject.

Table 19-1. Constants for Each Database Operation

Operation

Constant

Numeric Value

Insert

DD_INSERT

0

Update

DD_UPDATE

1

Delete

DD_DELETE

2

Reject

DD_REJECT

3

Forwarding Rejected Rows

You can configure the Update Strategy transformation to either pass rejected rows to the next transformation or drop them. By default, the PowerCenter Server forwards rejected rows to the next transformation

Aggregator and Update Strategy Transformations

When you connect Aggregator and Update Strategy transformations as part of the same pipeline, you have the following options:

· Position the Aggregator before the Update Strategy transformation. In this case, you perform the aggregate calculation, and then use the Update Strategy transformation to flag rows that contain the results of this calculation for insert, delete, or update.

· Position the Aggregator after the Update Strategy transformation. Here, you flag rows for insert, delete, update, or reject before you perform the aggregate calculation. How you flag a particular row determines how the Aggregator transformation treats any values in that row used in the calculation. For example, if you flag a row for delete and then later use the row to calculate the sum, the PowerCenter Server subtracts the value appearing in this row. If the row had been flagged for insert, the PowerCenter Server would add its value to the sum.

Rank Transformation Overview

Transformation type:

Active

Connected


The Rank transformation allows you to select only the top or bottom rank of data. You can use a Rank transformation to return the largest or smallest numeric value in a port or group. You can also use a Rank transformation to return the strings at the top or the bottom of a session sort order

The Rank transformation differs from the transformation functions MAX and MIN, in that it allows you to select a group of top or bottom values, not just one value.

As an active transformation, the Rank transformation might change the number of rows passed through it. You might pass 100 rows to the Rank transformation, but select to rank only the top 10 rows, which pass from the Rank transformation to another transformation.

You can connect ports from only one transformation to the Rank transformation. The Rank transformation allows you to create local variables and write non-aggregate expressions.

The RANKINDEX is an output port only. You can pass the rank index to another transformation in the mapping or directly to a target.

Like the Aggregator transformation, the Rank transformation allows you to group information.

Sorter Transformation Overview

Transformation type:

Connected

Active


The Sorter transformation allows you to sort data. You can sort data in ascending or descending order according to a specified sort key. You can also configure the Sorter transformation for case-sensitive sorting, and specify whether the output rows should be distinct. The Sorter transformation is an active transformation. It must be connected to the data flow.

You can sort data from relational or flat file sources. You can also use the Sorter transformation to sort data passing through an Aggregator transformation configured to use sorted input.

When you create a Sorter transformation in a mapping, you specify one or more ports as a sort key and configure each sort key port to sort in ascending or descending order.

Distinct Output Rows

You can configure the Sorter transformation to treat output rows as distinct. If you configure the Sorter transformation for distinct output rows, the Mapping Designer configures all ports as part of the sort key. When the PowerCenter Server runs the session, it discards duplicate rows compared during the sort operation.

Router Transformation Overview

Transformation type:

Connected

Active


A Router transformation is similar to a Filter transformation because both transformations allow you to use a condition to test data. A Filter transformation tests data for one condition and drops the rows of data that do not meet the condition. However, a Router transformation tests data for one or more conditions and gives you the option to route rows of data that do not meet any of the conditions to a default output group.

If you need to test the same input data based on multiple conditions, use a Router transformation in a mapping instead of creating multiple Filter transformations to perform the same task. The Router transformation is more efficient.

Likewise, when you use a Router transformation in a mapping, the PowerCenter Server processes the incoming data only once. When you use multiple Filter transformations in a mapping, the PowerCenter Server processes the incoming data for each transformation.

A Router transformation has the following types of groups:

  • Input
  • Output

Input Group

The Designer copies property information from the input ports of the input group to create a set of output ports for each output group.

Output Groups

There are two types of output groups:

  • User-defined groups
  • Default group

You cannot modify or delete output ports or their properties.

There is no group filter condition associated with the default group

User-Defined Groups

You create a user-defined group to test a condition based on incoming data. A user-defined group consists of output ports and a group filter condition.

When you connect transformations to a Router transformation in a mapping, consider the following rules:

  • You can connect one group to one transformation or target.
  • You can connect one output port in a group to multiple transformations or targets.
  • You can connect multiple output ports in one group to multiple transformations or targets.
  • You cannot connect more than one group to one target or a single input group transformation.
  • You can connect more than one group to a multiple input group transformation, except for Joiner transformations, when you connect each output group to a different input group.

Sequence Generator Transformation Overview

Transformation type:

Passive

Connected


The Sequence Generator transformation generates numeric values. You can use the Sequence Generator to create unique primary key values, replace missing primary keys, or cycle through a sequential range of numbers.

The Sequence Generator transformation is a connected transformation. It contains two output ports that you can connect to one or more transformations.

When NEXTVAL is connected to the input port of another transformation, the PowerCenter Server generates a sequence of numbers. When CURRVAL is connected to the input port of another transformation, the PowerCenter Server generates the NEXTVAL value plus one.

You can make a Sequence Generator reusable, and use it in multiple mappings. You might reuse a Sequence Generator when you perform multiple loads to a single target.

You can perform the following tasks with a Sequence Generator transformation:

  • Create keys.
  • Replace missing values.
  • Cycle through a sequential range of numbers.

Replacing Missing Values

Use the Sequence Generator transformation to replace missing keys by using NEXTVAL with the IIF and ISNULL functions.

To replace null values in the ORDER_NO column, for example, you create a Sequence Generator transformation with the desired properties and drag the NEXTVAL port to an Expression transformation. In the Expression transformation, drag the ORDER_NO port into the transformation (along with any other necessary ports). Then create a new output port, ALL_ORDERS.

In ALL_ORDERS, you can then enter the following expression to replace null orders:

IIF( ISNULL( ORDER_NO ), NEXTVAL, ORDER_NO )

For example, you might connect NEXTVAL to two target tables in a mapping to generate unique primary key values. The PowerCenter Server creates a column of unique primary key values for each target table.

Figure 14-1 illustrates connecting NEXTVAL to two target tables in a mapping:

Figure 14-1. Connecting NEXTVAL to Two Target Tables in a Mapping

For example, you configure the Sequence Generator transformation as follows: Current Value = 1, Increment By = 1. When you run the workflow, the PowerCenter Server generates the following primary key values for the T_ORDERS_PRIMARY and T_ORDERS_FOREIGN target tables:

T_ORDERS_PRIMARY TABLE:

PRIMARY KEY

T_ORDERS_FOREIGN TABLE: PRIMARY KEY

1

2

3

4

5

6

7

8

9

10

If you want the same generated value to go to more than one target that receives data from a single transformation then consider following example

Figure 14-2. Mapping With a Sequence Generator and an Expression Transformation

For example, you configure the Sequence Generator transformation as follows: Current Value = 1, Increment By = 1. When you run the workflow, the PowerCenter Server generates the following primary key values for the T_ORDERS_PRIMARY and T_ORDERS_FOREIGN target tables:

T_ORDERS_PRIMARY TABLE:

PRIMARY KEY

T_ORDERS_FOREIGN TABLE:

PRIMARY KEY

1

1

2

2

3

3

4

4

5

5

CURRVAL

CURRVAL is the NEXTVAL value plus one or NEXTVAL plus the Increment By value. You typically only connect the CURRVAL port when the NEXTVAL port is already connected to a downstream transformation. When a row enters the transformation connected to the CURRVAL port, the PowerCenter Server passes the last-created NEXTVAL value plus one.

Figure 14-3 illustrates connecting CURRVAL and NEXTVAL ports to a target:

Figure 14-3. Connecting CURRVAL and NEXTVAL Ports to a Target

For example, you configure the Sequence Generator transformation as follows: Current Value = 1, Increment By = 1. When you run the workflow, the PowerCenter Server generates the following values for NEXTVAL and CURRVAL:

NEXTVAL

CURRVAL

1

2

2

3

3

4

4

5

5

6

If you connect the CURRVAL port without connecting the NEXTVAL port, the PowerCenter Server passes a constant value for each row.

Figure 14-4 illustrates connecting only the CURRVAL port to a target:

Figure 14-4. Connecting Only the CURRVAL Port to a Target

For example, you configure the Sequence Generator transformation as follows: Current Value = 1, Increment By = 1. When you run the workflow, the PowerCenter Server generates the following constant values for CURRVAL:

CURRVAL

1

1

1

1

1

If you want to use the Sequence Generator transformation to cycle through a series of values, the current value must be greater than or equal to Start Value and less than the end value.

At the end of each session, the PowerCenter Server updates the current value to the last value generated for the session plus one if the Sequence Generator Number of Cached Values is 0.

For example, if the PowerCenter Server ends a session with a generated value of 101, it updates the Sequence Generator current value to 102 in the repository. The next time the Sequence Generator is used, the PowerCenter Server uses 102 as the basis for the next generated value. If the Sequence Generator Increment By is 1, when the PowerCenter Server starts another session using the Sequence Generator, the first generated value is 102.

Mapplets Overview


A mapplet is a reusable object that you create in the Mapplet Designer. It contains a set of transformations and allows you to reuse that transformation logic in multiple mappings.

Mapplet Input

Mapplet input can originate from a source definition and/or from an Input transformation in the mapplet. You can create multiple pipelines in a mapplet. You can use multiple source definitions and source qualifiers or Input transformations. You can also use a combination of source definitions and Input transformations.

Unconnected ports in mapplets do not display in the Mapping Designer.

If you use a Sequence Generator transformation, you must use a reusable Sequence Generator transformation

You cannot include the following objects in a mapplet:

    • Normalizer transformations
    • COBOL sources
    • XML Source Qualifier transformations
    • XML sources
    • Target definitions
    • Other mapplets

To maintain valid mappings, use the following rules when you edit a mapplet that is used by mappings:

  • Do not delete a port from the mapplet. The Designer deletes mapplet ports in the mapping when you delete links to an Input or Output transformation or when you delete ports connected to an Input or Output transformation.
  • Do not change the datatype, precision, or scale of a mapplet port. The datatype, precision, and scale of a mapplet port is defined by the transformation port to which it is connected in the mapplet. Therefore, if you edit a mapplet to change the datatype, precision, or scale of a port connected to a port in an Input or Output transformation, you change the mapplet port itself.
  • Do not change the mapplet type. If you remove all active transformations from an active mapplet, the mapplet becomes passive. If you add an active transformation to a passive mapplet, the mapplet becomes active.

Viewing the Mapplet

When you use a mapplet in a mapping, the Designer displays the mapplet object, which contains only the input and output ports of the mapplet. However, you can expand the mapplet by choosing Mappings-Expand from the menu.

When the Designer expands the mapplet, it displays the entire mapping with the mapplet transformations. It does not display the Input and Output transformations. You can view the mapping in this expanded form, but you cannot edit it. To continue designing the mapping, choose Mappings-Unexpand

No comments: