Sunday, September 16, 2012

Informatica Transformation - 1


Informatica Transformation

There are basically 2 types of transformation present in Informatica

ACTIVE transformation Properties:- 

·        Changes the number of rows that passes through the transformation.
·        Changes the transaction boundary: - commit or rollback kind of transaction based on expression.
·        Change the row type: update, insert delete or reject kind off.

Constraints with Active transformation: - designer won’t allow connecting multiple active transformations or an active and a passive transformation to the same downstream transformation input group.
                              But the sequence generator transformation [Passive Transformation] is exception for the above rule, because you can connect a sequence generator transformation and an active transformation to the same downstream transformation. Why it happens, because a sequence generator transformation does not receive records. It only produces unique values.

PASSIVE transformation Properties: - A Passive transformation does not change the number of rows that passes through the transformation, maintains the transaction boundary, and maintain the row types. Designer will allow you to connect multiple passive transformations to the same downstream transformation. The transformation that originates the branch can be active or passive.

 Multi-group Transformation: - This is also a categorization of transformation. Transformation has different input and output groups. Those transformations which has more than one input and output group are called as multi group Transformation like

·         Custom [contain any number of input and output group],
·         Joiner [can contain only 2 input groups]
·         Router [contains only one input group and can have multiple output group]
·         Union [contain multiple input group and single output group]
·         Unstructured data [can contain multiple output groups]
·         And all XML related transformation also falls into multi group transformation.

All multi-group transformation is active transformation.

Blocking Transformation: - Some multiple input group transformations require the integration service to block data at an input group while the integration service waits for a row from a different input group.  A  blocking transformation is a multiple input group transformation that blocks incoming data. Custom transformation and Joiner transformation comes into this category.

Transformation based on the expression: - Below is the list of transformation, where you can enter expressions.

·         Aggregator
·         Data Masking
·         Expression
·         Filter
·         Rank
·         Router
·         Update strategy
·         Transaction control

Defining expression strings in Parameter files:-
The Integration Service expands the mapping parameters and variables in an expression after it parses the expression. If you have an expression that changes frequently, you can define the expression string in a parameter file so that you do not have to update the mappings that use the expression when the expression changes.

To define an expression string in a parameter file, you create a mapping parameter or variable to store the expression string, and set the parameter or variable to the expression string in the parameter file. The parameter or variable you create must have IsExprVar set to true. When IsExprVar is true, the Integration Service expands the parameter or variable before it parses the expression.

For example, to define the expression IIF (color=‘red’, 5) in a parameter file, perform the following steps:


1.       In the mapping that uses the expression, create a mapping parameter $$Exp. Set IsExprVar to true and set the data type to String.
2.       In the Expression Editor, set the expression to the name of the mapping parameter as follows:
3.       Configure the session or workflow to use a parameter file.
4.       In the parameter file, set the value of $$Exp to the expression string as follows: $$Exp
5.       $$Exp=IIF(color=‘red’,5)


Using Local Variables
Use local variables in Aggregator, Expression, and Rank transformations. You can reference variables in an expression or use them to temporarily store data. Variables are an easy way to improve performance.

You might use variables to complete the following tasks:

·         Temporarily store data.
·         Simplify complex expressions.
·         Store values from prior rows.
·         Capture multiple return values from a stored procedure.
·         Compare values.
·         Store the results of an unconnected Lookup transformation.


Consider the following factors when you configure variable ports in a transformation:

·         Port order. The Integration Service evaluates ports by dependency. The order of the ports in a transformation must match the order of evaluation: input ports, variable ports, output ports.
·         Data type. The data type you choose reflects the return value of the expression you enter.
·         Variable initialization. The Integration Service sets initial values in variable ports, where you can create counters.

Variable Initialization
The Integration Service does not set the initial value for variables to NULL. Instead, the Integration Service uses the following guidelines to set initial values for variables:

·         Zero for numeric ports
·         Empty strings for string ports
·         01/01/1753 for Date/Time ports with PMServer 4.0 date handling compatibility disabled
·         01/01/0001 for Date/Time ports with PMServer 4.0 date handling compatibility enabled

Using Default Values for Ports

All transformations use default values that determine how the Integration Service handles input null values and output transformation errors. Input, output, and input/output ports are created with a system default value that you can sometimes override with a user-defined default value. Default values have different functions in different types of ports:

·         Input port. The system default value for null input ports is NULL. It displays as a blank in the transformation. If an input value is NULL, the Integration Service leaves it as NULL.

·         Output port. The system default value for output transformation errors is ERROR. The default value appears in the transformation as ERROR(‘transformation error’). If a transformation error occurs, the Integration Service skips the row. The Integration Service notes all input rows skipped by the ERROR function in the session log file.

The following errors are considered transformation errors:

·         Data conversion errors, such as passing a number to a date function.
·         Expression evaluation errors, such as dividing by zero.
·         Calls to an ERROR function.

·         Input/output port. The system default value for null input is the same as input ports, NULL. The system default value appears as a blank in the transformation. The default value for output transformation errors is the same as output ports. The default value for output transformation errors does not display in the transformation.


Note: Variable ports do not support default values. The Integration Service initializes variable ports according to the data type.

 Entering User-Defined Default Values
You can override the system default values with user-defined default values for supported input, input/output, and output ports within a connected transformation:

·         Input ports. You can enter user-defined default values for input ports if you do not want the Integration Service to treat null values as NULL.
·         Output ports. You can enter user-defined default values for output ports if you do not want the Integration Service to skip the row or if you want the Integration Service to write a specific message with the skipped row to the session log.
·         Input/output ports. You can enter user-defined default values to handle null input values for input/output ports in the same way you can enter user-defined default values for null input values for input ports. You cannot enter user-defined default values for output transformation errors in an input/output port.

Note: The Integration Service ignores user-defined default values for unconnected transformations. For example, if you call a Lookup or Stored Procedure transformation through an expression, the Integration Service ignores any user-defined default value and uses the system default value only.

Entering Constant Expressions
A constant expression is any expression that uses transformation functions (except aggregate functions) to write constant expressions. You cannot use values from input, input/output, or variable ports in a constant expression.

Some valid constant expressions include:

500 * 1.75
TO_DATE('January 1, 1998, 12:05 AM')
ERROR ('Null not allowed')
ABORT('Null not allowed')
SYSDATE

Entering ERROR and ABORT Functions
Use the ERROR and ABORT functions for input and output port default values, and input values for input/output ports. The Integration Service skips the row when it encounters the ERROR function. It aborts the session when it encounters the ABORT function.

 Skipping Null Records
Use the ERROR function as the default value when you do not want null values to pass into a transformation. For example, you might want to skip a row when the input value of DEPT_NAME is NULL. You could use the following expression as the default value:

ERROR('Error. DEPT is NULL')
The following figure shows a default value that instructs the Integration Service to skip null values:


When you use the ERROR function as a default value, the Integration Service skips the row with the null value. The Integration Service writes all rows skipped by the ERROR function into the session log file. It does not write these rows to the session reject file.

DEPT_NAME       RETURN VALUE

Housewares         Housewares
NULL 'Error. DEPT is NULL' (Row is skipped)

The following session log shows where the Integration Service skips the row with the null value:
TE_11019 Port [DEPT_NAME]: Default value is: ERROR(<<Transformation Error>> [error]: Error. DEPT is NULL
... error('Error. DEPT is NULL')
).
CMN_1053 EXPTRANS: : ERROR: NULL input column DEPT_NAME: Current Input data:
CMN_1053 Input row from SRCTRANS: Rowdata: ( RowType=4 Src Rowid=2 Targ Rowid=2
DEPT_ID (DEPT_ID:Int:): "2"
DEPT_NAME (DEPT_NAME:Char.25:): "NULL"
MANAGER_ID (MANAGER_ID:Int:): "1" Produce Produce
)

Aborting the Session
Use the ABORT function to abort a session when the Integration Service encounters null input values.

Entering User-Defined Default Output Values
You can enter user-defined default values for output ports if you do not want the Integration Service to skip rows with errors or if you want the Integration Service to write a specific message with the skipped row to the session log. You can enter default values to complete the following functions when the Integration Service encounters output transformation errors:

·         Replace the error with a constant value or constant expression. The Integration Service does not skip the row.

·         Abort the session with the ABORT function.

·         Write specific messages in the session log for transformation errors.

You cannot enter user-defined default output values for input/output ports.

 Replacing Errors
If you do not want the Integration Service to skip a row when a transformation error occurs, use a constant or constant expression as the default value for an output port. For example, if you have a numeric output port called NET_SALARY and you want to use the constant value ‘9999’ when a transformation error occurs, assign the default value 9999 to the NET_SALARY port. If there is any transformation error (such as dividing by zero) while computing the value of NET_SALARY, the Integration Service uses the default value 9999.

Aborting the Session
Use the ABORT function as the default value in an output port if you do not want to allow any transformation errors.

Writing Messages in the Session Log or Row Error Logs
You can enter a user-defined default value in the output port if you want the Integration Service to write a specific message in the session log with the skipped row. The system default is ERROR (‘transformation error’), and the Integration Service writes the message ‘transformation error’ in the session log along with the skipped row. You can replace ‘transformation error’ if you want to write a different message.

When you enable row error logging, the Integration Service writes error messages to the error log instead of the session log and the Integration Service does not log Transaction Control transformation rollback or commit errors. If you want to write rows to the session log in addition to the row error log, you can enable verbose data tracing.

Working with ERROR Functions in Output Port Expressions
If you enter an expression that uses the ERROR function, the user-defined default value for the output port might override the ERROR function in the expression.

For example, you enter the following expression that instructs the Integration Service to use the value ‘Negative Sale’ when it encounters an error:

The following examples show how user-defined default values may override the ERROR function in the expression:

·         Constant value or expression. The constant value or expression overrides the ERROR function in the output port expression.

For example, if you enter ‘0’ as the default value, the Integration Service overrides the ERROR function in the output port expression. It passes the value 0 when it encounters an error. It does not skip the row or write ‘Negative Sale’ in the session log.

·         ABORT. The ABORT function overrides the ERROR function in the output port expression.

If you use the ABORT function as the default value, the Integration Service aborts the session when a transformation error occurs. The ABORT function overrides the ERROR function in the output port expression.

·         ERROR. If you use the ERROR function as the default value, the Integration Service includes the following information in the session log:

−Error message from the default value
−Error message indicated in the ERROR function in the output port expression
−Skipped row

For example, you can override the default value with the following ERROR function:
The Integration Service skips the row, and includes both error messages in the log.
TE_7007 Transformation Evaluation Error; current row skipped...
TE_7007 [<<Transformation Error>> [error]: Negative Sale
... error('Negative Sale')
]

Sun Sep 20 13:57:28 1998

TE_11019 Port [OUT_SALES]: Default value is: ERROR(<<Transformation Error>> [error]: No default value ERROR('No default value') IIF( TOTAL_SALES>0, TOTAL_SALES, ERROR ('Negative Sale'))

... error('No default value')

General Rules for Default Values
Use the following rules and guidelines when you create default values:

·         The default value must be either a NULL, a constant value, a constant expression, an ERROR function, or an ABORT function.
·         For input/output ports, the Integration Service uses default values to handle null input values. The output default value of input/output ports is always ERROR(‘Transformation Error’).
·         Variable ports do not use default values.
·         You can assign default values to group by ports in the Aggregator and Rank transformations.
·         Not all port types in all transformations allow user-defined default values. If a port does not allow user-defined default values, the default value field is disabled.
·         Not all transformations allow user-defined default values. For a list of transformations that allow user-defined default values, see Table 1-1 on page 15.
·         If a transformation is not connected to the mapping data flow, the Integration Service ignores user-defined default values.
·         If any input port is unconnected, its value is assumed to be NULL and the Integration Service uses the default value for that input port.
·         If an input port default value contains the ABORT function and the input value is NULL, the Integration Service immediately stops the session. Use the ABORT function as a default value to restrict null input values. The first null value in an input port stops the session.
·         If an output port default value contains the ABORT function and any transformation error occurs for that port, the session immediately stops. Use the ABORT function as a default value to enforce strict rules for transformation errors. The first transformation error for this port stops the session.
·         The ABORT function, constant values, and constant expressions override ERROR functions configured in output port expressions.

 Configuring Tracing Level in Transformations

When you configure a transformation, you can set the amount of detail the Integration Service writes in the session log.

Tracing Level
Description
Normal
Integration Service logs initialization and status information, errors encountered, and skipped rows due to transformation row errors. Summarizes session results, but not at the level of individual rows.
Terse
Integration Service logs initialization information and error messages and notification of rejected data.
Verbose Initialization
In addition to normal tracing, Integration Service logs additional initialization details, names of index and data files used, and detailed transformation statistics.
Verbose Data
In addition to verbose initialization tracing, Integration Service logs each row that passes into the mapping. Also notes where the Integration Service truncates string data to fit the precision of a column and provides detailed transformation statistics.
Allows the Integration Service to write errors to both the session log and error log when you enable row error logging.
When you configure the tracing level to verbose data, the Integration Service writes row data for all rows in a block when it processes a transformation.

By default, the tracing level for every transformation is Normal. Change the tracing level to a Verbose setting only when you need to debug a transformation that is not behaving as expected. To add a slight performance boost, you can also set the tracing level to Terse, writing the minimum of detail to the session log when running a workflow containing the transformation.

When you configure a session, you can override the tracing levels for individual transformations with a single tracing level for all transformations in the session.


Reusable Transformations
Mappings can contain reusable and non-reusable transformations. Non-reusable transformations exist within a single mapping. Reusable transformations can be used in multiple mappings.
You can create most transformations as a non-reusable or reusable. However, you can only create the External Procedure transformation as a reusable transformation.
When you add instances of a reusable transformation to mappings, you must be careful that changes you make to the transformation do not invalidate the mapping or generate unexpected data.

Instances and Inherited Changes
When you add a reusable transformation to a mapping, you add an instance of the transformation. The definition of the transformation still exists outside the mapping, while an instance of the transformation appears within the mapping.
Since the instance of a reusable transformation is a pointer to that transformation, when you change the transformation in the Transformation Developer, its instances reflect these changes. Instead of updating the same transformation in every mapping that uses it, you can update the reusable transformation once, and all instances of the transformation inherit the change. Note that instances do not inherit changes to property settings, only modifications to ports, expressions, and the name of the transformation.

Mapping Variables in Expressions
Use mapping parameters and variables in reusable transformation expressions. When the Designer validates the parameter or variable, it treats it as an Integer datatype. When you use the transformation in a mapplet or mapping, the Designer validates the expression again. If the mapping parameter or variable does not exist in the mapplet or mapping, the Designer logs an error.

Creating Reusable Transformations
You can create a reusable transformation using the following methods:
  • Design it in the Transformation Developer. In the Transformation Developer, you can build new reusable transformations.
  • Promote a non-reusable transformation from the Mapping Designer. After you add a transformation to a mapping, you can promote it to the status of reusable transformation. The transformation designed in the mapping then becomes an instance of a reusable transformation maintained elsewhere in the repository.
If you promote a transformation to reusable status, you cannot demote it. However, you can create a non-reusable instance of it.
Note: Sequence Generator transformations must be reusable in mapplets. You cannot demote reusable Sequence Generator transformations to non-reusable in a mapplet.

Promoting Non-Reusable Transformations
The other technique for creating a reusable transformation is to promote an existing transformation within a mapping. By checking the Make Reusable option in the Edit Transformations dialog box, you instruct the Designer to promote the transformation and create an instance of it in the mapping.

Creating Non-Reusable Instances of Reusable Transformations
You can create a non-reusable instance of a reusable transformation within a mapping. Reusable transformations must be made non-reusable within the same folder. If you want to have a non-reusable instance of a reusable transformation in a different folder, you need to first make a non-reusable instance of the transformation in the source folder, and then copy it into the target folder.

Reverting to Original Reusable Transformation
If you change the properties of a reusable transformation in a mapping, you can revert to the original reusable transformation properties by clicking the Revert button.













0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home