News Ticker

Menu

Joiner Transformation


Joiner Transformation in Informatica

The joiner transformation is an active and connected transformation used to join two heterogeneous sources. The joiner transformation joins sources based on a condition that matches one or more pairs of columns between the two sources. The two input pipelines include a master and a detail pipeline or branch. To join more than two sources, you need to join the output of the joiner transformation with another source. To join n number of sources in a mapping, you need n-1 joiner transformations. If we need to join 3 tables, then we need 2 Joiner Transformations.The Joiner transformation joins two sources with at least one matching port. The Joiner transformation uses a condition that matches one or more pairs of Ports between the two sources.


Example: To join EMP and DEPT tables.
  • EMP and DEPT will be source table.
  • Create a target table JOINER_EXAMPLE in target designer. Table should Contain all ports of EMP table plus DNAME and LOC as shown below.
  • Create the shortcuts in your folder.
Creating Mapping:
  1. Open folder where we want to create the mapping.
  2. Click Tools -> Mapping Designer.
  3. Click Mapping-> Create-> Give mapping name. Ex: m_joiner_example
  4. Drag EMP, DEPT, and Target. Create Joiner Transformation. Link as shown below.
clip_image002
5. Specify the join condition in Condition tab. See steps on next page.
6. Set Master in Ports tab. See steps on next page.
7. Mapping -> Validate
8. Repository -> Save.
  • Create Session and Workflow as described earlier. Run the Work flow and see the data in target table.
  • Make sure to give connection information for all tables.
JOIN CONDITION:
The join condition contains ports from both input sources that must match for the Power Center Server to join two rows.
Example: DEPTNO=DEPTNO1 in above.
  1. Edit Joiner Transformation -> Condition Tab
  2. Add condition
  • We can add as many conditions as needed.
  • Only = operator is allowed.
If we join Char and Varchar data types, the Power Center Server counts any spaces that pad Char values as part of the string. So if you try to join the following:
Char (40) = “abcd” and Varchar (40) = “abcd”
Then the Char value is “abcd” padded with 36 blank spaces, and the Power Center Server does not join the two fields because the Char field contains trailing spaces.
Note: The Joiner transformation does not match null values.
MASTER and DETAIL TABLES
In Joiner, one table is called as MASTER and other as DETAIL.
  • MASTER table is always cached. We can make any table as MASTER.
  • Edit Joiner Transformation -> Ports Tab -> Select M for Master table.
Table with less number of rows should be made MASTER to improve Performance.
Reason:
  • When the Power Center Server processes a Joiner transformation, it reads rows from both sources concurrently and builds the index and data cache based on the master rows. So table with fewer rows will be read fast and cache can be made as table with more rows is still being read.
  • The fewer unique rows in the master, the fewer iterations of the join comparison occur, which speeds the join process.
JOINER TRANSFORMATION PROPERTIES TAB
  •  Case-Sensitive String Comparison: If selected, the Power Center Server uses case-sensitive string comparisons when performing joins on string columns.
  •  Cache Directory: Specifies the directory used to cache master or detail rows and the index to these rows.
  •  Join Type: Specifies the type of join: Normal, Master Outer, Detail Outer, or Full Outer.
 Tracing Level
 Joiner Data Cache Size
 Joiner Index Cache Size
 Sorted Input
JOIN TYPES
In SQL, a join is a relational operator that combines data from multiple tables into a single result set. The Joiner transformation acts in much the same manner, except that tables can originate from different databases or flat files.
Types of Joins:
  • Normal
  • Master Outer
  • Detail Outer
  • Full Outer
Note: A normal or master outer join performs faster than a full outer or detail outer join.
Example: In EMP, we have employees with DEPTNO 10, 20, 30 and 50. In DEPT, we have DEPTNO 10, 20, 30 and 40. DEPT will be MASTER table as it has less rows.
Normal Join:
With a normal join, the Power Center Server discards all rows of data from the master and detail source that do not match, based on the condition.
  • All employees of 10, 20 and 30 will be there as only they are matching.
Master Outer Join:
This join keeps all rows of data from the detail source and the matching rows from the master source. It discards the unmatched rows from the master source.
  • All data of employees of 10, 20 and 30 will be there.
  • There will be employees of DEPTNO 50 and corresponding DNAME and LOC Columns will be NULL.
Detail Outer Join:
This join keeps all rows of data from the master source and the matching rows from the detail source. It discards the unmatched rows from the detail source.
  • All employees of 10, 20 and 30 will be there.
  • There will be one record for DEPTNO 40 and corresponding data of EMP columns will be NULL.
Full Outer Join:
A full outer join keeps all rows of data from both the master and detail sources.
  • All data of employees of 10, 20 and 30 will be there.
  • There will be employees of DEPTNO 50 and corresponding DNAME and LOC Columns will be NULL.
  • There will be one record for DEPTNO 40 and corresponding data of EMP Columns will be NULL.
USING SORTED INPUT
  • Use to improve session performance.
  • to use sorted input, we must pass data to the Joiner transformation sorted by the ports that are used in Join Condition.
  • We check the Sorted Input Option in Properties Tab of the transformation.
  • If the option is checked but we are not passing sorted data to the Transformation, then the session fails.
  • We can use SORTER to sort data or Source Qualifier in case of Relational tables.
JOINER CACHES
Joiner always caches the MASTER table. We cannot disable caching. It builds Index cache and Data Cache based on MASTER table.
1) Joiner Index Cache:
  • All Columns of MASTER table used in Join condition are in JOINER INDEX CACHE.
· Example: DEPTNO in our mapping.
2) Joiner Data Cache:
  • Master column not in join condition and used for output to other transformation or target table are in Data Cache.
· Example: DNAME and LOC in our mapping example.



Creating Joiner Transformation


Follow the below steps to create a joiner transformation in informatica 
  • Go to the mapping designer, click on the Transformation->Create.
  • Select the joiner transformation, enter a name and click on OK.
  • Drag the ports from the first source into the joiner transformation. By default the designer creates the input/output ports for the source fields in the joiner transformation as detail fields.
  • Now drag the ports from the second source into the joiner transformation. By default the designer configures the second source ports as master fields.
  • Edit the joiner transformation, go the ports tab and check on any box in the M column to switch the master/detail relationship for the sources.
  • Go to the condition tab, click on the Add button to add a condition. You can add multiple conditions.
  • Go to the properties tab and configure the properties of the joiner transformation.

Configuring Joiner Transformation


Configure the following properties of joiner transformation: 
  • Case-Sensitive String Comparison: When performing joins on string columns, the integration service uses this option. By default the case sensitive string comparison option is checked.
  • Cache Directory: Directory used to cache the master or detail rows. The default directory path is $PMCacheDir. You can override this value.
  • Join Type: The type of join to be performed. Normal Join, Master Outer Join, Detail Outer Join or Full Outer Join.
  • Tracing Level: Level of tracing to be tracked in the session log file.
  • Joiner Data Cache Size: Size of the data cache. The default value is Auto.
  • Joiner Index Cache Size: Size of the index cache. The default value is Auto.
  • Sorted Input: If the input data is in sorted order, then check this option for better performance.
  • Master Sort Order: Sort order of the master source data. Choose Ascending if the master source data is sorted in ascending order. You have to enable Sorted Input option if you choose Ascending. The default value for this option is Auto.
  • Transformation Scope: You can choose the transformation scope as All Input or Row.


Join Condition


The integration service joins both the input sources based on the join condition. The join condition contains ports from both the input sources that must match. You can specify only the equal (=) operator between the join columns. Other operators are not allowed in the join condition. As an example, if you want to join the employees and departments table then you have to specify the join condition as department_id1= department_id. Here department_id1 is the port of departments source and department_id is the port of employees source.

Join Type


The joiner transformation supports the following four types of joins.
  • Normal Join
  • Master Outer Join
  • Details Outer Join
  • Full Outer Join

We will learn about each join type with an example. Let say i have the following students and subjects tables as the source. 
Table Name: Subjects
Subject_Id subject_Name
-----------------------
1          Maths
2          Chemistry
3          Physics

Table Name: Students
Student_Id  Subject_Id
---------------------
10          1
20          2
30          NULL

Assume that subjects source is the master and students source is the detail and we will join these sources on the subject_id port. 

Normal Join:

The joiner transformation outputs only the records that match the join condition and discards all the rows that do not match the join condition. The output of the normal join is
Master Ports       |   Detail Ports
---------------------------------------------
Subject_Id Subject_Name Student_Id Subject_Id
---------------------------------------------
1          Maths        10         1
2          Chemistry    20         2

Master Outer Join:

In a master outer join, the joiner transformation keeps all the records from the detail source and only the matching rows from the master source. It discards the unmatched rows from the master source. The output of master outer join is
Master Ports       |   Detail Ports
---------------------------------------------
Subject_Id Subject_Name Student_Id Subject_Id
---------------------------------------------
1          Maths        10         1
2          Chemistry    20         2
NULL       NULL         30         NULL

Detail Outer Join:

In a detail outer join, the joiner transformation keeps all the records from the master source and only the matching rows from the detail source. It discards the unmatched rows from the detail source. The output of detail outer join is
Master Ports       |   Detail Ports
---------------------------------------------
Subject_Id Subject_Name Student_Id Subject_Id
---------------------------------------------
1          Maths        10         1
2          Chemistry    20         2
3          Physics      NULL       NULL

Full Outer Join:

The full outer join first brings the matching rows from both the sources and then it also keeps the non-matched records from both the master and detail sources. The output of full outer join is
Master Ports       |   Detail Ports
---------------------------------------------
Subject_Id Subject_Name Student_Id Subject_Id
---------------------------------------------
1          Maths        10         1
2          Chemistry    20         2
3          Physics      NULL       NULL
NULL       NULL         30         NULL

Sorted Input


Use the sorted input option in the joiner properties tab when both the master and detail are sorted on the ports specified in the join condition. You can improve the performance by using the sorted input option as the integration service performs the join by minimizing the number of disk IOs. you can see good performance when worked with large data sets.

Steps to follow for configuring the sorted input option
  • Sort the master and detail source either by using the source qualifier transformation or sorter transformation.
  • Sort both the source on the ports to be used in join condition either in ascending or descending order.
  • Specify the Sorted Input option in the joiner transformation properties tab.

Why joiner transformation is called as blocking transformation


The integration service blocks and unblocks the source data depending on whether the joiner transformation is configured for sorted input or not. 

Unsorted Joiner Transformation

In case of unsorted joiner transformation, the integration service first reads all the master rows before it reads the detail rows. The integration service blocks the detail source while it caches the all the master rows. Once it reads all the master rows, then it unblocks the detail source and reads the details rows. 

Sorted Joiner Transformation

Blocking logic may or may not possible in case of sorted joiner transformation. The integration service uses blocking logic if it can do so without blocking all sources in the target load order group. Otherwise, it does not use blocking logic.

Joiner Transformation Performance Improve Tips


To improve the performance of a joiner transformation follow the below tips
  • If possible, perform joins in a database. Performing joins in a database is faster than performing joins in a session.
  • You can improve the session performance by configuring the Sorted Input option in the joiner transformation properties tab.
  • join only sorted data
  • Specify the source with fewer rows and with fewer duplicate keys as the master and the other source as detail.

Limitations of Joiner Transformation


The limitations of joiner transformation are
  • You cannot use joiner transformation when the input pipeline contains an update strategy transformation.
  • You cannot connect a sequence generator transformation directly to the joiner transformation.

Share This:

Post Tags:

Tharun Katanguru

I'm Tharun Katanguru. A full time web designer and Data Engineer. I enjoy to make modern template. I love create blogger template and write about Web Design, Data Warehousing and Data Management. Now I'm working as a Jr Etl Developer.

No Comment to " Joiner Transformation "

  • To add an Emoticons Show Icons
  • To add code Use [pre]code here[/pre]
  • To add an Image Use [img]IMAGE-URL-HERE[/img]
  • To add Youtube video just paste a video link like http://www.youtube.com/watch?v=0x_gnfpL3RM