Creating a joined data set
A joined data set combines the results of two or more data sets that are related through a common key. Creating a joined data set is similar to joining tables in a database using a SQL JOIN clause. Use a joined data set to combine data from different data sources in which a relationship exists, as shown in the following example. Figure 2‑18 illustrates data sets that return data about customers and orders. The data sets are related through the CustomerID field. You can retrieve order information for each customer by joining the data sets.
Figure 2‑18 Data sets with a common field returning customer and order data
Figure 2‑19 shows the results of joining the customers and orders data sets on the CustomerID key, and displaying only the CustomerName and Amount fields in the joined data set.
Figure 2‑19 Data rows returned when the customers and orders data sets are joined
BIRT Designer Professional supports the functionality of joined data sets available in the open-source version, and provides the following additional features:
*The capability to join more than two data sets
*The capability to join on more than one key
*Support for additional join operators: <>, <, >, <=, >=
*Support for the side-by-side join
Unlike the other types of supported joins (inner, left outer, right outer, and full outer), the side-by-side join links data sets without requiring a key. The resulting joined data set displays the selected fields side by side. Figure 2‑20 shows two data sets that do not share a common field. The first data set returns customer data, and the second data set returns order data.
Figure 2‑20 Data sets without a common field
Figure 2‑21 shows the results of joining the customers and orders data sets using the side-by-side join. When using this type of join, do not misinterpret the results. As Figure 2‑21 shows, the data from the two data sets appear side by side, implying that each customer has a relationship with an order when, in fact, no such relationship exists.
Figure 2‑21 Results of a side-by-side join
How to create a joined data set
This procedure assumes that you have created the data sets to include in the joined data set.
1 In Data Explorer, right-click Data Sets, and choose Join Data Set.
2 In New Data Set, in Data Set Name, optionally type a name for the joined data set.
3 Specify the data sets to use in the joined data set. From Available data sets, drag each data set to the editing area.
Figure 2‑22 shows three data sets in the editing area.
Figure 2‑22 Three data sets selected for a joined data set
4 Specify the fields from each data set to include in the joined data set. Perform the following tasks for each data set:
1 Select a data set by clicking anywhere in the image of the data set. Do not click a field name.
2 Choose Output Columns.
3 In Edit Data Set Properties, under Select output columns, select the desired data set fields, then choose OK.
5 Specify the conditions for joining the data sets. Perform the following tasks for each pair of data sets. In the example shown in Figure 2‑22, specify a condition for joining the first and second data sets, and a condition for joining the second and third data sets.
1 Select the arrow between two data sets.
2 Choose Conditions.
3 In Define join type and join conditions, specify the following information:
*In Join Type, select the type of join to use.
*If you select a join type other than Side-By-Side, define a join condition.
*Choose New.
*Select the fields on which to join, and select an operator that specifies how to compare the values in the fields being joined. Figure 2‑23 shows a join condition that combines data when the CUSTOMERNUMBER value in the Customers data set is equal to the CUSTOMERNUMBER value in the Orders data set.
*Choose OK.
Figure 2‑23 Joining data sets on a common field
Define join type and join conditions displays the specified condition, as shown in Figure 2‑24.
Figure 2‑24 Definition of an inner join
4 Choose OK.
6 Choose Finish to save the joined data set.
Joining on more than one key
You can specify more than one join condition when joining two data sets. For example, you can join a customers data set with a sales offices data set, shown in Figure 2‑25, to find the names of customers and sales managers that are located in the same city and state.
Figure 2‑25 Data sets with two common fields, City and State
You would create the following join conditions:
*The first condition, shown in Figure 2‑26, compares the State values in the Customers and SalesOffices data sets and looks for a match.
Figure 2‑26 Joining on a State field
*The second condition, shown in Figure 2‑27, compares the City values in both data sets and looks for a match.
Figure 2‑27 Joining on a City field
The joined data set returns the results shown in Figure 2‑28, if the join type is fullOuter.
Figure 2‑28 Data rows returned by the joined data set
Specifying a join condition not based on equality
The condition for joining values in two fields is usually based on equality (=), as shown in all the examples so far. Less common are join conditions that use any of the other comparison operators: not equal (<>), greater than (>), less than (<), greater than or equal to (>=), and less than or equal to (<=).
The following example shows the use of joins that are not based on equality. In the example, a Sales data set is joined with a Commissions data set. The joined data set uses a >= join and a < join to look up the commissions to pay to sales managers, based on their sales totals and management levels.
Figure 2‑29 shows the Sales and Commissions data sets. In the Commissions data set, each level has four commission rates. For level 1, a commission rate of 25% is paid if a sales total is between 75000 and 100000, 20% is paid if a sales total is between 50000 and 75000, and so on.
Figure 2‑29 Data sets returning sales and commission rates data
The following join conditions specify the fields on which to join and how to compare the values in the fields being joined:
*The first condition, shown in Figure 2‑30, compares the Level values in the Sales and Commissions data sets and looks for a match.
Figure 2‑30 Joining on the Level field and looking for a match
*The second condition, shown in Figure 2‑31, uses the >= operator to compare the TotalSales values in the Sales data set with the LowRange values in the Commissions data set.
Figure 2‑31 Joining on TotalSales and LowRange fields using the >= operator
*The third condition, shown in Figure 2‑32, uses the < operator to compare the TotalSales values in the Sales data set with the HighRange values in the Commissions data set.
Figure 2‑32 Joining on TotalSales and HighRange fields using the < operator
The second and third join conditions check if a sales total is greater than or equal to LowRange and less than HighRange.
The joined data set returns the results shown in Figure 2‑33.
Figure 2‑33 Data rows returned by the joined data set