Grouping data
A GROUP BY clause groups data by column value. For example, consider the following information object:
SELECT orderNumber
FROM OrderDetails
The first 10 data rows returned by this information object are as follows:
orderNumber
10100
10100
10100
10100
10101
10101
10101
10101
10102
10102
Each order number appears more than once. For example, order number 10100 appears four times. If you add a GROUP BY clause to the information object,
you can group the data by order number so that each order number appears only once:
SELECT orderNumber
FROM OrderDetails
GROUP BY orderNumber
The first 10 data rows returned by this information object are as follows:
orderNumber
10100
10101
10102
10103
10104
10105
10106
10107
10108
10109
Typically, you use a GROUP BY clause to perform an aggregation. For example, the following information object returns order numbers and order totals. The Total column is an aggregate column. An aggregate column is a computed column that uses an aggregate function such as AVG, COUNT, MAX, MIN, or SUM. In the following example the information object returns the first 10 data rows. The data is grouped by order number and the total for each order appears.
SELECT orderNumber, (SUM(quantityOrdered * priceEach)) AS Total
FROM OrderDetails
GROUP BY orderNumber
Creating a GROUP BY clause
By default, the IO Design perspective creates a GROUP BY clause automatically. If you prefer, you can create a GROUP BY clause manually.
Creating a GROUP BY clause automatically
When an information object’s SELECT clause includes an aggregate column and one or more non-aggregate columns, the non-aggregate columns must appear in the GROUP BY clause. If the non-aggregate columns do not appear in the GROUP BY clause, the IO Design perspective displays an error message. For example, consider the following information object:
SELECT orderNumber, (SUM(quantityOrdered * priceEach)) AS Total
FROM OrderDetails
When you attempt to compile the information object, an error message appears in the Problems view.
To avoid this problem, the IO Design perspective automatically creates a GROUP BY clause:
SELECT orderNumber, (SUM(quantityOrdered * priceEach)) AS Total
FROM OrderDetails
GROUP BY orderNumber
If more than one column appears in the GROUP BY clause, you can change the order of the columns using the up and down arrows in Group By, as shown in Figure 3‑37.
Figure 3‑37 Changing the order of GROUP BY columns
Creating a GROUP BY clause manually
If automatic grouping does not generate the desired SQL query, create the GROUP BY clause manually. Create the GROUP BY clause manually if you want to group on a column that does not appear in the SELECT clause, for example:
SELECT (SUM(quantityOrdered * priceEach)) AS Total
FROM OrderDetails
GROUP BY orderNumber
How to create a GROUP BY clause manually
1 In the graphical information object editor, choose Group By.
2 In Group By, deselect Use Automatic Grouping.
3 In Available, expand the Computed and Output nodes to view the available columns.
By default, the IO Design perspective displays only output columns and non‑aggregate computed fields. To group on a column that is not an output column, choose Show all.
4 In Available, select the appropriate column, and choose Select. This action moves the column name to Selected, as shown in Figure 3‑38.
Figure 3‑38 Selecting a GROUP BY column
5 Repeat the previous step for each GROUP BY column.
6 To change the order of the GROUP BY columns, select a column in Selected, and use the up or down arrow.
Removing a column from the GROUP BY clause
By default, the IO Design perspective removes GROUP BY columns automatically. If you disable automatic grouping, you must remove GROUP BY columns manually.
Removing a GROUP BY column automatically
The IO Design perspective automatically removes a column from the GROUP BY clause when:
*You remove the column from the SELECT clause.
For example, consider the following information object:
SELECT orderNumber, productCode,
(SUM(quantityOrdered * priceEach)) AS Total
FROM OrderDetails
GROUP BY orderNumber, productCode
You remove the productCode column from the SELECT clause. The IO Design perspective automatically removes productCode from the GROUP BY clause:
SELECT orderNumber,
(SUM(quantityOrdered * priceEach)) AS Total
FROM OrderDetails
GROUP BY orderNumber
*You manually add a column to the GROUP BY clause that does not appear in the SELECT clause and then enable automatic grouping.
For example, consider the following information object:
SELECT orderNumber,
(SUM(quantityOrdered * priceEach)) AS Total
FROM OrderDetails
GROUP BY orderNumber, productCode
The productCode column appears in the GROUP BY clause but not in the SELECT clause. You enable automatic grouping. Information Object Designer automatically removes productCode from the GROUP BY clause:
SELECT orderNumber,
(SUM(quantityOrdered * priceEach)) AS Total
FROM OrderDetails
GROUP BY orderNumber
The IO Design perspective automatically removes the GROUP BY clause when:
*You remove all aggregate columns from the SELECT clause.
For example, consider the following information object:
SELECT orderNumber,
(SUM(quantityOrdered * priceEach)) AS Total
FROM OrderDetails
GROUP BY orderNumber
You remove the aggregate column SUM(quantityOrdered * priceEach) from the SELECT clause. The IO Design perspective automatically removes the GROUP BY clause:
SELECT orderNumber
FROM OrderDetails
*You remove all non-aggregate columns from the SELECT clause.
For example, consider the following information object:
SELECT orderNumber,
(SUM(quantityOrdered * priceEach)) AS Total
FROM OrderDetails
GROUP BY orderNumber
You remove the orderNumber column from the SELECT clause. The IO Design perspective automatically removes the GROUP BY clause:
SELECT (SUM(quantityOrdered * priceEach)) AS Total
FROM OrderDetails
Removing a GROUP BY column manually
If you disable automatic grouping, you must remove GROUP BY columns manually.
How to remove a GROUP BY column manually
1 In the graphical information object editor, choose Group By.
2 In Group By, complete one of the following tasks:
*Select the column in Selected, and choose Deselect.
*To remove all Group By columns, choose Remove All.