EXPRESSION
Using the Expression instruction, you can create calculated dynamic columns using one or two columns from a database. You can create simple expressions or complex expressions. For example, a simple expression concatenates strings by joining several columns. A complex expression uses mathematical operations such as logarithms. For expressions that generate a decimal value, you can specify precision up to six decimal places.
An Expression instruction must have the following syntax:
Syntax
EXPRESSION [Target table] [Target column] [Expression]
Parameters
Target table
Define the target table where the expression column is created.
Target column
Defines the name of the new column. Do not use the following characters:
/ \ º ª - accents, dieresis, ?, !, *, @, #, &, $, o, ñ
Expression
The expression used to create a column value. To create expressions you use operators, statements, and functions.
BIRT Analytics Loader supports the following operators described in Table 5‑4.
Table 5‑4 Supported operators and statements
Name
Description
Example
+
Concatenate two text columns.
[Demo].[Customer].[Title]+[Demo].[Customer].[Surname]
+
Concatenate Numeric columns or strings.
STRING([CustomerID])+5+
[DaysSinceLastOrder]-1
+, -, *, /
Mathematical operators for adding, subtracting, multiplying, or dividing numeric columns.
STRING([Demo].[Customer].[Cust_ID]+[DEMO].[Customer].[Household_ID])
GT[>], LT [<],GE[>=],LE[<=],EQ [=], NE[<>],AND, OR, NOT
Assign Boolean values to columns, based on the conditions.
[Demo].[Customer].[Edats]EQ"Adults">0 or [Demo].[Customer].[Edats]EQ"Joves">0
Constants
Assign a constant to a table column.
[Demo].[Customer].[Household_ID]=1
Boolean If statement
Return a value depending on a condition. The condition can be simple or more complex. The return value can be any type, for example yes or no.
If([DEMO].[Order].[OrderDate]>[Demo].[Customer].[DOB],1,0)
Table 5‑5 presents the complete list of the functions. For more information about these functions, see Using BIRT Analytics.
Table 5‑5 Supported functions
Name and syntax
Description
ABS ([db].[table].[column])
Returns the absolute value of a numeric column, regardless of sign.
AGE ([db].[table].[column], date)
Returns the age of each value on a specified date.
ALLMONTHS ([db].[table].[column])
Returns the year and the month of each one of the values of a date column.
ALLQUARTERS ([db].[table].[column])
Returns the year and the quarter of each one of the values of a date column.
CEIL ([db].[table].[column])
Given a continuous numeric column, it returns the next integer value. For example, given the value 3.6, the CEIL function returns 4.
CHR ([db].[table].[column])
Given an integer numeric column, it returns the associated ASCII (American Standard Code for Information Interchange) value.
CODE ([db].[table].[column])
Returns the ASCII code for the first value of the text string. It only works for ASCII string columns, not for Unicode columns.
COUNT ([db].[table].[column])
Returns the number of records in a table.
DATE ([db].[table].[column], format)
Creates a date column converting input data to a date. You can use the following syntax:
DATE(<string>|<unicode>,<date format:string>)
DATETIME ([db].[table].[column], format)
Creates a date column converting input data to a datetime. You can use the following syntax:
DATETIME(<string>,<datetime format:string>) or DATETIME(<unicode>,<datetime format:string>)
DAY ([db].[table].[column])
Returns the part for the day in a date column.
DAYADD ([db].[table].[column], [db].[table].[column2])
Adds or removes the days required in a date column. The column returned is a date column.
DAYSTO ([db].[table].[column], date)
Returns the number of days between each value and a specified date.
DISTINCT ([db].[table].[column])
Returns the unique value or values in a column. For example, in a [Demo].[Customer].[Surname] column, it returns a single surname for each stored in the column.
EXP ([db].[table].[column])
Returns the exponential value of each of the values of a numeric column.
FLOOR ([db].[table].[column])
Returns the largest integer number less than or equal to the array.
HOUR ([db].[table].[column])
Returns the time of each date.
IF (A, B, C)
Returns a conditional function. A is the condition to be applied, B is the value to return if the condition is true, C is the value to return if the condition is false.
INSTR ([db].[table].[column], “string”)
Returns an integer that specifies the initial position of the first appearance of a string in another one. First value is 0.
INT ([db].[table].[column], format)
Returns the integer number that forms part of the value in a column, or converts the column that contains the numbers into a numeric column. You can use the following syntax:
INT(<string>|<unicode>, <thousands format:string>, <thousands grouping:integer|longint>)
ISNULL ([db].[table].[column])
Returns true for null values and false for not-null values.
ISNOTNULL ([db].[table].[column])
Returns true for not-null values and false for null values.
KURT ([db].[table].[column])
Returns the kurtosis of a column, which represents how values are distributed around the mean.
LCASE ([db].[table].[column])
Returns a lowercase text string, turning uppercase characters into lowercase.
LEFT ([db].[table].[column], n)
Returns the left part of a string of characters with the specified number of characters - where n is the quantity of characters.
The length of the strings in the column, resulting from an expression using the LEFTfunction, will be defined by the parameter provided in the function. The only exception that can occur is when the parameter is provided through a variable whose value is defined by a certain column. In this case the column result for LEFT will have a length equal to the original column.
LEN ([db].[table].[column])
Returns the number of characters in a text column, returning a number for each value in the column.
LOG ([db].[table].[column])
Returns the natural logarithm (logarithm in base e) of the values of the column.
LOG10 ([db].[table].[column])
Returns the decimal logarithm (logarithm in base 10) of the values of the column.
LONGINT ([db].[table].[column], format)
Returns the long integer part of a column, or converts the column that contains a number into a numeric column. You can use the following syntax:
LONGINT(<string>|<unicode>, <thousands format:string>, <thousands grouping:integer|longint>)
LTRIM ([db].[table].[column])
Returns a string that is a copy of a string with no initial spaces (deletes any initial spaces).
MAX ([db].[table].[column])
Returns the maximum value of a column. The column used must always be numeric (continuous or discrete).
MD5 ([db].[table].[column])
Returns an MD5 hash of a free text or a text field
MEAN ([db].[table].[column])
Returns the average value of a numeric column (continuous or discrete).
MEDIAN ([db].[table].[column])
Returns the value that divides the values of a column into two equal parts.
MID ([db].[table].[column], number1, number2)
Returns a string that contains a specified number of characters in a string. Two numbers are required: the first indicates the position where the string to be returned starts and the second indicates the number of characters to return (optional). If Number1 (the position indicator) is greater than the number of characters in the initial string, it returns a 0‑length string.
The length of the strings in the column, resulting from an expression using the MID function, will be defined by the parameter provided in the function. The only exception that can occur is when the parameter is provided through a variable whose value is defined by a certain column. In this case the column result for MID will have a length equal to the original column.
MIN ([db].[table].[column])
Returns the smallest value of the parameters of a numeric column (continuous or discrete).
MINUTE ([db].[table].[column])
Returns the minutes part of a date or time column.
MODE ([db].[table].[column])
Returns the value that is most repeated in the values of a column (the most frequent value).
MONTH ([db].[table].[column])
Returns the month part of a date column.
NOW ( )
Returns the current date and time in the engine’s default format.
POWER ([db].[table].[column], number)
Returns all the values of the column to the specified second parameter (number).
PROPER ([db].[table].[column])
Returns a new column in which all records begin with an uppercase letter and the rest are in lowercase.
PROPERSENTENCE ([db].[table].[column])
Returns a new column in which the first letter of each word in the description of the variables is in uppercase.
QUARTER ([db].[table].[column])
Returns the quarter of the year for each of the records in the date column.
RAND (n)
Returns a column based on random (n). For example, RAND (120) in the Customers table returns a column in which customers are randomly divided into 120 groups.
REAL ([db].[table].[column], format)
Returns the real value of a column. The source column must be a numeric column, a text string containing text, or a string representing a numeric column. You can use the following syntax:
REAL(<string>|<unicode>, <decimal format:string>, <thousands format:string>, <thousands grouping:integer|longint>)
REGEXMATCH ([db].[table].[column], regular_expression[, "i"])
Returns 1 if the column containing a string value matches the regular expression and 0 otherwise. To perform a case-insensitive test, use the value "i" as the optional third argument. For information about the syntax of regular expressions and examples of their use, see Using regular expression patterns to match and replace text strings.
REGEXREPLACE ([db].[table].[column], regular_expression, replacement[, "i"])
Returns a string containing the value of replacement in place of a matched regular expression in a column string value. To perform a case-insensitive test, use the value "i" as the optional fourth argument. For information about the syntax of regular expressions and examples of their use, see Using regular expression patterns to match and replace text strings.
REPLACE ([db].[table].[column], pattern, replacement)
Replaces the pattern text with the replacement text. Both pattern and replacement can be columns or text
RIGHT ([db].[table].[column],n)
Returns the right part of a string of characters with the specified number of characters - where n is the quantity of characters.
The length of the strings in the column, resulting from an expression using the RIGHT function, will be defined by the parameter provided in the function. The only exception that can occur is when the parameter is provided through a variable whose value is defined by a certain column. In this case the column result for RIGHT will have a length equal to the original column.
ROUND ([db].[table].[column])
Rounds the input number to the nearest value containing no decimal positions. It creates a column of the same type as of the source column.
ROW ([db].[table].[column])
Returns a unique value, starting from 0, for each of the existing records in the column used to create the expression.
RTRIM ([db].[table].[column])
Returns a text string, an exact copy of the specified string without trailing spaces.
SECOND ([db].[table].[column])
Returns the seconds part of a date or time column.
SECONDSTO ([db].[table].[column], [db].[table].[column])
Returns a longint value that is the seconds elapsed between two columns of date or time format.
SECSTO ([db].[table].[column], [db].[table].[column])
Deprecated.
SGN ([db].[table].[column])
Returns the values grouped by -1 for negative values, 0 for invalid values, because not entered, for example, and +1 for positive values.
SHA1([db].[table].[column])
Returns an SHA1 hash of a free text or a text field.
SKEW ([db].[table].[column])
Returns the skewness value of the values of a column with respect to the mean value.
SQRT ([db].[table].[column])
Returns the square root of n, where n is the mean of the values of the column.
STDEV ([db].[table].[column])
Returns the standard deviation of the values of the column, which is the square root of the variance.
STRING ([db].[table].[column], format)
Returns a text string that represents any other data type, except Unicode. Use any of the following syntaxes:
STRING(<integer>|<longint>|<real>, <decimal format:string>, <thousands format:string>, <thousands grouping:integer|longint>)
STRING(<date> <date format:string>[, <date separator:string>])
STRING(<datetime>, <datetime format:string>)
STRING(<datetime>, <datetime format:string>, <date separator:string>, <datetime separator:string>, <time separator:string>, <time decimal separator:string>)
STRING(<time>, <time format:string>)
STRING(<time>, <time format:string>, <time separator:string>, <time decimal separator:string>)
STRIP ([db].[table].[column])
Returns the entered column deleting any spaces.
SUM ([db].[table].[column])
Returns a new column with the cumulative sum of the values of the original column.
SUMSQ ([db].[table].[column])
Returns a new column with the cumulative sum of each of the values of the original column squared.
TIME ([db].[table].[column])
Returns the time of a string, date-and-time, or time column. You can use the following syntax:
TIME(<string>|<unicode>, <time format:string>)
TODAY ()
Returns current date in engine’s format.
TRIM ([db].[table].[column])
Returns a text string that contains the copy of the specified string with no spaces either at the beginning or the end of the string.
UCASE ([db].[table].[column])
Returns a new column with text in uppercase.
UNICODE([db].[table].[column], format)
Returns a unicode string that represents any other data type. You can use the following syntax:
UNICODE(<integer>|<longint>|<real>, <decimal format:string>, <thousands format:string>, <thousands grouping:integer|longint>)
UNICODE(<date>, <date format:string>[, <date separator:string>])
UNICODE(<datetime>, <datetime format:string>)
UNICODE(<datetime>, <datetime format:string>, <date separator:string>, <datetime separator:string>, <time separator:string>, <time decimal separator:string>)
UNICODE(<time>, <time format:string>)
UNICODE(<time>, <time format:string>, <time separator:string>, <time decimal separator:string>)
VAL ([db].[table].[column])
Returns a Unicode string that represents any other type of data.
WEEKDAY ([db].[table].[column])
Depending on the locale, returns the day of the week for each of the values. Default locale is en_US. F1=Sunday. In es_ES locale: 1=Monday.
WEEKNUMBER([db].[table].[column])
Returns the week number of a given date (as specified in ISO-8601).
YEAR ([db].[table].[column])
Returns the year of a date or date-and-time column.
Using LONGINT with DATE, TIME, or DATETIME arguments
For DATE, TIME, or DATETIME arguments, you must use the LONGINT function together with the STRING function, as shown in the following examples:
Incorrect: LONGINT(DATE("20130301"))
Correct: LONGINT(STRING(DATE("20130301", "yyyymmdd"), "yyyymmdd"), "", 0)
Incorrect: LONGINT(TIME("012345"))
Correct: LONGINT(STRING(TIME("012345", "hhMMss"), "hhMMss"), "", 0)
Incorrect: LONGINT(DATETIME("20120229 140600"))
Correct: LONGINT(REPLACE(STRING(DATETIME("20120229 140600", "yyyymmdd_hhMMss"), "yyyymmdd_hhMMss"), " ", ""), "", 0)
Using INT with DATE, TIME, or DATETIME arguments
For DATE, TIME, or DATETIME arguments, you must use the INT function together with the STRING function, as shown in the following examples:
Incorrect: INT(DATE("20130301"))
Correct: INT(STRING(DATE("20130301", "yyyymmdd"), "yyyymmdd"), "", 0)
Incorrect: INT(TIME("012345"))
Correct: INT(STRING(TIME("012345", "hhMMss"), "hhMMss"), "", 0)
Using format patterns for DATE, TIME, or DATETIME values
Format patterns are case sensitive and use the following components:
yyyy – represents the year with 4 digits
mm – represents the month with 2 digits
dd – represents the day with 2 digits
hh – represents the hour with 2 digits
MM – represents the minute with 2 digits
ss – represents the second with 2 digits
xxx – represents the millisecond with 3 digits. By default, a decimal point (.) separates milliseconds from seconds, for example 12.345.
Underscore (_) – represents any character
The following format patterns are supported when working with the DATE, TIME, or DATETIME functions:
DATE
ddmmyyyy
mmddyyyy
yyyymmdd
dd_mm_yyyy
mm_dd_yyyy
yyyy_mm_dd
TIME
hhMMss
hhMMssxxx
hh_MM_ss
hh_MM_ss_xxx
DATETIME
yyyymmdd_hhMMss
yyyymmdd_hhMMssxxx
yyyymmdd_hh_MM_ss
yyyymmdd_hh_MM_ss_xxx
yyyy_mm_dd_hhMMss
yyyy_mm_dd_hhMMssxxx
yyyy_mm_dd_hh_MM_ss
yyyy_mm_dd_hh_MM_ss_xxx
ddmmyyyy_hhMMss
ddmmyyyy_hhMMssxxx
ddmmyyyy_hh_MM_ss
ddmmyyyy_hh_MM_ss_xxx
dd_mm_yyyy_hhMMss
dd_mm_yyyy_hhMMssxxx
dd_mm_yyyy_hh_MM_ss
dd_mm_yyyy_hh_MM_ss_xxx
mmddyyyy_hhMMss
mmddyyyy_hhMMssxxx
mmddyyyy_hh_MM_ss
mmddyyyy_hh_MM_ss_xxx
mm_dd_yyyy_hhMMss
mm_dd_yyyy_hhMMssxxx
mm_dd_yyyy_hh_MM_ss
mm_dd_yyyy_hh_MM_ss_xxx
Using regular expression patterns to match and replace text strings
The REGEXMATCH and REGEXREPLACE functions use arbitrarily complex regular expressions to find a string in a column value. A regular expression is a sequence of literal string values and pattern-matching symbols enclosed in quotation marks (").
Table 5‑6 lists the pattern-matching symbols and shows examples of their use.
Table 5‑6 Regular expression pattern-matching symbols 
Symbol
Regular expression function
Examples
\
Enables or disables the regular expression function of the following character
"\|" matches the | character
"\t" matches a horizontal tab character
|
Separates alternative values
"gray|grey" matches "gray" and "grey"
()
Creates a group that defines operator precedence
"gr(a|e)y" matches "gray" and "grey"
?
Specifies zero or one of the preceding element
"colou?r" matches "color" and "colour"
*
Specifies zero or more of the preceding element
"ab*c" matches "ac", "abc", "abbc", and so on
+
Specifies one or more of the preceding element
"ab*c" matches "abc", "abbc", and so on, but does not match "ac"
{number}
Specifies an exact number of repetitions of the preceding element
"ab{2}c" matches "abbc"
"(ab){2}c" matches "ababc"
{min,max}
Specifies the minimum and maximum repetitions of the preceding element
"ab{2,3}c" matches "abbc" and "abbbc"
"(ab){2,3}c" matches "ababc" and "abababc"
{min,}
Specifies the minimum repetitions of the preceding element
"ab{2,}c" matches "abbc", "abbbc", "abbbbc", and so on
^
Requires the subsequent pattern to match the start of the string value
"^abc" matches "abc" in the string "abcdef" but not in the string "defabc"
$
Requires the preceding pattern to match the end of the string value
"abc$" matches "abc" in the string "defabc" but not in the string "abcdef"
.
Matches any single character
"a.c" matches "aac", "abc", "acc", and so on
[character expression]
Specifies an expression that matches a single character. The expression contains one or more characters or ranges of characters. The expression interprets other pattern-matching symbols as simple characters.
"gr[ae]y" matches "gray" and "grey"
"[abc]" matches "a", "b", and "c"
"[a.c]" matches "a", ".", and "c"
"[a-z]" matches any lowercase letter from "a" to "z"
"[abcx-z]" matches "a", "b", "c", "x", "y", and "z"
"[a-zA-Z]" matches any lowercase or uppercase letter
[^character expression]
Specifies an expression that matches any single character not in the expression.
"[^a-zA-Z]" matches any character that is not a lowercase or uppercase letter, for example "1" or "@"
Table 5‑7 lists and describes escape-sequence patterns that match specific characters or classes of characters.
Table 5‑7 Regular expression escape-sequence patterns
Pattern
Represents
Examples
\b
Backspace character (0x08)
 
\e
Escape character (0x1B)
 
\n
Newline character(0x0A)
 
\r
Return character (0x0D)
 
\t
Horizontal tab character (0x09)
 
\v
Vertical tab character (0x0B)
 
\d
Any decimal digit character: 0-9
 
\D
Any character not a decimal digit
 
\h
Any hexadecimal digit character: 0-9, a-f,and A-F
 
\H
Any character not a hexadecimal digit
 
\s
Any white-space character: space, horizontal tab, vertical tab, return, newline, formfeed (0x0C), next line (0x85)
"^\s+" matches white-space characters at the start of a string
\S
Any character not a white-space character
"\S$" matches a single non-white-space character at the end of a string
\w
Any alphanumeric, underscore (_), or multibyte character
 
\W
Any character not an alphanumeric, underscore (_), or multibyte character, such as control characters (0x01 through 0x1F) and punctuation characters
 
\digit
A back reference to an element enclosed in parentheses
"(abc)def(\1)" matches "abcdefabc"
The following examples show more complex usage of regular expressions as arguments to the REGEXMATCH and REGEXREPLACE functions:
REGEXMATCH([Demo].[Customer].[column], "H(ä|ae?)ndel")
Returns 1 for strings that match the words "Handel", "Händel", and "Haendel".
REGEXMATCH([Demo].[Customer].[column], "^[ \s]+|[ \s]+$")
Matches excess whitespace (space or tab) at the beginning or end of a line.
REGEXMATCH([Demo].[Customer].[column], "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,3}$")
Matches an e-mail address for which the user name part includes punctuation characters.
REGEXMATCH([Demo].[Customer].[column], "^\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,3}$")
Uses the \w sequence to match an e-mail address for which the user name part does not include punctuation characters.
REGEXMATCH(STRING([Demo].[Customer].[Income]), "^[+-]?(\d+\.?\d*|\.\d+)([eE][+-]?\d+)?$")
Matches any number.
REGEXREPLACE(UNICODE([Demo].[Customer].[Income]), "^([+-]?\d+)\.?(\d*)$", "\1")
Replaces a real number by its integer part.
How to create an expression
This example creates a column that shows the creation of a column of the year a customer first made their purchase.
1 In the Transform tab, choose New. The list of transformation instructions appears.
2 In the list of transformation instructions, select Expression. Expression appears, as shown in Figure 5‑18.
Figure 5‑18 Creating an expression
3 In the Table fields, select a target database and a table.
4 In the Column field, type the name of the new column.
5 In the Expression field, type the expression.
6 In the Description field, type the description of the column.
7 Click “OK” to confirm your choices and return to the main Transform screen. The Expression instruction appears in Main file.
8 Choose “Save” at the top of the Loader Project window to save the project.
9 After running the project, verify that the new column was created in Explorer in the Data Management tab.