Writing Expressions : Manipulating string data : Matching string patterns
 
Matching string patterns
The previous section described some techniques for getting parts of a string for display. Sometimes you need to match patterns, rather than literal substrings, in string values. For example, use pattern-matching to:
*Filter rows to display only customers whose last names start with a particular string pattern.
*Search for string patterns, using wildcard characters, and replace with a different string.
To perform pattern-matching, use regular expressions. A regular expression, also known as regexp, is an expression that searches for a pattern within a string. Many programming languages support regular expressions for complex string manipulation. JavaScript regular expressions are based on the regular expression features of the Perl programming language with a few differences.
In JavaScript, a regular expression is represented by the RegExp object, which you create by using a special literal syntax. Just as you specify a string literal as characters within quotation marks, you specify a regular expression as characters within a pair of forward slash (/) characters, as shown in the following example:
var pattern = /smith/;
This expression creates a RegExp object and assigns it to the variable pattern. The RegExp object finds the string “smith” within strings, such as smith, blacksmith, smithers, or mark smith. It does not match Smith or Mark Smith because the search is case sensitive.
You can perform complex pattern-matching by using any number of special characters along with the literal string to search. Table 11‑3 shows a few examples of regular expressions that contain special characters. There are many more special characters that you can use in a regular expression, too many to summarize in this section.
Table 11‑3 Examples of regular expressions
Regular expression
Description
/y$/
Matches any string that contains the letter “y” as its last character. The $ flag specifies that the character to search for is at the end of a string.
Matches: Carey, tommy, johnny, Fahey.
Does not match: young, gayle, faye.
/^smith/i
Matches any string that starts with “smith”. The ^ flag specifies that the string to search for is at the beginning of a string. The i flag makes the search case insensitive.
Matches: Smith, smithers, Smithsonian.
Does not match: blacksmith, John Smith.
/go*d/
Matches any string that contains this pattern. The asterisk (*) matches zero or any number of occurrences of the character previous to it, which is “o” in this example.
Matches: gd, god, good, goood, goodies, for goodness sake.
Does not match: ged, gored.
/go?d/
Matches any string that contains this pattern. The question mark (?) matches zero or one occurrence of the character previous to it, which is “o” in this example.
Matches: gd, god, godiva, for god and country.
Does not match: ged, gored, good, for goodness sake.
/go.*/
Matches any string that contains “go” followed by any number of characters. The period (.) matches any character, except the newline character.
Matches: go, good, gory, allegory.
/Ac[eio]r/
Matches any string that contains “Ac” followed by either e, i, or o, and r.
Matches: Acer, Acir, Acor, Acerre, National Acer Inc.
Does not match: Aceir, Acior, Aceior.
The RegExp object provides several functions for manipulating regular expressions. The following is an example of using a regular expression with the test( ) function to test for customer names that start with “national”:
var pattern = /^national/i;
var result = pattern.test(row["customerName"]);
The first statement specifies the string pattern to search. The second statement uses the test( ) function to check if the string pattern exists in the customerName field value. The test( ) function returns a value of true or false, which is stored in the result variable.
If you are familiar with regular expressions in other languages, note that some of the syntax of JavaScript regular expressions differs from the syntax of Java or Perl regular expressions. Most notably, JavaScript uses forward slashes (/ /) to delimit a regular expression, whereas Java and Perl use quotation marks (" ").
Using pattern-matching in filter conditions
In BIRT Report Designer, regular expressions are particularly useful when creating filter conditions. For example, a filter condition can contain a regular expression that tests whether the value of a string field matches a specified string pattern. Only data rows that meet the filter condition are displayed. For example, you can create a filter to display only rows where a memo field contains the words “Account overdrawn”, where a customer e-mail address ends with “.org”, or where a product code starts with “S10”.
When using the filter tool in BIRT Report Designer to specify this type of filter condition, use the Match operator, and specify the regular expression, or string pattern, to match. Figure 11‑8 shows an example of specifying a filter condition that uses a regular expression.
Figure 11‑8 Example of regular expression
In this example, the filter condition is applied to a table in the report design. In the generated report, the table displays only customers whose names contain the word National. You can learn more about filtering data in the next chapter.
Using pattern-matching to search and replace string values
So far, this chapter has described some of the syntax that is used to create regular expressions. This section discusses how regular expressions can be used in JavaScript code to search for and replace string values.
Recall that in “Substituting string values,” earlier in this section, we used replace( ) to search for a specified string and replace it with another. Sometimes, you need the flexibility of searching for a string pattern rather than a specific string.
Consider the example that was discussed in that earlier section. The row["address"].replace("St.", "Street") expression replaces St. Mary Road with Street Mary Road. To avoid these types of erroneous search-and-replace actions, use the following expression to search for “St.” at the end of a line. The $ flag specifies a match at the end of a string.
row["address"].replace (/St.$/, "Street")
Consider another example: A report displays the contents of a memo field. You notice that in the content, the word JavaScript appears as javascript, Javascript, and JavaScript. You want JavaScript to appear consistently in the report. To do so, write the following expression to search for various versions of the word and replace them with JavaScript:
row["memoField"].replace("javascript", "JavaScript").replace("Javascript", "JavaScript")
This expression searches for the specified strings only. It would miss, for example, JAVASCRIPT or javaScript. You can, of course, add as many versions of the word you can think of, but this technique is not efficient.
An efficient and flexible solution is to use a regular expression to search for any and all versions of JavaScript. The following expression replaces all versions of JavaScript with the correct capitalization, no matter how the word is capitalized:
row["memoField"].replace(/javascript/gi, "JavaScript")
The g flag specifies a global search, causing all occurrences of the pattern to be replaced, not just the first. The i flag specifies a case-insensitive search.