Specifying the data to retrieve from Amazon DynamoDB
Once the report connects to an Amazon DynamoDB database, you create a data set and select the table from which to retrieve data. A data set can retrieve data from one table only.
After selecting a table, you select the attributes from which to retrieve data. BIRT maps each selected attribute to a data set column. Because DynamoDB is a schema-less database in which each table item can contain a different set of attributes, you have the option of specifying the number of items to scan to compile the list of attributes. Scanning items in a table can be resource intensive. If all the table items contain the same attributes, specify one (the default) as the number of items to scan.
How to specify what data to retrieve from an Amazon DynamoDB database
1 In Data Explorer, right-click Data Sets, then choose New Data Set.
2 In New Data Set, specify the following information:
1 In Data Source Selection, select the Amazon DynamoDB data source to use. Data Set Type displays Amazon DynamoDB Data Set.
2 In Data Set Name, type a name for the data set.
3 Choose Next.
3 In New Data Set, in Query, do the following:
1 In DynamoDB Table, select the table from which to retrieve data.
2 In Number of Items to Scan for Attributes, type the number of table items for which to search for attributes, then choose Scan. The fewer the number of items to scan, the faster the response.
Available Attributes displays the attributes defined in the scanned items. If you do not see the attributes you expect and want, increase the number of items to scan, then choose Scan.
3 In Available Attributes, select the attribute or attributes whose data to retrieve.
4 If Searchable by Composite Key is available, you can filter the data to retrieve by searching for a hash key value, a range key value, or both. For information about this task, see “Filtering by a composite primary key,” later in this chapter.
5 In Advanced Settings, specify the following options:
*In AWS fetch size, type the maximum number of items to return in each web service call to the database, or select No fetch size limit. If you select the latter, each fetch operation returns the entire result set up to 1MB, the limit set by Amazon DynamoDB. The smaller the fetch size value, the faster the response time per web service call. The higher the fetch size, the fewer the calls to fetch data.
*Select the Eventually consistent reads option to maximize the read throughput. Deselect this option to request a strongly consistent read, which returns a result that reflects all writes that receive a successful response prior to the read.
*In Separator character(s) in a multi-valued set column, specify the character to use to separate values in a multi-value set. By default, BIRT returns a multi-value set as a string in the following format:
Value1|Value2|Value3
You can change the separator character to a comma, for example, to return results in the following format:
Value1,Value2,Value3
Figure 6‑2 shows an example of an Amazon DynamoDB query.
Figure 6‑2 Example of an Amazon DynamoDB query
4 Choose Finish to save the data set. Edit Data Set displays the columns, and provides options for editing the data set.
5 Choose Preview Results to view the data rows returned by the data set.
Filtering data
Amazon DynamoDB is designed to store large volumes of data across multiple servers. Database users must design tables for efficient write and read operations. As discussed earlier, the required primary key is the only part of a table that is indexed, and it is also used to hash partition data across multiple servers.
Amazon DynamoDB supports two types of primary keys:
*Hash primary key, which consists of one attribute. For example, a product catalog table can use ProductID as its primary key.
*Composite primary key, which consists of two attributes. The first attribute is a hash attribute and the second attribute is a range attribute. For example, a forum table can use ForumName and Subject as its primary key, where ForumName is the hash attribute and Subject is the range attribute.
A table’s primary key type determines how you specify a filter condition, and how Amazon DynamoDB searches for data, as the following sections describe.
Filtering by a composite primary key
A composite key supports searching for a specific value in the hash attribute, and can include searching on the range attribute as well. Searching on both attributes narrows a search. When a composite primary key is defined for a table, Amazon DynamoDB uses its Query API to search on the key index only. This type of search is typically efficient.
If you select a table that uses a composite primary key, the query page of the data set editor displays the Searchable by Composite Key option, as shown in Figure 6‑3. This option is disabled if the selected table uses a hash primary key.
Figure 6‑3 Query page displaying the Searchable by Composite Key option
In this example, the hash attribute is ForumName and the range attribute is Subject. You can select one or both of these attributes on which to filter. Each attribute you select creates a corresponding data set parameter, as shown in Figure 6‑4.
Figure 6‑4 Data set parameters associated with the selected attributes in the composite primary key
You must edit each data set parameter to specify the attribute value to search. Figure 6‑5 shows searching for the value Amazon DynamoDB in the ForumName hash attribute.
Figure 6‑5 Parameter value specified for an attribute in a composite primary key
Filtering by an attribute
You can filter data by any attribute selected in a data set. When filtering by an attribute that is not a primary key, Amazon DynamoDB uses its Scan API to scan the entire table, then filters out values to provide the desired result set. This type of search is not efficient, and slows down as a table grows.
To filter by an attribute that is not a composite primary key, use the Filters page in the data set editor. Figure 6‑6 shows an example of a filter condition created for the Product Catalog data set, where the BicycleType attribute is equal to Road.
Figure 6‑6 A filter condition specified for an attribute
This filter condition uses the Equal to operator, which looks for an exact match. With this filter, a match is found if the BicycleType attribute contains the single value, Road. As mentioned earlier, an attribute, however, can contain a multi‑value set, which BIRT returns in value1|value2|value3 format. If the BicycleType attribute contains a multi-value set, such as Road|Hybrid, there is no match.
If you do not know whether a string attribute contains a single value or a multi‑value set, do not use the Equal to operator in the filter condition. Instead, use the following operator:
Contains substring, or value in a set
To exclude a value when comparing values in a multi-value set, use the following operator:
Absence of substring, or value in a set
Figure 6‑7 shows a filter condition where the Color attribute must contain the value Red.
Figure 6‑7 A filter condition specified for an attribute that contains a multi‑value set
Figure 6‑8 shows the data rows returned when the filter condition in Figure 6‑7 is applied. The Color column in each row contains the value Red.
Figure 6‑8 Results of applying a filter condition