Data Set

The Data Set is a way to read data from different sources. You have to use Data Sets in the following document types:

  • Charts
  • DashBoards
  • KPI
  • GEO

1 How to create a New Data Set

To create a new Data Set open the Data Set menu item under Resources ->Data Providers

datasetSelection.png
To create new Data Set click on the button + on the right corner of the Data Set list. Then fill in the data set definition form.
SpagoBI supports dataset versioning. Therefore, each time you edit and save a dataset, the older version is archived and is still accessible from the lower part of the detail panel. 





datasetDetail.png

Some fields are common to all types of datasets:





  • Label
  • Name
  • Description
  • Scope - this is composed of two fields, whose combination allows the definition of fine-grained purpose datasets.
    In particular:

The following table summarizes the combination of scope options and intended usage:
DatasetPrivate
 Shared
 UserDataset created from file (CSV, XLS) or from QbE (My Data) for personal use onlyDataset created from file (CSV, XLS) or from QbE (My Data) and shared with other users
 TechnicalNot applicable
Dataset created by a BI developer to be used in one or more documents. Not visible to end users.
 EnterpriseNot available
Dataset of any type created by a technical user and certified by a trusted entity within the organization, and made available to all end users for reuse


In the following we provide details about how to create each type of dataset.

1.1 Query Data Set

To define a Query Data Set you need to define a Data Source and a Query.

queryDetail.gif

Since SpagoBI 3.5 it is also possible to define together with the base query a script that will be executed to modify the query at runtime just before its execution. The script can be defined in Javascript or in Groovy. The base query is bounded to the execution context of the script (variable query)together with its parameters (variable parameters) and all the profile attributes of the user that are executing the dataset (variable attributes).




query = query.replace("LIMIT 5","LIMIT " + parameters.get('limit'));
query.replace("lname, fname","lname, fname, '[" + attributes.get('family') + "]'");







where the base query is




SELECT lname, fname 
FROM customer
LIMIT 5







1.1.1 Query Data Set (Mongo)

The steps to create a Mongo dataset are:

Step1: Create a Mongo datasource in the administrator console. Notes: the type must be JDBC and the value for Class input field must be "mongo"
JDBC: {unit_host}:{port}/{db}
CLASS: mongo

Step2: now you can create a dataset. The procedure is the same of the query datasets. The difference here is the language.. JS instead of SQL. The script must respect some convention, in particular:

  • the return value of the query must be assigned to a variable with name “query”. For example


var query = db.store.find();





  • if the return value doesn’t come from a query, for example it’s a js variable, than it must be assigned to a variable with name “sbiDatasetfixedResult”. The result will be managed by SpagoBI accordingly to the type of the variable:
  • if it’s a primitive type the resulting dataset contains only a columns with name “result” and value equal to the value of the variable sbiDatasetfixedResult
  • if it’s an object, the resulting dataset contains a column for each property of the object. Example:
Query


sbiDatasetfixedResult = {a: 2, b:3}

Data Set


a   b
2   3





  • if it’s a list than the columns of the dataset are the union of the properties of all the objects contained in the list. Example:
Query


sbiDatasetfixedResult = [{a: 2, b:3},{a: 2, c:3}]



Data Set


a   b   c
2   3   
2       3





The result of a query in MongoDB can assume different shapes: Cursor, Document, List, fix value. SpagoBI can manage automatically the result of the query. The algorithm to understand how to manage the result is very simple

  • If in the query it finds the variable sbiDatasetfixedResult the result will be managed as described above
  • If in the query it finds a findOne the result will be managed as a single document
  • If in the query it finds an aggregate the result will be managed as an aggregation
  • In the ether cases the result will be managed as a Cursor
It’s possible to force the behaviour. In particular the result stored in the variable query, will be managed:

  • as cursor if in the script exist a variable with value LIST_DOCUMENTS_QUERY. Example:


var retVal=  "LIST_DOCUMENTS_QUERY"





  • a document if in the script exist a variable with value SINGLE_DOCUMENT_QUERY. Example


var retVal=  "SINGLE_DOCUMENT_QUERY"





1.2 Script Data Set

To define a Script Data Set you need to select the script language you want to use and to write the script.

scriptDetail.gif
The script must be written using Groovy or Javascript languages. It must return an XML string containing a list of values with the syntax shown below. If the script instead returns a single value this is automatically enveloped in such XML.




<ROWS>
  <ROW value="value1" ... />
  <ROW value="value2" ... />
  ...
</ROWS>







SpagoBI provides some Groovy and Javascript functions that can be used to return the value of a single or multi-value profile attribute. These functions are explained in the information window that can be opened from the Data Set Type Tab. New custom functions can be added in files predefinedGroovyScript.groovy and predefinedJavascript.js contained in the SpagoBIUtils.jar

Starting SpagoBI 3.5 dataset parameters and all profile attributes of the user that are executing the dataset are bounded in the context of the script (variable parameters and variable attributes)



if(parameters.get('unitsSold') > attributes.get('personalTarget')) {
'OK';
} else {
'KO';
}

















1.3 File Data Set

A dataset of type File reads data from an XLS or CSV file.

To define a File Data Set select the File type, then upload the file and set the proper options for parsing it.


  dsFile.png


1.4 Web Service Data Set DEPRECATED

Web Service Data Set can retrieve data from a web service implementing a well-defined WSDL (dataset.wsdl). It it possible to find the WSDL looking at our source code. The developer must apply such WS contract when implementing the data service. The service can be deployed anywhere.

When defining the dataset in SpagoBI, the user specifies the service endpoint (URL) and the operation (which is the third input of the service). Other parameters associated to the dataset are sent in the second input as a Map.

The resulting XML file has the following structure: the main tag is ROWS, each record defines a ROW tag, attributes are specified by their name and value in the ROW xml element;

WSdetail.gif

1.5 Java Class Data Set

To define a Java Class Data Set you need to define the Java Class that you waqnt to use.

jclassDetail.gif

1.6 Qbe Data Set

To define a Qbe Data Set you need to define the Data Source and Datamart that you want to use.

qbeDetail.gif
Once chosen your datamart you can click on the Lookup button of the Qbe Query field and a pop up window will appear showing a Qbe interface where you can easily define your query:

datasetQbeDetail.gif

1.7 Flat Data Set

A flat dataset allows the retrieval and storage of an entire table from a data source. In other words, it replaces a dummy query like "select * from Sales" by automatically retrieving all rows in a table.

To create a flat dataset, simply enter the table and the data source name, as shown below:
flatDataset.png

2 Parameters and Profile Attributes

2.1 Parameters and Profile Attributes in SQL statements

In a Data Set you can insert parameters or profile attributes, both single and multivalue. The correct syntax to insert a parameter or a profile attribute in a query or a script is the following one:

$P{name of the parameter}

While the syntax to insert a profile attribute is:

${name of the attribute}

As far as multi-value attributes are concerned, the behaviour is different if the user is defining a query or a script.

When inserting a parameter into the query  or script; the parameter must also be inserted in the table in dataset definition (in "type" tab), specifying its name  ("parameter_name" in the example above) and it type.

Possible type are:

  • String: provides to add single quote ( ') at the begnning and end of the passed value, if not already present
  • Number: the value is validated as a number; a numberFormat exception is throwed if it cannot be converted
  • Raw: provides to remove single quote (') if present
  • Generic: the value  is passed to the engine without any modification
Note: for multivalue parameter (expecially if they are STRING type), is better to specify the correct values configuration into the statement with the next syntax:

$P{<name of the parameter>(prefix;split&#95;char;suffix)}

For example:

SELECT FAMILY, STORE FROM ... WHERE FAMILY IN ($P{par&#95;family(';,;')})

In this case the final statement will be :

SELECT FAMILY, STORE FROM ... WHERE FAMILY IN ('Drink','Food')

2.2 Parameters in Qbe datasets

Open the Qbe wizard and click on "Add" button on the "Parameters" box in the bottom-right corner: you have to specify the name of the parameter and its type. Then you can drag&drop it into a filter within the query designer panel. When previewing the query result, the Qbe wizard will ask you the value to be considered for the parameter. When closing the Qbe wizard, the parameters definition will be inherited at the dataset definition level, so you don't have to define them again. Last step: when using the dataset within an anlytical document, you have to assign the document an analytical driver providing the parameter's name as the URL name for the document parameter.

3 Transformations

You can use directly the result of a dataset, or you can process it. At the moment the unique transformation available is the Pivot one.

dspivottransf.gif


This operation allows you to switch rows and columns of your resultset:

pivot.gif


Sometimes this is necessary in order to execute correctly some document types like the charts. Eventually, it is possible to require an automatic numeration for the output column by clicking on ‘Numbered Columns’ checkbox. (ie. It can return columns called x1,y1,x2,y2,…where ‘x’ and ‘y’ are the names of the original columns)

4 Advanced (persistent datasets)

This tab is used to make a dataset persistent, i.e., to write it on the default database.
Making a dataset persisent may be useful in case dataset calculation takes a considerable amount of time. Instead of recalculating the dataset each time the document using it is executed, the dataset is calculated once and then retrieved from a table to improve performance.

In order to force recalculation of the dataset, you should execute dataset preview again. This will store the newly generated data on the db table.
persistentDataset.png

5 Result

You can see the result of your Data Set by going to the last tab called Preview. There you can fill in eventual parameter values and then click on the preview button.

dsConPar.gif
A pop-up like the following one will appear with the result of your Data Set.

datasetresult.gif

Tags:
Created by bernabei on 2009/09/14 14:39
Last modified by Alberto Ghedin on 2017/11/30 14:36

This wiki is licensed under a Creative Commons 2.0 license
XWiki Enterprise 2.7.33694 - Documentation