Datasets are a representation of a table, view or data entity physically stored in some data source accessible in Biuwer through the corresponding Data Connection.
The easiest way to have Datasets is to use Reverse Engineering on those connections that allow it, because it is an almost automatic, fast and efficient process.
However, you can create and manage datasets manually in Biuwer, for which you must know the detail of the corresponding table, view or data entity in the data source.
Also when uploading data from files in CSV or Excel format, datasets are automatically created or updated. Remember that in this case, the generated datasets are called "Managed" because it is Biuwer who manages the data.
Therefore, there are two types of Datasets:
Managed. Biuwer manages both metadata and data and stores it for you in a CDW (Cloud Data Warehouse) specific to your organization. Managed datasets are available when you upload external files in CSV or Excel format, when you have connections to external applications accessible through an API and for those cases that you define to be used with the Biuwer data preparation module.
Not Managed: Biuwer only has the metadata to be able to perform queries and it is your organization that physically manages the data and is responsible for its update and maintenance. This is the most common case when you work with SQL or NoSQL databases that you manage in your company, for example, those used by ERP (Enterprise Resource Planning), CRM (Customer Relationship Management), Ecommerce, etc.
In the Data Center, the list of datasets that your Organization has defined so far in Biuwer is available in the menu "Datasets":
From this list you can perform the following operations:
Filter datasets by Name, Alias, Connection and whether they are Managed or Not Managed.
Create a new dataset, using the top right "Add" button.
From the context menu of each dataset, View the detail, Edit the dataset or Delete the dataset.
Use the "Add" button available in the dataset list to manually create a dataset.
A dataset creation dialog appears in which you must first choose which type of dataset to create, Managed or Not Managed. Depending on the choice, the parameters that are necessary in each case are activated.
Creating a new managed dataset implies that the Organization will have available a table with the physical name corresponding to the value of the attribute "Name" of the dataset, in the DWH (Data Warehouse) managed by Biuwer. Obviously this table will be empty and you can insert data into it by uploading data from CSV or Excel files, or by using the Biuwer data preparation module.
Creating a new not managed dataset implies that the Organization will be able to perform data queries on a table, view or data entity with the physical name corresponding to the value of the attribute "Name" of the dataset, in the SQL or NoSQL database system associated with the connection used. If such a table, view or data entity does not exist, the queries launched against any of its data fields will obviously give an error. It is the responsibility of the Organization to ensure that the data entity exists and is prepared with the expected data in order to be able to analyze it in Biuwer.
When you access a Dataset within the Data Center, its full detail is displayed, with access to the data fields, a data preview with the first 100 records, and the configuration of the dataset's data policies.
Besides being able to see all following details, you can edit the dataset and even delete it, if it has no active dependencies, that is, if it is not being used in any Data Model and therefore is not being used in any Card.
From this list you can see and manage existing fields in the dataset. From the list you can perform following operations:
Filter fields, by any of their attributes.
Add a field, which can be either Standard or Calculated.
Launch the reverse engineering associated specifically with the dataset being displayed, in order to add or modify fields that have been modified at the source.
Fields of Datasets marked as hidden will not be shown to users when composing data cards, although it may be of interest to manage them for data validation, for example, internal identifiers.
With the preview the user can get an idea of the type of information available in the dataset, before modelling the information and moving on to the assembly of reports, charts, etc.
Data policies are a very versatile tool for dynamically displaying different data contained within the same dataset to different users or groups of users. This is of great value as it allows minimizing the number of pages and cards designed in Biuwer because often the same chart, table, map or KPI is used to display the appropriate information to different usage profiles. For example, a Sales Dashboard can be designed and implemented in Biuwer to display:
Full details to the company management team or company direction.
Data filtered by sales areas to each area sales manager.
Data filtered by clients to account executives, according to the clients managed by each one.
Data policies are explained in detail in the corresponding section.
When we already have datasets in Biuwer, we can manage their data fields when necessary, including new ones, editing existing ones or deleting existing ones.
When a particular field needs to be modified, we can edit it using the following dialog where we can modify:
The physical name of the field in the data entity.
The alias of the field in Biuwer. This alias is intended to be a business name, without including characters present in the physical name, such as "_", "-", and will be the one presented to the user in the final result of cards.
The description of the field in Biuwer. It does not appear in the end user interface, but it’s useful to explain the meaning of the field, how it was obtained, how it was calculated, aspects to take into account for analysis, etc.
The data type of the field: Text, Number, Date or Boolean.
The type of field: Dimension or Metric.
The default aggregation function for metric type fields, which depends on the selected data type.
If the field is hidden to the user.
Whether the field is calculated.
Sometimes it is necessary or recommended to create calculated fields in a dataset. In this case you do not point to a physical field of the data entity, but define a logical expression that includes a formula suitable for the source data engine, which can include:
The physical data fields of the dataset
Basic arithmetic operators (+, -, *, /) and when necessary parentheses, brackets and curly braces.
Functions available in the function catalog.
There’s also the possibility to launch Reverse Engineering specifically on that dataset. within the Fields tab.