Monday, August 10, 2020

Tableau Software Information 0.3 [ IT ]

Blending Vs Joining


In today's digitally growing world we need to work on multiple and huge data sources at a same time, Data blending is a very powerful feature of tableau in such cases.

Data joining is useful when we want to combine two or multiple data sets  from a single data source. These data sets can be tables or sheets from Excel, Ms Access or SQL tables ,Oracle tables etc but their source is same. But when there is a need to combine multiple sources together then tableau has gives a provision of Data Blending.

Both Joining & Blending mainly used to combine data together but depending on their features and our requirements we can choose between Blending or Joining.

In this blog I will compare between Data Blending & Data Joining.


1    What is Data Blending in Tableau?

In today's business scenarios we come across a variety of data which come from variety of sources & we need to work with such data at  a same time or may be we need to combine variety of data together for different analyses.

  • Data Blending in Tableau is way of combining related data from two or more sources that can be analysed in a single visualization.
  • In blending we can combine data from multiple sources i.e. Excel file, Csv Files, SQL data, Oracle etc & many more .
  • A common field called “Linking field” has to be there while performing Data Blending.
  • Data Blending performs like a left join operation and does not accept or perform any other type of join.


2    Difference between Blending & Joining?

Sr.No.

Joining

Blending

1

Combine datasets from Same Data Source.

Able to blend datasets from multiple Data Sources.

2

Examples, Joining two or more Excel worksheets, Oracle tables etc

Example, Blending SQL table with Excel worksheet.

3

Common field is needed

Common field is needed called as ‘Linking Field’.

4

All the combined tables or sheets contains common set of Dimensions and Measures.

Each data source contains its own set of Dimensions and Measures.

5

We CAN have common set of Dimensions and Measures which extends the ability to create groups, formulas and other entities across the data.

We can NOT have common set of Dimensions and Measures which limits to create groups, formulas and other entities across the data.

6

In Joining, a query is sent to the database where the join is performed.

The results of the join are then sent back to Tableau and aggregated for display in the visualization.

In Blending, for every data source a query is sent to the database.

The result of these queries, including the aggregated data, are sent back and combined by Tableau.

7

A join combines the data and then aggregates.

A blend aggregates and then combines the data.

8

We perform join manually on datasets

It is functionality that Tableau performs automatically


   

3    When to Blend & when to join?

>    Join : Same data source.

Whenever the datasets to be combined are from same data source ,Data Joining is used.

For example, Sales information coming from various locations in Excel files.

    Blend : Different data sources.

Whenever the datasets to be combined are from different data sources, Data Blending is used.

For example, Actual Sales for geography like Region Name can come from one source (relational) and Month details from a different source (Excel File).

>    Join : Same level of granularity.

The data to be combined is stored in same data source, and the granularity of the data captured in each table is also same, then data Joining is the best way to combine data.

>    Blend : Different level of granularity.

The data data to be combined is stored in different databases, and the granularity of the data captured in each table is different in the two data sources, then data blending is the best way to combine data.

>    Join : Common set of Dimensions and Measures.

In Joining All the combined tables or sheets contains common set of Dimensions and Measures.

We CAN have common set of Dimensions and Measures which extends the ability to create groups, formulas and other entities across the data.

>    Blend : Own set of Dimensions and Measures.

In Data Blending, each data source contains its own set of Dimensions and Measures.

We can NOT have common set of Dimensions and Measures which limits to create groups, formulas and other entities across the data.

>    Joining Duplicate values that are not aggregated gives inflated result in Tableau.

Whenever datasets containing duplicate values are joined which are not aggregated correctly before joining then we may get inflated results.

>    Blending solves this problem automatically by considering the granularity.

In such cases Blending solves this problem automatically by considering the granularity chosen & combine data sources with aggregated fields directly.

In case you wish to visualize monthly sales for a region, you will be using Data Blending.


4    Examples of Joining & Blending?

Lets see an example of left join & blending so that we can understand clearly how they differ from each other.

Examples :

Left Join

  A query is sent to the database where the join is performed. A left join returns all rows from the left table and any corresponding rows from the right table. The results of the join are then sent back to Tableau and aggregated for display in the visualization.

—  In Our example The common columns are Student ID and State; where there is corresponding information from the right table, that data is returned. Otherwise, there is a null.

—  Suppose you have the same tables, but flip the order. This new left join produces different results. Again, a left join takes all the data from the new left table, but essentially ignores a row from the right table.

Data blending

—  A query is sent to the database for each data source that is used on the sheet. The results of the queries are sent back to Tableau as aggregated data and presented together in the visualization.

—  In our example If the linking fields are Student ID and State, This will be our resulting table.

—  Null indicates that

      A row in the left table does not have a corresponding row match in the right table,

—   Asterisk (*) indicates that

      There are multiple corresponding values in the rows in the right table,

—  The view uses all values from the primary data source (functioning as the left table) and the corresponding rows from the secondary data source (the right table) based on the linking field(s).


5    What are the Limitations of Data Blending?

    Data blending doesn't work for non-additive aggregates, such as COUNTD [Count Distinct],                   MEDIAN, and RAWSQLAGG.

       Non Additive aggregates are aggregate functions that produce results that can not be aggregated            along a dimension. Instead the values have top be calculated individually.

       All number functions except Min & Max functions are non additive aggregates.

    We can not publish blended data source as a single unit.

       For publishing blended data we have to publish each data sources separately & then blend the                published data source.

    Data from secondary data source must always be aggregated in calculations.

    Data blending behaves similarly to a left join, which may result in missing data from the                       secondary data source.

    Data Blending performs like a left join operation and does not accept or perform any other type of        join.

    A cube data source must be the primary data source while blending.

    BLENDING can lead to asterisks [ * ] if there are one-to-many relationships

 

I hope that this helps you to understand difference between Blending & Joining in Tableau. 

You can also visit my channel to watch video on the same topic.

https://www.youtube.com/watch?v=imiBVEQucM0

Please comment below if you need information on any other topic in tableau.