Tableau is one of the best data analysis and visualization tools on the market. Before you can create visually stunning dashboards that tell powerful, data-driven stories for your business, you have to manually connect your various data sources. Until then, you’ll only see a blank canvas. That’s why connecting your data to Tableau is one of the most critical steps in the process. Luckily, we’ve created a guide to help you get started.
Connecting to Data
To get started with creating Tableau views, you must first connect to one or more data sources. Once you open up a new workbook, you’ll be prompted to ‘Connect to Data’ in the top left corner. Tableau allows you to connect to a variety of data sources, including Tableau Servers, Excel spreadsheets, Google Sheets, and many more options.
While connecting to one data source may be sufficient in some cases, oftentimes it is necessary to analyze data from multiple data sources in tandem. To add another source, simply click Data, then New Data Source.
Connecting to additional data sources is the first step, but to actually analyze multiple sources together, you need to establish some kind of data blend between them. The three main types of data blending we’ll talk about today are Relationships, Joins, and Blends.
Choose the Right Data Blending Method
Each type of data blending has its pros and cons, and the best kind for your use will largely depend on a few different factors. Before going into each type’s benefits and drawbacks, we’ll quickly explain the difference between each data blending method.
- This is now the default method of blending within Tableau, and they will work in most cases. Relationships blend data at the data source level and allow for tables at different levels of detail, which grants you more flexibility.
- These combine data tables by adding additional columns to similar rows. Joins also operate at the data source level but can cause duplicate data when the level of detail differs between tables.
- These do not actually combine the two data sources. Instead of merging two tables, blend relationships query data sources separately and then aggregate them at the correct level.
|Relationships||Flexible – adapts to different levels of detail; little up-front work is needed from the user||Cannot be used between published Tableau server data sources||Default option – best to use when combining multiple Excel spreadsheets or other non-server data sources|
|Joins||Customizable; results in a single table which is best for filtering and aggregation||Cannot be used between published Tableau server data sources; could result in duplicate rows of data||Best to use when duplicate data is not a worry, or when you need to customize the blend manually|
|Blends||Operates at the worksheet level; can use published Tableau server data sources||Cannot be published; operate at worksheet level; no real blended table||Best to use when worksheet-level relationships are helpful, or if you need to combine published Tableau server data sources|
How to Define a Relationship Between Two Data Sources
Start by connecting the first data source as normal. Then, within the Data Source sheet, add your second data source. Next, drag the second source next to the first, and a dialog box will pop up.
If Tableau recognizes fields with the same name in both sources, it will automatically connect them, as seen in the example here. If not, then you can manually add the fields by clicking add more fields.
After establishing and defining the relationship between your data sources, you can begin creating views in any worksheet. Relationships exist throughout the entire workbook and will adapt based on the level of detail within a view.
How to Blend Data Sources Within a Worksheet
Once you have added multiple data sources, you’ll see them listed in the top-left corner. You can toggle between the sources by clicking on them, and start pulling in fields to the worksheet as usual from the first data source. To start pulling in fields from a secondary source, there needs to be a relationship between the two data sources. Tableau will automatically recognize connections between data sources when naming conventions are consistent. For example, if both of your data sources include a field named ‘Customer ID’, Tableau will understand their relationship, and you’ll be able to pull in fields from both sources. However, if naming conventions are not consistent between sources, you’ll likely run into the following error:
“In order to use fields from Secondary Data Source, a relationship needs to be created with Primary Data Source.”
This means that Tableau does not recognize any connection between the data sources. To use multiple sources within the same worksheet, you must establish a link between at least one field from each source. One way to establish a connection is by creating what is known as a Blend Relationship. To create or edit blend relationships, start by clicking Data, then Edit Blend Relationships. If Tableau recognizes a link, those blend relationships will show up here. If not, you’ll need to click Custom, and then choose the fields in each source that are equivalent to one another.
For example, perhaps the primary data source includes a field called ‘Month’, while the secondary data source includes a field called ‘Month*’. If you know that these two fields are the same, but have different names, this is where you can create a link between them. If there is more than one field that is available in both sources, be sure to add those pairs as well.
After establishing a relationship or a blend, you can start using fields from both data sources. Once fields from both data sources are in use, you’ll be able to see which is the primary data source. The primary data source is whichever source was used first on a given worksheet, and it is indicated by a blue checkmark. The secondary data source is any other source used on the same worksheet, and it’s indicated by an orange checkmark.
Another visual indication of your relationship or blend is the link indicator next to the field. When both sources are in use, you’ll see a red link, which means that the blend is being used actively in the view. If the link is greyed out, that means the link is not in use. Simply click the grey link to re-establish the blend.
The final, most in-depth type of data blending are joins. To start a join, double click on the first data source in the Data Source sheet, and then drag the second data source next to the first. From there, you can specify which type of join you’d like, with the options of ‘Inner’, ‘Left’, ‘Right’, and ‘Full Outer’. These options determine how the tables should relate to each other. For example, ‘Left’ joins keep every row in the left table, along with any rows in the right table that are related to rows in the left. ‘Inner’ joins only keep rows where a relationship exists between both tables.
After choosing which type of join, as well as outlying which fields relate to one another between the tables, you can begin creating views as usual. As mentioned above, joins are the most complex of the data blending methods and require a certain level of knowledge of the underlying data tables, as well as of the intricacies of the different types of joins.
While data blending in Tableau may be a bit intimidating to beginners, all it takes to become a Tableau expert is some trial and error. Start with some basic examples, and keep working up in difficulty until you feel comfortable!