What is a join in which rows that do not have matching values in common columns are nonetheless included in the result table?

Relationships are a dynamic, flexible way to combine data from multiple tables for analysis. A relationship describes how two tables relate to each other, based on common fields, but does not merge the tables together. When a relationship is created between tables, the tables remain separate, maintaining their individual level of detail and domains.

Think of a relationship as a contract between two tables. When you are building a viz with fields from these tables, Tableau brings in data from these tables using that contract to build a query with the appropriate joins.

What are relationships?

Relationships are the flexible, connecting lines created between the logical tables in your data source. Some people affectionately call relationships "noodles", but we usually refer to them as "relationships" in our help documentation.

We recommend using relationships as your first approach to combining your data because it makes data preparation and analysis easier and more intuitive. Use joins only when you absolutely need to(Link opens in a new window).

Relationships provide several advantages over using joins for multi-table data:

  • You don't need to configure joins types(Link opens in a new window) between tables. You only need to select the fields to define the relationship.
  • Related tables remain separate and distinct; they are not merged into a single table.
  • Relationships use joins, but they are automatic. Tableau automatically selects join types based on the fields being used in the visualization. During analysis, Tableau adjusts join types intelligently and preserves the native level of detail in your data.
  • Tableau uses relationships to generate correct aggregations and appropriate joins during analysis, based on the current context of the fields in use in a worksheet.
  • Multiple tables at different levels of detail are supported in a single data source. You can build data models that contain more tables, and reduce the number of data sources needed to build a viz.
  • Unmatched measure values are not dropped (no accidental loss of data).
  • Avoids data duplication and filtering issues that can sometimes result from joins.
  • Tableau will generate queries only for the data that is relevant to the current view.

Requirements for relationships

  • When relating tables, the fields that define the relationships must have the same data type.
  • You can't define relationships based on geographic fields.
  • Circular relationships aren't supported in the data model.
  • You can't define relationships between published data sources.

Factors that limit the benefits of using related tables:

  • Dirty data in tables (i.e. tables that weren't created with a well-structured model in mind and contain a mix of measures and dimensions in multiple tables) can make multi-table analysis more complex.
  • Using data source filters will limit Tableau's ability to do join culling in the data. Join culling is a term for how Tableau simplifies queries by removing unnecessary joins.
  • Tables with a lot of unmatched values across relationships.
  • Interrelating multiple fact tables with multiple dimension tables (attempting to model shared or conformed dimensions).

Most relational connection types are completely supported. Cubes, SAP HANA (with OLAP attribute), JSON, and Google Analytics are limited to a single logical table in Tableau 2020.2. Stored procedures can only be used within a single logical table.

Published data sources can't be related to each other.

Unsupported

  • Cube databases do not support the new logical layer. Connecting to a cube offers the same experience as pre-2020.2.
  • Stored Procedures: Don't support federation, relationships, or joins. They are represented in a single logical table, and don't allow opening the Join/Union canvas (physical layer).
  • Splunk: Doesn't support left joins (and therefore relating logical tables).
  • JSON: Doesn't support federation, custom SQL, joins, or relationships (only unions).
  • Datasources that do not support LOD calcs. For more information, see Data Source Constraints for Level of Detail Expressions.

Limited support

  • Salesforce and WDC Standard Connections: These are represented as joined tables within a logical table. Adding these connections is currently only supported for single, logical table data sources. Standard connections cannot join to an existing table.
  • SAP HANA: Doesn't currently support relating logical tables when the connection has the OLAP attribute set.

Create and define relationships

After you drag the first table to the top-level canvas of the data source, each new table that you drag to the canvas must be related to an existing table. When you create relationships between tables in the logical layer, you are building the data model for your data source.

Note: In Tableau 2020.3 and later, you can create relationships based on calculated fields, and compare fields used for relationships using operators in the relationship definition.

Create a relationship

You create relationships in the logical layer of the data source. This is the default view of the canvas that you see in the Data Source page.

Note: The Salesforce connector doesn't support inequality operators. Google Big Query and MapR connectors support non-equal joins starting with version 2021.4. The MapR connector is deprecated as of version 2022.3.

  1. Drag a table to the canvas.

  2. Drag another table to the canvas. When you see the "noodle" between the two tables, drop that table.

    The Edit Relationship dialog box opens. Tableau automatically attempts to create the relationship based on existing key constraints and matching fields to define the relationship. If it can't determine the matching fields, you will need to select them.

    To change the fields: Select a field pair, and then click in the list of fields below to select a new pair of matching fields.

    To add multiple field pairs: After you select the first pair, click Close, and then click Add more fields.

    If no constraints are detected, a Many-to-many relationship is created and referential integrity is set to Some records match. These default settings are a safe choice and provide the most a lot of flexibility for your data source. The default settings support full outer joins and optimize queries by aggregating table data before forming joins during analysis. All column and row data from each table becomes available for analysis.

    In many analytical scenarios, using the default settings for a relationship will give you all of the data you need for analysis. Using a many-to-many relationship will work even if your data is actually many-to-one or one-to-one. If you know the particular cardinality and referential integrity of your data, you can adjust the Performance Options settings(Link opens in a new window) to describe your data more accurately and optimize how Tableau queries the database.

  3. Add more tables following the same steps, as needed.

    What is a join in which rows that do not have matching values in common columns are nonetheless included in the result table?

After you have built your multi-table, related data source, you can dive into exploring that data. For more information, see How Analysis Works for Multi-table Data Sources that Use Relationships and Troubleshooting multi-table analysis.

Move a table to create a different relationship

To move a table, drag it next to a different table. Or, hover over a table, click the arrow, and then select Move.

What is a join in which rows that do not have matching values in common columns are nonetheless included in the result table?

Tip: Drag a table over the top of another table to replace it.

Change the root table of the data model

To swap the root table with another table: Right-click another logical table in the data model, and then select Swap with root to make the change.

To move a table, hover over a table, click the arrow, and then select Remove.

What is a join in which rows that do not have matching values in common columns are nonetheless included in the result table?

Deleting a table in the canvas automatically deletes its related descendants as well.

View a relationship

  • Hover over the relationship line (noodle) to see the matching fields that define it. You can also hover over any logical table to see what it contains.

    What is a join in which rows that do not have matching values in common columns are nonetheless included in the result table?


Edit a relationship

  • Click a relationship line to open the Edit Relationship dialog box. You can add, change, or remove the fields used to define the relationship. Add additional field pairs to create a compound relationship.

    To add multiple field pairs: After you select the first pair, click Close, and then click Add more fields.

    What is a join in which rows that do not have matching values in common columns are nonetheless included in the result table?

  • The first table that you drag to the canvas becomes the root table for the data model in your data source. After you drag out the root table, you can drag out additional tables in any order. You will need to consider which tables should be related to each other, and the matching field pairs that you define for each relationship.
  • Before you start creating relationships, viewing the data from the data source before or during analysis can be useful to give you a sense of the scope of each table. For more information, see View Underlying Data. You can also use View Data to see a table's underlying data when a relationship is invalid.
  • If you are creating a star schema, it can be helpful to drag the fact table out first, and then relate dimension tables to that table.
  • Each relationship must be made of at least one matched pair of fields. Add multiple field pairs to create a compound relationship. Matched pairs must have the same data type. Changing the data type in the Data Source page does not change this requirement. Tableau will still use the data type in the underlying database for queries.
  • Relationships can be based on calculated fields. You can also specify how fields should be compared by using operators when you define the relationship.
  • Deleting a table in the canvas automatically deletes its related descendants as well.
  • You can swap the root table with another table. Right-click another logical table in the data model, and then select Swap with root to make the change.

You have several options for validating your data model for analysis. As you create the model for your data source, we recommend going to the sheet, selecting that data source, and then building a viz to explore record counts, unmatched values, nulls, or repeated measure values. Try working with fields across different tables to ensure everything looks how you expect it to.

What to look for:

  • Are your relationships in the data model using the correct matching fields for their tables?
  • What are the results of dragging different dimensions and measures into the view?
  • Are you seeing the expected number of rows?
  • Would compound relationships make the relationship more accurate?
  • If you changed any of the Performance Options settings from the default settings, are the values that you are seeing in the viz what you would expect? If they aren't, you might want to check the settings, or reset to the default.

Options for validating relationships and the data model:

  • Every table includes a count of its records, as a field named TableName(Count), at the level of detail for that table. To see the count for a table, drag its Count field into the view. To see the count for all tables, select the Count field for each table in the Data pane, and then click the Text Table in Show Me.
  • Click View Data in the Data pane to see the number of rows and data per table. Also, before you start creating relationships, viewing the data from the data source before or during analysis can be useful to give you a sense of the scope of each table. For more information, see View Underlying Data.
  • Drag dimensions onto rows to see the Number of Rows in the status bar. To see unmatched values, click the Analysis menu, and then select Table Layout > Show Empty Rows or Show Empty Columns. You can also drag different measures to the view, such as <YourTable>(Count) from one of the tables represented in your viz. This ensures that you will see all values of the dimensions from that table.

Tip: If you would like to see the queries that are being generated for relationships, you can use the Performance Recorder in Tableau Desktop.

  1. Click the Help menu, and then select Settings and Performance > Start Performance Recording.
  2. Drag fields into the view to build your viz.
  3. Click the Help menu, and then select Settings and Performance > Stop Performance Recording.
  4. In the Performance Summary dashboard, under Events Sorted By Time, click an "Executing Query" bar and view the query below.

Another more advanced option is to use the Tableau Log Viewer(Link opens in a new window) on GitHub. You can filter on a specific keyword using end-protocol.query. For more information, start with the Tableau Log Viewer wiki page(Link opens in a new window) in GitHub.

Dimension-only visualizations

When using a multi-table data source with related tables: If you build a dimension-only viz, Tableau uses inner joins and you won't see the full unmatched domain.

To see partial combinations of dimension values, you have can:

  • Use Show Empty Rows/Columns to see all of the possible rows. Click the Analysis menu, and then select Table Layout > Show Empty Rows or Show Empty Columns.
  • Add a measure to the view, such as <YourTable>(Count) from one of the tables represented in your viz. This ensures that you will see all values of the dimensions from that table.

For more information, see How Analysis Works for Multi-table Data Sources that Use Relationships and Troubleshooting multi-table analysis.

Relationships (logical tables) versus joins (physical tables)

While similar, joins and relationships behave differently in Tableau, and are defined in different layers of the data model. You create relationships between logical tables at the top-level, logical layer of your data source. You create joins between physical tables in the physical layer of your data source.

Joins merge data from two tables into a single table before your analysis begins. Merging the tables together can cause data to be duplicated or filtered from one or both tables; it can also cause NULL rows to be added to your data if you use a left, right, or full outer join. When doing analysis over joined data, you need to make sure that you correctly handle the effects of the join on your data.

Note: When duplication or the filtering effects of a join might be desirable, use joins to merge tables together instead of relationships. Double-click a logical table to open the physical layer and add joined tables.

A relationship describes how two independent tables relate to each other but does not merge the tables together. This avoids the data duplication and filtering issues that might occur in a join and can make working with your data easier.

relationships joins
Defined between logical tables in the Relationship canvas (logical layer) Defined between physical tables in the Join/Union canvas (physical layer)
Don't require you to define a join type Require join planning and join type
Act like containers for tables that are joined or unioned Are merged into their logical table
Only data relevant to the viz is queried. Cardinality and referential integrity settings can be adjusted to optimize queries. Run as part of every query
Level of detail is at the aggregate for the viz Level of detail is at the row level for the single table
Join types are automatically formed by Tableau based on the context of analysis. Tableau determines the necessary joins based on the measures and dimensions in the viz. Join types are static and fixed in the data source, regardless of analytical context. Joins and unions are established prior to analysis and don’t change.
Rows are not duplicated Merged table data can result in duplication
Unmatched records are included in aggregates, unless explicitly excluded Unmatched records are omitted from the merged data
Create independent domains at multiple levels of detail Support scenarios that require a single table of data, such as extract filters and aggregation

While both relationships and blends support analysis at different levels of detail, they have distinct differences. One reason you might use blends over relationships is to combine published data sources for your analysis.

relationships blends
Defined in the data source Defined in the worksheet between a primary and a secondary data source
Can be published Can't be published
All tables are equal semantically Depend on selection of primary and secondary data sources, and how those data sources are structured.
Support full outer joins Only support left joins
Computed locally Computed as part of the SQL query
Related fields are fixed Related fields vary by sheet (can be customized on a sheet-by-sheet basis)

There are many ways to combine data tables, each with their own preferred scenarios and nuances.

Relate

Use when combining data from different levels of detail.

  • Requires matching fields between two logical tables. Multiple matching field pairs can define the relationship.
  • Automatically uses correct aggregations and contextual joins based on how fields are related and used in the viz.
  • Supports many-to-many and outer joins.
  • Relationships are consistent for the entire workbook and can be published.
  • Can be published, but you can't relate published data sources.
  • Can't define relationships based on calculated fields or geographic fields.
  • Using data source filters limits join culling benefits of relationships.
Join

Use when you want to add more columns of data across the same row structure.

  • Requires common fields between two physical tables.
  • Requires establishing a join clause and a join type.
  • Can join on a calculation.
  • Joined physical tables are merged into a single logical table with a fixed combination of data.
  • May cause data loss if fields or values are not present in all tables (dependent on join types used).
  • May cause data duplication if fields are at different levels of detail.
  • Can use data source filters.
Union

Use when you want to add more rows of data with the same column structure.

  • Based on matching columns between two tables.
  • Unioned physical tables are merged into a single logical table with a fixed combination of data.
Blend

Use when combining data from different levels of detail.

  • Can be used to combine published data sources, but can't be published.
  • Can be used between a relational data source and a cube data source.
  • Data sources can be blended on a per-sheet basis.
  • Are always effectively left joins (may lose data from secondary data sources).

Thanks for your feedback!