If you’ve ever used calculations in Google Sheets, Excel, or for custom fields in Salesforce, you’ll know how powerful they can be in expanding the possibilities for analysis and helping you make sense of your data.
Tableau is no different – calculations are fundamental to improving your productivity throughout the visual analytics workflow. Furthermore, mastering Tableau’s calculations can give you what at times seem almost like superpowers with the ability to access data at different levels of detail and across rows with LODs and Table Calculations. So let’s dive in and see what’s possible!
Tableau Calculated Fields
It’s possible to create different types of calculated fields in Tableau, which can be used in diverse scenarios. In this post, we’ll look at some of the reasons for using calculations, and then explore the main types of calculations you can create along with some sample use cases.
The main types of calculations available to you in Tableau are:
- Basic expressions:
- Row-level calculations
- Aggregate calculations
- Table calculations
- Level of Detail (LOD) expressions:
- FIXED LOD
- INCLUDE LOD
- EXCLUDE LOD
We’ll explore these one by one, however, note that each type of calculation could be a multi-part blog on its own, so this will be a relatively high-level introduction.
Why Use Calculations At All?
Even with a thoughtfully planned and well prepared data source, new data fields based on existing data or transformations of data are frequently required during the course of analysis. A small sample of scenarios where calculations can be useful are as follows:
- Modifying the data, such as adding a tax percentage to a sales figure.
- Aggregating data, such as finding the total sum of sales for a region.
- Calculating ratios, such as the profit ratio for each product.
- Segmenting data, such as categorizing sales values into buckets of ‘High’, ‘Medium’, and ‘Low’.
- Converting data from one data type to another, such as from a text string to a date, or creating spatial points from numerical latitude and longitude values.
- Enabling user input, such as allowing for the input of different exchange rates or changing the measure that is displayed in a chart.
Basic Expressions: Row-Level Calculations
Let’s start with one of the examples above: adding a tax percentage. Let’s make it 10% to a sales figure. As a comparison, in Excel, this might look something like this: = A1 * 1.1.
The equals sign identifies this as a calculation. A1 is a reference to another cell. ‘*’ is an operator and 1.1 is a literal expression.
There are some similarities and differences in an equivalent calculation in Tableau:
A few points to note:
- Tableau calculations don’t have an equals sign at the start.
- The gray text on the first line, starting with //, is a comment. Multiple comments can be added to help readability and maintainability.
- When referencing other data fields (columns in your underlying data source), the name of that field is used: in this case, [Sales]. In Excel, you might copy your calculation down over many rows. In Tableau, this is not relevant as row-level calculations are applied to every row in the underlying data.
- The operator (*) and literal expression (1.1) are the same.
- Clicking on the “1 Dependency” message will list the places this calculation is used, such as nested within other calculations or within Tableau views.
After clicking OK, “Sales + Tax” will appear as a new field in the Data Pane and can be used in the view to create a table or visualization. Here’s a basic table with the original sales field and our new calculated field:
Basic Expressions: Aggregate Calculations
In the previous example, while the calculation was performed at the row-level, the results in the view are aggregated. This can be seen by the “SUM” aggregation that appears before both the “Sales” and “Sales + Tax” field names in the green ‘pills’ in the image above. Why is this?
Each row in the table represents many individual sales in that city, so all of those sales values need to be aggregated so they can be shown as one value per city (one row in the displayed table). They could be summed, as they are here, or averaged – or only the maximum or minimum sales value shown etc.
The type of aggregation can be changed by clicking the pill and selecting a different aggregation from the context menu:
Instead of picking the aggregation in the view, it’s also possible to set the aggregation directly in the calculated field by using an aggregated basic expression. For example, the below calculation returns the average sales value:
When this is added to the view, the average sales value for each unique combination of region, country, and city (the dimensions in the view), is shown in the table. Also, the green pill now has the letters “AGG” at the start to show that the value is already aggregated and the aggregation type cannot be changed.
What if we wanted to show the average of the sum of sales for each city within each region? For example, for the South Region, this would be 23,651 (the average of 33,256, 51,386, 5,352, and 4,608). This is where our next calculation type can help.
Table Calculations
When you drag and drop pills into or within the view in Tableau, two things are happening behind the scenes:
- If Tableau doesn’t already have the data in memory, it generates a query to the underlying data source to retrieve the required data.
- The results of the query are then visualized in the view according to the position of pills. This could be a table of data, a simple bar chart, or a more complex visualization.
It is the aggregated results of the query (sometimes referred to as a virtual table), that table calculations operate on. This virtual table could be very small, even if your underlying data consists of millions of records. Because of this, table calculations can be very computationally efficient.
Table calculations are also the only type of Tableau calculation that can be used to reference other values in the virtual table, required for scenarios such as ranking, moving average calculations, and time comparison calculations such as year-over-year.
Let’s return to our earlier question: what is the average of the sum of sales for each city within each region? Because the virtual tables include the sum of sales for each city, we can calculate the average of these using the following table calculation:
When this calculated field is added to the view, you can see that the pill has a small triangle in it, indicating that it is a table calculation.
The calculation is returning 27,324, which is not the result we were looking for. By default, the window calculation is calculating “Table (Down)”, which means the calculation is being performed across the full virtual table in a downwards direction (based on the current configuration of the view). I.e. it is calculating the average of all the values highlighted in yellow below.
In fact, what we want is for the calculation to calculate the average for each region. To do that, we need to change the way the calculation is computed by editing the table calculation:
We can then select a custom configuration so that the calculation restarts for each region (the unchecked dimension, as can be seen below). Our calculation is now restarting for every region, so we now see the averages at the expected level of detail.
At this point, concepts such as addressing and partitioning come into play, which are beyond the scope of this post (see this help page for more information). However, it’s fair to say that table calculations provide a great deal of flexibility and control in how they are computed, and when used effectively, enable quick and insightful exploratory analysis.
Next, let’s create a summarized view of the data at the region level:
What if our regional directors wanted to keep this high-level view but still be able to see the average of the sum of each city’s sales by region? Adding the table calculation that we used previously doesn’t work because the country and city dimensions are no longer in the view, therefore being unavailable in the edit table calculation dialog.
Fortunately, level of detail expressions can help with this very situation.
Level of Detail Expressions
When you need to calculate a value at a level of detail that is different to the visualization level of detail (determined by the dimensions in the view), level of detail expressions – commonly referred to as LODs – are the tool to use.
LODs come in three different ‘flavors’:
- FIXED: Calculates a value at a specified level of detail independent of the view.
- INCLUDE: Calculates a value at a more granular level of detail than the view.
- EXCLUDE: Calculates a value at a less granular level of detail than the view.
To replicate the window calculation used earlier, we need to include three dimensions in the calculation to ensure we uniquely identify each city, as city names are not unique across countries and regions. We can do this in two ways.
The first is to use a FIXED LOD that includes the three required dimensions:
The second is to use an INCLUDE LOD. Because we already have the region in the view, we can just INCLUDE the additional two dimensions:
We can then take the average of these calculations by setting the aggregation type in the view. Both of these LOD expressions return the same values as the table calculation used earlier, but with only the region dimension in the view:
And that’s it!
Summary
There’s lots more to explore to fully understand how each of these calculation types work, but I hope this introduction highlights the flexibility of Tableau and some of the types of analyses it’s possible to perform with it. For more information, please see Tableau’s help page on understanding calculations and this blog post on Tableau’s website.