Data and AI Training

Home | Prices | Contact Us | Courses: Power BI - Excel - Python - SQL - Generative AI - Visualising Data - Analysing Data

Excel Lesson - Summarise Data with pivot tables or formulas

We can use either pivot tables or formulas to summarise data. Both approaches have their pros and cons and the best approach depend on the circumstances.

Case Study Data

In this lab we use a fictitious dataset of about 700 transactions over a 2 year period. The company sells about six different types of product to customers in about 5 countries. The data has some categorical fields such as Country, Segment and Product that we may want to group by. It also has some numeric fields, such as Quantity and Discount, that we may want to aggregate (sum or count).

The data also has a few other important columns

TransactionId is a unique identifier of the transaction (row)
OrderId relates to an Order. An order has one or more transactions and so OrderId is not unique

First we format our data as an Excel table named Store. This will make all subsequent operations simpler.

Next we add three useful calculated columns to the Store table based on these formulas:

GrossSales = Quantity * SalePrice
Sales = GrossSales - Discount
Profit = Sales - COGS (Cost of Goods Sold)

Summarise with Pivot Tables

Getting Started

We start by creating some pivots on the Store table, for example, Sales and Number of Transactions (both in the Values area) by Country (in the Rows area) and Product (in the Columns area). We experiment with different configurations by putting one or more categorical fields into the Rows and/or Columns areas and one or more numeric fields into the Values area.

Appearance

We experiment with the appearance and formatting of our pivot tables, for example,

outline or tabular format,
with or without subtotals, grand totals,
add a blank line and / or repeating labels.

We experiment with conditional formats, for example

add data bars, then configure these so that they only show on the detail values (and not subtotals),
add icons to show outlier values.

Once we have more than one categorical fields on either the Rows or Columns areas, we can experiment with expand and collapse sections.

Sorting

The pivot table will order the categorical values in alphabetical order. We may want to change this either by:

manual drag and drop of the values - that affects a single pivot table only.
create a custom list of the values of a field. This will affect all pivot tables using that field.

Grouping

The Store data provides a Date column, but we may want to group the pivot table at a higher level, for example by Year and Month. We can do this in two ways:

drag the Date field on to the rows or columns area. Excel usually then creates new set of fields in the pivot list for the year, quarter and month (known as grouping a field).
create new Year and Month calculated columns in the Store dataset, and then use these to group the data. This is the better and more robust approach.

Filtering

We filter the data in the pivot either by:

add a categorical field to the Filter area in the pivot list
create a slicer

Slicers have some advantages over fields in the Filter area

slicers can float above the Excel cell surface, and be resized
we can configure the layout e.g. the number of columns
slicers are generally more intuitive to use: better display of multiple values selected
a single slicer can control one or several pivot tables

Other features

We look at several other useful features of pivot tables

drill to detail to show the rows in the source data (the Store table) that contribute to the value in a cell in a pivot table. This is useful for investigating suspicious values

Summarise with Formulas

Pivot tables are great to get a fast simple summary of our data and to explore the data quickly. However they have limitation, especially when it comes to creating a polished report. Using formulas is an alternative approach, and in some cases a better approach for a few reasons:

the formula approach has more analytical power e.g., we can calculate distinct counts and certain types of ratios - things which are impossible for pivot tables to do
we have all the advantages that formulas bring: flexibility of layout and presentation

This approach uses functions such as UNIQUE, SUM, FILTER, SUMIFS and COUNTIFS.

UNIQUE builds the row and column headers
SUMIFS and COUNTIFS calculate the aggregated numerical values
SUM and FILTER provide an alternative method to calculate the aggregated numerical values

This approach can also takes advantage of dynamic array formulas Since the row and column headers are created by a spill function, we can pass the array of row or column header values as arguments into the SUMIFS function to generate a row / column / grid of values rather than a single value. This avoids the need for to write mixed references in formulas and also the need to copy formulas.

In this exercise, we use functions to achieve the same results as the pivot tables that we have built previously. We use these steps

create a column with the list of countries (using UNIQUE, and note how this formula produces a spilled result)
calculate Sales by Country (using SUMIFS, first using copy / paste then using the spilled range operator)
calculate number of transactions by Country (using COUNTIFS)
create a row with the list of countries (using UNIQUE and TRANSPOSE)
calculate Sales by Country and Product in various way

We also check if these formulas adapt well if the Store data changes, for example, there is no need to change the formula if the Store data expands to new rows or new countries are added

Excel has some very new functions GROUPBY and PIVOTBY. These can create a pivot table in single formula. Note that these may not yet be available in students’ version of Excel 365

Formulas have greater analytical power

We create a calculation which would be impossible to do with pivot tables. This is the number of orders for each country. The OrderId is not a unique field so we have to use an approach of counting the distinct values of OrderId.