Data and AI Training

Home | Power BI | Excel | Python | SQL | Generative AI | Visualising Data | Analysing Data

Football Match Dataset

This data lists the football matches played in the current season of the English Premier League as of 31st January 2025.

Match Record

This data is in an CSV file in a public web location here.

The column (variable) names and explanations are listed here for convenience.

Key to results data:

Title Description
Div League Division
Date Match Date (dd/mm/yy)
Time Time of match kick off
HomeTeam Home Team
AwayTeam Away Team
FTHG Full Time Home Team Goals
FTAG Full Time Away Team Goals
FTR Full Time Result (H=Home Win, D=Draw, A=Away Win)
HTHG Half Time Home Team Goals
HTAG Half Time Away Team Goals
HTR Half Time Result (H=Home Win, D=Draw, A=Away Win)

Match Statistics (where available)

Title Description
Referee Match Referee
HS Home Team Shots
AS Away Team Shots
HST Home Team Shots on Target
AST Away Team Shots on Target
HC Home Team Corners
AC Away Team Corners
HF Home Team Fouls Committed
AF Away Team Fouls Committed
HY Home Team Yellow Cards
AY Away Team Yellow Cards
HR Home Team Red Cards
AR Away Team Red Cards

The original data is from the football data website.

Premier League Team Profile

The Excel spreadsheet here has an Excel table named Team with a row for each of the teams in the Premier League. The columns are:

This also includes some fictional data on the ticket sales for each match, used in an advanced Power BI exercise

Logos

The logos of each team are in a public web folder at https://zomalextrainingstorage.blob.core.windows.net/datasets/epl-svg/. Each logo is a .svg filetype. The filename is the three letter abbreviation. For example, The Chelsea logo file is at https://zomalextrainingstorage.blob.core.windows.net/datasets/epl-svg/CHE.svg

Power BI Case Study – Build a League Table from the Match Results

A sports league table is a report summarising the performance to date of several teams over a certain period. It is built up from the results of matches between the teams – the operational data. Building a league table is a typical traditional BI task!

In this case study, we use Power BI to build the league table for this season’s English Premier League – our intention is to reproduce the results that we see in the league tables from the BBC or the Premier League.

At first, this may seem an easy task – a single, tiny data source and well-defined business rules. But it does throw up some interesting challenges along the way, for example:

League Table Data

The league table columns are:

Data preparation

We need to restate the match record as a pair of results both for the home and away teams. This will involve several operations in the query editor including

Data Modeling

We will start by using the auto-date time feature, but the later import and configure a custom Dates table.

We will create a relationship between our results data and the Dates and Team tables to form a start schema pattern

DAX Calculations

We will build several measures ot use on the league table.

Visual Exercises

The league table show the logo of each team. These logos are available in a public web location as image files. We will want to show these also on our Power BI report

Advanced Exercise

There is an advanced exercise to calculate ticket sales by month. This is based on some fictional data that lists the total proceeds of ticket sales and the number of days on sale. In this data tickets go on sale a few months before the match date.

For this exercise, we assume that the sales for each are evenly distributed over the sale period for that match.