Data Training Courses

Home | Power BI | Excel | Python | SQL | Generative AI | Visualising Data | Analysing Data

Football Match Dataset

This data lists the football matches played in the current season of the English Premier League as of 8th January 2025.

Football Data

An Excel spreadsheet contains several useful tables and is here.

This includes:an Excel file with data in an Excel table named Team with a row for each of the teams in the Premier League. The columns are:

This also includes some fictional data on the ticket sales for each match, used in an advanced Power BI exercise

Logos

The logos of each team are in a public web folder at https://zomalextrainingstorage.blob.core.windows.net/datasets/epl-svg/. Each logo is a .svg filetype. The filename is the three letter abbreviation. For example, The Chelsea logo file is at https://zomalextrainingstorage.blob.core.windows.net/datasets/epl-svg/CHE.svg

Match Record

This data is in an CSV file in a public web location here.

The column (variable) names and explanations are listed here for convenience.

Key to results data:

Title Description
Div League Division
Date Match Date (dd/mm/yy)
Time Time of match kick off
HomeTeam Home Team
AwayTeam Away Team
FTHG Full Time Home Team Goals
FTAG Full Time Away Team Goals
FTR Full Time Result (H=Home Win, D=Draw, A=Away Win)
HTHG Half Time Home Team Goals
HTAG Half Time Away Team Goals
HTR Half Time Result (H=Home Win, D=Draw, A=Away Win)

Match Statistics (where available)

Title Description
Referee Match Referee
HS Home Team Shots
AS Away Team Shots
HST Home Team Shots on Target
AST Away Team Shots on Target
HC Home Team Corners
AC Away Team Corners
HF Home Team Fouls Committed
AF Away Team Fouls Committed
HY Home Team Yellow Cards
AY Away Team Yellow Cards
HR Home Team Red Cards
AR Away Team Red Cards

The original data is from the football data website.

Power BI Lesson Plan

The objective is to build a league table from the match record. This will involve several challenges.

Firstly we need to understand both the football match dataset and the league table.

The league table columns are:

Data preparation

We need to restate the match record as a pair of results both for the home and away teams. This will involve several operations in the query editor including

Data Modeling

We will start by using the auto-date time feature, but the later import and configure a custom Dates table.

We will create a relationship between our results data and the Dates and Team tables to form a start schema pattern

DAX Calculations

We will build several measures ot use on the league table.

Visual Exercises

The league table show the logo of each team. These logos are available in a public web location as image files. We will want to show these also on our Power BI report

Advanced Exercise

There is an advanced exercise to calculate ticket sales by month. This is based on some fictional data that lists the total proceeds of ticket sales and the number of days on sale. In this data tickets go on sale a few months before the match date.

For this exercise, we assume that the sales for each are evenly distributed over the sale period for that match.