Home | Power BI | Excel | Python | SQL | Generative AI | Visualising Data | Analysing Data
This data lists the football matches played in the current season of the English Premier League as of 31st January 2025.
This data is in an CSV file in a public web location here.
The column (variable) names and explanations are listed here for convenience.
Key to results data:
Title | Description |
---|---|
Div | League Division |
Date | Match Date (dd/mm/yy) |
Time | Time of match kick off |
HomeTeam | Home Team |
AwayTeam | Away Team |
FTHG | Full Time Home Team Goals |
FTAG | Full Time Away Team Goals |
FTR | Full Time Result (H=Home Win, D=Draw, A=Away Win) |
HTHG | Half Time Home Team Goals |
HTAG | Half Time Away Team Goals |
HTR | Half Time Result (H=Home Win, D=Draw, A=Away Win) |
Match Statistics (where available)
Title | Description |
---|---|
Referee | Match Referee |
HS | Home Team Shots |
AS | Away Team Shots |
HST | Home Team Shots on Target |
AST | Away Team Shots on Target |
HC | Home Team Corners |
AC | Away Team Corners |
HF | Home Team Fouls Committed |
AF | Away Team Fouls Committed |
HY | Home Team Yellow Cards |
AY | Away Team Yellow Cards |
HR | Home Team Red Cards |
AR | Away Team Red Cards |
The original data is from the football data website.
The Excel spreadsheet here has an Excel table named Team with a row for each of the teams in the Premier League. The columns are:
This also includes some fictional data on the ticket sales for each match, used in an advanced Power BI exercise
The logos of each team are in a public web folder at https://zomalextrainingstorage.blob.core.windows.net/datasets/epl-svg/. Each logo is a .svg filetype. The filename is the three letter abbreviation. For example, The Chelsea logo file is at https://zomalextrainingstorage.blob.core.windows.net/datasets/epl-svg/CHE.svg
A sports league table is a report summarising the performance to date of several teams over a certain period. It is built up from the results of matches between the teams – the operational data. Building a league table is a typical traditional BI task!
In this case study, we use Power BI to build the league table for this season’s English Premier League – our intention is to reproduce the results that we see in the league tables from the BBC or the Premier League.
At first, this may seem an easy task – a single, tiny data source and well-defined business rules. But it does throw up some interesting challenges along the way, for example:
The league table columns are:
We need to restate the match record as a pair of results both for the home and away teams. This will involve several operations in the query editor including
We will start by using the auto-date time feature, but the later import and configure a custom Dates table.
We will create a relationship between our results data and the Dates and Team tables to form a start schema pattern
We will build several measures ot use on the league table.
The league table show the logo of each team. These logos are available in a public web location as image files. We will want to show these also on our Power BI report
There is an advanced exercise to calculate ticket sales by month. This is based on some fictional data that lists the total proceeds of ticket sales and the number of days on sale. In this data tickets go on sale a few months before the match date.
For this exercise, we assume that the sales for each are evenly distributed over the sale period for that match.