The hands-on labs and guided exercises on the course may use one of the following datasets.
A useful dataset for many exercises is a Dates table. A Dates table is an essential component of any dimensional data model.
Bank Churn, fictional dataset, details of 10K bank customers and ex-customers (age, salary, tenure, balance..) and importantly whether they left the bank.
Football Match Record Record of matches in English Premier League.
Iris, classic dataset, length and width of petals and sepals of 3 species if iris, 150 observations.
London Bike Hire daily record number of bikes hired since July 2010 as part of London’s public transport system.
London Journey Types Monthly record over more than a decade of the number of passengers taking different journey types (bus, underground, tram,..).
Montana Fictional dataset of 700 sales, by country, segment, product.
MT Cars Classic dataset of 32 American supercars from the 1970s.
Price Paid Property sales in England from 1995 sourced from the Land Registry.
Retail Bank Fictional dataset of savings and loan balances of a retail bank.
SEGRO Share Price History Daily share prices and trading volumes, originally sourced from SEGRO website.
Strictly Results for the 2023 series of the BBC’s Strictly dancing competition.
Superstore Fictional dataset of 10K transactions (with associated sales, profit) by customer, product, geography, date.
Survey - LBAG Topics Survey conducted in 2020 to gauge interest about possible topics for future events of a meetup group.
Survey - Session Feedback Scores Scores from attendee feedback questionnaire of speaker sessions.
Time Series Datasets, various time series datasets including economic and financial data.
Titanic, partial passenger list of the Titanic.
World Bank Indicators Economic development indicators by country and year.
UK 2024 election The results of recent UK elections. Sourced from the House of Commons Library.
This contains audio MP3 files for use in voice-to-text transcription AI models. The landing pages is here.