Zomalex | Data and AI Training

Strictly Dataset

Strictly is a Saturday evening light entertainment BBC show. It is a dancing competition. The latest series was in autumn 2023. Originally about 15 couples compete for the glitterball trophy. A couple is made of a celebrity and a professional dancer. There are also four judges; in this series these were Craig, Motsi, Shirley and Anton.

Each week, couples perform a dance, and the four judges give a score out of ten. The audience can also vote although details of the audience results are not released. When all couples have danced, the audience and judge scores are combined and the two couples with the least combined votes go into the dance off. The judges then vote to save one of the couples and the other couple is eliminated. There are a few exceptions; no couples are eliminated in the first week and in this series two couples withdrew from the competition and so there were no eliminations into two of the weeks.

The BBC website has more details and up-to-date information here.

strictly 2018 dream team

Data Description and Links

The first exercise is to import, shape, and visualise the results from week 1 of series 21 (September 2023). This data is on a Wikipedia page at Strictly Series 21 Wiki Page Use the table named Week 1.

The second exercise is to analyse the results for all 13 week of the full series 21. This data is on an Excel spreadsheet in a public web location. The URL is at Full Strictly Series 21 data

Wikipedia provides separate tables for each week but for this exercise, the results data for all weeks has been appended into one single dataset. The results data has the following columns:

Week e.g. Week 01, Week 02
Couple e.g. in the format Celebrity and Professional Dancer e.g. Nigel and Katya
Music (artist and song)
Dance, e.g. Waltz, Foxtrot
Scores – these are provided in a single column in the format : Total Score (Craig, Motsi, Shirley, Anton) e.g. 30 (6,7,8,9)
Result e.g. Safe, Bottom Two, Eliminated, NA (not applicable, for weeks where no couple was eliminated)

There are two other tables that may be useful.

Judge: contains a row for each judge; full name, link to a Wikipedia page, link to image
Couple: contains a row for each couple; celebrity full name and notability, professional dancer’s full name

Suggested Initial Approach

You will need to clean and reshape the data in several ways before you can build any good visualisation. Build a bar chart to show the judges’ scores for each couple? Some questions you may want to consider.

Which judge is the most generous?
Which judge is the most influential?
Do the couples at the bottom of the leader board (which is the judges scores only) often / always go into the dance off?

Particular Challenges

Wikipedia has organised the results data for presentational purpose but is this the best shape for analysis and visualisation? Is it better to calculate the total score or average score? If you use the Couple table, look out for data quality issues!

Power BI Labs: AI prompt to build calculated Couple column

We may use an AI assistant, such as ChatGPT, to help us build the Query Editor step to get the Couple column. The prompt could be something like:

In PowerQuery I have a table named Couple with Celebrity and Professional Partner columns.  These two columns have full names 
e.g 
Celebrity, Partner
Les Dennis, Nancy Xu 

Add a new column to the Couple table.  This has the first names of Celebrity  & Professional Partner in the format 
<celebrity first name> & <Professional Partner first name> e.g. Les & Nancy.
Name this column Couple.