Home | Power BI | Excel | Python | SQL | Generative AI | Visualising Data | Analysing Data
AI tools can help us explore, understand and visualise a dataset. For best results, we provide information about the data in an initial scene-setting prompt. We can do this in a few ways:
- as a description of the structure: the table name(s) and a list of column names. If we want to be thorough, we can also include the descriptions of each column (data type, whether nullable, the set of valid values if appropriate), but this usually is not necessary. For example, this ChatGPT session analyses a dataset from the NHS about the number and type of hospital appointments over the last decade.
- with some AI tools we can upload the data file - it is often better if this is a CSV rather than an Excel file.
- if we can’t upload the data file, copy and paste the data, or at least the column headers and a few representative rows into the prompt.
- as an image of the structure of the table(s): column names and possibly data types. This could be a snapshot of the model view of a Power BI data model, or an entity relation diagram of a database,
The AI can then
- provide a summary of the data and pattern e.g. identify as a star-schema arrangement of tables and label each table as a fact or dimension table,
- suggest strengths, weakness and improvements to the data structure,
- suggest a list of questions that we can ask to understand more about the data, and
- provide those questions in actionable form perhaps as SQL or Python code.
- if data was provided, the AI can analyse and summarise the data and build a few charts to provide insights.
After the initial prompt, we can prompt with follow-on questions, for example:
- how would we add a new column that combines X and Y?
- summarise the data: sum column X and group by column Y
- what are the factors that influence the values in column Z?
Note: data professionals will often use a language such as SQL, Python or DAX (the data modelling and calculation language of Power BI) for our data analysis. There is a specific section later in the course for each of these.
If we need to write code as part of our data analysis tasks, AI tools can help us write better and faster. It can even write the code for us. For example, we can use it for
- SQL - for example, ChatGPT writes the SQL based on a description of the data and instructions here
- data modelling and DAX (the calculation and data modelling language of Power BI) - see here
- Python - see this example
Tutorial - Analyse NHS England Outpatient Activity
We will use an AI tool to analyse the number of outpatient appointments in England over the past several years by outcome (attended, did not attend etc). The data is in a CSV file - download from here and have a look at it. The original source data is from NHS Digital here.
Suggested Prompts
- Describe the attached dataset (Don’t forget to attach the file)
- Show attendance trends over the years.
- What is the trend in the percentage of patient cancellations? (Possible follow on: Why did this dip in the 2020-21 year?)
- Are there any other insights available from the data?
Data Analysis Exercises
There are four group exercises to practice data analysis using Generative AI tools: Strictly, Bank Churn, Titanic and Supercar.
There will be four groups and each group will have a different cases study and use a different AI tool.
- Strictly, Copilot
- Bank Churn, ChatGPT
- Titanic, Gemini
- Supercar, Claude
There will be about 30 minutes for the exercise. Within your groups, please:
- introduce yourself: videos on!
- elect a spokesperson
- start with the suggested prompts
There will be about 10 minutes for each group’s presentation to the class. In the presentation, the spokesperson will
- explain the dataset and background (class will not be familiar with this)
- describe your results
- include any insights, conclusions, opinions