Data and AI Training

Home | Power BI | Excel | Python | SQL | Generative AI | Visualising Data | Analysing Data

Tidy Data

Tidy data is a good place to start analysis and visualisation. Tidy data is tabular data that has one variable per column.

Source (raw) data is rarely tidy. It is often wide data, with the same variable split over several columns for presentation purposes.

For example, this example shows a dataset where each month values are in a different column. This is wide data.

Product Jan Feb Mar
Alpha 101 102 103
Bravo 201 202 203

It is much better to reshape (unpivot) this data into a long format such as:

Product Cost Type Amount
Alpha Jan 101
Alpha Feb 102
Alpha Mar 103
Bravo Jan 201
Bravo Feb 202
Bravo Mar 203

This is known as tidy data. This allows us to compare values by month, as well as product, which is not possible with the data in wide format.