Home | Power BI | Excel | Python | SQL | Generative AI | Visualising Data | Analysing Data
Tidy data is a good place to start analysis and visualisation. Tidy data is tabular data that has one variable per column.
Source (raw) data is rarely tidy. It is often wide data, with the same variable split over several columns for presentation purposes.
For example, this example shows a dataset where each month values are in a different column. This is wide data.
Product | Jan | Feb | Mar |
---|---|---|---|
Alpha | 101 | 102 | 103 |
Bravo | 201 | 202 | 203 |
It is much better to reshape (unpivot) this data into a long format such as:
Product | Cost Type | Amount |
---|---|---|
Alpha | Jan | 101 |
Alpha | Feb | 102 |
Alpha | Mar | 103 |
Bravo | Jan | 201 |
Bravo | Feb | 202 |
Bravo | Mar | 203 |
This is known as tidy data. This allows us to compare values by month, as well as product, which is not possible with the data in wide format.