Data Analysis with Python

Who is this course for? Python is a very popular, versatile, open source language, and often used for data analysis.  This course is aimed at people who would like to start using Python for their data analysis tasks

Learning Objectives: Attendees will have a broad understanding of Python’s numpy and pandas packages and be able to use these to analyse and visualise data in Python.

Pre-requisites: Completion of the Python Foundation course, or equivalent knowledge of the fundamentals of the language.

Course Content:
The course teaches Python for data analysis. It covers in detail the most popular Python packages for data analysis: numpy, for working with numerical data, pandas, for data analysis and manipulation, and matplotlib and seaborn for data visualisation.

Manipulate numerical data with the numpy package.

  • Create numpy arrays from lists, and in other ways.
  • Examine numpy arrays: determine shape, size.
  • Vectorised operations.
  • Operate on numpy arrays: filter, add, multiply.

Clean, tidy and analyse data with the pandas package.

  • Import data into pandas from CSV, Excel files and for python data structures.
  • The Dataframe and Series objects
  • Examine data frames with head(), tail(), describe(), shape and values attributes
  • Operate on Dataframe: calculate new columns, filter rows, order rows, group by, …
  • Exercise: calculate and visualise FX rates

Visualise data with seaborn and matplotlib packages.

  • Create different type of charts from pandas datasets including line, scatter, bar and column charts
  • Create small multiples (trellis charts)
  • Customise these charts to improve their readability and appeal: titles, axes, colours,…
  • Exercises to visualise several datasets

In lab exercises, attendees import data from text/CSV files, Excel spreadsheets and databases, clean, shape and transform the data, build charts to visualise results then finally export the transformed data.

Below are some images from the lab exercises on the course.

A snippet of Python code from one of the lab exercises.