Data Training from Zomalex

Home | Power BI | Excel | Python | SQL | Visualising Data | Generative AI

Retail Bank Dataset

This exercise relates to a fictional retail bank, which provides both savings accounts and loans to customers.

The bank’s IT system provide a history over the last few years of the loan and saving balances for each customer at the end of each month. Each client may have a loan, a saving account, both or neither (possibly these are prospective customers who have not been onboarded yet).

The tables are in an Excel file in a public location on the web here. Import data initially from the Excel tables: LoanProduct, LoanData, Client, ArrearsStatus. Then later from SavingProduct, SavingData

There is a Dates table in a public location on the web here.

The data is provided over several tables: Client, Loan Product, Saving Product, Loan Data, Savings Data, Arrears Status. The ‘Saving Product’ table has a row for each type of savings product (e.g., notice account, easy access account) that the bank offers. The key column is the unique saving product code.

The ‘Loan Product’ table has a row for each type of loan (e.g., unsecured loan, interest only mortgage) that the bank offers. The key column is the unique loan product code.

The ‘Client’ table lists all clients: the unique client code and client name. Clients are split into four geographical regions (North, South, East and West).

The ‘Loan Data’ table has the following columns:

The ‘Saving Data’ table is similar to the ‘Loan Data’ table but does not have a credit limit or arrears status columns. The Balance represents the value of the funds that the client has deposited with the bank.

The ‘Arrears Status’ table shows whether client repayments against a loan are up to date and if in arrears by how much (less than 30 days, 31 to 60 days or above 60 days)

A ‘Dates’ (calendar) table is also provided if you wish to use it. This contains a row for every date in the years of interest (2019 through 2024).

Suggested Initial Approach

Start by importing the tables related to loans only, build a few visuals and get the model into a good shape before extending the model with the ‘Savings Data’ and ‘Savings Product’ tables.

Build a bar chart to show the number of transactions by loan product type.

Build a column chart to show loan balances over time.

Particular Challenges

Balance is a semi-additive measure, like an ‘inventory level’. We can add it up by all dimensions but not by date. We may need to write a [Latest Balance] measure.

The final data model is not a star schema pattern!