Category Archives: Data Visualisation

Using The R Script Visual in Power BI To Build Charts With Precise Layouts

Financial regulators are unreasonable people – at least that’s the impression those of us who report a bank’s risk numbers to them would have you believe. Not only do they demand daily reporting but they also require the charts that we provide to have a defined layout and format. This does raise a problem for Power BI in that while it has very good visual practices, these are not exactly those of the regulator. However, we can use the R visual to create a chart in exactly the fashion that a regular demands.

One very important chart for regulators is the market risk back test chart. It is realy two charts in one; a bar and area chart. It compares the profit and loss (P&L) that the bank, or part of the back such as a legal entity or desk, against a margin of safety, known as value-at-risk or VaR, that we promise that the P&L will stay within most of the time. On those days that the P&L exceeds the VaR, that’s a breach. If we have too many breaches, the regulator will take action.

The regulator wants to see the VaR as an area chart with VaR as a mirror image both above and below the x (date) axis since a breach occurs if the bank makes either a profit or a loss outside this VaR envelope. Regulators want this VaR envelope to be partly transparent so that we can superimpose the P&L on it. The daily P&L must be shown as blue bars with breaches shown in red.

In Power BI chart we can get close to these requirements – close, but no cigar.

Backtest built in native PBI

There are three problems

    • The VaR envelope is shown as two boundary lines rather than area chart. The Power BI combo chart visual combines a bar chart (used for the P&L) with a line chart but we can’t combine with any other chart
      • The P&L bars are all blue; there is no means of changing the colour of individual bars based on a condition
      •  The snapshot below show synchronized axes for P&L and VaR but if the the ranges of these two measures are sufficient different, the chart will automatically have different scales for P&L and VaR, which is not what we want and is potentially misleading.

However, we can meet all requirements if we use the R visual. We can use the fine tuning allowed in the plotting package to define our chart elements precisely. The snapshot below shows the R chart.

Backtest built in PBI R Script Visual

The R visual creates a script editor in a lower pane. It provides the first 2 lines of R code to create an R dataframe (R’s equivalent of a table) that contains all unique values of the columns in the values well. From here we can add a few lines of R code firstly to determine if there is a breach on a given day and then to plot the chart. The R code uses two very popular R packages. The dplyr package enables us to manipulate dataframes and the ggplot2 package gives us the ability to plot exactly the chart we require. The ggplot call first creates the P&L bar chart. It selects the colour of the P&L bars based on the breach condition. It then overlays this with partially transparent area charts – one for the upper VaR bound, then one for the lower (negative) VaR bound.

PBI R Script Editor Window

Even with a small amount of R code like this, it is best to build and test the R code within the RStudio, the typical development environment for R, then copy it into the code window for the visual once it is fully working.

The ggplot2 package allows fine tuning of a charts appearance – enough to satisfy the most unreasonable regulator.

This article first appeared on the Microsoft MVP Technical Tuesdays blog  here
The source data is here.

Analysis of London SW12 property price history

I live in a part of south London called Balham. It’s about 6 miles south of the central area and part of a postal district named SW12 (the 12th SouthWest district in London). It was famous a long time go for being lampooned in a radio comedy as “Balham, Gateway to the South” but things have changed. I’ve lived here over 20 years and property prices have shot up and I wanted to analyse this.

The UK Land Registry provide details of every one of the 18 million property transactions since 1995 in one rather large CSV file which can be found here. This contains the price paid, the date of transfer, the postcode and the house number. An example of a postcode is SW12 0EN. The UK postcode narrows down to a set of about 20 properties, usually on the same street – together with the house number it uniquely identifies the property. There are over 600 postcodes in SW12 and over 1 million in the UK.

My friends at the Information Lab have kindly made publicly available a dataset that geocodes every postcode in the UK. The details are in this blog post. Geocoding ties a location, in this case a postcode, to a latitude/longitude so Tableau can map it. So now I can plot all the house purchases on a map.

This compares the property sales in prices in 1995 and 2013. The size of each circle represents the purchase price. Hover over a particular spot to see the sales details.

This motion chart shows the 14,000 property transactions since 1995, year by year. Running it through the years gives an impression of the activity, including the dip in 2008 due to the credit crisis.

I’m only interested in my neck of the woods but the Land Registry data obviously covers the whole country.

Obviously you can do a lot more with this data – for example filter by price paid only to look at properties within a certain price band. Or you can look at those properties that have been bought & sold several times in the last 18 years and use these to calculate average annual price increases.

Viewing Core Tier 1 Ratios In EU stress test data

The two charts below each give a different perspective the distribution of the core tier 1 ratios (CT1R) of all the banks in the EU stress test data.

Figure 1 shows the distribution of the CT1R of all the banks grouped by their home country. The UK’s 4 banks have very similar CT1R about the 10% mark. Other countries have a much wider range. In contrast, Spain’s banks’ CT1Rs vary over a wide range from 3.82% (CAJA DE AHORROS DEL MEDITERRANEO) to 22.2% (BANCA MARCH SA).

Figure 1 – This strip plot provides a useful summary of the distribution of the CT1Rs in a way that we can easily compare countries.

Figure 2 provides a different perspective. It shows the risk weighted assets (RWA), a measure of size, against the CT1R. Banks with data points in the bottom right of this chart are large banks with poor core tier 1 ratios and are perhaps of most concern. Santander for example has RWA of €600 billion and a core tier 1 ratio of just over 7%.

Figure 2 – This scatter plot highlights large banks with poor CT1Rs

Both charts are implemented in Tableau Public and are interactive – you can hover over each point to identify the bank and read the exact CT1R as in the snapshot below.
Scatter Plot Details show Brushing Of Single Item

Brushing and Linking in Microsoft Power View

Power View is Microsoft’s new data visualisation and exploration tool. This first look shows off a useful feature, known as brushing and linking, which is useful when a dashboard shows several charts that are different perspectives of the same dataset. Brushing allows the user to set to choose a set of data points in one of the charts and highlights this data. Linking then highlights the related data in other charts. It can be a powerful help to analysis.

Figure 1 shows a Power View screen with two charts. Both charts are different perspectives on the same data which is the sovereign exposures of 90 EU banks as of Dec 2010. The bar chart on the left shows the exposure by bank (sorted top down) and the column chart on the right shows the exposure by maturity. This is a simple, but useful, example of what Stephen Few, a data visualisation expert, calls a ‘faceted analytical display’.

Power View screen
Figure 1 – PowerView screen

As you can see, HSBC has the highest exposure. We can click on the HSBC bar to see its maturity profile. Figure 2 shows the result. The maturity chart now changes; the total exposures are now de-emphasised but still visible while HSBC’s exposures are highlighted. We can see that HSBC has a high proportion of debt with shorter maturities, especially the 3 month tenor. If you regard shorter term maturities as a less risky situation then this is a good thing for HSBC.

Figure 2 - Power View screen: HSBC selected
Figure 2 – Power View screen: HSBC selected

Figure 3 shows the corresponding picture for BNP PARIBAS, the bank with the 2nd highest exposure overall. It has a very different maturity profile with more longer dated exposures especially the 10 year.

Figure 3 - Power View screen: BNP PARIBAS selected
Figure 3 – Power View screen: BNP PARIBAS selected

We can click on a particular column in the maturity chart to see how that particular maturity is spread across the banks. For example, in Figure 4 the longest (15 year) maturity is selected and we can see that Dexia has the highest exposure to this maturity even though it has the 16th highest exposure overall.

Figure 4 - Power View screen: 15Y tenor selected
Figure 4 – Power View screen: 15Y tenor selected.

Alternatives to ‘Top N’ concentration tables in risk management reports

Market risk management reports often contain many ‘Top N’ tables. Figure 1 shows a typical example. Each of the four tables shows the top 5 underliers (companies) for a different risk factor – in this example delta, gamma, vega and time-weighted vega.
Figure 1
Figure 1

Figure 1 actually underestimates the prevalence of these ‘Top N’ tables. Often there may be about for about 9-10 ‘Top 10’ tables and even more if underliers are split between indexes and individual stocks.

It is not easy for readers of these tables to gain insight about the data in these tables quickly . Firstly, they suffer for all the usual problems of tables of text and numbers. Secondly, it’s quite hard to follow the details for a particular underlier across all risk factors. Say we are interested in ‘Underlier F’ since it has the highest delta. We have to scan the other three tables to find it and in fact we don’t find it in the gamma and vega tables since it is not in the Top 5 in these risk factors. Thirdly, the tables don’t provide that much data – there are only 20 data points (4 tables of 5 numbers each).

There are a few alternatives that may improve on this. We can add data-bars that provide some visual cues about the relative values as in Figure 2.
Figure 2
Figure 2

However it would be much better to see all the data for each and every underlier that appears in any of the tables. In our example, that will be between 5 and fewer than 20 underliers since we can expect the same underlier to be present in several tables. Figure 3 shows the data organised with each underlier on a separate row. Now it is easy to see all the values for a particular underlier. Although it is now harder to determine the biggest value since we are no longer ordering by value, the data-bars help us quickly home in on the important values.
Figure 3
Figure 3

The data points represent a snapshot of the position as of close of the previous business day and it is very useful to know the recent history – how has this value been arrived at. In Figure 4, sparklines were added to show a micro-chart of the each value over the last 10 days.
Figure 4
Figure 4

This report was implemented in Excel and the sparklines were added using a very useful third-party Excel add-in, XLCubed.

Figure 4 has made good use of precious space and increased the density of data points. As mentioned Figure 1 represents 20 data points. Figure 4 however manages to represent twenty times as many data points (i.e. 400 data points; 10 underliers by four risk factors by 10 days) while only taking up about twice the screen or page real-estate.

Looking at Libor data

Libor is in the news. Barclays, and probably some other banks have been fidding it. There will be a lot of lawsuits soon enough. It’s interesting to look at the actual Libor rate values and get a feel for how they vary. The British Bankers Association (BBA) publish the Libor rates for the last 6 months with a 2 month delay – data from Dec 2011 to May 2012 is currently available here. The data is for 10 currencies and for 15 tenors from overnight to 12 months. I’ve used the May data to create the three graphs below.

Figure 1 shows the yield curve on May 1st for each currency. This is a motion chart – you can use the slider to move through the month and see how the yield curve changes from day to day.

Figure 2 shows how the various LIBOR rates for each currency change through the month for a particular selected tenor.

Figure 3 shows how the various LIBOR rates for each tenor change through the month for a particular selected currency.

The graphs highlight some notable features, for example

  • Something happened to the DKK rate on the 24th May and for the next few days – it drops suddenly.
  • The DKK rate is above the GBP rate for all tenors – except the two month tenor (until it has its drop on May 24th)
  • Some rates are almost completely flat (e.g. JPY) most seem to drop very slightly during the month (e.g. GBP) whereas the USD rate was the exception to the rule – it rose slightly.

Visualising the EBA stress results

In July 2011, the European Banking Authority (EBA) published results of a stress test of 90 of the biggest banks in the EU.  The purpose of the stress test was to test the resilience of the banks in an “adverse but plausible” scenario to give investors good data and transparency about the stability of these banks.  The details are here

The data provided for each bank included the risk weighted assets (RWAs), core tier 1 (CT1) capital and ratios, exposures at default (EADs) to various sectors of the economy and also a breakdown of sovereign exposures by country and residual maturity.

The EBA provided a report into each bank but also thoughtfully made the raw data available.  I have mashed this up and made some visualisations using Tableau Public data visualisation software.  (See for more details).   These provide great insight into the data.

The first visualisation shows the exposures to each sovereign country.  Use the ‘bank’ selector to look at all banks or for a particular bank. You can ask questions such as “what is RBS’s exposure to sovereigns?

The second visualisation shows the exposures by bank.  Use the ‘country’ selector to look at all sovereign exposure or for a particular country.   You can ask questions such as “which banks are most exposed to Greece?”

I find this “heat map” style of visualisation useful for an “at-a-glance” overview of the data. Much of the power of the visualisation is the interactivity so I suggest that you have a play.

This next visualisation shows the the exposure to the PIIGS – the five countries that have been of most concern (Portugal, Ireland, Italy, Greece and Spain) during the credit crisis.  I have grouped the banks by their home country. Obviously banks have a lot of domestic exposure but it’s interesting to see that French, German and Belgian banks have a significant amount of Greek debt – and even more Spanish debt.

Of course, capital adequacy data as of December 2010 are very old news. A lot has happened in the year since the EBA published these results last July. Many of the banks have improved their CT1 ratios (indeed, one purpose of the stress tests was to galvanise banks into exactly this action). Bankia, a Spanish bank has been re-structured. Over the next few weeks, I will be bringing the data more up-to-date and providing more visualisations.