Category Archives: Power BI

Using R in Power BI Part (3) – Create a R Script Visual

The third way of using R in Power BI is to create an R Script visual.  This is especially useful when we want to create a visual such as small multiples that is not available (yet) in the suite of standard or custom PBI visuals.

 Image that we have a daily time-series for December 2015 for the profit and loss (P&L), and we want to show a trellis of bar charts; one chart  for each  region (Asia, EMEA and US) and one for a  Volcker flag (which is either In or Out). So we want to show six charts in all.   We also want to show a bar on each chart in a different colour for each of three clusters.   Given a data frame named dataset with date, Region, Volcker, Cluster and P&L columns, we can create a this in R using the ggplot2 library with the following code.

 ggplot R code

Once we have written and tested this code in RStudio, we can create a R Script Visual. The RScript visual provides  a dataframe (R’s equivalent of a table)  named dataset.  This contains  a column for each of the fields in the Values well – in this case Region, Cluster, VolckerFlag, Date and P&L.

 R Script Values Well

We can copy our R  snippet into the bottom of the R Script Visual code window. 

R script editor

 Once done we can see our chart in Power BI.

R visual


Using R in Power BI Part (2) – Execute A R Step in a Query

The second way of using R in Power BI is as a step in a query in the Query Editor. Imagine we have a Task query that has a TaskOwner column that contains a comma separated list of one or more owners – like the snapshot below.

Task Query

Our objective is to create a new query that contains the mapping of tasks to owners – this requires us to split the TaskOwner column and create a row for each task and owner.

Since we are creating an additional dataset, let’s start by duplicating the Task query and rename to TaskOwnerMap. We can now insert a R step (rightmost icon on the Transform ribbon below).

Run R Script button

This opens a dialog where we can paste our R code snippet.

run r script window

The code above uses the functions from the tidyr package. The separate function splits the TaskOwner columns into several columns, one for each owner. The gather function then unpivots these columns to create an attribute and value column. The  select chooses the columns required for our final dataset.  (It’s best to develop, debug and test in the RStudio or other development environment beforehand.)

This transforms the data as required – we’ve named our result dataframe and we expand this in the next step to see the columns.

QE Final TaskOwnerMap

This particular example (splitting a column) is more easily accomplished directly in the Query Editor than using R but I wanted to show a simple case to explain the mechanics of the process.

Using R in Power BI Part (1) – Load a R Script

There are three ways to use R in Power BI

  1. Load a R Script as a data source
  2. Add an R Script as a QE step – introduced in the July 2016 update
  3. Plot a R Visual

In this article we’ll look at the first of these.  The subsequent articles will look at the other two ways.  We’ll use as an example in a typical reporting and visualisation challenge taken from the area of report progress of task to meet certain regulatory requirements – in this instance the 11 principles for good aggregation and reporting of data issued in by the BCBS – details here if you are interested.

Imagine the source data looks like this.

Excel Compliance Task Data Source

This shows a row for each task with some properties (Key, Name, Start Date, End Date). It also shows which task are applicable for which regions (denoted by the x in the cell).  The set of regions is implicit in the column names.  It also shows the owners of each task – if there is more than one owner, these are in a comma separated-list. In a similar way, it also shows the principles covered by each task.  There are 11 principles and we know this list will only have integers between 1 and 11. We also have on a separate sheet the details of the principles

Excel Compliance Principle List

This is not a good structure for reporting and analysis.  We need to tease out the data into separate, related tables –

  • Task,
  • TaskPrincipleMap (to hold the many to many relationships between Tasks and Principles),
  • Region,
  • TaskRegionMap (again to hold the many to many relationships between Tasks and Regions)

and combine this with the Principle table to arrive at a data model that looks something like this

Target Data Map

The R script below uses the tidyr and dplyr packages to perform a sequence of data transformations to achieve exactly this.  These include projecting columns (select), filtering rows (filter), splitting a comma-separated column into individual values (spread) and unpivot (gather).  The result is a set of data frames (R’s equivalent of a table) that correspond to the tables required.

Compliance Data Transform R Code

We can paste this RScript into the “Load R Script“ window.

Get Data R Script


execute R script window

One of the nice features of this is that you can then choose exactly which dataframes created by the script to import – and that you can load several dataframes not just one.

Navigator Window

This then loads our data into the Query Editor below – each selected R data frame becomes a separate query.  (in the snapshot below, I have renamed the queries from R naming convention to more typical table / query names e.g. from  to TaskPrincipleMap)

QE view

In this case, we can do these operations directly in Power BI of course – but R has many advance statistical capabilities that are not in Power BI; decision tress, clustering, correlations.  Some of these are shown in the R Script Showcase here.

Visualising Operational Risks

Operational risks results differ from other types of financial risk results such as market or credit risk. The latter are defined in numerical ratio terms – examples include profit and loss, value-at-risk and exposure. We can add these measure together naturally.

However, operational risk results are usually expressed in ordinal terms. For example, an operational risk and controls self-assessment (RCSA) estimates the inherent risks, assesses the controls that are in place to avoid and mitigate these risks and then estimates the residual risks after these controls are applied. The effectiveness of each control is also estimated – typically on a scale from adequate, inadequate, vulnerable and worst-of-all not controlled.

There are a set of 7 industry standard well defined event type categories; internal fraud, external fraud, employment practices, clients & products, damage to physical assets, business disruption and execution & delivery. Each of these is broken down into sub-categories. For example, internal fraud splits into unauthorised trading, bribery and others. Furthermore, many parts of the bank submit a RCSA assessment and these may be organised into a hierarchy.

Risk Managers want to view these as heat maps. They would view initially at an aggregated level, for example, at event type category and across all regions. The aggregated values should be the “worst case” of the detail values so if a department has estimated low for misappropriation of assets, medium for mismarking and high for bribery then the internal fraud aggregated score would be high.

The snapshot below is a typical heatmap of RCSA scores, mocked up in Excel. The columns list the inherent risks, mitigating controls these residual risks. These are broken down by region. The rows list the standard operational risk event types

Mockup RCSA Heatmap

The data for this heatmap is a table – first few rows below.

Mockup RCSA Data

Let’s build this heatmap in Power BI Desktop. Our first problem is that these are ordinal results. L, M, H, VH values represent low, medium, high and very high scores for risks. Similarly, A, I, V, NC, values represent adequate, inadequate, vulnerable and not controlled for effectiveness of controls. So we add another dataset that helps us map these ordinal scores to a numeric weight to indicate the order of these scores. The actual values of those weights don’t matter, just the relative values.

Mockup RCSA Weights

We’ll merge these two datasets in the Query Editor to get a combined data set like this below

RCSA Query Editor Merged Data


We then create two measures

WorstCaseWeight = MAX(RCSAData[Weight])

This measure aggregates as worst-case manner as required.

WorstCaseScore = SWITCH([WorstCaseWeight], 10, “L”, 20, “M”, 30, “H”, 40, “VH”, 110, “A”,120, “I”, 130, “V”, 140, “NC”)

This measure maps our worst case weight into the appropriate ordinal score. Power BI does not allow a string field in the values well so we can’t add our score values directly but it does allow any measure – even if that measure returns a string.   This is very useful for us.

Now we can visualise our heatmap both at an aggregated and a detail level.

Mockup RCSA Heatmap in PBID - summary

Mockup RCSA Heatmap in PBID - detail


This gives us basically what we need with the exception of conditional formatting of the cells. We have a heat map without the “heat” which is unsatisfactory. What we need is to be able to map these categorical scores to colours and apply these colours to the cell background. This is not yet possible for the matrix visual, or indeed for any Power BI visual but I hope that this will be a features of upcoming update – or that someone will build a heat map custom visual.

Using the Colour Brewer With Conditional Formatting

The Power BI Desktop May Update introduced conditional formatting of numerical columns in tables. The snapshot below shows a table with two columns, one of which has had conditional formatting applied.

Table With Conditional Formatting

The conditional formatting dialog is a context menu on numeric columns in the values well

Conditiional Formatting Dialog

You can set the colours at the end points of the colour scale, the minimum or maximum values from a colour picker but also choose a customer RGB colour value (for example, #FF0000 representing pure red). There is a lot of choice but very little guidance on the best combination of colours. Fortunately the colour brewer website  can help here by suggesting a few good colour schemes.  This site was developed by Cynthia A. Brewer in the Geography department of Pennsylvania State University.

Colour Brewer Landing Page

The simplest approach is to take the RGB values of the first and last colours in a scheme and enter these as the custom colours for the minimum and maximum values.
For example, the PL column in the table above used # edf8fb as the colour for the minimum value and # 006d2c as the colour for the maximum value

Choosing a custom RGB colour

The conditional formatting dialog allows to choose a diverging colour scheme. This is useful for a range of values centred around zero where you want to emphasize large absolute values either positive or negative. For example, on a column of profit and loss values it is useful to highlight those periods with either large profits or losses.

Table With Diverging Formatting

This is the first appearance of conditional formatting and there are of course some improvements that suggest themselves. It would be helpful to be able to

  • To be able to set some defaults for personal, team or organisational colour schemes – currently every column has to be set up manually
  • To have a set of discrete colours mapped to ranges (as is provided by the colour brewer site)

In the longer run it would be very useful to have conditional formatting based on a categorical measure as well as a numeric range. For example, imagine a DAX calculation that returns either “Breach” or “OK” – we would like to highlight the breaches.  Of course we could fake this now by our DAX calculation return -1 or 1 for OK and Breach respectively.

Quick Calcs in the Power BI May Update

The Power BI Desktop May Update introduced Quick Calcs. In fact, there is currently only one calculation currently, percentage of grand total. However, it shows the direction of travel – I assume that there will be other calculations added in future months; e.g. percentage of parent and time related calculations; moving average, year-to-date and so on.  It removes a stumbling block for many (perhaps most) users so that they can access typical common calculations; if you know DAX then you can write a concise DAX function for any of these calculations, but many users will not want to learn DAX.  Quick Calcs are useful with the new tooltips feature, also introduced in the May update

You can create a quick calc from a context menu on a measure placed in the Values or new Tooltips area.

Quick Calc Context Menu

The ‘Show value As’ drop down in the Quick Calc dialog provides only one option ‘Percent Of Grand Total’

Quick Calc Dialog

The snapshot below shows the tooltips area with two measures; the ‘% GT PL’ generated by the quick tooltip and a manually created DAX measure for comparison ‘% GT PL DAX’ that generates the same result. This has the formula

% GT PL DAX = [PL]/ CALCULATE( [PL],  ALL(FirmHierarchy) )

Quick Calc InTooltips

And the tooltip shows the PL measure (as usual, since in the values area) but now also the two measures in the tooltip area.

Tooltip Detail

This is the first appearance of Quick Calcs and there are of course some improvements that suggest themselves. It would be helpful to be able to –

  • format the result, just like you can a normal DAX measure
  • rename the result, again just like you can a normal DAX measure.  The name is set to ‘% GT <original measure>’  and is sensible but of course may not fit in with the pattern that you use to name measures
  • put the calculation into the fields list so it can reuse it in other visuals
  • see the DAX underlying the quick calc; this would be a good way of learning some practical DAX


Visualising A Daily Operational Process

The trading at a bank is managed at book level and there are thousands of books.  During the business day, each book moves through several different states starting with ‘Open’, then progressing through ‘Data OK’, ‘Calcs OK’, ‘Approved’, ‘Locked’ and finally ‘Closed’. This process is subject to strict deadlines – and it is important to monitor the overall progress during the day.

The visuals below show two ways of doing this. The bar chart shows the overall progress at a point in time – and it is more useful when it is extended into a side-by-side bar chart that can compare against the previous day or a historic average, so we can get an indication how well the current day is progressing.Power BI Desk Status SnapshotThe stacked area chart shows the progress through the day. The visual below shows at the end of the day – not all books have completed their journey to Closed status which could be a concern.

Power BI Desk Status Profile

I’ve simulated in Excel the source data for 100 books with snapshots of the status of each book taken at 30 minute intervals The main table has each book on a row and each column represents a half hourly timestamp.

Excel Status History

The book status is represented as an integer from 0 (Open) to 5 (Closed) and this mapping is in a separate table.

Excel Status List

The steps to make the charts are as follows

  • Import both the Excel sheets into Query Editor
  • Unpivot the data to obtain a row for each book at each timestamp
  • Load the data. The data model automatically joins the two tables on Status Id
  • In the data model, sort the Status Name column by Status Id so that our visualisations order the Status in the proper order rather than alphabetically
  • Build the visualisations

I have made a  5 minute YouTube video where I demonstrate building these charts

The Power BI visual is here.

If you would like to recreate it the sample source data is here.

First Look at Power BI Parameters

The Power BI Desktop April update contains several improvements, including templates, conditional columns and formatting tables but for me the most important are parameters since they make possible useful ways of interacting with Power BI reports.

For example, risk managers like to compare the VaR, a measure of the risk of a portfolios between any two dates they choose. Let’s call these two dates the “Start Date” and the “End Date”.  This is not possible by filtering the data – if we first filter all data by the Start Date, we have already filtered out the data for the End Date.

However, we can achieve this with parameters in the three easy steps. Firstly In the Query Editor, set up parameter Start Date with Date data type.

Manage Parameters Dialog

Create an ‘End Date’ parameter also.

Secondly create a new measure “VaR At Start Date” with the DAX below.

VaR At Start Date measure

Note that it references the “Start Date” Parameter. The MAX() is necessary since DAX has to be sure it is a single value. Of course our parameter contains only one value so we can safely do that.  Create a ‘VaR at End Date’ measure in the same way.

Finally add those measures as values well to a side-by-side bar chart and use another dimension on the ‘Axis’ well.

VaR Comparison Chart

The icing on the cake is to create the parametrised title for the chart by creating a DAX measure below that includes the Start Date and End Date parameters.

Comparison Title Measure

Chris Webb shows an advanced use of this technique in his blog.

Now users can change the parameters and the values on the chart refresh accordingly.

Enter Parameters Dialog

This first version of the parameters is very useful but there are clear ways that future versions may improve on this.

  • An obvious improvement would be to have a calendar picker for dates.
  • Also, at the moment we can only restrict values of our parameter to a list of allowed values – not helpful if your data contains 2 years of dates. It would be useful to set a range.
  • Furthermore we’d like to set a range of one parameter based on the value chosen by the user of another parameter – so that our End Date can never be earlier than our Start Date for instance.
  • When the user changes parameters, PBI desktop refreshes the data source. This is problematic if the data refresh takes more than few seconds.
  • In our example, changing the parameters does not actually require a data refresh.
  • Finally, parameters are available in the desktop but not yet in the service.

Having said all that, parameters do represent a big step forward in improving the usability of Power BI.


Using The R Script Visual in Power BI To Build Charts With Precise Layouts

Financial regulators are unreasonable people – at least that’s the impression those of us who report a bank’s risk numbers to them would have you believe. Not only do they demand daily reporting but they also require the charts that we provide to have a defined layout and format. This does raise a problem for Power BI in that while it has very good visual practices, these are not exactly those of the regulator. However, we can use the R visual to create a chart in exactly the fashion that a regular demands.

One very important chart for regulators is the market risk back test chart. It is realy two charts in one; a bar and area chart. It compares the profit and loss (P&L) that the bank, or part of the back such as a legal entity or desk, against a margin of safety, known as value-at-risk or VaR, that we promise that the P&L will stay within most of the time. On those days that the P&L exceeds the VaR, that’s a breach. If we have too many breaches, the regulator will take action.

The regulator wants to see the VaR as an area chart with VaR as a mirror image both above and below the x (date) axis since a breach occurs if the bank makes either a profit or a loss outside this VaR envelope. Regulators want this VaR envelope to be partly transparent so that we can superimpose the P&L on it. The daily P&L must be shown as blue bars with breaches shown in red.

In Power BI chart we can get close to these requirements – close, but no cigar.

Backtest built in native PBI

There are three problems

    • The VaR envelope is shown as two boundary lines rather than area chart. The Power BI combo chart visual combines a bar chart (used for the P&L) with a line chart but we can’t combine with any other chart
      • The P&L bars are all blue; there is no means of changing the colour of individual bars based on a condition
      •  The snapshot below show synchronized axes for P&L and VaR but if the the ranges of these two measures are sufficient different, the chart will automatically have different scales for P&L and VaR, which is not what we want and is potentially misleading.

However, we can meet all requirements if we use the R visual. We can use the fine tuning allowed in the plotting package to define our chart elements precisely. The snapshot below shows the R chart.

Backtest built in PBI R Script Visual

The R visual creates a script editor in a lower pane. It provides the first 2 lines of R code to create an R dataframe (R’s equivalent of a table) that contains all unique values of the columns in the values well. From here we can add a few lines of R code firstly to determine if there is a breach on a given day and then to plot the chart. The R code uses two very popular R packages. The dplyr package enables us to manipulate dataframes and the ggplot2 package gives us the ability to plot exactly the chart we require. The ggplot call first creates the P&L bar chart. It selects the colour of the P&L bars based on the breach condition. It then overlays this with partially transparent area charts – one for the upper VaR bound, then one for the lower (negative) VaR bound.

PBI R Script Editor Window

Even with a small amount of R code like this, it is best to build and test the R code within the RStudio, the typical development environment for R, then copy it into the code window for the visual once it is fully working.

The ggplot2 package allows fine tuning of a charts appearance – enough to satisfy the most unreasonable regulator.

This article first appeared on the Microsoft MVP Technical Tuesdays blog  here
The source data is here.

Time-Series Slicers In Power BI

In finance, we often need to look at a time-series over a large time-period, say a few years, then focus in on a much smaller range. Of course we can do this in Power BI desktop by putting the date (or month, quarter or year) in the filters area.

However, there are three custom visuals that we can use for convenient on canvas filters. The timeline, from Microsoft, shows a bar with a left stop (earliest date) and a right stop (latest date). The bar is divided into periods. There is a small slider that allows us to switch these periods between years, quarters, months, weeks and days. This works pretty well. I would like to see a few improvements however, to be able to disable certain periods. For example, when looking at a 5 year range, years and quarters are appropriate but, anything less than that is not a helpful.

The time brush, also from Microsoft and in experimental state, is even more convenient. It has a drag and drop feel; simply mouse-down at the start of the range and mouse-up at the end. The range on the slider is shaded. The time brush does have a couple of gremlins – for example, it shows a bar chart supposedly of the frequency of the observations but in that case I would expect a uniform distribution over time with my test data and it is clearly not showing that.

The advanced time slicer, from Cambridge Technology Partners, is an all-in-one visual, with a lower smaller bar chart showing the whole time range. It has drag selection like the time brush above and the larger chart above shows the selected period. Again there are some suggestions this is not ready for prime-time. The larger chart seems only to show the positive values and it calls itself the BrushChart in the visuals palette.

All three slider are shown in the example below. I will update this blog as they improve – all will become really useful slicers.