Data Training Courses

Home | Power BI | Excel | Python | SQL | Generative AI | Visualising Data | Analysing Data

Approaches to Improving Data Quality

Approaches to improving data quality

There are several approches we can take to improve data quality. These include:

Note: We’ll not talk about data protection: anonymise, secure, audit/log

Visualise

A good visualisation often shows up unanticipated data issues, not caught by pre-meditated rules.

With modern tools, exploratory visualisation is easy and quick

Validate

Check against business rules e.g.

Inspect individual rows and totals - both counts and amounts.

Flag with exception reporting and alerts.

Certify

Agree ownership

Constrain / Enforce

Often worth flagging a ‘missing’ value as:

Reconcile

Often an automated daily process with sign-offs

Adjust and comment

Adjustments are a fact of life.

Ensure a properly followed and documented process, for example

A process to fix the underlying causes must be in place.

Comments available on the final dashboard.

Results based on adjusted data should be differentiated (different colour).

Document

Describe the meaning and the semantics of the data and the process followed to transform the data

Open datasets often are supplied with metadata

‘Reproduceable research’ approach – provide the code to transform the data e.g. as a ‘notebook’ so others can analyse and repeat

Track (apply data lineage)

Assign a unique tag each item of source data that accompanies it through the data journey.

Aggregation of data loses tags but is often last stage of journey.

Drill through capabilities on visualisations are good for listing all the tags in a suspicious aggregated value.

Often useful in reconciliation process

Golden Source – Single Version of the Truth

Automate

BCBS 239 Data, Aggregation & Reporting

bcbs 239 front page