Data Classification – the role of Metadata in Data Protection

By Daragh O Brien
May 26, 2015
13min read

Metadata – it is often defined as “data about data”. Equally often it is considered a dry and technical issue, a question of documentation or the domain of the geek. Blessed are the geeks however, at least when it comes to the impact that metadata can have on critical business decisions and processes, such as Data Protection compliance and responses to a Data Security Breach.

We recently advised an organisation who had suffered a data breach. No “black hat hackers in the dead of night” here, just your run of the mill basic configuration failure in a back office process that resulted in personal data of individuals being sent to places it should not have gone. In this case, our client had amended a website expression of interest form response processing to get around issues posed by spam filters blocking their follow up email. That follow up email contained a copy of the form that had been submitted by the individual. An error in the configuration of some middleware resulted in every n-1th form submitted being sent to the nth person to submit their form. The issue was spotted quickly and resolved quickly and only a small number of people were affected by it. However, it still happened.

The form contained a variety of questions, one of which was enquring if the applicant had any medical conditions disabilities, or other special needs. The question was a simple “Yes/No”

The number of individuals affected was far below the mandatory data breach reporting threshold in Ireland (100 individuals affected is the threshold). However, if the form contained any sensitive personal data or sensitive financial data it would need to be reported.

The question relating to medical or special needs was just a “yes/no”. But it was “Sensitive Personal data” within the meaning of the Data Protection Acts as it related to physical or mental health.

We contacted the DPC on behalf of the client to verify that that question, even if it only captured a “yes/no” consituted Sensitive Personal data and would therefore require notification of the breach to the DPC under the Data Security Breach code of Practice. The DPC deals with a tidal wave of notifications and one thing we try to do is to ensure they only get the ones that need to be notified. We were advised that yes, that was the case and that notification of the breach would be required.

This raises two important points about the importance of data classification and metadata:

  1. The way you classify your data, and the correct classification of that data, can affect your ability to comply with your obligations under the Data Protection Acts and, in particular, the data security breach code of practice. Given that that the Draft Data Protection Regulation imposes a fine of 2% of global turnover for failing to comply, errors in data classification could be costly
  2. The way you design your data is important. A “No” is still a recording of a fact relating to physical or mental health in the example above. It is a definitive statement. It would be far better to record NO FACT – a NULL value where there was no decision to be driven from a “Yes”. This is a simple “tickbox”, but in the implementation of the data model it would be important that that tick box record NULL rather than 0 in the database as that way no data is being recorded.

The experience handling this incident does raise one question though:

  • If questions relating to physical and mental health are considered sensitive personal data in the context of a data security breach notification, why does the Department of Education continue to insist that those same types of questions in POD are not sensitive personal data? (See answer to question 3 in this document )

This is a final issue to watch for in metadata definition and the application of data classifications: CONSISTENCY. Inconsistent application of data classification rules can lead to situations where data security breach notification requirements are not met, or where other decisions and actions are taken with data which could give rise to data quality issues or criminal offences under Data Protection laws.

The solution to the consistency conundrum is, of course, effective Data Governance, either through your Data Governance office or your Data Protection Officer. Defining rules for determining the correct classification of data is essential. Ensuring that the decisison about classification are taken at the right level in the organisation (decision rights and accountabilities) is likewise important. Ultimately, should the worst case scenario arise, your organisation will need to be able to justify to a Regulator and, almost inevitably, the media why you took the decisions you did.

[image sourced from]

Related Insights


Keep up to date with all our latest insights, podcast, training sessions, and webinars.

This field is for validation purposes and should be left unchanged.