Insights

Automation is Great – Until it’s Not

By Carey Lening

July 22, 2021

35min read

Oftentimes, I find myself staring at Excel, slaving away on some menial data entry task. A thought pops into my mind — “I should really automate this thing, because this is hell, and if I write a little script, it’ll go so much faster!” With the excitement of someone whose programming skills rate only slightly above the skill of copy-and-pasting directly from Stackoverflow, I confidently march upstairs to seek the wise counsel of my husband — a software engineer by trade. I explain my problem, the envisioned hours of time I’ll recover by automating the thing, and my insanely naive assumptions about how hard it will be to write the program I need.

He patiently listens to me, sporting a slight, good-natured smirk. See, he’s heard this all before. It’s almost a weekly occurrence. Nearly every time, he reminds me that unless it’s routine, ongoing, and something that’ll take longer than a few hours to do, it’s probably not worth automating. If it’s nuanced and fiddly, it better be something that really sucks up your soul, because that will just be harder still.

I share this random window into my soul as a way of emphasizing that generally, I’m a fan of automation. I love tools that make people’s lives better and more fulfilling. And I’m a zealous advocate of privacy technologies that help people take back power when it comes to their data.

The Automation Trend

But lately, I’m noticing a trend, and it’s a worrying one. Bright-eyed engineers (with admittedly better coding skills), are trying to automate privacy and data protection challenges. While some do this rather sensibly, others I think have missed the mark. This is particularly true for the growing list of data erasure tools that have come to market over the last year or two. Each of these services pursue a noble goal — providing automation to streamline the confusing and tedious process that users must undergo when making an Article 17 erasure request under the GDPR, or a deletion request under the California Consumer Privacy Act or the Virginia Consumer Data Protection Act.

But my goal isn’t to lambast automated privacy tools. Rather, I want to nudge companies developing privacy-enhancing technologies to think critically about data protection and privacy in a way that actually solves the problem — instead of merely monetising privacy fears with opaque tools that nobody understands that ultimately don’t get the job done because they don’t understand the problem.

Stop Monetizing Fear

Each of the services I reviewed offered paragraphs of breathless prose on how your privacy rights are at risk and under attack all the time. There’s often at least one or two bits about how users need to rise up against the privacy-sucking vampires of capitalism, and take back their data. Some highlight alarming statistics around data breaches or identity thieves, the sheer volume of companies who collect our data, or worse, how we’d all be better off if we owned and could commoditize our data. All of these things are true, but they’re also scary. And scared (and/or angry) is the emotional state these services want consumers to be in.

This approach leads to a few follow-on problems. Firstly, it doesn’t account for differing levels of risk. If you sign up for a newsletter (say, Privacat Insights or Castlebridge Insights) the only thing collected is an email address. If the newsletter site gets breached, that’s a bummer for me or Castlebridge (as site owners), but your privacy risk is extremely low because we don’t collect any additional information. But on the data erasure sites, all risks are valued equally — whether it’s Facebook and Google or a fairly benign, occasional website update. There’s no prioritization and no nuance.

Second, each of these services present users with a false sense of assurance. If you sign up with them, you’ll finally be secure — you’ll be a ‘ghost’ on the internet, and you’ll be able to ‘take back control’ of your data once and for all. The problem is, many of these sites require users to grant broad levels access to their email or accounts, or worse, to share boatloads of additional personal information with the erasure service in order for them to act on the data subject’s behalf. These services also completely miss controllers who don’t regularly send emails, or send emails through processors and service providers, which I’ll get to later.

Show Your Work (AKA, Be Transparent)

None of the half-dozen data deletion/erasure services I surveyed spent more than a paragraph or two discussing how they protect user data. For most, it was a few sentences, couched in the same boilerplate language about ‘taking user privacy and security seriously’. Some services trotted out how they use ‘military-grade encryption’ (ugh), or have passed a security assessment, with no additional details. One company implied that their tool had gone through a ‘strict external security assessment’ with a link to Google’s API Security Assessment framework page. Not a link to their specific security assessment, just a link to Google’s explainer which applies to “any app that accesses consumer data from restricted scopes and sends it to their servers”. It’s disingenuous, and honestly, borders on the deceptive.

If I’m engaging with a service to act as a trusted intermediary to protect my personal data, I want to know how they’re doing it, what they’re storing about me, how they’re using my data, and who they’re sharing things with. I want to understand a bit about how their machine-learning algorithm makes a decision on what constitutes a controller, and how much and what kind of data it needs to make that decision.

Instead, we know nothing of the data being processed and retained by most of these sites, where the data resides, who it’s shared with, or how it is used to train subsequent AI models. We don’t even have good details on the organisational or legal controls that are in place when data is processed or transferred. We don’t know if DPIAs have been or are regularly conducted, or if the tools themselves were built around Privacy by Design/Default principles.

In my opinion (and that of Article 13 GDPR), all companies should be offering this level of detail, particularly those firms peddling privtech. As someone said, ‘be the change you want to see in the world.’ If you’re in the business of helping people regain their privacy, model the seven key principles of data protection in your business practices.

For example, you could take the approach Signal has adopted, by sharing their code and processes online for external privacy and security analysis and review. But I’m not queen of the universe, and I’m also mindful that legitimate IP interests exist. Still, there’s plenty of room for organizations banking on privacy and data protection to step up their game and show their work.

Automating Nuance is Really, Really Hard

During the process of digging into these services, I reached out to a few privacy pros who are in the trenches when it comes to DSAR and erasure/deletion requests. Many folks, including @reggblinker @nsqe, @privacymatters and @justcallmepips offered a lot of great food for thought from the data processor point-of-view. The internal discussions on this topic in the Castlebridge team are probably NSFW. The IAPP has also discussed how these subject request vendors create headaches for processors, and is worth a read.

In short, processors and controllers want to honor SARs and deletion requests, but often can’t because the approaches used by automated tools don’t match with legal reality. Erasure requests in particular, are rarely binary or absolute. If you look at Article 17 GDPR for example, there’s a ton of grey that makes a binary choice unworkable.

1. The data subject shall have the right to obtain from the controller the erasure of personal data concerning him or her without undue delay and the controller shall have the obligation to erase personal data without undue delay where one of the following grounds applies:

(a)	the personal data are no longer necessary in relation to the purposes for which they were collected or otherwise processed;

(b)	the data subject withdraws consent on which the processing is based according to point (a) of Article 6(1), or point (a) of Article 9(2), and where there is no other legal ground for the processing;

(c)	the data subject objects to the processing pursuant to Article 21(1) and there are no overriding legitimate grounds for the processing, or the data subject objects to the processing pursuant to Article 21(2);

(d)	the personal data have been unlawfully processed;

(e)	the personal data have to be erased for compliance with a legal obligation in Union or Member State law to which the controller is subject;

(f)	the personal data have been collected in relation to the offer of information society services referred to in Article 8(1).

It’s pretty clear that the GDPR provides a number of legitimate reasons to disregard an erasure request.

And there’s more nuance…

And there’s one additional bit — deletion requests, and rights requests in general, govern data controllers, not processors under the GDPR. Processors are obligated to assist controllers with SAR compliance, usually by notifying them that they’ve received a request from a data subject (Article 28(3)(e) GDPR), but not to act independently regarding that request. While the CCPA/CPRA does allow a service provider (which is similar to, but not equivalent to a processor under the GDPR) to independently act on a request, it also gives them the option to also punt to the business that is actually in control of the data.

Many of the requests sent by these automated services fail to account for this fact. For example, processors like MailChimp, SendGrid, HubSpot and Constant Contact, send out thousands (or even hundreds of thousands) of emails on behalf of a dizzying number of controllers on a daily basis. But the metadata doesn’t always cleanly identify that fact, and because we have no transparency in terms of how the automation actually works, these processors are getting bombarded with requests they can’t practically or legally act upon.

By treating every sender in a user’s inbox as a controller, and every deletion request as legally required, controllers and processors have succumbed to alarm fatigue, and have begun to tune these sites out, or worse, added additional friction to the process by requiring the user to manually submit a request. The whole process adds another layer of frustration for everyone — for the controller or processors who now have to deal with a mountain of poorly-targeted requests, and for data subjects who realize that the automated solution was not as ‘automated’ as they were initially led to believe.

What’s the Solution?

Much of the problems I’ve identified centre around issues of understanding, communication and transparency. As I said at the outset, these companies are all serving laudable goals. Giving users better control and transparency over who collects what data about them is a good thing. Empowering users to take charge of their data is awesome. Even automation and the use of creative new machine learning models, can be objectively beneficial, provided the development of those tools are not done in isolation. In order to build trust and gain acceptance, privtech providers must be transparent, collaborate with stakeholders (including data subjects, controllers & processors), and understand that the law is dynamic, messy and rarely solved by automation alone.

Companies in this space need to ensure that their internal practices, policies and governance adhere and advance the underlying principles found in most data protection and privacy laws, namely:

Lawfulness
Fairness
Transparency
Data & storage minimization
Accuracy
Integrity & confidentiality
Accountability

Consider this a shameless plug to point out that Castlebridge has been helping companies get data protection, governance, and strategy right for 12 years. We’d be happy to help — just get in touch!.