Rokas - stock.adobe.com

News

Invasive tracking ‘endemic’ on sensitive support websites

Websites set up by police, charities and universities to help people get support for sensitive issues like addiction and sexual harassment are deploying tracking technologies that harvest information without proper consent

Sebastian Klovig Skelton, Data & ethics editor

Published: 04 Jun 2024 15:47

Dozens of university, charity and policing websites designed to help people get support for serious issues like sexual abuse, addiction or mental health are inadvertently collecting and sharing site visitors’ sensitive data with advertisers.

A variety of tracking tools embedded on these sites – including Meta Pixel and Google Analytics – mean that when a person visits them seeking help, their sensitive data is collected and shared with companies like Google and Meta, which may become aware that a person is looking to use support services before those services can even offer help.

According to privacy experts attempting to raise awareness of the issue, the use of such tracking tools means people’s information is being shared inadvertently with these advertisers, as soon as they enter the sites in many cases because analytics tags begin collecting personal data before users have interacted with the cookie banner.

Depending on the configuration of the analytics in place, the data collected could include information about the site visitor’s age, location, browser, device, operating system and behaviours online.

While even more data is shared with advertisers if users consent to cookies, experts told Computer Weekly the sites do not provide an adequate explanation of how their information will be stored and used by programmatic advertisers.

They further warned the issue is “endemic” due a widespread lack of awareness about how tracking technologies like cookies work, as well as the potential harms associated with allowing advertisers inadvertent access to such sensitive information.

Stef Elliott, a data governance and protection expert who has been raising the alarm about these practices since he noticed the issue in mid-2023, has since identified more than 50 sensitive sites with this kind of setup, including support sites related to sexual abuse, health conditions and the protection of children.

While The Guardian reported on the Met Police’s use of the Meta Pixel tool on its website in July 2023 after Elliott flagged the issue, he said the problem is much deeper than one organisation’s use of one particular tracking tool.

“There seems to be a real lack of understanding of the potential harms that can be associated with this,” he said, adding a major part of the problem is that the analytics are added to websites by developers as standard practice, without taking into account the sensitivity of a given site.

Elliott said although tracking technologies can be helpful to organisations for a variety of reasons, he has an issue with it being used without due care and attention. But whenever he raises the issue with authorities, whether that be the UK’s data regulator or his local MP, he gets “tumbleweeds” in return.

Given the sensitivity of the data being collected, Elliott and other experts are concerned that people may be discouraged from seeking much-needed assistance if they believe sensitive data about them is being sent to third parties.

Online privacy researcher Mark Richards, for example, said that when people enter sensitive physical environments like a doctor’s surgery or teacher’s office, there is a duty of care on organisations and people to safeguard the vulnerable people there.

“The basic premise that ‘these four walls are safe so you can talk to me and no one else will hear you’ is broken,” he said. “It is no longer true – when you’re online, you’re being watched, someone knows you walked into that office and the topic you’re trying to handle … it’s an intrusion into a space which is supposed to be protected.”

On the harms entailed by this widespread use of invasive tracking, Richards said if people do not trust a system, they will not use it.

“If you’ve got a child who’s depressed, who is having trust issues with the environments around them, and then they’re told to go into an environment which starts lying to them by showing them cookie banners saying they’re not going to be tracked; while at the same time they see their privacy blocking tool saying there’s Facebook and Google and YouTube loading, they’re going to start thinking, ‘how do I trust this website?’,” he said.

Digital surveillance on sensitive sites

While the experts Computer Weekly spoke with are choosing to not disclose which specific sites are affected so that people are not deterred from seeking help, technical breakdowns of the tracking on multiple sites have been shared with Computer Weekly to confirm the data collection and sharing taking place.

Giving the example of a visitor to a police sexual offences reporting page, Elliott said that with the tracking setup currently in place, cookies would be immediately deployed to start harvesting information about the user, which is linked back to individualised profiles that advertisers can use to target them.

While this takes place via login cookies dropped when users are signed into, for example, their Google or X accounts, such advertisers can also build up profiles of people without accounts through the use of advertiser IDs, which use various methods such as browser fingerprinting IP addresses to correlate users with online activity.

Comparing the tracking to other forms of surveillance, Elliott said it was “more precise” than using video cameras or facial recognition because of the way it allows a wide variety of information about specific people to be linked to individualised profiles of them.

Highlighting another example of a university’s sexually transmitted infections (STI) webpage, Elliot said in that instance it asks for people’s ages, postcodes and sexual preferences, which then goes to a search results page with that information in the URL, all of which takes place before any interaction with the cookie banner.

“My issue is that if you knew that data was going to Facebook and Google, would you go on to the STI site and enter that data?” he said. “I think it undermines the purpose of the site, which is to help people in distress, in trauma.”

Tracking examples

Web tracking refers to the harvesting of information provided by users – both directly and indirectly – while they are visiting websites or using internet-based services like email or social media, and then reliably linking this information to a particular individual or device.

From a technical standpoint, tracking can occur in one of two broad ways – those that store data on a user’s device, generally known as stateful, and those that do not, known as stateless.

Cookies – which were originally implemented in web communications by Netscape programmer Lou Montulli in 1994 to allow users of e-commerce sites to store their items in a virtual shopping cart – are the most common version of the former technique, and are essentially small text files stored by web browsers to save information about users.

While stateless tracking works in a similar fashion, in that it allows specific individuals to be linked to specific online activity, it instead uses unique device configurations to identify people without storing any data directly on their machine.

As an example, while a stateful tracking technique would identify users by storing and exchanging a cookie between the client’s browser and the web server, a stateless tracking technique (such as device fingerprinting) would run a script to gather a range of information about the device, which can then be used to generate a unique hash value as an identifier.

While the original intent behind cookies was to provide users with more cohesive and personalised experiences online, Montulli has since described the invasive data collection and advertising they have enabled as “detrimental to society”.

Looking at the tracking in place across the sensitive support sites, it is not limited to one technology or method, and can be identified using the developer tools section of various browsers.

In one charity website built to help people secure their personal devices in domestic abuse situations, using the developer tools to inspect the technical make-up of the page as it loads shows that calls are made through a Meta Tag, which fires a collection event using Meta Pixel.

If logged into Facebook, a dataset is provided and recorded by the company, including the fact the person has visited the page URL and associated event information. This page is served without a cookie banner.

In relation to the site of a different domestic violence charity, while it did have a cookie banner in that instance, the technical analysis provided to Computer Weekly shows it is not effective, as it has been set up to immediately deploy Google analytics and advertising tags. The same site also deploys the Google collect event before a user has interacted with the cookie banner.

In the example of a regional police website, a user’s information is collected via Google Analytics and various Google ad products when they click accept in the cookie banner. However, while someone would have to accept the cookies in this instance, experts told Computer Weekly the force does not provide adequate information about the tracking in place, and how people’s data will be stored and used by programmatic advertisers.

On another policing website, accepting the cookies will similarly launch Google analytics (Universal and GA4) plus Google advertising code. The presence of these various trackers then allows event data and the customer identifier (cid) to be captured, which refers to the match key for linking data points, and which can be used to distinguish visitors along with other sensitive data points.

Commenting on the nature of the digital surveillance underpinning programmatic advertising, Richards likened it to a criminal act of trespass.

“Whenever you’re on a device … you’re doing your personal activities in your own environment. And when you’re doing them, you’re being watched through your own device,” he said. “To me, this is a sense of trespass – if someone was in my house, watching me do things, taking notes, I would feel like the police should be here taking them outside.

Whenever you’re on a device ... you’re being watched. To me, this is a sense of trespass

Mark Richards, online privacy researcher

“They’re like Peeping Toms in some ways, peeking through even though they’re not supposed to … there’s no restraint, the digital streets you walk, they’re following.”

Richards added that while many people are at least attempting to claw back some privacy by, for example, using ad blockers – one of the most downloaded tools on the internet – or buying iPhones due to their higher degree of privacy over Android devices, “the saddest part is as much as people try to avoid all of this, they haven’t got much hope”.

He said that while it can be hard to quantify the cost of privacy invasions in financial terms, there can be clear emotional impacts from a loss of privacy, and that people’s attempts to stop their privacy being abused shows there is “obviously a sense of dislike and distrust in the system that modern IT has created”.

A black box ecosystem

Those Computer Weekly spoke with also highlighted the “black box” nature of the online advertising ecosystem this sensitive data is then sent into, noting that once data is taken from a person’s device and sent to the likes of Google, Meta and others, there is very little visibility over how it is used.

“They’re using search algorithms, they’re using AI, they’re using machine learning, they’re using techniques to try and correlate users to things they think the users will be interested in, and things they think users will buy, to make money,” said Richards.

“Do you think it’s okay that when a child visits a suicide support website, we’re now trusting that Facebook’s algorithms will pick up the child is interested in suicide and will present the appropriate adverts and recommended content for that child’s situation? What does that even look like? How do they avoid the machine-learnt risk that the best thing financially to show them is likely booze?”

Richards added the way in which advertisers collect and combine people’s personal data to make inferences about them as both individuals and groups can easily become very discriminatory, as it essentially runs off of stereotyping.

“You have organisations sitting there picking out topics and choosing how they’re going to target people,” he said, adding that in practice this means “picking out stereotypes of specific groups and using that as a means to reach out to that group of people”.

Mariano delli Santi, a legal and policy officer for Open Rights Group, added that the systemic use of tracking tools to monitor people’s behaviour for the purposes of serving them ads is particularly harmful for those with vulnerabilities or addictions.

Noting the difficulty behavioural advertising often has in accurately inferring people’s intentions, characteristics or status – “if you’re reading an article about a strike, that doesn’t tell whether you agree with the strike or not, so everything is based on guessing” – delli Santi said it is good at picking up on and exploiting people’s compulsive behaviours because of the regular and repetitive nature of these actions.

“The existing system of advertising is a system that inherently favours this exploitive model of online targeting, exactly because any system that profiles behaviour is a system that is very good at identifying vulnerabilities and addictions, and is very bad at identifying everything else,” he said. “There is a perverse incentive to target people based on their weak spots.”

Delli Santi also highlighted the fact that once the data is collected by the sensitive sites in question, there is simply no knowing what happens to it, or what other data about you it can be combined with, especially when purchasing further information about people from data brokers is so easy and accessible.

However, Shane Gohil, a data protection officer at DPO Centre, questioned whether the potential harms to individuals outweighed the good achieved by the sensitive sites, given how difficult it is to pinpoint specific harms that stem from a programmatic advertiser’s access to any given data point or the subsequent inferences made off the back of it.

He added that while it is “very difficult to track because of the advert ecosystem”, the severity of harms will be different depending on the context of the data collection.

“Let’s say it was gambling addiction, for example – if that went into the advertising ecosystem, who’s to say that gambling firms cannot serve those individuals ads? Because their position will be, ‘I don’t care if you're an addict or not, the fact is, you make me money’. I know it sounds awful, but that’s the way commercial vehicles will operate,” he said.

“The difficulty with other things – for example, visiting self-harm pages or police services – is I just don’t see how that materially affects someone. How could they bring damages, how could they demonstrate they were caused distress by this?

“Merely saying, ‘I don’t feel great about that information being in the ad system because they know I’ve been on that website’, doesn’t really constitute a [legal] harm or damage to someone.”

Gohil added it is much easier to legally demonstrate direct harm from the collection of sensitive personal data in other contexts, such as when it is used for things like insurance or claims handling. He also said that while personal data from the sensitive sites may be added to behavioural profiles, it is still legally very tricky to pinpoint the harm to the specific data collected during a site visit.

Despite the legal difficulty of linking specific harms to the collection of specific data points, Gohil added: “Would you tell your doctor everything if you looked over and there was someone in the room? That would make me think twice.”

A lack of awareness

All those Computer Weekly spoke with said the core reason why such invasive tracking has been allowed to flourish is a general lack of awareness around how these technologies work in practice.

“My concern is that we’ve got to the point where we have a lack of awareness of the tracking tools that are available for businesses, and the potential associated harms to people,” said Elliott, who noted it is standard for tracking tools to be embedded on websites from the get-go.

“When you set up a system, you either build it, buy it, or borrow it. I think lots of people have learnt the rudiments of website development, but don’t fully understand the functionality they’re implementing, and therefore the associated risks and harms they’re exposing individuals to.”

Gohil added that while on the one hand it’s hard for organisations to argue ignorance of the UK’s laws on digital tracking given they are now 20 years old, the readily available nature of today’s tracking tools – “which is simply copying a piece of code and placing it into your website” – means they get overlooked, especially by organisations like charities that are generally already strapped for resources.

“I visit many organisations and part of my overall audit would be to assess their website and their use of cookies,” he said. “And whilst I can say, ‘You’re non-compliant in this area’, having the legal [and] technical expertise to remove these cookies and set them up correctly is a resource that many charities probably don’t have.

“I see this a lot in SMEs and in small charities who genuinely want to do good things, but they’ve got much bigger problems – they’ve got no IT provider, shadow IT, volunteers managing the data and doing work, so there’s a balance.”

The role of big tech

For delli Santi, the root cause of pervasive behavioural tracking online is the structure of today’s internet platforms and their market dominance.

Noting that a lot of today’s web development relies on the use of plugins and tools that have been “pre-compiled” by big tech advertisers to extract and share people’s data, delli Santi said the standard inclusion of tracking tech by these firms acts as a way of reinforcing their market dominance, as it essentially brings more and more data into their orbit without them having to do anything.

Highlighting that tracking tools are even embedded in software developer kits (SDKs), which are platform-specific software packages used to build applications, delli Santi said “these tracking technologies are effectively viral”.

He added this tracking is “particularly pernicious” because it taps into the dynamics around web development, including that organisations want websites set up as cheaply and quickly as possible, and that many web developers will not necessarily be well-versed in how to code or the technical ins and outs of how the tracking tools they’re embedding work.

“You can clearly see the commercial value of having this system of surveillance installed, even if you [as the developer] didn’t necessarily want to, which is that as a digital platform I’m now getting access to your browsing habits in an environment where I wouldn’t otherwise have access to it,” he says. “Of course, we’re talking about very sensitive information here, so it’s even more valuable.”

Commenting on the responsibility of big tech advertisers, Elliott said firms that provide such tracking services will cite the fact their Ts&Cs prohibit advertising surveillance on particularly sensitive sites, and that they will flag improper uses of tracking to their customers, adding: “I’ve never heard of anyone having this flagged.”

However, delli Santi notes that while big tech firms have a responsibility to stop offering this tracking as standard, the organisations setting up the websites are data controllers in their own right, and are therefore responsible for installing software that performs tracking on behalf of these firms.

“If you’re writing a website and you have something that is ready to be deployed, ready to copy and paste, that’s perfect,” he said. “On the other hand, you’re responsible for how the personal data being collected in this way is used, because ultimately the decision to copy and paste that web code, to implement that plug-in, to implement the functionality of your website, was your decision, nobody else’s decision.”

Gohil ultimately came to a similar conclusion, noting that while he is very sympathetic to the pressures smaller organisations face when it comes to dealing with technical matters like data protection and tracking, there is no legal excuse at the end of the day.

“This is something that will unfortunately fall, when it comes down to data protection law, on the data controller, which is fundamentally the organisation that’s chosen to use these tools,” he said.

Google and Meta respond

Computer Weekly contacted both Google and Meta about the tracking and the claims made by data protection experts.

“Our policies require advertisers to have the necessary rights and permissions, including people’s consent where applicable, to use our Business Tools data, and we don’t want or permit advertisers to send sensitive information about people through our Business Tools,” said a Meta spokesperson. “We educate advertisers on properly setting up our Business Tools to prevent this from occurring. Our system is designed to filter out potentially sensitive data it is able to detect.”

According to Google, measurement tools like Analytics help businesses understand how users engage with their websites and apps through aggregate reports that provide insights into patterns of behaviour of their traffic and the performance of their online properties, all without identifying individual users.

The search giant also contends that Google Analytics does not combine data of different customers or track users across unaffiliated websites, and that the data collected by customers using Analytics would only be used for advertising purposes for that specific customer, and only if the customer links their Google Analytics account with their Google Ads account, exporting their own data for their own use.

On the use of wider tracking tools, Google said its customers own and control the data they collect on their properties, that it does not use their measurement data for its own ad targeting or profile building, and that businesses are required to give visitors proper notice of and, where legally required, obtain their consent for their collection of data using Google Analytics on their properties.

It further added that its policies do not allow serving advertisements to people based on sensitive information such as health, ethnicity, sexual orientation or negative financial situations, and has strict policies in place to prohibit customers from using Analytics to collected protected health information.

These customers are also prohibited from uploading any information to Google that could be used by the company to identify an individual.

On the use of Tag Manager, Google said that if a business uses the Tag Manager, the tool itself does not collect, retain or share any information about site visits, and is instead a tool to help customers manage the behaviour of the tags they place on their websites.

A lack of enforcement

While Elliott first raised the issues around websites inadvertently sharing sensitive personal data with third-party advertising platforms with the Information Commissioner’s Office (ICO) in July 2023, the regulator is yet to take any action in helping the affected organisations mitigate the risks to site visitors.

In his latest correspondence to the ICO, in mid-March 2024, Elliott noted that while the tracking activity was halted on the police.uk website’s sexual assault page, this only happened in response to press coverage at the time, and that the same functionality is deployed across 26 of England and Wales’ 43 police forces.

“[One policing website] alone inadvertently shared personal data on 245 individuals seeking support! I can’t confirm the total number of people impacted as I am awaiting a significantly overdue FOI reply,” Elliott told information commissioner John Edwards in an email.

He added that, nine months since he originally reached out to the ICO, “the endemic leaking of personal data through support sites, raised in my letters, continues”.

Elliott concluded the letter by asking whether the ICO had been in touch with any of the organisations highlighted to make them aware of the issues and help them with mitigation, and if it could provide a roadmap for when specific guidance would be issued by the regulator, but is yet to receive a response.

The data regulator wrote to 53 of the UK’s top 100 websites in November 2023, warning them that they faced enforcement action if they did not make changes to advertising cookies to comply with data protection law. Of those contacted, 38 changed their cookie banners in ways that achieved compliance.

“We expect all websites using advertising cookies or similar technologies to give people a fair choice over whether they consent to the use of such technologies. Where organisations continue to ignore the law, they can expect to face the consequences,” it said in a January 2024 press release.

“We will not stop with the top 100 websites. We are already preparing to write to the next 100 – and the 100 after that.”

In 2019, the ICO issued a report titled Update report into adtech and real time bidding, which found that online advertising companies were failing to comply with the law in key areas such as legality of data processing, transparency, use of sensitive data, accountability requirements and ensuring an adequate level of security throughout the supply chain.

“The creation and sharing of personal data profiles about people, to the scale we’ve seen, feels disproportionate, intrusive and unfair, particularly when people are often unaware it is happening,” wrote the ICO. “We outline that one visit to a website, prompting one auction among advertisers, can result in a person’s personal data being seen by hundreds of organisations, in ways that suggest data protection rules have not been sufficiently considered.”

In a since-deleted blog post published by the ICO in January 2020 about its adtech actions, the regulator’s then-executive director of technology and innovation, Simon McDougall, said: “The reform of real-time bidding has started and will continue,” noting that while industry engagement has been positive, much more still needs to be done to bring transparency and ensure the processing of personal data in advertising ecosystems is lawful.

“The most effective way for organisations to avoid the need for further regulatory scrutiny or action is to engage with the industry reform and transformation, and to encourage their supply chain to do the same,” he said. “I am both heartened at how much progress we have made, and disappointed that there are some who are still ignoring our message. Those who have ignored the window of opportunity to engage and transform must now prepare for the ICO to utilise its wider powers.”

However, the ORG has previously told Computer Weekly that, to date, the ICO has not taken any regulatory action against data protection infringements in the online advertising space that were revealed as a result of the regulatory update report.

Computer Weekly contacted the ICO about the tracking in place on sensitive support sites and every aspect of the story. A spokesperson responded that organisations using cookies and tracking pixels have responsibilities to use these technologies lawfully, fairly and transparently.

“We want them to make it easy for people to understand what is happening to their information and whether they want to give their permission for it to be shared,” they told Computer Weekly.

“We expect organisations providing these technologies to take action too. All too often there’s a lack of accountability for how these tools collect and use people’s personal information, with poor transparency and deceptive design.”

The spokesperson added that the ICO wrote to more than 20 NHS trusts and health charities using Meta Pixel last autumn to remind them of their responsibilities.

In November 2023, we warned the top 100 websites in the UK that they faced enforcement action if their ‘reject all’ button for cookies was not as prominent as their ‘accept all’, achieving great success so far with more to come

ICO spokesperson

“We’re also engaging directly with companies providing these technologies, including Meta, to make our expectations clear. Tackling the potential harms caused by advertising technology is a priority for the ICO and we will not hesitate to act decisively to protect the public,” the spokesperson said.

“In November, we warned the top 100 websites in the UK that they faced enforcement action if their ‘reject all’ button for cookies was not as prominent as their ‘accept all’, achieving great success so far with more to come. We’ve noted the information that has come to light through this report and will be considering this matter further.”

In lieu of formal regulatory action from the ICO, delli Santi said antitrust legislation around digital markets – which there is a growing push for in Europe and the US – could help by stopping big tech firms from providing tracking technologies as standard in SDKs and other software development tools.

“Focusing on the small player like the charity is not going to solve the issue,” he said. “We know they probably never meant to share the data with these platforms in the first place, so there is a problem of market dominance and market dynamics which needs to be addressed.”

However, he added: “What we really need in the end is institutions which have the strength and integrity to actually take this action.”

For Richards, part of moving towards a fix is looking at how the modern internet has been built around tracking, and questioning the value of this to the general public.

“We’re getting content on social media platforms for free and we’re getting content from publishers which is subsidised. But that subsidy is not without cost,” he said. “A very large percentage of that is going straight to the advertising and tech industry to maintain a system that tracks us, to give profits to a few monopolies who managed to corner the market.”

However, regarding the technology itself, Richards added: “It needs the regulator to enforce the law.”

Invasive tracking ‘endemic’ on sensitive support websites

Websites set up by police, charities and universities to help people get support for sensitive issues like addiction and sexual harassment are deploying tracking technologies that harvest information without proper consent

Digital surveillance on sensitive sites

Tracking examples

A black box ecosystem

A lack of awareness

The role of big tech

Google and Meta respond

A lack of enforcement

Read more about online advertising

Read more on Business applications

Netherlands calls for European shift to post-tracking internet as privacy laws fail

What is a cookie?

What is a third-party cookie?

Explaining third-party cookies vs. tracking pixels