ArtemSam - Fotolia

Content filtering a potential challenge in digital single market

The proposed digital single market directive is intended to harmonise e-commerce and copyright throughout the European Union, but concerns have been raised over the technological impact this would have on UK industry

The digital single market is a new directive currently being debated within the European Commission, intended to harmonise e-commerce throughout the European Union (EU). It has been estimated that this new legislation could potentially contribute €415bn a year to the EU economy, creating hundreds of thousands of jobs.

But there are concerns that Article 13 of the directive could have a massive impact on the tech industry, both in the UK and for anyone wanting to operate within the EU. Article 13 is intended to provide copyright holders with the means to protect their works from illegal sharing. However, the broad wording used in Article 13 means this could have wide-reaching implications throughout the technology sector, in terms of both technological issues and legislative difficulties, including with the General Data Protection Regulation (GDPR).

Section 1 of Article 13 in the legislation states: “Information society service providers that store and provide to the public access to large amounts of works or other subject matter uploaded by their users shall, in co-operation with rightholders, take measures to ensure the functioning of agreements concluded with rightholders for the use of their works or other subject matter or to prevent the availability on their services of works or other subject matter identified by rightholders through the cooperation with the service providers.

“Those measures, such as the use of effective content recognition technologies, shall be appropriate and proportionate. The service providers shall provide rightholders with adequate information on the functioning and the deployment of the measures, as well as, when relevant, adequate reporting on the recognition and use of the works and other subject matter.”

As it is currently worded, Article 13 essentially proposes that any online platform that hosts content uploaded by its users will be obligated to identify, filter and block files that infringe copyright.

Currently, rightholders rely on using takedown notices to request hosting companies to remove copyright-infringing content. The proposed Article 13 now puts the onus very much on the hosting company, rather than the rightholder.

Many of the larger online platforms, such as YouTube or SoundCloud, already employ content recognition filters in order to avoid their users uploading copyrighted content. However, many of the smaller platforms do not. “This would be a huge financial obligation,” says MEP Julia Reda. “YouTube invested $60m in the development of Content-ID.”

Further complication

Many existing content recognition technologies are focused on detecting audio/visual formats. However, the broad wording of the proposed article implies that all copyrighted content – such as e-books, research papers or images – must be filtered. This adds a further complication, because content recognition technologies are currently not geared towards text-based or image filtering. 

Also, the broad definition of “information society service providers that store and provide to the public access to large amounts of works” means that the scope goes way beyond the media platforms that were the initial focus of the directive. This definition could include cloud storage companies such as Dropbox, as well as Remote backup services such as iDrive. “Even if [a platform host] would not be required to install a filter, the proposal would in any case make [them] liable for any copyright infringement of its users, regardless of whether it is giving public access or not,” says Reda.

Cloud storage services do not, by default, provide public access to a user’s uploaded files. However, because users can choose to make files public, these providers could come under the remit of the proposed directive. Also, these companies could be indirectly compelled to install content filters. “Any web host that uses an algorithm to sort the content would be stripped of its limited liability for its user’s content,” says Reda.

This follows from the re-interpretation of the liability exemption in Article 14 of the e-commerce directive. Recital 38 (2) of the proposed digital single market reads: “In respect of Article 14, it is necessary to verify whether the service provider plays an active role, including by optimising the presentation of the uploaded works or subject matter or promoting them, irrespective of the nature of the means used.”

This interpretation is not in line with the e-commerce directive, or its interpretation by the Court of Justice of the European Union (CJEU), which has always maintained that what matters for the application of Article 14 is the knowledge of illegal activity going on.

Extranet providers could be similarly affected, if they are considered public. Extranets are platforms that enable companies to collaborate with contractors and subcontractors on large multidisciplinary projects. However, given the often confidential nature of these projects, which may include commercially sensitive or confidential information, this creates serious concerns.

Cloud storage providers often use encryption to protect their users’ content, which could hinder content filtering. “Dropbox and other cloud storage providers talk about their content being encrypted as it is going into store,” says Colin Tankard, managing director of Digital Pathways. “So the ability to actually scan that content for any licence infringement becomes impossible.”

Apart from the privacy aspect, there are significant security concerns over the fact that all content must be filtered for any possible instances of copyright infringement. If the content recognition has been supplied as a service, that means an external party could potentially be scanning confidential or sensitive information. “Technologically, you could provide that software as a service, plug-in,” says James Macintyre, technical director of Affinity Digital. “However, I would have some severe reservations as to how that data is interrogated.”

If the filtering is provided internally, then, according to the GDPR, these companies will not just be classed as data controllers (as the hosting company will retain user data), but also as data processors, because the filtering of the files will be classified as data processing under the terms of the GDPR. “Article 13 very much muddies the water, as you would need to be declared as a processor of that data,” says Macintyre. “This adds another layer of complexity to a project and so increases costs.”

As well as in-house content recognition systems, such as Content-ID or SoundCloud’s Content Recognition system, there are also external content recognition services available, such as Audible Magic. “Software will take a fingerprint from an uploaded file and send it to us,” says Mike Edwards, vice-president of licensing and European operations at Audible Magic. “This is checked with our database and we send back a response. Either it does not match anything, or, yes it does, and this is what it matches.”

The accuracy of content recognition technologies, whether it be false positives (where files are incorrectly identified as containing infringing content) or false negatives (where infringing content is missed), is also subject to debate. “There is an independent study that says almost a third of takedown requests are problematic, leading to frequent takedown of legal content,” says Reda.

This high number of takedown requests could be because of copyright owners relying on only the filename or metadata. Inferior content recognition technology could also be prone to a high level of false positives.

“We will recognise a copyright work, if the reference fingerprint is in the database, 99.9% of the time,” says Edwards. “Our false positive rate is extremely low – effectively zero – because our technology has been used for litigation support, so it has to be very accurate.”

Copyright exceptions

Article 13 does not mention whether content recognition technologies should recognise copyright exceptions, of which there are 22 in EU copyright law. These can range from copyrighted content being used for the purpose of parody, to research and private study.

This would mean hosting companies would need to dedicate further resources than are first implied in this proposal. Employees who are familiar with copyright law will be needed to review infringing files, because these might be included in the list of EU copyright exceptions.

Also, copyright exceptions vary between countries. Although countries’ copyright laws cannot add exceptions, they can choose which exceptions they want to use. This jurisdictional issue adds a further legislative headache, as a user in one country may upload a file to a platform in another country, which is viewed by the public in a third country.

The sheer volume of data that would have to be scanned and approved before being uploaded would be staggering, and is increasing every day. “Last year we processed, for the first time, more than a billion identification requests in a single month,” says Edwards. “Twelve months later, we have processed more than two billion in a single month.”

Finally, Article 13 may not just apply to all content that is being uploaded, but to any that is currently available. “As it does not say anything to the contrary, I would say this proposal applies retroactively to all the content that is already online,” says Reda.

Copyright law needs to be revised to meet the demands of an ever-changing technological landscape, one that recognises the global nature of technology while ensuring that creators’ rights are respected and ensuring they are properly compensated for their work.

However, as it currently stands, the wording in the proposed Article 13 of the digital single market does not provide the required clarity and could potentially be harmful to the technology sector. “It is one of those situations where it is going to be really hard to enforce and practically impossible to deploy,” says Tankard.

Read more on Hackers and cybercrime prevention