Big data tutorial: Everything you need to know
A comprehensive collection of articles, videos and more, hand-picked by our editors
Big data has a range of practical and commercial benefits to businesses but can be fraught with privacy and legal issues. With a projected global growth at a rate of 40% per year, raw digital data is a resource which many companies are turning to in their quest for market advantage.
While gathering information is a very costly exercise, requiring extensive storage facilities and a solid privacy compliance system, this has become the mission of some of the larger profile-based websites with an eye to the commercial value.
Social platforms such as Google and LinkedIn have created a revenue stream from the data collected by supplying businesses, individuals and public institutions with an insight into the behavioural patterns of their target audiences.
Key issues around data protection and privacy can be addressed by making sure that data is anonymous, which will avoid having to tackle problems around the data subject’s consent. Guidelines from the Information Commissioner’s Office (ICO) include a Code of Practice on Anonymisation, which is aimed at helping companies managing risk.
Checking the source of big data
The source of the data set needs to be investigated. Is the data free to use? Unless specifically released as open data, most data sets will be subject to some controls in relation to their use. A business should identify any licence terms on which data is supplied and use this to engineer protection in the form of warranties in the agreement with the data owner.
If the data collected is going to be used in a commercial service delivered to users by the business, then those customers will expect to receive assurances as well.
If a business using big data is not clear on the extent to which it can re-use this, then it will not be able to give its users the comfort they will seek. It is better to address these issues as far up the “data chain” as possible.
The issues are not new, but the scale of big data means businesses will need to be multi-tasking as they consider the consequences of using data from multiple sources. In practice, the user will need to reply on the assurances or warranties it can obtain in the agreement with data set owners.
How can a business become big data compliant?
Since the legal framework which regulates the big data business model is based on existing principles of intellectual property, confidentiality, contract and data protection law there are a number of solutions which can be used equally well with big data as they might with any other licence arrangements, since that is generally what the big data business model boils down to.
Read more about big data
Use of big data which has not been anonymised is clearly an area of risk. However, even where anonymised data is supplied a business would be wise to request a warranty from the supplier in which it is giving assurances that the data is fully compliant with Data Protection Act (DPA) requirements.
This can include scrutiny of the information on use of data and privacy which was given to the data subjects at the point their data was collected. Transparency is important, so telling an individual in a privacy statement accessible at the point the data is collected that their data may be used and disclosed to others in anonymised form is good practice and the credibility of the data set will depend on it.
Warranties will also be required in relation to the ownership of and “freedom to use” the data to avoid disputes arising as a consequence of an infringement of intellectual property rights or a breach of confidentiality.
A data set will be protected under English law by Database Right. The rights in this will belong to the person who takes the initiative in “obtaining, verifying and presenting the content of a database, while assuming the risks involved in doing this” to quote from the Copyright and Rights in Databases Regulations 1997. This is an automatic right of ownership and should be respected.
Use of big data which has not been anonymised is clearly an area of risk
Kim Walker, Thomas Eggar LLP
A database which has been substantially formed by collecting data from various different databases may also be able to enjoy the benefit of the Database Right and this is a valuable way in which to regulate the onward use of a solution which is based on a database.
The ability of a user to re-use or resell the data provided needs to be clearly set out in the licence agreement in this stage of the supply chain. This way the value of the assembled data can be controlled with terms which restrict the use of the data purchased to within the user’s business.
The relationship between the data supplier and the business seeking to use big data will provide the channel in which to deal with the key issues. A licence or service agreement should contain contractual terms which protect the business and ensure that the revenue stream is properly secured.
Kim Walker (pictured) is a partner at law firm Thomas Eggar LLP.