Big data is big news, but have you considered big data security?
Big data is built on analysis of data that is usually correlated from multiple sources. That makes big data security a non-trivial task because you need to know where the various data points that make up big data analytics sets reside and map out access and permissions.
In this podcast, ComputerWeekly.com storage editor Antony Adshead talks with Vigitrust CEO Mathieu Gorge about the definition of big data and what that means for big data security, as well as practical steps you can take to ensure you achieve big data security across all locations in your ecosystem.
What is big data?
Mathieu Gorge: So, let’s have a look at some important definitions to start with. If you look at data as facts and statistics, or figures collected about a particular topic, we very often talk about structured data versus unstructured data.
In the case of unstructured data, you have a piece of information, also known as a data point, which can be comprised of several sub-sections, known as sub-data points.
So, the idea of big data is that if you put data in context, and mix and match a number of data points within unstructured or structured data, you end up with information, and that is what big data is about – making sense of the relation between data in context versus a piece of information or data point that resides on its own.
Big data is about making sense of the relation between data in context versus a piece of information or data point that resides on its own
Mathieu Gorge, Vigitrust
What are the implications for security in the storage of big data?
Gorge: Big data is about the intelligence behind the collection and correlation of that information, providing you with information, such as predicted value of a particular topic, and mixing and matching particular data points to try to make sense of a particular topic.
In other words, what you are doing is mining the data, giving that data some value.
The whole idea behind security in the storage of big data is being able to know where data is, where it is being stored, who has access to it and being able to track it for data protection and compliance purposes and Freedom of Information requests.
What are best practices for securing an organisation's big data?
Gorge: To protect big data you need to understand what big data means for your organisation. The first thing to do is to look at what constitutes information for your organisation and break that down into data points.
Once you have identified the key data points, you can create a data [?] that will eventually fit into your overall ecosystem diagram that you normally use to manage security.
You also need to look at permissions and authorisations. For example, who can access the data depending on where it is, but also depending on how you access the data – whether it’s from the mobile device, from within the network or from an extranet.
The next thing to do is to look at the risk posed to the data on its own, or to big data, as in data correlated and data points that could be linked.
So, you have internal risks and third-party risks. With third-party risks we’re not just talking about criminals and hackers, but also third-party organisations that have a valid reason to access that data.
In summary, the best way to do it is to map out the data, have a data flow diagram, classify the data and data points, the correlation of links between those data points, and then apply the standard mix of technical solutions, policies and procedures, and training.
To protect big data you need to understand what big data means for your organisation
Mathieu Gorge, Vigitrust
Dos and don'ts of big data security
Gorge: In terms of "dos" you really need to keep data protected from the permission perspective and from the availability perspective, so if you are storing the data you need to make sure it is available to the right people at the right time.
The big "don’ts" of big data are not to be lured into big data marketing. A lot of suppliers use big data as a marketing tool. The reality is that most organisations have been using big data security for the past few years, and have had no choice but to do that, but it is not actually labelled as "big data security".
Big data security relies on the same principles as network security and storage security. It is a little bit more complicated in that it is looking at mixing and matching data that could be anywhere in your ecosystem, hence why the most important thing to do is to map your ecosystem and the data flow within the system.