Historically, the development team has had a bit of a two-edged sword when it comes to their development environment. It has tended to be separate to the production environment, so they can do whatever they want without any risk to operational systems. The network tends to have been pretty self-enclosed as well, so they get super-fast speeds while they are working. However, those positives are also negatives as they then find what had worked so blazingly fast in the development environment fails in the user experience stakes in the production environment due to slower servers, storage and networks.
On top of that, the development infrastructure has its own issues. Provisioning development environments can take a long time, even where golden images are being used for the base versions. Tearing down these environments after a development cycle is not as easy as it can be. Declarative systems, such as the open source Puppet, allow for scripts to be written that can set up environments in a more automated manner, but it still leaves a lot to be desired.
Physically configuring hardware and software environments, even with the help of automation like Puppet, still leaves the problem of getting hold of the right data. Making physical copies of a database at a single point in time or taking subsets as a form of pseudodata does not address the central issue. In neither case is the data a true reflection of the current real world production data – and the results from the development and test environments cannot, therefore, be guaranteed to be the same when it is pushed into production.
Trying to continue with inconsistent data between development, test and production environments will be slow and costly. Alongside the lack of having a development and test environment that reflects the real world, attempting to work around this involves taking full database production copies and regularly refreshing them which is a lengthy process and will also affect the performance of the operational network itself. Organisations are particularly struggling with continuous integration, where an application requires data from multiple production databases (e.g. Oracle, Sybase and SQLServer). As developers move towards looking at big data for their organisation, the problem gets worse – now, multiple different databases and data types (for example, Hadoop, and noSQL sources alongside existing SQL sources) may need to be used at the same time – bringing these together as distinct copies across three different environments is just not viable.
Continuous development, integration and delivery require systems that are adaptable and are fast to set up and tear down. Existing approaches make agile development difficult, requiring cascade processes that take too much time and involve too many iterations to fulfil the business’ needs for continuous delivery.
What is needed is an infrastructure that bridges that gap between the different environments. Server and storage virtualisation does this at the hardware level, and virtual machines and container mechanisms such as Docker allow for fast stand up of development and test environments within the greater IT platform. However, there still remains the issue of the data.
To create an effective data environment needs the capability to use fresh data – without impacting adversely on overall storage needs or in the time required to set up and tear down environments. The common approach of database snap shots, clones or subset copies involves too many compromises and costs – a new approach of using an abstraction of the database into a virtual environment is needed.
The data virtualisation technology I’ve been looking at from start-up Delphix does just this. It can create a byte-for-byte full size virtual “copy” of a database in minutes, using near live data and requiring barely any extra storage. The data created for development and testing can be refreshed or reset at any point and then deleted once the stage is over. Suddenly each developer or tester can have their own environment without any impact on infrastructure.
By embracing DevOps and data virtualisation, everyone wins. Developers and testers get to spin up environments that exactly represent the real world; DBAs can spend more time on complex tasks that add distinct business value rather than creating routine copies. Sysadmins don’t have to struggle with trying to deal with the infrastructure to support multiple IT dev/test environments; network admins can sleep easy knowing that huge copies of data are no longer being copied back and forth and storage admins don’t have to have their capacity drained by pointless copies of the same data.
More to the point, the business gets what it wants – fast, continuous delivery of new functionality enabling it to compete far more strongly in its market. All without having to invest in large amounts of extra hardware and time – without the end result being guaranteed.
Disclosure: Delphix is a Quocirca client