CrazyCloud - Fotolia
The number of businesses using NoSQL databases, which specialise in storing and managing unstructured and unpredictable data, is set to mushroom in the next few years.
For instance, a study from analyst firm 451 Research says the market for NoSQL systems will grow from $814m in 2015 to around $4.9bn in 2020.
However it will still make up a small fraction of implementations, despite a compound annual growth of 43%, according to 451. By 2020 the market share for NoSQL will be around 9%, compared with 90% for traditional relational databases.
NoSQL is finding most of its growth from new sets of applications built in response to data sources that have come on stream during the past 10 years.
“The majority of existing applications are all developed for relational databases. NoSQL has grown in support of new applications where developers are looking at their requirement for the database,” says 451 research director for data platforms and analytics Matt Aslett.
These areas include the internet of things (IoT), where the volume of data from remote, connected devices can be both difficult to predict and varied. Irish firm Temtra collects and manages data from remote utilities meters and has found NoSQL database Basho valuable in collecting and managing this data, which includes photos, keyed-in values and 15-minutes data streams (see case study, below).
NoSQL databases are also finding traction in startups, which are more likely to look beyond incumbent database providers in search of lower costs and a better technological fit, says Aslett.
Seenit, a collaborative video platform used by brands such as Adidas, British Airways and Bacardi, launched two and a half years ago. Its platform and analytics engine are built on the Couchbase NoSQL database (see case study, below). Meanwhile, in finance, startup Kaiko – which analyses Bitcoin transactions – has been using NoSQL database Cassandra to help create reports for its clients (see case study, below).
However, well-established businesses are also finding value in NoSQL technologies. Publishing firm Haymarket has migrated its websites to MongoDB, a NoSQL database (see case study, below). Meanwhile, BBC Worldwide, the commercial arm of the UK’s public broadcaster, is using a NoSQL database from MarkLogic.
Developers are driving the adoption of NoSQL where older businesses are looking to do new things, typically in improving customer experience on the web, according to Aslett.
“To some extent, this is a developer-led phenomenon,” he says. “Some of the most significant adopters are startups and younger IT experts who are more likely to look for alternatives to established database vendors.
“However, we are also seeing greater mainstream adoption. This includes architectural transformation projects in very large, quite traditional enterprises. As they move to a new architecture, they are considering all database alternatives, as well as where cloud plays a role and how DevOps can benefit the business.”
Even where it remains advisable to use more established relational databases, NoSQL could still be involved with the application stack, reckons Aslett.
“There are a lot of applications out there for which NoSQL is not a good fit, such as financial transactions, but sometimes NoSQL will play a part,” he says. “It might not be responsible for transactions, but it could be responsible for distribution of the data to multiple sources. It is not one type of database versus the other.”
Read more about NoSQL database technology in use
- How physicists in the Compact Muon Solenoid detector at the Large Hadron Collider at Cern benefited from a MongoDB NoSQL database management system that gave them unified access to data from a range of databases
- How Skillpages developed DataStax Cassandra to rank trades with NoSQL
- How ComparetheMarket.com migrated from Microsoft SQL Server to MongoDB as its business grew, slashing cycle time of idea to production
Temetra was founded in 2002 to store and manage data from water meters in Ireland. It has since grown rapidly to collect data from 12.5 million utility meters across the UK.
Co-founder and director Paul Barry says that unpredictable nature of the type and volume of the data it received was behind its decision to deploy the Riak NoSQL database from Basho.
“There is a wide variety in the data collected from meters,” he says. “We still have people keying in readings, right through to large-scale industry users who send data automatically every 15 minutes via a fixed-line network.”
The business started by supporting its customers with a PostgreSQL database, but soon found its limitations. “It was not that PostgreSQL could not cope with the volume of the data, but it became too tricky to administer the database,” says Barry. “With master-slave replication, it becomes much more difficult as volumes go up.”
Another advantage of NoSQL is its ability to adapt to new data types, he adds. “When we were starting out, changes to the schema were not such a big deal. But when you have 10 million meters, some of which are sending data every 15 minutes, and you are storing data for years, making changes could take hours.
“NoSQL allows us to store data without such a rigid format. You can start adding new data in a new format and over time adjust the older data to the new schema. It becomes a living format, and you have to write applications to accommodate this philosophy,” says Barry.
Founded in 2014, Kaiko collects and analyses trading data of the cryptocurrency Bitcoin. It redistributes the resulting intelligence in the form of reports and application programming interfaces (APIs).
Co-founder and software developer Vincent de Lagabbe says the team considered various SQL and NoSQL options before building a proof of concept on open-source NoSQL database Cassandra in late 2014. In 2015 it switched to a Cassandra distribution supported by Datastax to take advantage of its startup programme.
“The reason was a mix of maintenance, scalability, replication and scalability,” says de Lagabbe. “Currently, we’re holding around 6TB of data. If we wanted the same performance with SQL, we would need a very powerful server.
“Plus it would mean all the hassle of data replication – that is automatic with Cassandra. We are a small team, so we do not have a dedicated IT operations team. We are DevOps by default.”
“We are loading about 2-3 GB of data per day,” adds de Lagabbe. “We need a way to scale without changing the whole system. In traditional SQL you have to manage the sharding and distribution yourselves, and understand where it fits on the server.
“It is not impossible, but it is painful to do. We want to generate revenue and develop new products; we do not want to spend time managing operations when we only have a small team.”
Kaiko runs its main application on Amazon Web Services (AWS). It chose to license Datastax’s Cassandra distribution because it was more stable than the free open-source distribution and comes with OpsCentre software, which automates administration tasks that would otherwise be scripted.
The startup is also experimenting with using Apache Spark to help analyse data on its NoSQL database.
Founded in January 2014, Seenit is a platform that helps organisations direct and collect video from customers and employees. It offers scripts to engage users around a topic and features tools to edit, analyse and curate content.
Having used Couchbase in a previous role, Seenit chief technology officer Dave Starling looked to the NoSQL database to help in the startup’s expansion because of it allows data schemas to change over time.
“We did not know what our schema was going to look like in three months time. If we did not change it we were probably doing the wrong thing,” he says.
The ability to scale was also important, adds Starling. Seenit stores several terabytes of video data on the Google Cloud Platform, as well as tens of gigabytes of metadata. It also uses Google machine learning APIs to do analyis of video for specific characteristics, to help customers find the videos they want more rapidly.
The ability to be flexible and grow has been the main business benefit, according to Starling. “We develop in [programming language] Python, which has data structure directly synonymous with Couchbase,” he explains.
“The underlying technology is not a factor in what we want to build. If the head of product development says they really want to offer a service, we can do a proof of concept and not worry about the limitations of what database can do.”
It is not only startups and small businesses employing NoSQL. Haymarket Media Group dates back to the 1950s and is a leading UK business-to-business and consumer publisher. It is now on its second generation of technology to support its transition to web-based publishing, and is building its websites on NoSQL database MongoDB.
Head of architecture Peter Dignan says the move started when the company looked to update the technology underpinning its early consumer websites for its consumer motoring titles, which were built on Microsoft Active Server Pages and dated back to 2002. Over time the sites needed massive caching to support increasing demand, leading to a struggle in introducing more interactive, personalised features.
“When you moving cache above front end, you are reducing customisation. We couldn’t take personal data and use it to segment users and serve the right content back to them,” he says.
Haymarket began to move the motoring brands to MongoDB in 2011 to support greater agility, explains Dignan.
“It allows flexibility in the schema which helps when you’re trying to be agile,” he says. “In consumer space, that’s really important when you are trying to win eyeballs with new features. With MongoDB, you can add new features without worrying about the schema. It is much more reliable and does not require downtime for major schema change.”
The system in hosted in a private cloud by AWS. MongoDB has been introduced to support all Haymarket’s websites.