In our voyages from software conference to software conference, we technology journalists often find stories that are developing and worth sharing.
At Percona Live Europe last week, one such example came up around the open source scene that is developing in Russia and how one of the projects that is now starting to open up to international use.
Think about Russia typically… and you may not automatically think about open source software. However, the country has a strong software developer community that is looking to expand the number of projects that are used internationally.
An example of this is ClickHouse, an open source data warehouse project found on GitHub here. The technology was originally developed at Yandex, the Russian equivalent of Google.
As defined on TechTarget: a data warehouse is a ‘federated repository’ for all the data collected by an enterprise’s various operational systems – and the practice of data warehousing itself puts emphasis on the ‘capture’ of data from different sources for access and analysis.
ClickHouse’s performance claims to exceed that of comparable column-oriented database management systems (DBMS) currently available. As such, it processes hundreds of millions (to more than a billion) of rows — and tens of gigabytes of data per single server, per second.
According to its development team, ClickHouse allows users to add servers to their clusters when necessary without investing time or money into any additional DBMS modification.
According to the development team notes, “ClickHouse processes typical analytical queries two to three orders of magnitude faster than traditional row-oriented systems with the same available I/O throughput. The system’s columnar storage format allows fitting more hot data in RAM, which leads to a shorter response times. ClickHouse is CPU efficient because of its vectorised query execution involving relevant processor instructions and runtime code generation.”
The central go-to-market proposition here is that by minimising data transfers for most types of queries, ClickHouse enables companies to manage their data and create reports without using specialised networks that are aimed at high-performance computing.
The technology, which is essentially aligned for Online Analytical Processing (OLAP), uses all available hardware to process each query as fast as possible, which amounts to a speed of more than 2-terabytes per second.
The project is starting to expand and get more adopters. As part of its monitoring product launch at the event, database monitoring and management company Percona announced that it will use ClickHouse for load testing and to monitor accessibility and other performance KPIs. Percona’s leadership team originally hails from Russia, so there are a lot of relationships there as well.
Alongside Percona, Altinity is also looking to expand use of ClickHouse over time. Robert Hodges, CEO at Altinity describes the company as a provider of the highest ClickHouse expertise on the market to deploy and run demanding analytic applications. The company also provides software to manage ClickHouse in Kubernetes, cloud and bare-metal environments.
Explaining how his firm has developed alongside the core ClickHouse technology proposition, Hodges says that the enterprise version of ClickHouse can run on laptop, yet be ready to scale up for significant enterprise workloads.
“ClickHouse is very efficient at processing and handling time-series data…. and it has SQL features which are great at monitoring specific cloud issues such as ‘last point query’ [a way of looking at the last thing that happened in a cloud application]. Say for example you had a bunch of Virtual Machines (VMs) running in the cloud and you wanted to know the CPU load on them, ClickHouse is good at that getting that measure to you. It’s good to know to the ‘current state’ of VMs, because from that point you can then drill-in a look at the load over (for example) the last two weeks and so get a sharper idea of performance status,” said Altinity’s Hodges.
Hodges also explains that Percona is interested in ClickHouse because it is so similar to MySQL – it has ‘surface similarities’ and has good abilities to load data into it and pull data down from it.
This project is an example of how open source communities can expand with new approaches to existing problems.