Atoti+ for Risk Management Aggregation – Comparison with the ClickHouse benchmark

Categories
Blog Financial Services Technology

As data volumes and market complexity have increased, the financial services industry is ever more reliant on technologies that can quickly and seamlessly help them perform data analytics.  

Performing calculations on hundreds of millions of rows of fast-moving data in real-time exceeds mainstream database capacities.Traditional databases cannot handle dynamic bucketing, non- linear, multi-step aggregations, statistics over simulations and time series in real time.

What’s more, identifying technologies that can handle these tasks is difficult. Standard analytical benchmarks such as those found on the TPC do not support this. 

This makes public benchmarks focused on fintech applications of this complexity extremely useful, and this is also why a recent blog post by Altinity on calculating Value-at- Risk (VaR) caught our eye. In that article, the ClickHouse database is compared to the AWS Redshift database for a VaR aggregation use case – that being non-linear aggregation of historical returns. In order to perform at interactive speed, that type of aggregation requires native support for array aggregation. Redshift is a popular general purpose data warehouse but does not support array aggregation and, therefore, cannot really be compared to more specialized technologies.

ActiveViam has been performing VaR aggregation on terabytes of data for nearly two decades for dozens of multinational banks and use cases ranging from front-office PnL to credit, market and liquidity risk as well as solving for regulatory requirements. 

In the blog, Altinity invited some “friendly competition”. Respectfully, we’re up for the challenge.

We ran the exact same benchmark using Atoti+ and found that Atoti+ (as well as its free community edition atoti) is not only faster, but less expensive to run. Specifically:

  •  The Atoti+ engine appears consistently five times faster than the ClickHouse engine.
  • Atoti+ appears to offer similar or better performance than ClickHouse, while running on an eight-times less expensive server.

Atoti+ and atoti, both powered by the same underlying ActivePivot technology, offer advanced data modelling features including native array aggregation and multidimensional cubes. Atoti+ does not require users to install, configure and manage a database like ClickHouse. It can work directly on the data like other popular Python libraries such as pandas.

Now we will show you how we ran the same benchmark referenced in the Altinity blog using Atoti+.

We use the same data generator that produces a dataset with:

  • 1,720,000 rows
  • One array of 1000 historical returns per row
  • About 7GB in size

We load it into Atoti+and define the “VaR” measure. It’s all contained in this small notebook.

(This is a very trivial way to calculate VaR. If you’re interested in more sophisticated analytics and what-if analysis please visit the public atoti/Atoti+ Python repository of fintech use cases at https://github.com/atoti/notebooks/tree/master/notebooks).

Then we benchmark the four queries described in the reference blog post, reusing the VaR measure. Atoti+ is running on an AWS m5.8xlarge instance, in the same condition as the ClickHouse benchmark.

Q1

cube.query(m["ValueAtRisk"], levels=[lvl["str0"]])

Q2

cube.query(m["ValueAtRisk"], levels=[lvl["str0"], lvl["str1"], lvl["int10"], lvl["int11"], lvl["dttime10"], lvl["dttime11"]])

Q3

cube.query(m["ValueAtRisk"], levels=[lvl["str0"], lvl["str1"], lvl["str2"],lvl["str3"], lvl["int10"], lvl["int11"], lvl["int12"], lvl["int13"], lvl["dttime10"], lvl["dttime11"], lvl["dttime12"], lvl["dttime13"]])

Q4

cube.query(m["PnL at index"], levels=[lvl["str0"], lvl["Scenarios"]], condition=(lvl["str1"] == "KzORBHFRuFFOQm"))

Here the queries are described in the Atoti+ Python API, a convenient way to run the benchmark directly from a notebook. Atoti+ also embeds an online user interface similar to Tableau and PowerBI, so in real life end users would start analysing VaR immediately from their browser, without having any query to write and without any additional software to deploy.

QueryClickHouseAtoti+Atoti+ advantage
Q1720 ms121 ms6 x
Q21040 ms194 ms5.4 x
Q31050 ms223 ms4.7 x
Q4450 ms65 ms7x

The Atoti+ engine is consistently five times faster for this use case than the ClickHouse engine, according to the Altinity test.

Behind that performance is special memory management, multicore processing, query compilation and proven performance in working with the largest financial institutions in the world that have been pushing the Atoti+engine in production for years. If you want to know more, you can watch this JavaOne presentation that discloses some of the secrets.

While a traditional database is sticky to its underlying hardware and persistent storage, Atoti+ with its Python library can be used on demand and anywhere, with a clean decoupling between storage and compute.

Additional benchmarks

Running the benchmark with Atoti+on a smaller AWS instance: m5.xlarge (4 CPUs, 16GB RAM, 0.19$/hour)



Query
ClickHousem
5.8xlarge 1.54$/hour
Atoti+
m5.xlarge 0.19$/hour
Q1720 ms663 ms
Q21040 ms798 ms
Q31050 ms877 ms
Q4450 ms110 ms

Atoti+ is faster or equivalent to ClickHouse, while running on an eight-times less expensive server.

Running the benchmark with Atoti+ on the same m5.8xlarge instance than ClickHouse, but duplicating the dataset 6 times:

  • 10,000,000 rows
  • One array of 1000 historical returns per row
  • About 42GB in size


Query
ClickHouse7GB datasetAtoti+42GB dataset
Q1720 ms610 ms
Q21040 ms740 ms
Q31050 ms770 ms
Q4450 ms90 ms

In sum, Atoti+can process a dataset six times larger than ClickHouse in less time and for the same cost. If you are interested, you can easily try it for yourself by reproducing this test with atoti, the free community version of Atoti+.

Contact us if you’d like more information on this use case and on this benchmark.