New options for in-memory computing are coming to the public cloud: Microsoft Azure just announced Ms-Series instances with 2TB of memory and 128 cores (!).

Servers with terabytes of memory are still considered high-end in the enterprise, even in 2017 they’re hard to get, harder than the smaller, standardized boxes, which can get in the way of teams building in-memory computing solutions with ActivePivot. This situation has just come to an end: today you can get those servers with a snap of the finger, without planning.

We’ve tested the new Ms-Series instances with ActivePivot on a data- and compute-intensive financial workload, called “Expected Shortfall”. Beyond just a benchmark of the new hardware, we also looked at all the practical aspects of operating the solution in real-life such as dynamic resource allocation and IO performance from cloud storage.

Complex calculations on the fly

Expected shortfall (ES) is a market risk calculation that was brought forward by financial regulators with FRTB (Fundamental Review of the Trading Book). FRTB is a new framework for banks to calculate and report the risks in their trading books, and determine how much business they can do for a given capital (a very hot topic obviously!). According to this upcoming regulation, banks who want to estimate risk their own way (following the “Internal Model Approach” or IMA) must use Expected Shortfall.

To put it short, ES is a better estimate of potential losses than Value at Risk (VaR), the previously used indicator, but ES relies on tens of time more simulations than VaR. A large financial institution can easily generate half a terabyte of raw data just for one day, running large compute grids with thousands of cores. Indeed a single trade in the bank can generate tens of simulation vectors, depending on liquidity horizons, risk factors, standard vs stressed market conditions…

FRTB Use Case

To calculate ES, vectors must be bucketed and aggregated the right way, before statistics can be applied (quantile, expectation…). This used to be done in “end-of-day” batches producing pre-canned reports, but today the levels of expectation for productivity and precision have raised and require interactive solutions with short response times.

To reach that goal ActivePivot puts the power of in-memory computing in the hands of business users, and lets them do everything on the fly, analytics and calculations as well.

ActivePivot on Azure Ms-Series

To operate our Ms-Series instance we use the Linux operating system. A look at /proc/cpuinfo tells us what’s in the system:

  • 4 processors, Intel Xeon E7-8890 v3 @ 2.50GHz, 16 cores per processor with HT
  • 4 NUMA nodes
  • 2TB of memory (about 500GB per processor)

ActivePivot on Azure

Those are fairly recent and powerful processors with “Haswell” architecture. It’s not just a lot of memory but also the processing power to handle it. The clock frequency on the processors is average so the true power comes from the large number of cores, 64 physical cores, and 128 hardware threads with hyperthreading enabled. Very few workloads really take advantage of that much parallelism and so many cores. To achieve that goal ActivePivot uses work-stealing thread pools that maximize the usage of the cores and lock-free data structures to reduce contention among threads. We’ll see how it goes.

This server has a NUMA architecture, the memory chips are distributed among the processors. When a processor reads data from its local memory chips, the performance is maximal, but when it accesses a remote memory chip, the performance is degraded. This is an important concern, as most data-intensive workloads suffer a performance drop when running on NUMA system. However, ActivePivot is NUMA aware and can assign data partitions in memory to the right processor, minimizing cross-node data movements and maximizing memory bandwidth. For the benchmark we leave the NUMA configuration set to default mode (one thread pool per NUMA node).

NUMA architecture

High-throughput data loading from cloud storage

In the benchmark we load two consecutive days of simulations, to calculate Expected Shortfall and also analyze day-to-day variations. For 10M trades (12 vectors per trades on average), that’s about 1TB of raw data.

We did not want to use SSD drives attached to the Ms-Series instance: the coupling between storage and compute creates rigidity and bringing the data on the SSD would add some delay. Instead we load the data directly from the cloud object storage (Azure Blob Storage) where it was written by the compute grid generating the simulations.

Object storage is a simple technology that distributes files on a cluster of storage servers. Data is accessed as a service, through HTTP protocol. Used directly, throughput is not great and limited by the HTTP protocol. One single connection to blob storage can download data at a rate of up to a few tens of megabytes per second but not more.

ActiveViam have developed a cloud connector that opens tens of HTTP connections to download chunks of the data in parallel, transparently reassembling the chunks in the ActivePivot server.

Parallel Loading

After tuning the connector we were able to load the dataset of 1TB in 9 minutes. It means we reach a throughput of 2GB/s between cloud storage and a single ActivePivot instance. This is very fast and reveals that the MS-Series come with at least 20Gbp/s networking. It also demonstrates that cloud storage scales very well in the presence of multiple connections, but of course we already knew that.

ActivePivot ingests the data as it goes and only needs about a couple more minutes at the end to finalize its data structures, then it’s ready for interactive analysis. The Ms-Series itself is up and running about 2mn after you request it on demand. Overall this advanced in-memory analytics solution goes from nothing to locked and loaded in 15mn! It’s terabyte-scale but it’s agile enough to be used on demand.
Interactive multi-core analytics in seconds
Let’s now look at analytics performance, and how fast the calculations are performed by ActivePivot once the data is in.

The following dashboard that executes more than 10 FRTB related queries on the entire dataset loads in about 15 seconds.

FRTB dashboard

Let’s look more specifically at Expected Shortfall calculations:

First result: it takes about 6 seconds to compute the total Expected Shortfall of the bank, including all the vectors for all the trades. Not bad for a workload that used to take hours with the previous generation systems…

FRTB Benchmark on Ms-Series

ActivePivot can further accelerate those calculations by pre-aggregating the facts for a selection of dimensions. The ActivePivot engine detects automatically which queries can be answered from the materialized aggregates. In the case of Expected Shortfall, pre-aggregation at the risk factor level is very efficient: it makes all the top level queries split second.

Here is a series of various queries benchmarked with and without pre-aggregation

Query From raw data With pre-aggregation
Top level expected shortfall 4.6s 473ms
Expected shortfall for each desk and for each book 12s 2s
Expected shortfall for one single book 1s 70ms
Day-to-day variation of expected shortfall for one book 1.2s 70ms

The right technology for the job, right now

Conclusion: one single Ms-Series instance powered by ActiveViam delivers interactive “Internal Model” FRTB analytics on millions of trades. This is the most advanced market risk analytics platform, capable of calculating Value at Risk or Expected Shortfall on the fly, at any level of detail. Furthermore, by leveraging the power of public cloud infrastructures it only takes 15mn to get it up and running.

Clearly this is a new perspective on cloud computing. Public clouds started with renting commodity hardware by the dozen, but now they offer SSDs, GPUs, hyper-fast networking, large memory servers, etc. In a reversal of the existing trend, hardware in the public cloud has become more powerful, more recent, and more varied than in the enterprise.

It means that each workload can benefit from the right hardware, and this gives a clear advantage to “best-of-breed” solutions that select the best technology for each task and operate components optimally. Solutions that fire a 1000-nodes compute grid with GPUs, store the results in cloud storage and put them in the hands of business users who analyse it interactively with in-memory computing.