This article presents a performance report of Atoti+ technology applied to a realistic IFRS 9 use case. If you want to see how you can easily build such a project using Atoti+’s Python API, head over to our Python-dedicated website atoti.io and check out these articles:
To go deeper on the topic, you can also download our eBook on IFRS 9.
IFRS 9 introduced various pain points to banks with large loan portfolios, one of them being the challenge posed by massive data volumes. In particular, the following aspects make the situation even more challenging and make it a true “Big Data” problem:
- Large lenders’s portfolios contain tens of millions of data points
- Portfolios with long-dated loans and associated financial products generate larger data volumes
- IFRS 9 requires tracking potential loan defaults over a 12-month period, which adds even more data points
In most cases, industry practitioners want to perform ECL variation explanation and vintage analysis. We will demonstrate how fast Atoti+ can aggregate and visualize a vintage matrix, perform an ECL aggregation from LGD, PD and EAD, and ECL variance explanation in both the absolute and relative sense.
For this test, we use Atoti+, the enterprise version of our technology, as opposed to the atoti community version. While atoti is great for prototyping and testing an IFRS 9 project, it has restrictions on data volumes that Atoti+ does not, which is obviously essential for a production project and for this performance test.
To simulate real-word data challenges, we created a dataset (123 GiB) based on a real consumer loan portfolio. Essentially it covers (for each reporting date):
- A 12-month period
- 4 legal entities (France, Germany, Italy, UK)
- 5 Segments (Credit Cards, Durables, Personal Loans, Automobile, Mortgages)
- 20 million clients
- 2 client types (Corporate, Individual)
- 40 million contracts
Here are the data models we used and a chart to illustrate contract distribution across different reporting dates, legal entities and segments:
- AWS EC2 Instance (r5.8xlarge) with 32vCPUs&256 GiB of Memory to deal with a half dataset
- AWS EC2 Instance (r5.16xlarge) with 64vCPUs&512 GiB of Memory to deal with a full dataset
We’ve prepared a set of dashboards above that help solve for IFRS 9 but the following relate to the most important use cases.
Day-to-day (DtD) Variation Monitor
Let’s take the EAD Analysis Dashboard as an example. The EAD evolution screen finishes data load in 1.55 seconds with EAD from 40 million contracts aggregated for each reporting date, legal entity and segment.
Expected Credit Loss (ECL) Variation Explanation
This sequence starts from an ECL Overview Dashboard that depicts the ECL and the month-over-month ECL variation breakdown by segments, stages and with a “new contract” flag. It takes 1.36 seconds to finish an on-the-fly ECL calculation at the contract level, the aggregation of 40 million contracts and then display the results.
Next we can right-click on any figure in Overview & Segments to drill down to Client-level ECL variation and explainers — the Probability of Default (PD) explainer, the Loss Given Default (LGD) explainer and the Exposure at Default (EAD) explainer. It takes a bit longer for this screen to pop out (1min 50s) as it fetches 100 (those with the highest ECL DtD variation) out of 20 million client-level ECL and explainers.
Finally we drill down to Contract Level ECL variation and explainers. This page displays in only 0.36 seconds.
Vintage Analysis is popular in Credit Risk Management.
This shows the number of cases with an EAD of more than 30 days past due per reporting index and opening date. The graph on the right illustrates the same in a relative sense. It takes 5.4 seconds to do a days past due calculation in a contract level (40 million) and to filter out cases with more than 30 days past due.
Our solution provides several different forms of vintage matrix. For instance, this one shows the cumulative percent of contracts with more than 30 days past due against months that have completed since the origination date. The filter of bad contracts and cumulative percent calculation against all contracts is accomplished in 1.06 seconds.
Query Time Spent
To prove the scalability of Atoti+, we first compare the impact of using 32 vCPUs vs. 64 vCPUs with half the dataset.
Here is the result: query time columns show the time spent for each dashboard to be loaded for display; the performance enhancement column represents the amount of time (relatively) saved after switching from a 32 vCPUs to a 64 vCPUs machine.
In most cases, the 64 vCPUs machine produces a significant query time improvement – some of them even achieve an almost 50% improvement. In the cases where the difference is not much e.g. 03_ECL_Analysis_Drill_to_Contracts, it’s because the query time is extremely short; i.e. less than 300 milliseconds in this case.
CPU usage is similar across different tests. Here is an example from the test with a 32 vCPUs machine against the half dataset. Clearly we are able to hit a very high range (80% to 100% ) in most cases.
40 Million Contracts
Now we present the dashboards query time when the full dataset is loaded.
The columns query time time for full/half indicates the time spent for each query with full/half dataset. The performance diff column shows the relative additional query time spent compared to the one with the half-dataset.
The results show the query time doesn’t double. Moreover, we even achieved a faster query in one Vintage analysis dashboard (07_Vintage_Matrix_percent_of_Contracts).
What makes it fast
The main factors are the common technologies leveraged in our in-memory aggregation and calculation framework e.g. Parallel Partition, Parallel Bitmap Aggregation.
In addition, a lot of research and development has gone into tailoring Atoti+ to perform risk data aggregation more efficiently. For instance, the “sum_product” function is one area of performance improvement we recently achieved.
Here is an example of the Atoti+ “sum_product” aggregation function for an on-the-fly ECL calculation implementation:
ecl_stage_1 = tt.agg.sum_product(contracts_store[‘LGD’], contracts_store[‘EAD’], contracts_store[‘PD12’])
ecl_stage_2 = tt.agg.sum_product(contracts_store[‘LGD’], contracts_store[‘EAD’], contracts_store[‘PDLT’])
ecl_stage_3 = tt.agg.sum_product(contracts_store[‘LGD’], contracts_store[‘EAD’])
m[‘ECL’] = tt.filter(ecl_stage_1, l[‘Stage’] == 1) + tt.filter(ecl_stage_2, l[‘Stage’] == 2) + tt.filter(ecl_stage_3, l[‘Stage’] == 3)
We picked it not just because it’s included in this case, but also because it’s widely used in risk aggregation such as SIMM, the FRTB standardized approach and the CVA risk-based capital charge, among other financial regulations.
These results show that Atoti+ is one of very few solutions capable of handling this specific “Big Data” challenge of monitoring IFRS 9 on large consumer loan portfolios. The popular vintage analysis dashboard, for instance, is displayed almost instantly in Atoti+.
Even in this simplified use case, it proves Atoti+’s ability to scale up to handle a portfolio of hundreds of millions of contracts.
Furthermore, Atoti+’s Python API makes it possible to build a comprehensive, fully-functional and high-performance IFRS 9 business application within hours rather than weeks, especially if you use our existing guides as a starting point.