Relational and multidimensional databases differ on almost any possible dimension: tables, columns and rows vs. cubes, measures and dimensions; queries across joint tables vs. pre-calculated aggregations across dimensions; Structured Query Language (SQL) vs. Multidimensional Expressions MDX. And the list of goes on and on.
Nevertheless, talk to a vendor from any of these two camps, and he will argue that the system can successfully perform any task. Cross-dimensional analytics, for instance, can be performed using both types of systems.
The Rebirth Of Multidimensional Analytics – Extract of the ActivePivot User Group 2013
In most cases, it’s best to leave the technological evaluations aside, and start with a close examination of your needs. Who are the personas that will analyze data and for what purpose? What will be the depth of the analysis? How often will analysis be run? How quickly should results be displayed? These are just some of the question you should ask yourself.
Here are some tips on the issues to be considered when evaluating your options – so that you can answer the most important question – ‘which system is best for you?’
What is the depth of your analysis?
One of the fundamental questions to begin with is the depth and complexity of the analysis. Will your analysis involve only a few static dimensions, like in “the profitability for the third quarter across the Northeast region for the XYZ product line?” Or alternatively, will queries potentially involve dozens or even hundreds of dimensions? An e-commerce business may have multiple online stores, each offering thousands of products (toys, books, IT etc…). Each online store has various attributes which you want to cross-join with information about inventory, competition pricing, and traffic information in order for you to analyse data in a relevant manner. This could easily end up with hundreds of dimensions.
If the planned business analytics will only involve a few dimensions and simple logic, you may not need more than a relational database solution. If this is not the case, then you should consider multidimensional DBMS systems.
Another aspect is the involved business logic. Does your analysis require straight-forward computations, such as summing up numbers or calculating averages, or does it involve more complex logic with matrices, statistical computations or predictive measures? Do you need to perform ‘what-if’ analysis to evaluate alternative scenarios prior to help you make business decisions? For such complex analysis you’d want an OLAP calculated measure type of functionality, so that new dimensions are taken into account automatically without having to write new queries, configure dashboards and reports.
How important is performance?
When analysis is performed for reporting purposes – for example, at the end of each business quarter – speed may not be an issue. Analysis can run overnight with results displayed the next morning. But if analysis is critical for timely business decisions, performance is crucial. When a financial institution evaluates the risk of different trading positions, decisions must be made immediately.
Evaluate, therefore, the performance needs of your analysis and consider the volume of data. OLAP solutions, for example, provide an adequate solution for analysis of a small number of dimensions. However, when attempting to handle more intensive analysis that involves many dimensions, legacy OLAP systems tend to collapse due to their “curse of dimensionality.” Modern OLAP systems use In-memory computing technology to address such challenges and enable limitless data navigation at high speed.
How dynamic is the data?
Depending on your industry segment and line of business, data freshness may be important. Ask yourself how often does data change? Should analysis be run on fresh data, or maybe it does not matter. For example, it may be acceptable to run reports for the accounting department on a weekly basis. Conversely it is not acceptable for companies in the likes of Fedex to run analytics based on where transported parcels were located yesterday. They need that information in real-time in order to be able to react to any disruption without impacting delivery timelines.
In the past, a very clear distinction used to be made between OLTP and OLAP systems. OLTP meant transaction centric, and OLAP analytics centric. Big Data disrupted the clear-cut frontier between these two approaches. Nowadays mixed workload systems enable addressing both needs. A significant advantage of mixed workload is that it removes the latency between the transactional environment where data is created, and the analytical environment where data is explored. As a result, you can perform queries that are executed in split seconds on data that gets refreshed on the fly.
We’ll cover mixed workload in more depth in our next post.
By no means is the above an all-inclusive list of evaluation questions. For each of these topics, you should dig much deeper to map and prioritize your needs prior to selecting a database solution.