Did Big Data Kill OLAP Cubes?

Did Big Data Kill OLAP Cubes? Not yet, but very possibly soon.

Think about the traditional usage and purpose of OLAP cubes in terms of their predominate deployment today. In most cases, enterprises are using cubes to aggregate data and pre-process data from multiple data source and/or a data warehouse to provide BI capabilities.

Many of these use cases are based upon data processing cycles that occur daily with large sets of data in bulk fashion. Well, that sounds quite a bit like Big Data requirements of processing large data sets in bulk fashion and then providing access to that post-processed data to analysts, scientists, etc.

So there is still clearly a correlation and applicability of OLAP cubes in the Big Data world.

OLAP cubes provide value in a number of ways, including abstracting report queries away from the database and providing fast access to knowledge through techniques that include pre-aggregated, pre-built analytics in the cube. This is where we start to breakdown in terms of the future of OLAP cubes in Big Data use cases.

In Big Data use cases, we need to provide much more ad-hoc, data exploration and knowledge self-discovery. This makes building the analytics in the cube based on requirements and assumptions very difficult. Even in the most “Agile” BI shops, this is a challenge.

This is where in-memory technologies, MPP and columnar databases become key enablers in the BI stack for Big Data. I’m writing a few new posts for SQL Server Pro mag and MSSQLDUDE that I’ll link to here to explain this in more technical terms over the next few days. Back here in Big Data Analytics, I’ll talk about generic MPP techniques.

For now, be prepared to hear the BI and database industry talk about maximizing in-memory cubes & databases for BI & reporting purposes, replacing OLAP cubes.

This does NOT preclude the need for semantic modeling and abstraction layers. And OLAP cubes still play a very important role in specific use cases that do not require large sets of ad-hoc query requirements.

However, Big Data architects do need to think about solving the traditional BI problems in a different way.

Advertisement

3 comments

  1. […] I guess this means that the answer to my question my last year is “No”. In all fairness to myself, I guess … I was really referring to MPP in-analtyics databases and Hadoop MR with Mahout where analytics are “in-engine” instead of forming logical in-memory OLAP models. There, that’s out of the way. The intention of those systems is to store large data sets AND allow analytics directly against those distributed data stores. […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s