Tag Archives: mdx

Edit Pentaho Mondrian Models Inline in your Browser

23 Jul

Our friends at Ivy Software (http://www.ivy-is.co.uk/ivy-labs/ivy-software/) have updated one my favorite community marketplace tools available to Pentaho customers called Ivy Schema Editor. This is a very simple tool that is very powerful in that you can modify and edit your Mondrian semantic business models right in-line in your browser from the Pentaho User Console … Great job, guys!

ivy2 ivy1

I can now create new models inline and test the model through Analyzer in one place. To me, for anyone building an interactive BI solution with Pentaho, this seems like a must-have tool.


Getting Started with Mondrian For SSAS Developers

22 Oct

Over on my MSSQLDUDE blog, I showed you how the Pentaho BI Suite can auto-model and create OLAP analytics for you using the Mondrian ROLAP engine with SQL Server Adventure Works DW as the source database here and here. That is basically asking the Pentaho Enterprise Edition Suite (download an eval copy to get started here) to build the cube models for  you based on a series of automation and assumptions.

But if you, like me, have a bunch of SQL Server Analysis Server (SSAS) cubes that you want to move over to Pentaho or just use with the Mondrian ROLAP engine, it’s pretty straightforward to migrate those models over to Mondrian. You can then use the entire Pentaho Suite and visualizations or use Mondrian as an XMLA server, or even use Mondrian as a lightweight embedded MDX-based OLAP engine in your app. Take a look here at Pentaho’s Mondrian project home page for more info on the Mondrian ROLAP engine.

So, here I am going to walk you through the process of migrating over to the Pentaho Mondrian platform from SSAS … At least, the way that I did it! First here are a few differences in the engines to keep in mind as you migrate:

1. SSAS offers 3 storage options for your OLAP database: MOLAP, ROLAP & HOLAP. Mondrian is a pure-play relational OLAP or ROLAP engine, converting MDX queries from sources like XMLA client tools, to SQL queries against the source relational engine. In this example, I will keep the source RDMS database as-is in SQL Server and move my model over to Mondrian.

2. I am going to show you how to migrate just a base model, not the infrastructure of the server, which is fairly different from SSAS to Mondrian. This means we’ll ignore things like partitioning, pre-built aggregations structures, roles, KPIs and translations.

3. Mondrian has a very good aggregation capability and caching engine, similar to that of Microsoft SSAS. But since I do not migrate these in this approach, just take a look at this documentation on how to re-establish caching and aggregation tables in Mondrian here & here.  Keep in mind that Mondrian is pure ROLAP, so aggregation tables will live in the source database, not in a separate storage engine like the traditional SSAS MOLAP approach. I’ll save those more advanced topics for a later follow-up post.

As we migrate your SSAS cubes over to Mondrian land, make sure to note these conversions in terminology and options:

1. Data types

I convert the data types of measures from SSAS to Mondrian where you will find differences most notably in the numeric types that you will need to switch from the different SSAS number types to the “numeric” type in Mondrian.

2. Level types

Mondrian organizes dimensions into hierarchies with levels that represent the underlying dimension table columns. In SSAS, these are essentially equivalent to Dimension Attributes. So I convert the SSAS cube dimension attributes to Mondrian levels and convert the dimension attribute types to Mondrian level types. In Mondrian, you will see either regular or a varying set of time types. When migrating from SSAS, you need to modify from the different date types and make a decision on how you would want to handle the currency and geography types from SSAS.

3. Aggregation types

SSAS has about 12 or so aggregation functions for measures. In Mondrian, you will need to boil those down to sum, count, min, max, avg, count or distinct count. What I did was to remodel or remove unused measures that are of type Last / First Child, LastNonEmpty and any dimensions in a M:M relationship.

Many of the rest of the attributes and properties in SSAS are natural fits directly into the Mondrian schema such as format strings, visibility, calculated members, name columns, hide member if, etc.

So those are the basics. The pieces of SSAS that I do not script or automatically convert are the MDX scripts where you define named sets, calculations and scope inside of the SSAS cube. Those are MDX free-form scripts which you can rewrite in Mondrian as a Named Set or calculated measure. However, if you are using a lot of many-to-many or scope operators in your SSAS schema, you will need to find different ways of doing those in Mondrian 3.x (from the Pentaho 4.8 or 4.0 Suite) because they are not supported currently.

Now, if you have a fairly simple straight-forward cube in SSAS, you can script or automate a lot of this like I mentioned above, which can help a lot with the mundane rewriting of XML or using the Pentaho Schema Workbench to reconstitute and entire OLAP schema.

I wrote one in C# so that I can run it from the SSAS Windows server or from my laptop and it will perform all of those conversions and assumptions that I just listed including converting Measure Groups to Cubes in Mondrian and then generating Virtual Cubes to bring together measures from different fact tables. This is the familiar Virtual Cube concept that Mondrian 3.x utilizes. Again, you can do this all by hand or use a tool like this one.

Here is my beginning SSAS 2012 MOLAP (not Tabular) schema against Adventure Works DW 2012:


I’m just gonna build, deploy and browse that cube by slicing the Unit Cost by my Calendar Hierarchy:


Next, I’ll set the base schema in Mondrian by using my SSAS2MONDRIAN command-line converter tool:


All that the switches are doing on the command line are setting the source database & cube, telling the converter to just skip M:M relationships, including the SQL Server alias name in the XML output and then redirect the output to a Mondrian XML schema file. It will collect and iterate through all of the measure groups and dimensions, converting them to Mondrian, converting the similar types to the Mondrian types, converting the measure expression syntax in SSAS to Mondrian Calculated Measures and building a virtual cube, which is named on the command line as well. There are a few other behind-the-scenes migrations occurring, all of which you can hand-crank if you’d prefer. This is definitely a time saver and also helps to avoid human error of hand-editing XML.


Alright, so the migration is done, I’ve opened the output XML file in the Pentaho Schema Workbench (PSW) to & validate the Mondrian file and it looks good, no red X marks. I like to test my models with the MDX query editor, which also performs an additional level of schema file validation. When you go to open the MDX query editor in PSW, look in your standard error output. If you use this utility, or any other method of hand-generating Mondrian schema XML, you may need to tweak it a little bit before it will validate.


The MDX editor is showing the results that I am looking for, so I feel good from my simple quick & dirty smoke test. I can now publish the schema to the Pentaho platform and have fun with my new live Mondrian cube running in Pentaho, converted as-is from SSAS … you will see the name of the cubes (measure groups) as well as the grouped virtual cube:


And there is my slicing & dicing in the Pentaho OLAP Analyzer tool against SQL Server from Mondrian, just as it was in SSAS:


BTW, if you would like a copy of that Windows command-line program to automate the base model conversion, just send me an email: mkromer at pentaho dot com or download a copy of the executable here. You don’t need to use that, although it does simplify quite a bit of the tedious migration tasks. But the Mondrian XML schema is very straight forward and easy to understand and the Pentaho Schema Workbench is a nice tool to work you through adding in the rest of the pieces.

Look for follow-ups on my blog here to continue using Mondrian, especially for SSAS users and for BI developers who are new to tools like Pentaho.

More MDX Fun on NoSQL Databases

2 Oct

Let’s use Pentaho’s Business Analytics Suite one step further to do a few more interesting things with MDX OLAP models that we’ve put on top of MongoDB and Cassandra data stores.

Check over at those links above so that you can follow along with the Pentaho tools. I’m going to start from where I left off with an in-memory OLAP model built using the Instaview tool and the Mondrian MDX engine.

Just as a quick reminder, I will use the set-up from the Integration portion of Instaview for MongoDB using this JSON Query Expression:

{ “$query” : { “url” : “–firstpage–” }, “$orderby” : { “Count” : -1 } }

And I’m selecting just the Key, URL, NextURL and Count fields for this simple demonstration. One of the first changes that you can make to expand your OLAP modeling and MDX capabilities on MongoDB is to choose “Model” from the Pentaho Instaview UI and add measures, dimensions, properties, formatting and your own hierarchies to the model. This Model represents the logical model that your MDX queries will utilize from the in-memory layer when querying MongoDB.


Back in Instaview, you will navigate to the Visualize screen so that we can issue MDX queries to our MongoDB data. Click on the Administration button at the top and select “MDX”. Here will be presented with a blank MDX query box. This is essentially where we were running a few test MDX queries from our first model. In this demo, let’s try a few more interesting ways to interrogate and slice the Mongo data:

1. Let’s order the output of Next URLs by the most frequently hit pages, regardless of the order of the pages in the NextURL dimension. In MDX, we do this with the ORDER function and use BDESC to order the data Descending with a “B” for “Breaking Hierarchy” so that the report user can see the data in order of actual counts, not based on the order in the hierarchy:

SELECT [Measures].[Count] ON 0, ORDER([NextURL].Members,[Measures].[Count], BDESC) ON 1 FROM [Mongo2]

And as we saw before, Pentaho Analyzer will give you the results in an HTML Web page


2. It should be pretty clear now that the normal dimensional navigating of member data is available to you from these NoSQL sources, just like MDX against RDBMS sources. This is how you access a count from a single specific page or dimension member … Just put the member name in square brackets [ ] and refer to the Dimension name first:

SELECT [Measures].[Count] ON 0, [Nexturl].[/careers] ON 1 FROM [Mongo2]

Or use relative MDX functions which really begins to exploit the power we enjoy in MDX land such as FirstChild, Sibling, LastChild, etc:

SELECT [Measures].[Count] ON 0, [Nexturl].FirstChild ON 1 FROM [Mongo2]

SELECT [Measures].[Count] ON 0, [Nexturl].[/about].Sibling ON 1 FROM [Mongo2]

Results in those cases are single-row member calculations:
[Nexturl].[/about]    504
3. OK, I’m almost done … let’s try 1 more where I will now use the powerful FILTER function in MDX so that my sorted list of counts only shows counts > 1000

SELECT [Measures].[Count] ON 0,
ORDER ([NextURL].Members,[Measures].[Count], BDESC) ,
([Measures].[Count] > 1000) )
ON 1
FROM [Mongo2]


MDX on MongoDB

30 Sep

I had a few queries following my posting on MDX for Cassandra using Pentaho’s Big Data Analytics, namely about if this same capability was available on other NoSQL platforms, not just Cassandra. Simply put, the answer is yes. And I’ll give you a really short simple demo of Pentaho’s MDX Olap for MongoDB.

First, let me just point out that MongoDB is a very important NoSQL / Big Data partner for Pentaho as you can read about here. So you’ll see a lot of tight integration and built-in capabilities with MongoDB in the Pentaho Suite.

As I did in the Cassandra demo earlier, I’ll use Pentaho’s Instaview to build out a quick & easy template, but selecting the MongoDB template this time, instead of Cassandra.


This will direct Pentaho to generate a bunch of automated routines in the background through the PDI / Kettle tranformation to the Mondrian OLAP engine. Your end result will be an OLAP model for sending MDX queries against your MongoDB data, using an embedded in-memory model. First, tell PDI (through Instaview’s steps) where your MongoDB data store is located and give it a JSON query to pull the data that you want to use for your OLAP BI application:


In my simple demo, I am going to use the PageSuccessions sample data set, which you can find more about here. I really like this data set because I’ve worked with clickstream analytics in the past and Web log data stored in a NoSQL data store is a key, common NoSQL and Big Data use case, so it’s a good representative data set.

What I really like about the way that Pentaho’s tools work with MongoDB is the way that it queries the data store’s metadata to fill in the database and collection information from the source.  I wrote up a bit more about it here at this posting on KromerBigData about MongoDB with Pentaho tools.

When you are done with those steps, you will now have an auto-generated OLAP model for MDX querying. It is very simple in this case and you will want to modify the measure and dimensional properties to fit your reporting needs.


Time to start slicing & dicing … I can drag fields from the OLAP model onto the design surface to build a pivot table or use a data visualization option.


That data visualization tool is Pentaho’s Analyzer and is sending MDX queries to the in-memory data store from MongoDB’s data, which means that we can now use it to query MongoDB manually using MDX.

Go to Administration -> MDX in Analyzer and in the subsequent query box, you can use my demo MDX against PageSuccessions which will slice the counted hits per page and order it by total hits descending (DESC):

SELECT [Measures].[Count] ON 0, ORDER ([nextUrl].Members,[Measures].[Count], DESC) ON 1 FROM [Untitled 10]

The cube name “Untitled 10” is just a placeholder given to the auto-generated OLAP logical model that Instaview creates and is generated by Mondrian.

The results look like this:


And that’s it. Not too much different than my earlier Cassandra MDX OLAP demo using Pentaho. Just a slight difference in setting up the data source. The querying, reporting and model generation are all essentially the same otherwise.

Enjoy!! Best, Mark


Microsoft SQL/BI and other bits and pieces


Current & Breaking News | National & World Updates

Tech Ramblings

My Thoughts on Software

SQL Authority with Pinal Dave

SQL Server Performance Tuning Expert

Insight Extractor - Blog

Paras Doshi's Blog on Analytics, Data Science & Business Intelligence.

The SQL Herald

Databases et al...

Chris Webb's BI Blog

Microsoft Analysis Services, MDX, DAX, Power Pivot, Power Query and Power BI

Bill on BI

Info about Business Analytics and Pentaho

Big Data Analytics

Occasional observations from a vet of many database, Big Data and BI battles

Blog Home for MSSQLDUDE

The life of a data geek