Enterprise Security in Big Data Analytics

12 Aug

Here is a step into some of the plumbing involved in Big Data Analytics solutions. When architecting Big Data Analytics solutions, you cannot ignore the production and operational requirements of such a solution. Getting your analytics to work against Big Data sources in a proof of concept, trial or laptop environment is a great first step and a very important phase of your project. This will help you to get the executive-level buy-in that you need to get the funding and approval to move forward with your project.

Now, as you move into operationalizing a production-ready version of your BDA solution for your enterprise, you will find a mixed bag of available security solutions to harden the data layer of a Big Data Analytics solution. Here is a look at some of the options:

  1. Hadoop as a source
    This is (my opinion) the most complicated aspect of your BDA solution. While you may be providing security on reports, analytical models and portals, to secure data in Hadoop, the traditional method of relying on Linux file permissions may be insufficient for your enterprise IT requirements, auditing and standards. Here are 3 options to look at here:
    a. Secure the data stored in HDFS with Kerberos or a 3rd party security provider such as Voltage Security or Protegrity.
    b. If using Cloudera, they are now offering their own security on Hadoop called Sentry that provides authority, authorization and compliance with regulations including SOX, HIPAA, PCI.
    c. If using Hortonworks, they are steering users toward the Apache project Knox: http://hortonworks.com/blog/introducing-knox-hadoop-security/. Knox is a gateway that sits between your client accessing the data in HDFS and your Hadoop cluster.
  2. MPP databases as source
    In this case, if your organization has invested in MPP databases, when you store your data in those databases, you enjoy the added benefit of data security, auditing, auth, etc. from the database layer (Teradata, PDW, Vertica, etc.)
  3. OLAP as source
    If your analytics will be built from an OLAP engine (SAS, SSAS, Mondrian, etc.) then you can secure the data at this layer with ACLs and roles. However, if you do allow detailed reporting off of the source data, below the OLAP layer in your solution, then you still need to secure the data layer for those BDA solutions.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

cbailiss

Microsoft SQL/BI and other bits and pieces

TIME

Current & Breaking News | National & World Updates

Tech Ramblings

My Thoughts on Software

SQL Authority with Pinal Dave

SQL Server Performance Tuning Expert

Insight Extractor - Blog

Paras Doshi's Blog on Analytics, Data Science & Business Intelligence.

The SQL Herald

Databases et al...

Chris Webb's BI Blog

Microsoft Analysis Services, MDX, DAX, Power Pivot, Power Query and Power BI

Bill on BI

Info about Business Analytics and Pentaho

Big Data Analytics

Occasional observations from a vet of many database, Big Data and BI battles

Blog Home for MSSQLDUDE

The life of a data geek

%d bloggers like this: