As promised, here is a drill-down into one very important use case where Big Data technologies become very strategic for business: online (or interactive) marketing. Brand management, online advertising, marketing campaign analysis and social brand sentiment analysis (i.e. Twitter, Pintrest, Facebook) become critical strategic advantages for business that may require Big Data approaches such as Hadoop, MPP, MapReduce and in-memory analytics.
The reasons that online marketing techniques such as those that I just listed above require a Big Data approach include:
- The data volumes coming from social media, search engines, Web page tags (from online ad servers) and Web server logs is extremely large, chatty and granular (i.e. event based)
- Those sources include a lot of “unstructured” data which includes logs and “extended data” tags that are formatted in ways that make traditional data warehouse ETL very difficult
- Some of those sources may also include rich media that require specialized filters and adapters to search
Tools like Hadoops can be helpful in storing the raw files in HDFS or Hive and running MapReduce jobs or queries against the sources to produce parsed results that can then be stored in a data warehouse for analytical real-time queries by analysts. The extra step of parsing with MapReduce makes data from those sources available for search engine optimization, marketing campaign analysis and sentiment analysis that is just not possible with traditional BI and DW environments.
It’s important to keep in mind that many estimates put the percentage of an organization’s data assets available in a traditional DW somewhere around just 10%. Adding these important data sources is very challenging, but much more possible with Big Data technologies, creating a big strategic advantage for your business.