OLAP and data mining: What’s the difference?

Online application processing (OLAP) and data mining are considered double entendres. Here we attempt at singling out what each term entails.

There is widespread misconception over usage of the terms - OLAP and data mining. We take a closer look at what they mean, their functions in an organization, and how diverse they are from one another.   

Defining OLAP and data mining

OLAP is a design paradigm, a way to seek information out of the physical data store. OLAP is all about summation. It aggregates information from multiple systems, and stores it in a multi-dimensional format. These could be a star schema, snowflake schema or a hybrid kind of a schema.

Data mines leverage information within and without the organization to aid in answering business questions. They involve ratios and algorithms like decision trees, nearest neighbor classification and mural networks, along with clustering of data.

Why are OLAP and data mining considered synonymous?

OLAP and data mining are considered the same due to the perception one holds of their function. To add to the ambiguity, both the terms fall under the business intelligence (BI) umbrella. Vendors also complicate the scenario when they offer data mining solutions at the database level. Data mining was considered a skillfully built statistical solution, but as a result of mergers and acquisitions, specialized tools are available for predictive purposes.

While OLAP was always prevalent, it is easy to build and use, therefore, extensively used. Owing to the easy features and availability of data mines, the two terms began to be used synonymously.

Functions of OLAP and data mining

  • OLAP and data mining are used to solve different kinds of analytical problems. OLAP summarizes data and makes forecasts. For example, it answers operational questions like “What are the average sales of cars, by region and by year?"
  • Data mining discovers hidden patterns in data and operates at a detailed level instead of a summary level. For instance, in a telecom industry where customer churn is a key factor, Data mining would answer questions like, “Who is likely to shift service providers and what are the reasons for that?”

OLAP and data mining can complement each other. For instance, while OLAP pinpoints problems with the sales of a product in a certain region, data mining could be used to gain insight about the behavior of the individual customers. Similarly, after data mining predicts something like a 5% increase in sales, OLAP could be used to track the net income.

Can OLAP and data mining exist independently?

Data mining is appropriate for an organization that wants a future perspective on things. However, for an organization that simply wants to improve its operational efficiency, OLAP can be used. Thus, OLAP and data mining can exist independently. A lot of mid-sized companies do not use data mining because it requires high-end skills. A data mine can be implemented only when there is a need to address business queries. On the other hand, OLAP can be easily employed to further the goals of any business that can be satiated by reporting and association of the various variables.

Users for OLAP and data mining

The customers for OLAP and data mining vary. In a typical organization, OLAP is used by the regular front and back office employees. Predominantly, they would use it for an organization-wide reporting or a small time analysis.

Data mining is used by business strategists. The strategists base their business moves on the information thrown up by the data mine.

Inadequacies of OLAP and data mining

OLAP is a dimensional model, which can scale up and information can be diced and sliced for interrogation. It is a kind of a BI cube, which is refreshed based on the source data on a periodic basis. However, an OLAP solution lacks the capacity for predictive analysis.

A data mine is built for eternity, which is a shortcoming, as a model cannot be valid forever. Some data mining tools also enable the retention of older models.

About the author: Ramesh Babu is the Delivery Manager & BI/DW Practice Leader - Banking & Capital Markets at Mphasis. He is an accomplished Senior IT Manager and a Certified Project Management Professional with 19+ years of demonstrated experience in Information Systems, Project Management, DWH, Business Intelligence, Pre Sales and Techno Business Consulting.

(As told to Sharon D’Souza)

Read more on Data warehousing