A Data Science Central Community

**The narrow-sensed OLAP**

OLAP is part and parcel of a BI application. As the name suggests, the word is an acronym for online analytical processing. Users, frontline employees, to be precise, are responsible for performing various types of data processing online.

**But, the concept of OLAP tends to be used in a very narrow sense**. It has almost become an equivalence of multidimensional analysis. Based on a prebuilt data cubic, the analysis performs summarization according to specified dimensions/levels and presents the aggregate values as a table or a diagram. It adopts drilldown, aggregation, rotation, and slicing to change the dimensions/levels and summarization range. The idea behind multidimensional analysis is this: extensive ground-based aggregate results are too broad to get a good insight into an issue; instead, data needs to be sliced into smaller parts and drilled down to more detailed and deeper levels for achieving a more valuable analytical purpose.

**The broad-sensed OLAP**

Is online analytical processing all about the multidimensional analysis?

There are some data analysis scenarios where a person who has a lot of experience in a field makes some predictions about their businesses. For example:

- An equity analyst predicts that stocks meeting certain conditions are most likely to rise;
- A sales manager knows which types of sales representatives are better at dealing with difficult customers;
- A tutor knows how the results of students who have very strong subjects and very weak subjects are like;

These guesses provide basis for predictions. After operating for a certain time, a business system will generate a huge amount of data, which could verify these guesses. Verified guesses can be used as principles to guide future decisions. If the guesses are proved wrong, re-guesses will be made.

It is the guess verification that the OLAP should focus. The guess-and-verify work aims to find principles or facts that support a conclusion based on historical data. An OLAP tool helps to verify guesses via data manipulation.

Of course **guesses are made by experienced people in a certain field**, instead of the software. The online analysis is necessary because, most of the time, guesses are made on the spot based on some intermediate results. It is impossible and unnecessary to pre-design a complete end-to-end path, which means the pre-modelling is unfeasible. The provisionality of the action also makes the IT resources unavailable when trying to verify it.

To counter the issue technologically, frontline workers must be equipped with the capability of querying and computing data in a flexible and interactive way. In the previously mentioned scenarios, the possible computations are as follows:

- For a stock that has been rising for 3 days in a month, find the probability of continuous rising on the 4
^{th}day; - Find the customers whose last orders were half a year ago but who placed an order after their sales representatives were changed;
- Get the rankings of the English scores of the students whose scores of both Chinese and Math are in top 10;
- …

**Limitations of multidimensional analysis**

Obviously these computations can be handled based on historical data. But is a multidimensional analysis method helpful?

I’m afraid not!

The multidimensional analysis has two drawbacks: one is that the data cubic should be pre-created, giving users no opportunity of remolding it provisionally and requiring a re-creation for each new analysis; the other one is that the analytic operations over a data cubic are limited, including only drilldown, aggregation, slicing and rotation, thus it is difficult to cope with complex multi-step computations. Though the popular agile BI products in recent years that are capable of performing multidimensional analysis have much better operation fluency and far more attractive interface than the early OLAP products have, their essential functionalities remain unchanged and no improvement is made about the inabilities.

Yet multidimensional analysis has values, like locating the exact source of the high cost. But it can’t get a principle that is crucial for predicting and guiding a future move based on data. In this sense, online analytical processing should be more than multidimensional analysis.

**What kind of OLAP do we need?**

What functionalities the OLAP software for verifying a speculation should have?

As mentioned previously, verifying a speculation is a process of data query and computation. **It is vital that the query and computation can be defined by frontline workers without the help of IT specialists**. In the current application context, an OLAP platform needs to have the following two functionalities:

1. Associated query

The first thing for performing an analysis is acquiring data. Many organizations have their own data warehouses for non-IT employees to access and perform queries. An important issue is that most of the OLAP software doesn’t provide convenient associated query functionality for the frontline employees. Instead, IT specialists need to first create a model to solve the associated query (which is similar to creating a data cubic for performing multidimensional analysis). Usually not all real-life demands can be handled with this single model, and IT rescue is still needed. This makes online analytical processing not online any more.

2. Interactive computation

After data is collected, computation begins. The distinguishing characteristic of the speculation-verifying computation is that, instead of a ready-made program, the next move is determined based on the result of the previous move. The process is highly interactive, which is similar to the computation with a calculator. Furthermore, it is the structured data in batches, instead of numbers, that needs to be processed. The OLAP tool thus becomes a **data calculator**. Excel is interactive to some degree, making it the most popular desktop BI tool. But Excel doesn’t give sufficient support for dealing with multi-level data and regular operations, thus unable to handle the speculation-verifying computation mentioned in the previous scenarios.

In later articles, we’ll analyze the current popular computing techniques to locate problems of handling the two types of computation, and suggest solutions to them.

https://www.linkedin.com/pulse/mr-jiangs-datatalk-room-what-kind-ol...

© 2019 AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

**Technical**

- Free Books and Resources for DSC Members
- Learn Machine Learning Coding Basics in a weekend
- New Machine Learning Cheat Sheet | Old one
- Advanced Machine Learning with Basic Excel
- 12 Algorithms Every Data Scientist Should Know
- Hitchhiker's Guide to Data Science, Machine Learning, R, Python
- Visualizations: Comparing Tableau, SPSS, R, Excel, Matlab, JS, Pyth...
- How to Automatically Determine the Number of Clusters in your Data
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- Fast Combinatorial Feature Selection with New Definition of Predict...
- 10 types of regressions. Which one to use?
- 40 Techniques Used by Data Scientists
- 15 Deep Learning Tutorials
- R: a survival guide to data science with R

**Non Technical**

- Advanced Analytic Platforms - Incumbents Fall - Challengers Rise
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- How to Become a Data Scientist - On your own
- 16 analytic disciplines compared to data science
- Six categories of Data Scientists
- 21 data science systems used by Amazon to operate its business
- 24 Uses of Statistical Modeling
- 33 unusual problems that can be solved with data science
- 22 Differences Between Junior and Senior Data Scientists
- Why You Should be a Data Science Generalist - and How to Become One
- Becoming a Billionaire Data Scientist vs Struggling to Get a $100k Job
- Why do people with no experience want to become data scientists?

**Articles from top bloggers**

- Kirk Borne | Stephanie Glen | Vincent Granville
- Ajit Jaokar | Ronald van Loon | Bernard Marr
- Steve Miller | Bill Schmarzo | Bill Vorhies

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives**: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge