Hierarchies. In each of these business nouns, there are attributes that have natural relationships. These natural relationships can be used by On-Line Analytical Processing (OLAP) database systems to create aggregated views of quantitative variables for fast querying. “Although Data Mining techniques can operate on any kind of unprocessed or even unstructured information, they can also be applied to the data views and summaries generated by OLAP to provide more in-depth and often more multidimensional knowledge. In this sense, Data Mining techniques could be considered to represent either a different analytic approach (serving different purposes than OLAP) or as an analytic extension of OLAP.” (Statistica) In addition to the common product category hierarchy, a common example is date and time hierarchies. Just as a product rolls up to a sub-category and eventually category, specific dates roll up to weeks, months, quarters, and years. With these hierarchies in place, our data mining models are able to perform much richer sets of analysis.
Quantitative Variables (Measures)
Having all of the dimensions hashed out now gets down to the actual quantitative variables, or measures, which the business stakeholders wish to analyze. Measures are often numerical in nature and can be aggregated using common functions such as Sum, Average (Mean), Min, and Max.
Granularity. The best analysis for data mining is going to come from the most granular of measures. Identifying the granularity is a key part to understanding exactly what the outcomes of the data mining models are. If a student is taking a class is our example, we can capture each time a student enrolls in, completes, or attends a class as all different levels of granularity.
Aggregation. How to aggregate these measures is also a key part of the OLAP engines handling of the data. Not all aggregations can be applied to all measures. The quantity on hand of a product for example, cannot be summed in relation to a customer invoice. Furthermore, some measures are non-additive all together. Grade Point Average (GPA), for example, cannot be summed together with other students to create a consolidated GPA. This measure however, may be averaged, for a set of courses, programs, or institutions to name a few dimensions.
Data Warehouse vs. Data Mart
There exists, two primary schools of thought on how to organize data into databases for analysis. Ralph Kimball, generally revered as the inventor of the Data Mart, believes that data sets should be organized into smaller, targeted models for pointed analysis. William (Bill) Inmon, on the other hand, believes that an entire enterprises worth of data must be composed into a single data model for proper analysis. The arguments for either school of thought are outside the scope of this paper, however it is notable to say that the Kimball approach is the most prevalent in modern day data warehousing.
Data Presentation
Presenting the findings of the data mining and analysis is the final leg of the process in applying predictive analytics towards business decision making. The presentation options are many in number, however all generally focus around a few key types of presentation.
Dashboards. Dashboards are a popular means for presenting predictive analytics and general analysis. They are composed of high-level graphs and charts, which intend to present data in such obvious ways that the findings cannot be misunderstood. Often gauges as seen in an automobile dashboard are used as a method for presenting this data in an easily understood manner.
Key Performance Indicators. When target measures exist for measures analyzed in the data warehouse a Key Performance Indicator (KPI) is another great way to graphically convey a message from the underlying data. Often these KPIs are presented using stoplight indicators of red, yellow, and green lights. The idea is that a user looking at the KPI can easily get a sense of the analysis being good (green light), needs attention (yellow light), or in trouble (red light).
It’s arduous to search out knowledgeable people on this subject, however you sound like you understand what you’re speaking about! Thanks
Ben – do you have this in single-page format? Thanks.
Aw, this was a really nice post. In idea I would like to put in writing like this additionally – taking time and actual effort to make a very good article… but what can I say… I procrastinate alot and by no means seem to get something done.
There are certainly a lot of details like that to take into consideration. That is a great point to bring up. I offer the thoughts above as general inspiration but clearly there are questions like the one you bring up where the most important thing will be working in honest good faith. I don?t know if best practices have emerged around things like that, but I am sure that your job is clearly identified as a fair game. Both boys and girls feel the impact of just a moment’s pleasure, for the rest of their lives.