Analytical CRM with Microsoft Data Mining

Analytical CRM Overview

Analytical CRM analyzes customer data for a variety of purposes:

  • Campaign Analysis - To analyze and compare the effectiveness of customer and product promotions, marketing drives, and advertising.
  • Cross Sell Analysis - To analyze the characteristics of multi-product usage by customers, Identifying profitable trends usage of a base product suggests complementary product and service purchases. This also allows review of a organization's cross-selling plans.
  • Customer Attrition Analysis - To understand the reason and impact of customers ceasing to use the organization's products and services.
  • Customer Behavior - To understand customer trends, and define the lifetime activity patterns of the organization’s customers, in order to assess and guide the provision of products and services to the customer community.
  • Customer Complaints Analysis - To understand the pattern of complaints and the effectiveness of the resolution process.
  • Customer Interaction Analysis - Analysis of how the organization interacts with its customers, and the effectiveness of communications and channels in terms of winning new business. The analysis measures active threads of communication. A Thread is a series of sequential Communications on a given subject. Examples are a Complaint Thread initiated by a Customer, or a Product Sales thread initiated by the organization.
  • Customer Loyalty - To understand the determination a customer has for continuing to use the services of the Organization, while recognizing the customer has alternative choices.
  • Lead Analysis - To identify prospects for new product and service sales, and analyze the effectiveness of this activity.
  • Market Analysis - To identify the demographics of a market and the organization's customer base within the market; and compare the results with that of the target population and of peer organizations' customer bases.

Analytical CRM generally makes heavy use of data mining and other techniques to produce useful results for decision-making. It is at the analytical stage that the importance of fully integrated CRM software becomes most apparent. Logically speaking, the more information that the analytical software has available for analysis, the better its predictions and recommendations will be.

Sales Intelligence CRM

Sales Intelligence CRM is similar to Analytical CRM, but is intended as a more direct sales tool. Features include alerts sent to sales staff regarding:

  • Cross-selling/Up-selling/Switch-selling opportunities
  • Sales performance
  • Customer trends
  • Customer margins
  • Customer alignment

Importance of the Software

In general, Customer Relationship Management (CRM) representing a set of methods that companies use to interact with customers. The methods include employee training and special purpose CRM software. Accordingly Analytical CRM as a part of overall CRM solution should leverage on certain kind of software.

Microsoft Data Mining Tools

I have already mention that Analytical CRM solution makes heavy use of data mining and other techniques to produce useful results for decision-making. Microsoft provides in his BI platform powerful tools for development of complex and comprehensive data mining applications - Microsoft SQL Server Analysis Services – Data Mining Tools.

Microsoft SQL Server Analysis Services contains the features and tools you need to create complex data mining solutions.

  • A set of industry-standard data mining algorithms.
  • The Data Mining Designer, which you can use to create, manage, and explore data mining models, and then create predictions by using those models.
  • The Data Mining Extensions (DMX) language, which you can use to manage mining models and to create complex prediction queries.

You can use a combination of these features and tools to discover trends and patterns that exist in your data, and then use the trends and patterns to make intelligent decisions about difficult business problems.

Example: Design and development of Analytical CRM Solution using Microsoft Data Mining Tools

Business Case Definition

Since that definition of a reasonable business case could take a serious space in this article, I would “bet” on AdventureWorks sample database. Using this approach gives an additional benefit – a lot of Microsoft Tools users are aware about this sample and they do not need extensive introduction. Moreover, I could focus more on subject of this article instead on data model introduction.

The AdventureWorks sample databases are for the fictitious AdventureWorks -- a large, multinational manufacturing company. The company manufactures and sells metal and composite bicycles to North American, European and Asian commercial markets. While its base operation is located in Bothell, Washington with 290 employees, several regional sales teams are located throughout their market base.

Data stored in AdventureWorks database are in fact data that exists in common organization environment: customers, products, geography, transactions, time series, etc.

In the picture below you can see database schema of Adventure Works DW database (data warehouse like database based on AdventureWorks OLTP database).

 

Adventure Works 2008 Data Model

So, we have the model, now we have to define the business requirements. Suppose that Adventure works looking for solution in order to improve their customer relationship segment. They are considering the following scenarios:

  • Targeted Mailing Campaign: Supports the scenario of an analyst applying various data mining algorithms to Adventure Works customer and Internet sales data to determine the demographic attributes of customers who are likely to purchase a bicycle. The analyst can then apply the data mining model to a list of potential customers in order to determine which customers are most likely to respond to a targeted mailing that promotes Adventure Works bikes
  • Forecasting: Supports the scenario of an analyst investigating the growth of bicycle models by time and region.

How Microsoft Data Mining tools could help to achieve these requirements? Answer is given in the next chapter of this article.

Data Mining Development Cycle

 

Data Mining Development Cycle

Defining the Data Mining Problem

The first step in the data mining process is to clearly define the business problem, and consider ways to provide an answer to the problem.

This step includes analyzing business requirements, defining the scope of the problem, defining the metrics by which the model will be evaluated, and defining specific objectives for the data mining project. These tasks translate into questions such as the following:

  • What are you looking for? What types of relationships are you trying to find?
  • Does the problem you are trying to solve reflect the policies or processes of the business?
  • Do you want to make predictions from the data mining model, or just look for interesting patterns and associations?
  • Which attribute of the dataset do you want to try to predict?
  • How are the columns related? If there are multiple tables, how are the tables related?
  • How is the data distributed? Is the data seasonal? Does the data accurately represent the processes of the business?

To answer these questions, you might have to conduct a data availability study, to investigate the needs of the business users with regard to the available data. If the data does not support the needs of the users, you might have to redefine the project.

Here is the problem definition regarding our example:

Targeted Mailing Campaign

The marketing department of Adventure Works wants to increase sales by targeting specific customers for a mailing campaign. By investigating the attributes of known customers, the company hopes to discover patterns that they can then apply to potential customers. They hope to use the discovered patterns to predict which potential customers are most likely to purchase a product from the company.

Additionally, the marketing department wants to find any logical groupings of customers already in the company's database, such as customers who have similar demographics and buying patterns.

The company's database, Adventure Works DW, contains a list of past customers and a list of potential new customers.

Forecasting

As the sales analyst for Adventure Works, you have been asked to forecast the sales of individual bike models for the next year. In particular, you have been asked to find peak times for bike sales and to learn how sales lead or lag with regard to region. Additionally, you have been asked to determine whether sales of different models vary depending on the time of the year.

To find the requested information, you will investigate the company's data at the monthly level, and you will also divide sales into three regions: Europe, North America, and the Pacific.

After you complete the tasks in this lesson, you will be able to answer the following questions:

  • What time of year do sales peak?
  • How do the sales of different bike models interact over time?
  • Is there a pattern to sales with regard to the three regions?

Preparing Data

The second step in the data mining process is to consolidate and clean the data that was identified in the Defining the Problem step. Data can be scattered across a company and stored in different formats, or may contain inconsistencies such as incorrect or missing entries. For example, the data might show that a customer bought a product before the product was offered on the market, or that the customer shops regularly at a store located 2,000 miles from her home.

Data cleaning is not just about removing bad data, but about finding hidden correlations in the data, identifying sources of data that are the most accurate, and determining which columns are the most appropriate for use in analysis. For example, should you use the shipping date or the order date? Is the best sales influencer the quantity, total price, or a discounted price? Incomplete data, wrong data, and inputs that appear separate, but are in fact strongly correlated, can influence the results of the model in ways you do not expect. Therefore, before you start to build mining models, you should identify these problems and determine how you will fix them.

Typically, you are working with a very large dataset and cannot look through every transaction. Therefore, you have to use some form of automation, such as in Integration Services, to explore the data and find the inconsistencies. It is important to note that the data you use for data mining does not need to be stored in an Online Analytical Processing (OLAP) cube, or even in a relational database, although you can use both of these as data sources. You can conduct data mining using any source of data that has been defined as an Analysis Services data source. These can include text files, Excel workbooks, or data from other external providers.

Picture represents views containing customer and Internet sales data to determine the demographic attributes of customers who are likely to purchase a bicycle.

 

Preparing Data fro Data Mining

Exploring Data

The third step in the data mining process is to explore the prepared data.

You must understand the data in order to make appropriate decisions when you create the mining models. Exploration techniques include calculating the minimum and maximum values, calculating mean and standard deviations, and looking at the distribution of the data. For example, you might determine by reviewing the maximum, minimum, and mean values that the data is not representative of your customers or business processes, and that you therefore must obtain more balanced data or review the assumptions that are the basis for your expectations. Standard deviations and other distribution values can provide useful information about the stability and accuracy of the results. A large standard deviation can indicate that adding more data might help you improve the model. Data that strongly deviates from a standard distribution might be skewed, or might represent an accurate picture of a real-life problem, but make it difficult to fit a model to the data.

By exploring the data in light of your own understanding of the business problem, you can decide if the dataset contains flawed data, and then you can devise a strategy for fixing the problems or gain a deeper understanding of the behaviors that are typical of your business.

Picture below represents data for Targeted Mailing Campaign

Exploring Data

Building Models

The fourth step in the data mining process is to build the mining model or models. You will use the knowledge that you gained in the Exploring Data step to help define and create the models.

You define which data you want to use by creating a mining structure. The mining structure defines the source of data, but does not contain any data until you process it. When you process the mining structure, Analysis Services generates aggregates and other statistical information that can be used for analysis. This information can be used by any mining model that is based on the structure.

Before the model is processed, a data mining model is just a container that specifies the columns used for input, the attribute that you are predicting, and parameters that tell the algorithm how to process the data. Processing a model is also called training. Training refers to the process of applying a specific mathematical algorithm to the data in the structure in order to extract patterns. The patterns that you find in the training process depend on the selection of training data, the algorithm you chose, and how you have configured the algorithm.

You can also use parameters to adjust each algorithm, and you can apply filters to the training data to use just a subset of the data, creating different results. After you pass data through the model, the mining model object contains summaries and patterns that can be queried or used for prediction.

It is important to remember that whenever the data changes, you must update both the mining structure and the mining model. When you update a mining structure by reprocessing it, Analysis Services retrieves data from the source, including any new data if the source is dynamically updated, and repopulates the mining structure. If you have models that are based on the structure, you can choose to update the models that are based on the structure, which means they are retrained on the new data, or you can leave the models as is.

Picture below represents the model for Targeted Mailing Campaign. This model contains 3 types of attributes: key attribute, input attributes which have been used for prediction and predict attributes. Moreover, several algorithms are used: decision Tree, Clustering, Naïve Bayes and Neural Net.

 

Targeting Mailing Campaign Data Mining Model

Picture below represents the model for Forecasting. Time series algorithm is used to predict Quantity and Amount of sales.

 

Forecasting Data Mining Model

Exploring and Validating Models

The fifth step in the data mining process is to explore the mining models that you have built and test their effectiveness.

Before you deploy a model into a production environment, you will want to test how well the model performs. Also, when you build a model, you typically create multiple models with different configurations and test all models to see which yields the best results for your problem and your data.

Analysis Services provides tools that help you separate your data into training and testing datasets so that you can accurately assess the performance of all models on the same data. You use the training dataset to build the model, and the testing dataset to test the accuracy of the model by creating prediction queries. In SQL Server 2008 Analysis Services, this partitioning can be done automatically while building the mining model.

If none of the models that you created in the Building Models step perform well, you might have to return to a previous step in the process and redefine the problem or reinvestigate the data in the original dataset.

Picture below represents decision tree evaluation model for Targeting Mailing Campaign. Please consider that more darkness background color of the attribute node means higher probability for bying.

 

Decision tree

Next picture represents the influence of each attribute to predicted variable. For particular case, attributes home owner and education are less important for bike buyer the other attributes. You can set the level of confidentiality you want to consider.

 

Data Mining Attribute Relations

Next picture represents the result of usage of Time Series algorithm in forecasting scenario.

 

Forecasting Data Mining Results

Deploying and Updating Models

The last step in the data mining process is to deploy the models that performed the best to a production environment.

After the mining models exist in a production environment, you can perform many tasks, depending on your needs. The following are some of the tasks you can perform:

  • Use the models to create predictions, which you can then use to make business decisions. SQL Server provides the DMX language that you can use to create prediction queries, and Prediction Query Builder to help you build the queries.
  • Create content queries to retrieve statistics, rules, or formulas from the model.
  • Embed data mining functionality directly into an application. You can include Analysis Management Objects (AMO), which contains a set of objects that your application can use to create, alter, process, and delete mining structures and mining models. Alternatively, you can send XML for Analysis (XMLA) messages directly to an instance of Analysis Services.
  • Use Integration Services to create a package in which a mining model is used to intelligently separate incoming data into multiple tables. For example, if a database is continually updated with potential customers, you could use a mining model together with Integration Services to split the incoming data into customers who are likely to purchase a product and customers who are likely to not purchase a product.
  • Create a report that lets users directly query against an existing mining model.
  • Update the models after review and analysis. Any update requires that you reprocess the models.
  • Update the models dynamically, as more data comes into the organization, and making constant changes to improve the effectiveness of the solution should be part of the deployment strategy.

 

References:

SQL Server Books Online

Rate:
 
1 vote | 2 comments

Comments

This article has a focus on Data Mining usage in Analytical CRM. There is a lot of content on my site that covers the  rest of 99% effort as you mention, before you put "whipped cream on the cake"  (those articles are related to the DW architecture, data integration, data governance, etc)

Honestly, I believe this endeavour is MUCH BIGGER than data mining... data mining is really the cherry in the top of the pie but doesn't represent 1% of the effort to create business definitions and to gather the data...

I am a huge believer in analytical CRM but I believe the topic is far more complex than the use of technology. It goes deep in the business cases and business definitions.

Ideally, every company would be able to join the forces of CRM + Marketing Campaign + Analytics and predictive analyses (Customer Intellgence) + Competitive Intelligence (know the market)...

Glad to see that there are still people working hard to make it happen...

Cheers!
Marcelo Malheiros

Post new comment

The content of this field is kept private and will not be shown publicly.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters (without spaces) shown in the image.