Predictive Analytics

From EcomEvolve Knowledge Base
Jump to navigation Jump to search

Introduction

Predictive analytics sounds complicated, but it is something we use in our everyday lives.

We avoid calling our friends at 2 o'clock in the morning because we know they will probably be asleep. People who sell sun screen, don't promote their products in the UK in January, because demand will be low. There are many judgements we make, based on what we would describe as common sense (although data scientists would more formally call it "domain knowledge"), but they amount to the same thing, using our understanding of the world to predict future outcomes.

Things get more complicated where outcomes are difficult to judge. For example, if I want to offer free shipping to online customers, should I set a threshold for the order value? If so, what should that threshold be to maximise profits? These kinds of questions are much more difficult to answer without the development of some kind of mathematical model and experimentation. And the gains can be so marginal that only statistics can say if they have any effect at all.

This is an investigation into the use of Predictive Analytics in online marketing. The goal is to create a plan of action that will enable the practical application of predictive analytics in online marketing projects. The starting reference for this research is this Predictive Analytics article on Wikipedia. This document is useful because if outlines some of the applications of predictive analytics together with the most widely used types of mathematical tools.

Benefits of predictive analytics

Predictive analytics has a large number of applications, but our interest is in the context of online marketing.

Our business is to help online retailers to become more profitable and we facilitate this by dividing the marketing process into three distinct steps:

  1. Targeting the right people (those who are most likely to make a purchase) at the right time.
  2. Engaging them using content that is likely to capture their attention and interest.
  3. Converting them into customers by providing information and assurance they need to make their decision.

Predictive analytics is an essential tool used in the delivery of this process, because it enables us to answer important questions, including:

  • Who are "the right people"?
  • What products will they want to buy?
  • When will they want to buy?
  • How do we engage their interest?
  • How can we persuade them to buy?

As you can imagine, this can be quite challenging and costly, but it is worth the effort. By focusing your investment in marketing on the people who are most likely to buy you products, you can become more efficient and increase the profitability of your business.

For online retailers, the impact can be seen in the form of higher conversion rates and a lower marketing cost of sale.

Using predictive analytics

Essential tools

How you go about using predictive analytics depends on the questions you are trying to answer. Whatever they, though, you are going to need three things to find useful answers.

Requirement Description
Data

Objective and subjective information relating to the question being asked. The more data and better the quality of the data you have, the more likely it is that you can use it effectively. As a rule of thumb, capturing and organising your data typically consumes around 80% of any data analysis project.

Domain knowledge

Although intuitive knowledge can be misleading, experience of your market domain helps you to understand the limitations of the predictive models you build. It also increases awareness the quality of the data you use.

Mathematics and statistics

Important tools you can explore and understand data, create predictive models and test hypotheses. They have their limitations, though, all models are flawed, but some are useful.

The analytics process

Depending on the type of business these are the steps that we would usually include these important steps.

Step 1 - Define your objective

One of the big challenges of using predictive analytics is staying focused. Once you begin to visualise data all kinds of interesting characteristics become apparent and it is all too easy to get sidetracked. The best way to avoid this is to clearly define what it is you are trying to achieve before you begin.

There is nothing wrong with setting big objectives, such as How can we sell more to our existing customers? so long as you realise that a single investigation is unlikely to provide a definitive answer. In fact, developing a project plan to answer a question like this one is a project in itself.

It is essential to be as specific as possible about the questions you ask of the data and, before any analysis, make sure you know how the answers are going to improve your business.

Step 2 - Collect the available data

Before you start any kind of analysis, you need to collect and evaluate the available data. The majority of online retailers will have access to the private datasets datasets Table 1.

Dataset Description
Traffic and sales data The website traffic and order data that is usually collected and analysed using Google Analytics. One of the most important benefits provided by tools like Google Analytics is that they enable you to see how people find your website and measure how likely they are to make purchases after they have arrived.
Ecommerce sales data The sales data in your ecommerce system, which includes includes personal information about your customers, such as their contact details. This information can be used to build a detailed understanding of who your customers are and what might drive their buying behaviours.
Customer feedback Customer reviews and product and service ratings. You can use this to measure the quality of the service you provide and the products you sell.

Table 1: Commonly available private datasets

It is possible that in the analysis process you might combine this with other information, held in other internal systems or made available by third parties. Examples are listed in Table 2.

Dataset Description
National Statistics Postcode Lookup dataset National Statistics Postcode Lookup dataset], which can be used for geographic and demographic profiling.

Table 2: Commonly used public datasets

Step 3 - Define your hypotheses

Having set the objective for the project and reviewed the available data this is a good time to pause and reflect. Is the available information sufficient for your needs and if it is how much time and effort will be required to develop the predictive models you want? Best stop or refine your goals at this point than flog a dead horse.

Step 4 - Process the data

Processing the data usually means inspecting and cleaning it and transforming it into the required formats. You will also hovw to join different datasets together, such as adding geographic coordinates to personal address data.

Step 5 - Develop your models

Once the data is processed, you create predictive models that forecast the likelihood of different outcomes based on input variables.

Step 6 - Report and plan next actions

At the end of the process, you will have a better understanding of how your research and modelling can be used to benefit the business. All predictive models are incomplete and flawed, but some are useful.

Step 7 - Monitor and review

When you find a model that is useful, it is important to remember that things change. For example, I doubt that any sales forecast for 2020 predicted the impact of COVID-19 and you can't assume that models developed before the pandemic will work as efficiently after it.

Types of predictive models

On of the most important decision made in in any predictive analytics project is deciding which mathematical model to use. There are a lot to tools to chose from, which can be broadly categorised into three different types.

Predictive models

Developing a predictive model involves analysing the relationship between the behaviour of customers in a sample and the characteristics and attributes of those customers. This knowledge is then used to estimate the likelihood that a customer in another sample will behave in a similar way.

These kinds of models are used in a range of applications. In marketing, for instance, to help influence and online customer customer to make a purchase, and in finance, to predict the credit worthiness of an individual.

Descriptive models

While predictive models attempt to predict the action of a specific customer, descriptive models quantify information that allows customers to be formed into groups. Descriptive models are not used to rank or prioritise individuals, but to categorise them into groups, which are might then be categorised.

In marketing this process is sometimes described as segmentation, and often uses the following types of data:

  • Demographic (age, gender, ethnicity, etc.)
  • Geographic (country, locality etc.)
  • Psychographic (activities, interests, etc.)
  • Behavioural (when then buy, how they buy, etc.)

The challenge that businesses have to overcome to build these models is acquiring the customer data needed to build them. Established business can mine their various databases, but new businesses can only hypothesise and test using market research.

Decision models

Decision models describe the relationship between all of the factors that influence a decision and the forecast results of the decision. Decision models are generally used to develop decision logic or a set of business rules that will produce the desired action for every customer or circumstance.

Applications of predictive analytics

The best way to familiarise ourselves with some of the models available is to look as some specific use cases. For this we are being guided by Marketing Data Science, by Thomas W Miller.

Understanding markets

"Marketing data science, a specialisation of predictive analytics or data science, involves building models of seller and buyer preferences and using those models to make predictions about future marketplace behaviour." [1]

What are markets?

Markets are spaces where sellers and buyers get together and do business. Buyers represent the demand side and sellers represent the supply side. One of the important questions that sellers want to know is what drives buyer choice. What are the factors that determine whether a buyer purchases a product and which factors are the most important.

How can you analyse them?

A good modelling technique for determining this is Conjoint Analysis, in which you present customers with product profiles, which are made up of product attributes which they can score and use their responses to rank the products.

To facilitate this you need a good understanding of the relative importance of each product attribute and the part-worths associated with the levels of each attribute. The part-worths reflect how much value an individual consumer associates to the product attribute and can be positive or negative (thought they will always total zero).

By calculating the sum of the part worths you can calculate a ranking for the product in the target market.

Developing new products

Developing new products is costly. Research undertaken by the Product Development & Management Association in the US in 2011 suggests that for every 7 products that are introduced only one is successful. There is little reason to believe this statistic will have improved in recent year.

Before developing a new product, you have to be realistic about your chances of success. In addition to understanding the feature requirements of your potential customers, it is essential to to undertake a business analysis to evaluate your company's standing in the target market, and assess its position relative to competitors.

Commonly used tools for these types of analysis enable you to compare and evaluate multiple features include:

  • Conjoint analysis.
  • Binomial Logistic regression.
  • Multinomial Logistic regression.
  • Interaction plots.

Positioning products

Everyone is familiar with the concept of similarity. We know that products, such as movies, books and music can be classified using genres and that PCs can be categorised as desktops, laptops and tablets. Some of the measures we use facilitate this are objective, but some are subjective, requiring human judgement.

Mathematical tools exist to help us "measure the distance" between different products computationally.

Distance Matrix

For instance, a Distance Matrix showing how far any two products are from each other in a group. That means for any the total number of measurements required (y) for a number of products (x) would be:

y = n(n-1)/ 2

The results could be plotted on a map, created using a Multidimensional Scaling algorithm.

Cluster Analysis

Cluster analysis is for classification problems where the classes are unknown in advance. Interrelationships between objects are used to define classes.

There are two types of clustering, hierarchical clustering and non-hierarchical clustering.

Hierarchical clustering facilitates the organisation of objects in a hierarchy, such as the categorisation of shops and amenities in a shopping centre.

Non-hierarchical clustering simply gathers objects into distinct groups.

Finding new customers

Selling to existing customers

Retaining customers

Promoting products

Recommending products

Forecasting sales

Assessing brands and prices

Social media marketing

Competitor analysis

  1. Miller, Thomas W.. Marketing Data Science (FT Press Analytics) (p. 5). Pearson Education. Kindle Edition.