- Which marketing medium (TV, radio, print, online ads) returns maximum return (ROI)?
- How much to spend on marketing activities to increase sales by some percent (15%)?
- Predict sales in future from investment spent on marketing activities
- Identifying Key drivers of sales (including marketing mediums, price, competition, weather and macro-economic factors)
- How to optimize marketing spend?
- Is online marketing medium better than offline?

Imagine you are a chief strategist and your role requires you to come up with strategies that can boost company's revenue growth and profitability. With the use of insights from marketing mix model, you know sales can be increased by 50 million dollars for every 5 million dollars you spend on advertising. MMM would also help you to determine how much to spend on each advertising medium to get maximum return on investment (ROI).

**Chart source : brafton.com**

## Types of Marketing Mediums

Let's break it into two parts - offline and online.**Offline Marketing**

- Print Media : Newspaper, Magazine
- TV
- Radio
- Out-of-home (OOH) Advertising like Billboards, ads in public places.
- Direct Mail like catalogs, letters
- Telemarketing
- Below The Line Promotions like free product samples or vouchers
- Sponsorship

**Online Marketing**

- Search Engine Marketing like Content Marketing, Backlink building etc.
- Pay per Click, Pay per Impression
- Email Marketing
- Social Media Marketing (Facebook, YouTube, Instagram, LinkedIn Ads)
- Affiliate Marketing

## Data Required for Marketing Mix Modeling

To build robust and accurate MMM model, we need data of different types. See the details below.`Product Data`

refers to product and sub-product information along with price and no of units sold / unsold. It is required to understand various aspects related to product - Whether it is a new product, which product category the product falls into? Is it the largest selling product? At what rate sales of this product is growing? What's the price for each product in the category?`Promotion Data`

includes details about days when promotion or offers were active and the offer type like free delivery, cash back etc.`Advertising Data`

It is used to measure how effective advertisement is running. In TV Advertising, Gross Rating Point (GRP) is a method which measures total audience exposure to advertisement. You can also include other traditional methods such as Newspapers and Magazines and Radio. You can include SPENDS which is dollar value spent on these methods. Digital marketing is popular these days and a large percentage of population can access internet now which opens up a whole new world for advertising. We can use the variables like Digital Spend, Affiliation Marketing Spend, Content Marketing Spend, Sponsorship Spend and Social Media Spend to capture online media investment information.`Seasonality`

Increase in sales due to seasonality can make calculation of identifying top drivers of sales biased. We would not be able to assess the impact of promotion on sales. For example, sales in woolen jackets will be higher in winter season than summer. We also need to incorporate holidays and major events planned in a particular month. For example, sales during Christmas season is generally high than average sales.`Geographical Data`

It refers to location of retail store. It can be city, state or postal code of store.`Macroeconomic Data`

Sales can also be affected by macro economic factors like inflation, unemployment rate, GDP etc. Companies generally report negative growth rate in sales during recession. We need to incorporate these factors in our model so that it understands recession and cyclical effects.`Sales`

It is not possible to build MMM without sales variable. Sales can be in volume in units as well as revenue($)

- Sales, product and promotion data are generally stored internally within company's relational databases management system.
- Advertising data is either managed by internal marketing team or through external marketing agency.
- Macroeconomic data can be extracted through websites like World Bank, IMF and Economagic.

## Data Preparation

The level of data can be decided depending on the implementation of promotion campaign. Ideally data should NOT be prepared at daily level as daily sales data generally has too much variation which leads to poor accuracy. We generally use weekly time period and aggregate data at weekly level as promotions are live from Monday to Sunday.Another question : Do we have data at product level which means product information in each invoice? See the example below.

```
+----------------+------------------+-------------+--------------+-------+
| Transaction ID | Transaction Date | Customer ID | Product Code | Sales |
+----------------+------------------+-------------+--------------+-------+
| 1 | 24-09-2019 | 12 | AAA | 824 |
| 1 | 24-09-2019 | 12 | AAB | 809 |
+----------------+------------------+-------------+--------------+-------+
```

If you observe the above table, you would see transaction data at daily level. We can transform it to weekly level without losing any required information. Transformed data contains number of distinct customers, no of transactions, total sales of different products in each week.
```
+------+--------------------+--------------+-----------+-----------+
| Week | Distinct Customers | Transactions | Sales AAA | Sales AAB |
+------+--------------------+--------------+-----------+-----------+
```

## Data Exploration and Transformation

Before building a model, we as analysts need to perform various quality checks. It's a crucial stage of building a model. If your data is not prepared and handled correctly, model development would be of less use. Accuracy and robustness of model can also be deteriorated badly if you compromise data cleaning and transformation.`Importing Data`

If you have data in CSV or Excel format, you need to load data into R/Python. After importing data, you need to ensure whole data have been loaded correctly by checking the number of rows and columns. Length of character variables can be truncated while importing.`Missing Values`

It is important to check the number of missing values in each variable. Many statistical algorithms are not immune to missing values. They simply remove the rows when missing values exist. It can cause a loss of a good amount of data depending on percentage of missing values. Sometimes missing values are specially coded as '999'. We also need to understand the reason behind missing data. Is it a genuine missing because it is not applicable to some customers? Or customers or respondents did not provide the information? Missing values can be treated by replacing them with mean, median or mode value. We can also remove missing values if it constitutes a very small percentage of data. Moving average is also a common technique to impute missing values when data is in time series format.`Outlier Values`

are referred to as extreme values (in plain English). For example, customer having 150 years age is an outlier. Sometimes data are entered incorrectly into the system. We can handle outliers by percentile capping at 99th or 95th percentile depending on the distribution. Outliers can also be removed if percentage of these kind of observations in the dataset is very small.`Univariate and Bivariate Analysis`

are ways to perform data exploration. Univariate analysis refers to examine the distribution of a single variable. It can be done by calculating mean, median, mode, standard deviation and percentiles. Bivariate analysis refers to checking the association between two variables. For example, inspecting relationship between TV GRPs and Sales.

## Adstock

The concept behind Adstock is that advertising has been shown to have an effect extending several periods after you see it first time. In other words, an advertisement from a previous period may have some effect of an advertisement in the current period. It is also called advertising carryover. It's a very old concept (first used in 1980s) when TV was the main medium of advertisement. It is not restricted to TV only and can be applied to online, radio and print media as well. Let's understand through example.Suppose you are watching your favorite TV show. During commercial breaks, you see an ad of perfume brand 'X'. You would not buy this perfume immediately after the commercial break. Let's say you see the ad of the same brand 'X' a couple of times in next few days. It would increase awareness to a new level and there is a high chance that you would purchase perfume of this brand (if you need it). If you would not have seen the advertisement again after first time, you would not be able to recall the brand easily. This is the decay effect of Adstock. This decay is reduced by new advertising exposure.

Ads shown on TV are generally remembered for longer than online ads

AWhere A_{t}= T_{t}+ λA_{t −1}+ λA_{t −2}Here t =1,...,n

_{t}is the Adstock at time t. T

_{t}is the value of the advertising variable at time t and λ is the 'retention' or lag weight parameter and can be interpreted as percentage of effectively remembered ad contact from previous week plus contacts from current week. λ lies between 0 and 1. Let's learn via example.

Adstock rate = 0.5 TV GRP in week 1 = 115 TV GRP in week 2 = 120 Adstock in week 2 = 120 + 0.5*115 = 177.5

## Techniques used in Marketing Mix Modeling

The common statistics techniques used in MMM are linear and non-linear regression techniques.Multiple Linear RegressionSales = β0 + β1*X1 + β2*X2 + .... + βn*Xn Xi are independent variables (or predictors), βi can be interpreted as change in sales corresponding to unit change in predictor.

Log-Linear RegressionLN(Sales) = β0 + β1*X1 + β2*X2 + .... + βn*Xn LN is natural log. Xi are independent variables (or predictors), βi can be interpreted as % change in sales corresponding to unit change in predictor.

Log-Log RegressionLN(Sales) = β0 + β1*LN(X1) + β2*LN(X2) + .... + βn*LN(Xn) βi can be interpreted as % change in sales corresponding to 1% change in predictor.

## Elasticity

Elasticity answer this question "What marketing activities to pull or push and the corresponding impact on sales". For example a price elasticity of -1.9 means that when price is increased by 1%, sales will be reduced by 1.9 percent keeping all other factors being constant. Similarly we can calculate elasticity of TV, radio and online advertisement.When you have simple linear regression model, you can calculate elasticity using the formula below -

In the case of non-linear regression models, the above defined elasticity formula needs to be tweaked according to the equation. Refer the table below.

## Model Performance

Model performance of MMM can be checked like we do in any linear regression model. Common model performance metrics are R-Squared, Adjusted R-Squared, Mean Absolute Percentage Error (MAPE) etc.## Marketing Mix Modeling using SAS, Python and R

In the program below, I have shown how to implement basic MMM model using SAS, R and Python. This model takes takes into account three variables which are price and exposure on two different advertising mediums (let's say TV and online). You can download dataset from this linkR Code

# Read CSV File df <- read.csv("Furniture.csv") adstock <- function(data, rate) { return(as.numeric(filter(x=data, filter=rate, method="recursive"))) } x1 <- adstock(df$Advt1, 0.5) x2 <- adstock(df$Advt2, 0.5) # Model mod <- lm(formula=Sales ~ x1 + x2 + Price.Furniture,data=df) summary(mod) # Price Elasticity (PE <-as.numeric(mod$coefficients["Price.Furniture"] * mean(df$Price.Furniture)/mean(df$Sales)))

Python Code

import pandas as pd df = pd.read_csv("C:/Users/DELL/Documents/Furniture.csv") #Adstock import numpy as np def adstock(data, rate): tt = np.empty(len(data)) tt[0] = data[0] for i in range(1, len(data)): tt[i] = data[i] + tt[i-1] * rate return tt x1 = adstock(df.Advt1, 0.5) x2 = adstock(df.Advt2, 0.5) #Linear Regression Model from sklearn.linear_model import LinearRegression x = pd.DataFrame({"x1":x1, "x2":x2, "Price": df["Price.Furniture"]}) y = df.iloc[:,0] lm = LinearRegression() lm = lm.fit(x,y) coefficients = pd.concat([pd.DataFrame(x.columns),pd.DataFrame(np.transpose(lm.coef_))], axis = 1) lm.intercept_ #Elasticity coefficients.iloc[2,1] * np.mean(x.Price) / np.mean(y)

SAS Code

FILENAME PROBLY TEMP; PROC HTTP URL="https://raw.githubusercontent.com/deepanshu88/Datasets/master/Furniture.csv" METHOD="GET" OUT=PROBLY; RUN; OPTIONS VALIDVARNAME=ANY; PROC IMPORT FILE=PROBLY OUT=WORK.MMM (rename='Price.Furniture'n=Price) REPLACE DBMS=CSV; RUN; %let adstock = 0.5; data Adstock; set MMM; format x1 x2 10.2; if _N_ = 1 then do; x1 = Advt1; x2 = Advt2; retain x1 x2; end; else do; x1 = sum(Advt1, x1*&adstock.); x2 = sum(Advt2, x2*&adstock.); end; run; ods output ParameterEstimates = coefficients ; proc reg data=adstock; model sales = x1 x2 Price; run; proc sql; select estimate into: beta from coefficients where Variable = 'Price'; select mean(sales) into: meansales from MMM; select mean(Price) into: meanprice from MMM; quit; %let elasticity = %sysevalf(&beta. *(&meanprice./&meansales.)); %put &elasticity.;

The above code returns elasticity value of -1.796211 which means increase in price of furniture by 1% will decrease the sales of furniture by 1.79%.

Love your work. Do you have SAS code for MMM?

ReplyDeleteThanks.

Ethan

Yes I have added in the article. Thanks!

DeleteCould you also address how to use "proc mixed" and "proc GAM" in MMM given that the lower geographic data are available?

ReplyDeleteThanks.

Good question.. But, more interesting how to get positive coefficients for all marketing drivers?

ReplyDelete@Deepanshu- Nice post, very concise. I dont understand one thing though- Mix models differ from simple Linear models in that they allow both fixed and random effects. Where do you capture these in your equation.

ReplyDeleteThose are mixed models - a statistical term. The post is about marketing mix models, that is modeling of how marketing mix impacts sales

DeleteExcellent job done. Thanks.

ReplyDeleteWhat a marketing touch you have put in this long detailed article! Marketing mix modeling concept and features you disclosed were really a knowledgeable and helpful read. The Introduction, use of it, big marketing world, effective chart, types, required, date preparation and transformation, used techniques and using SAS- every points was incredibly rich with high valued data and analysis. It's indeed a cool stuff. Love your super talented job and presentation. Much respect. Keep it up please.

ReplyDeletegood post share it

ReplyDeletenice article

ReplyDeleteThis comment has been removed by the author.

ReplyDeleteHey let me say thanks for this brilliant quality concept. I just landed on this recourse world and amazed with what has been explained about MMM. You really spread huge effective data and analysis which must be able to draw anybody's prompt attention. Every item was full of authentic and realistic information and you can't imagine how much beneficial read it is! I appreciate your time and thoughts. Finally this guidelines can be really very useful for sure.

ReplyDeleteThis comment has been removed by a blog administrator.

ReplyDeleteThis comment has been removed by a blog administrator.

ReplyDeleteJust would like to say that it was a fascinating read for me and perhaps one of the best experience regarding the marketing mix modeling concepts. The steps and thoughts you revealed pointless to state that were the adoring and age-worth thinking. Including the brilliant introduction all the data and clarification were proving the thoughtful and genius expression to me. This type of quality work always be able to draw the rapt attention of any audience.

ReplyDeleteHi!

ReplyDeleteHow do you calculate the variable contributions in log-linear and log-log models? It's easy in case of additive one.

Excellent post , does any body have data needed for Marketing mix model?

ReplyDeleteHi! Fascinating article! Wonder why did you multiply the coefficient with np.mean(x.Price) / np.mean(y). I assume this way you can predict the percentage change right? Without it, it's if you decrease price by 1 USD Sales would increase by 10.0713 units.

ReplyDeletehey thank you for this post. it is really helpfull to me bu ı have question. ı am using adaboostregressior which function ı should be using for elasticity ?

ReplyDeleteThanks. So wonderful job done.

ReplyDeleteHi Deepanshu, you really gifted a analytical and resourceful content regarding marketing development issue. This was definitely a cool read and experience to have me into your long details. The mix modeling data you delivered step by step like the uses of MMM, types, required data, data sources, exploration and transformation- evey point was so effective and thoughtful. I like the way you discovered all the module. Anyway, my browsing story today was to hunt some data related to saavi ordering system an AU based leading company that performs as a wholesale business solution for the business owners to manage their orders from the customers and fortunately it also ensures mobile ordering apps that helps the customers to connect the account instantly. Hope this information will be helpful to many business guys here.

ReplyDeleteVery useful information shared. Thank you.

ReplyDeleteHi, can anyone elaborate on calculating saturation curves?

ReplyDeleteHi Deepanshu,

ReplyDeleteGreat article for starting and having an idea of the MMM. I have a question on Media Mix Model. I have weekly level data on affiliates,banner, social media ( lets see they are called vehicles ), most of them have a granularity for example under banner we have display, video etc. Many times they are higly correlated . How do we keep variables which have high VIF or do we really keep them in Model ?Is Transformation of those a best practise ? Business every quater would look for planning and they may want to plan all those vehicles .Let me know your thoughts.

NICE POST, THANKS

ReplyDeletehi Deepanshu,

ReplyDeleteI do not have TV GRP as all my advertising channels have spend in $.How do I still find the important channels that affect the sales.

A quick response would be apprecaited.

thanks