Coefficient of Variation in R (with Examples)

Deepanshu Bhalla Add Comment

This tutorial explains how to calculate the coefficient of variation (CV) in R, along with examples.

The coefficient of variation explains how much data deviates from the average. It is calculated by dividing the standard deviation by the mean. It is used to compare the variation of datasets.

The syntax of coefficient of variation in R is as follows:

# Sample Data
data <- c(77, NA, 63, 50, 41, 24, 56, 29, 36, 74, 23)

# Coefficient of Variation
cv_value <- (sd(data, na.rm=T) / mean(data, na.rm=T)) * 100

# Output
# 42.04833

In the code above, sd() function is used to calculate the standard deviation of the data, while the mean() function is used to calculate the mean. The na.rm=T argument is used to tell R to ignore missing values while calculating the mean or standard deviation.

Calculating Coefficient of Variation on Multiple Columns

Suppose you have historical monthly returns data for stocks - Apple, Google and Microsoft. You want to check which stock has the best balance between risk and return when comparing them based on their variation in returns.

# Sample dataframe with historical monthly returns data
df <- data.frame(
  Date = seq(as.Date("2024-01-01"), by = "month", length.out = 12),
  AAPL_Return = c(0.05, 0.03, -0.02, 0.04, 0.06, -0.01, 0.02, 0.03, -0.05, 0.07, 0.01, 0.02),
  GOOG_Return = c(0.04, 0.02, 0.01, 0.03, -0.03, 0.05, -0.02, 0.03, 0.06, 0.02, 0.04, 0.01),
  MSFT_Return = c(0.03, 0.01, 0.02, 0.04, 0.05, -0.02, 0.03, -0.01, 0.02, 0.06, 0.04, 0.03)

# Select Columns
df_cv <- df[c("AAPL_Return", "GOOG_Return","MSFT_Return")]

# Calculating CV for each column
cv_df <- sapply(df_cv, function(x) sd(x, na.rm=T) / mean(x, na.rm=T) * 100)

The sapply function works like a loop which is to apply the calculation to each column of the dataframe.

Coefficient of Variation in R

As per the output shown above, Apple's stock has the highest CV which means that the stock's returns are more volatile compared to its average return which means highest risk among all three stocks. Whereas, Microsoft's stock has the lowest CV which means that the stock's returns are more stable relative to its average return which means lowest risk among all three stocks.

Related Posts
Spread the Word!
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

Post Comment 0 Response to "Coefficient of Variation in R (with Examples)"
Next → ← Prev