R Function : Outlier Treatment


R Data Science: R Programming A-Z: R For Data Science With Real Exercises!

To correct outlier problem, we can winsorise extreme values. Winsorize at the 1st and 99th percentile means values that are less than the value at 1st percentile are replaced by the value at 1st percentile, and values that are greater than the value at 99th percentile are replaced by the value at 99th percentile.
########################################################
# R Function for Outlier Treatment : Percentile Capping
########################################################

pcap <- function(x){
  for (i in which(sapply(x, is.numeric))) {
  quantiles <- quantile( x[,i], c(.05, .95 ), na.rm =TRUE)
  x[,i] = ifelse(x[,i] < quantiles[1] , quantiles[1], x[,i])
  x[,i] = ifelse(x[,i] > quantiles[2] , quantiles[2], x[,i])}
  x}

# Replacing extreme values with percentiles
abcd = pcap(mydata)
  
# Checking Percentile values of 7th variable
quantile(abcd[,7], c(0.25,0.5,.95, .99, 1), na.rm = TRUE)
Coursera Data Science

R Tutorials : 75 Free R Tutorials

Get Free Email Updates :
*Please confirm your email address by clicking on the link sent to your Email*

Related Posts:

0 Response to "R Function : Outlier Treatment"

Post a Comment

Next → ← Prev