How to Extract Numeric Variables from Dataframe in R

In R, you can extract numeric columns from a data frame using various methods.

Let's create a sample data frame called mydata having 3 variables (name, age, height).

# Create a sample data frame
mydata <- data.frame(
  name = c("Alice", "Bob", "Charlie"),
  age = c(25, 30, 28),
  height = c(165.5, 180.0, 172.3)
)

How to Extract all Numeric Variables in R

In the dataframe named "mydata", we have two numeric columns "age" and "height". When we have multiple variables in a dataframe, we don't know the name of the numeric columns in advance.

Base R

numeric_columns <- mydata[sapply(mydata, is.numeric)]
print(numeric_columns)

In base R, you can extract multiple numeric columns (variables) using sapply function. The sapply function is a part of apply family of functions. They perform multiple iterations (loops) in R.

In dplyr package, the select_if function is used to select columns based on a condition. In this case, is.numeric selects only the numeric columns.

dplyr

library(dplyr)

# Select numeric columns using select_if()
numeric_columns <- mydata %>% select_if(is.numeric)
print(numeric_columns)

Extract Numeric Columns from Dataframe in R

Extracting Numeric Variables with No Missing Values in R

Let's say you want to keep numeric columns that have no missing values in R.

# Create a sample data frame
mydata <- data.frame(
  name = c("Alice", "Bob", "Charlie", "Dave"),
  age = c(25, 30, 28, NA),
  height = c(165.5, 180.0, 172.3, 189),
  weight = c(NA, NA, 72, 74)
)

Base R

numeric_cols <- sapply(mydata, is.numeric)
numeric_no_missing <- colSums(is.na(mydata[numeric_cols])) == 0
numeric_no_missing_cols <- mydata[numeric_cols] [numeric_no_missing]

Let's see how the code works:

numeric_cols <- sapply(mydata, is.numeric) returns TRUE for numeric columns, otherwise FALSE in the dataframe.
numeric_no_missing <- colSums(is.na(mydata[numeric_cols])) == 0 returns numeric columns with no missing values.
numeric_no_missing_cols <- mydata[numeric_cols][numeric_no_missing] selects numeric columns with no missing values into a new dataframe.

dplyr

library(dplyr)

numeric_no_missing_cols <- mydata %>%
  select(where(is.numeric)) %>%
  select(where(~ all(!is.na(.))))

If you want to keep columns that have no missing values, you can use the select() function with where() in dplyr. select(where(is.numeric)) selects only the numeric columns. select(where(~ all(!is.na(.)))) selects columns where all values are not missing (NA).

Extracting Numeric Columns with No Missing Values

About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

While I love having friends who agree, I only learn from those who don't
Let's Get Connected Email LinkedIn