# How to Extract Numeric Variables from Dataframe in R

In R, you can extract numeric columns from a data frame using various methods.

Let's create a sample data frame called `mydata` having 3 variables (name, age, height).

```# Create a sample data frame
mydata <- data.frame(
name = c("Alice", "Bob", "Charlie"),
age = c(25, 30, 28),
height = c(165.5, 180.0, 172.3)
)
```

## How to Extract all Numeric Variables in R

In the dataframe named "mydata", we have two numeric columns "age" and "height". When we have multiple variables in a dataframe, we don't know the name of the numeric columns in advance.

Base R

```numeric_columns <- mydata[sapply(mydata, is.numeric)]
print(numeric_columns)
```

In base R, you can extract multiple numeric columns (variables) using `sapply` function. The sapply function is a part of apply family of functions. They perform multiple iterations (loops) in R.

In dplyr package, the `select_if` function is used to select columns based on a condition. In this case, `is.numeric` selects only the numeric columns.

dplyr

```library(dplyr)

# Select numeric columns using select_if()
numeric_columns <- mydata %>% select_if(is.numeric)
print(numeric_columns)
```

## Extracting Numeric Variables with No Missing Values in R

Let's say you want to keep numeric columns that have no missing values in R.

```# Create a sample data frame
mydata <- data.frame(
name = c("Alice", "Bob", "Charlie", "Dave"),
age = c(25, 30, 28, NA),
height = c(165.5, 180.0, 172.3, 189),
weight = c(NA, NA, 72, 74)
)
```

Base R

```numeric_cols <- sapply(mydata, is.numeric)
numeric_no_missing <- colSums(is.na(mydata[numeric_cols])) == 0
numeric_no_missing_cols <- mydata[numeric_cols] [numeric_no_missing]
```

Let's see how the code works:

1. `numeric_cols <- sapply(mydata, is.numeric)` returns TRUE for numeric columns, otherwise FALSE in the dataframe.
2. `numeric_no_missing <- colSums(is.na(mydata[numeric_cols])) == 0` returns numeric columns with no missing values.
3. `numeric_no_missing_cols <- mydata[numeric_cols][numeric_no_missing]` selects numeric columns with no missing values into a new dataframe.

dplyr

```library(dplyr)

numeric_no_missing_cols <- mydata %>%
select(where(is.numeric)) %>%
select(where(~ all(!is.na(.))))
```

If you want to keep columns that have no missing values, you can use the select() function with where() in dplyr. select(where(is.numeric)) selects only the numeric columns. select(where(~ all(!is.na(.)))) selects columns where all values are not missing (NA).

Related Posts