In this tutorial, you will learn how to split sample into training and test data sets with R.

The following code splits 70% of the data selected randomly into training set and the remaining 30% sample into test data set.

Here

In the above program,

data<-read.csv("c:/datafile.csv")

dt = sort(sample(nrow(data), nrow(data)*.7))

train<-data[dt,]

test<-data[-dt,]

**sample( )**function randomly picks 70% rows from the data set. It is sampling without replacement.**Method 2 :**To maintain same percentage of event rate in both training and validation dataset.library(caret)

set.seed(3456)

trainIndex <- createDataPartition(data$FD, p = .7,

list = FALSE,

times = 1)

Train <- data[ trainIndex,]

Valid <- data[-trainIndex,]

**FD**is a dependent variable having two values 1 and 0.**Make sure it is defined in factor format.**
