tag:blogger.com,1999:blog-7958828565254404797.post6394217036497722051..comments2024-03-28T07:44:59.527-07:00Comments on ListenData: Ensemble Methods in R : Practical GuideDeepanshu Bhallahttp://www.blogger.com/profile/09802839558125192674noreply@blogger.comBlogger12125tag:blogger.com,1999:blog-7958828565254404797.post-78968017243808924012019-06-23T14:58:11.432-07:002019-06-23T14:58:11.432-07:00good post!, thanksgood post!, thanksSebahttps://www.blogger.com/profile/15457436365162991473noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-63466570596986947382017-12-29T06:14:44.929-08:002017-12-29T06:14:44.929-08:00Many thanks Deepanshu.
I have one question though....Many thanks Deepanshu.<br />I have one question though. How can we use Grid (expand.grid) values for different algorithms?<br /><br />Regards<br />AbhayAbhayhttps://www.blogger.com/profile/01331184161430889120noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-68577474665133849162017-10-11T09:25:28.902-07:002017-10-11T09:25:28.902-07:00Hi. First, I like to thank you for the great and e...Hi. First, I like to thank you for the great and easy to follow walkthrough. I have been able adapt this to several other algorithms and it works fine with expected results. However, I have challenges understanding what exactly this training$OOF_dt and this testing$dt are.<br /><br />My understanding is that, one is just a sorted form of the other and I know they form new columns in the training and testing sets.The confusing part is that you referred to "training$OOF_dt" as another prediction. <br /><br />Thank you for responding.Anonymoushttps://www.blogger.com/profile/04649481728215972818noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-81615965176888944412017-03-09T12:58:21.438-08:002017-03-09T12:58:21.438-08:00Hi,
I copied and pasted the code above and run it...Hi,<br /><br />I copied and pasted the code above and run it on my computer. I ran into this problem I couldn't figure out what went wrong. Attached is the code, and the last line is where the problem occurs.<br /><br /># Loading Required Packages<br />library(caret)<br />library(caTools)<br />library(RCurl)<br />library(caretEnsemble)<br />library(pROC)<br /><br /># Reading data file<br />urlfile <-'https://raw.githubusercontent.com/hadley/fueleconomy/master/data-raw/vehicles.csv'<br />x <- getURL(urlfile, ssl.verifypeer = FALSE)<br />vehicles <- read.csv(textConnection(x))<br /><br /># Cleaning up the data and only use the first 24 columns<br />vehicles <- vehicles[names(vehicles)[1:24]]<br />vehicles <- data.frame(lapply(vehicles, as.character), stringsAsFactors=FALSE)<br />vehicles <- data.frame(lapply(vehicles, as.numeric))<br />vehicles[is.na(vehicles)] <- 0<br />vehicles$cylinders <- ifelse(vehicles$cylinders == 6, 1,0)<br /><br /># Making dependent variable factor and label values<br />vehicles$cylinders <- as.factor(vehicles$cylinders)<br />vehicles$cylinders <- factor(vehicles$cylinders,<br /> levels = c(0,1),<br /> labels = c("level1", "level2"))<br /><br /># Split data into two sets - Training and Testing<br />set.seed(107)<br />inTrain <- createDataPartition(y = vehicles$cylinders, p = .7, list = FALSE)<br />training <- vehicles[ inTrain,]<br />testing <- vehicles[-inTrain,]<br /><br /># Setting Control<br />ctrl <- trainControl(<br /> method='cv',<br /> number= 3,<br /> savePredictions=TRUE,<br /> classProbs=TRUE,<br /> index=createResample(training$cylinders, 10),<br /> summaryFunction=twoClassSummary<br />)<br /><br /># Train Models<br />model_list <- caretList(<br /> cylinders~., data=training,<br /> trControl = ctrl,<br /> metric='ROC',<br /> tuneList=list(<br /> rf1=caretModelSpec(method='rpart', tuneLength = 10),<br /> gbm1=caretModelSpec(method='gbm', distribution = "bernoulli",<br /> bag.fraction = 0.5, tuneGrid=data.frame(n.trees = 50,<br /> interaction.depth = 2,<br /> shrinkage = 0.1,<br /> n.minobsinnode = 10))<br /> )<br />)<br /><br /># Check AUC of Individual Models<br />model_list$rf1<br />model_list$gbm1<br /><br />#Check the 2 model’s correlation<br />#Good candidate for an ensemble: their predictions are fairly uncorrelated,<br />#but their overall accuracy is similar<br />modelCor(resamples(model_list))<br /><br /><br />#################################################################<br /># Technique I : Stacking / Blending with GLM<br />#################################################################<br /><br />glm_ensemble <- caretStack(<br /> model_list,<br /> method='glm',<br /> metric='ROC',<br /> trControl=trainControl(<br /> method='cv',<br /> number=3,<br /> savePredictions=TRUE,<br /> classProbs=TRUE,<br /> summaryFunction=twoClassSummary<br /> )<br />)<br /><br /># Check Results<br />glm_ensemble<br /><br /><br />########################################################<br /># Validation on Testing Sample<br />########################################################<br /><br />ensemble <- predict(glm_ensemble, newdata=testing, type='prob')$level2<br /><br />When I ran the line above, I got the following message <br />" Error in eval(expr, envir, enclos) : object 'rf1' not found"<br /><br />Please help. Thanks in advance.<br />Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-5339437021293734442016-12-20T20:36:59.528-08:002016-12-20T20:36:59.528-08:00Plies send the data
Plies send the data<br />Anonymoushttps://www.blogger.com/profile/05428836629051645065noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-85576549830578192652016-10-23T02:31:52.862-07:002016-10-23T02:31:52.862-07:00thanks for sharing.. was very helpful..thanks for sharing.. was very helpful..Anonymoushttps://www.blogger.com/profile/01234379470068269965noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-29113921201105536982015-09-21T23:54:45.300-07:002015-09-21T23:54:45.300-07:00Cool. Glad you found it useful. Cool. Glad you found it useful. Deepanshu Bhallahttps://www.blogger.com/profile/09802839558125192674noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-1244791351485358602015-09-21T23:51:38.979-07:002015-09-21T23:51:38.979-07:00ok.
I was using the older version. By the way, ver...ok.<br />I was using the older version. By the way, very good post.cosmoshttps://www.blogger.com/profile/10017930041149685301noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-91671481364698882162015-09-21T23:50:42.131-07:002015-09-21T23:50:42.131-07:00It works in the latest version of caret. Check out...It works in the latest version of caret. Check out this link http://topepo.github.io/caret/training.htmlDeepanshu Bhallahttps://www.blogger.com/profile/09802839558125192674noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-71831846644969338152015-09-21T23:42:50.272-07:002015-09-21T23:42:50.272-07:00This comment has been removed by the author.cosmoshttps://www.blogger.com/profile/10017930041149685301noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-40983668041081133972015-09-14T02:39:53.568-07:002015-09-14T02:39:53.568-07:00Why should i remove - n.minobsinnode = 10? It is o...Why should i remove - n.minobsinnode = 10? It is one of the tuning parameter of GBM. Deepanshu Bhallahttps://www.blogger.com/profile/09802839558125192674noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-4601105836526101642015-09-14T01:52:21.643-07:002015-09-14T01:52:21.643-07:00Hi...
you need to remove " n.minobsinnode = 1...Hi...<br />you need to remove " n.minobsinnode = 10" from tuneList=list(....)<br /><br />It was a great help in understanding the blending.<br /><br />Thankscosmoshttps://www.blogger.com/profile/10017930041149685301noreply@blogger.com