tag:blogger.com,1999:blog-7958828565254404797.post8121424613199408455..comments2024-03-28T07:44:59.527-07:00Comments on ListenData: Dimensionality Reduction with RDeepanshu Bhallahttp://www.blogger.com/profile/09802839558125192674noreply@blogger.comBlogger11125tag:blogger.com,1999:blog-7958828565254404797.post-72853686510937545382021-01-06T08:34:42.010-08:002021-01-06T08:34:42.010-08:00with this i know i'm not alone in data science...with this i know i'm not alone in data scienceEmmanuelhttps://www.blogger.com/profile/15380787574172947898noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-81696931688714818342017-01-09T18:46:56.099-08:002017-01-09T18:46:56.099-08:00What a informative post! It's really helpful. ...What a informative post! It's really helpful. <br />But I have a question. findCorrelation() function searches columns to remove to reduce pair-wise correlations. Then, is it different with vif() function?<br />vif() function calculates multi-collinearity. what's the difference? Anonymoushttps://www.blogger.com/profile/10369024663689566434noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-18017026426908441772016-10-16T05:36:40.367-07:002016-10-16T05:36:40.367-07:00I have added more description in the article. Hope...I have added more description in the article. Hope it helps.Deepanshu Bhallahttps://www.blogger.com/profile/09802839558125192674noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-72476380554080583992016-10-11T10:59:29.380-07:002016-10-11T10:59:29.380-07:00What do you mean by mean absolute correlation of e...What do you mean by mean absolute correlation of each variable? How can we compute the correlation of a single variable? Thanks.Anonymoushttps://www.blogger.com/profile/17562822903953745242noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-8861733846299889022015-07-21T01:30:19.815-07:002015-07-21T01:30:19.815-07:00Really like the way you have created a whole corre...Really like the way you have created a whole correlation module. Thanks. Any other articles you have written on pre-analytics visualization in R?Varunhttps://www.blogger.com/profile/02212721215257461694noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-51362807130770913662015-07-01T10:48:30.788-07:002015-07-01T10:48:30.788-07:00It is because OOB error is very small. Scope of im...It is because OOB error is very small. Scope of improvement is very minimal. Hence, improve = 0.01 fails. You can ignore this line of code and just to the next step. Set 4 to mtry instead of mtry = best.m. See the code below -<br />rf <-randomForest(classe~.,data=dat3, mtry=4, importance=TRUE,ntree=1000)Deepanshu Bhallahttps://www.blogger.com/profile/09802839558125192674noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-5836268477698005412015-07-01T00:08:30.346-07:002015-07-01T00:08:30.346-07:00Hi Deepanshu,
Thanks for your very useful code. I ...Hi Deepanshu,<br />Thanks for your very useful code. I am getting error in the mtry command as mentioned below. Any clue to solve that ?<br />mtry <- tuneRF(dat3[, -36], dat3[,36], ntreeTry=1000, stepFactor=1.5,improve=0.01, trace=TRUE, plot=TRUE)<br />mtry = 5 OOB error = 0% <br />Searching left ...<br />mtry = 4 OOB error = 0% <br />NaN 0.01 <br />Error in if (Improve > improve) { : missing value where TRUE/FALSE neededsidduhttps://www.blogger.com/profile/12677519659101457972noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-81906350432537189572015-06-30T23:29:56.563-07:002015-06-30T23:29:56.563-07:00No, it keeps one variable among the 2 highly corre...No, it keeps one variable among the 2 highly correlated variables. If two variables have a high correlation, the function looks at the mean absolute correlation of each variable and removes the SINGLE variable with the largest mean absolute correlation.<br />Deepanshu Bhallahttps://www.blogger.com/profile/09802839558125192674noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-3998037283215516002015-06-30T14:29:23.145-07:002015-06-30T14:29:23.145-07:00Hi Manish, nice article.
One question
If we are re...Hi Manish, nice article.<br />One question<br />If we are removing highly correlated variables, do we keep one variable among the highly correlated set (cluster) to represent the other removed variables of the set or do we remove all of them?Anonymoushttps://www.blogger.com/profile/13801737029415139684noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-82848884104456484072015-06-27T08:15:25.893-07:002015-06-27T08:15:25.893-07:00Glad you liked it. Stay tuned :-)Glad you liked it. Stay tuned :-)Deepanshu Bhallahttps://www.blogger.com/profile/09802839558125192674noreply@blogger.comtag:blogger.com,1999:blog-7958828565254404797.post-55977374666580596802015-06-27T07:54:07.005-07:002015-06-27T07:54:07.005-07:00great article. I am learning R and this was very h...great article. I am learning R and this was very helpful indeed. Please keep writing more hands on how to do this in R stuff:-)<br />cheers<br />ManishManish Mahajanhttp://www.twitter.com/manizoyanoreply@blogger.com