In linear regression analysis, it's an important assumption that there should be a linear relationship between independent variable and dependent variable. Whereas, logistic regression assumes there should be a linear relationship between independent variable and logit function.

**How to check non-linearity****Pearson correlation**is a measure of linear relationship. The variables must be measured at interval scales. It is sensitive to outliers. If pearson correlation coefficient of a variable is close to 0, it means there is no linear relationship between variables.**Spearman's correlation**is a measure of monotonic relationship. It can be used for ordinal variables. It is less sensitive to outliers. If spearman correlation coefficient of a variable is close to 0, it means there is no monotonic relationship between variables.

**Hoeffding’s D correlation**is a measure of linear, monotonic and

**non-monotonic relationship**. It has values between –0.5 to 1. The signs of Hoeffding coefficient has no interpretation.

If a variable has a very low rank for Spearman (coefficient - close to 0) and a very high rank for Hoeffding indicates a non-monotonic relationship.

If a variable has a very low rank for Pearson (coefficient - close to 0) and a very high rank for Hoeffding indicates a non-linear relationship.

**Criterion to eliminate irrelevant variables**

If a variable has poor rank on both the spearman and hoeffding correlation metrics, it means the relationship between the variables is random.

**SAS Macro to detect non-monotonic relationship**

This comment has been removed by the author.

ReplyDeleteSpearman correlation measures monotonic relationship. When you say a variable having high spearman rank, i am assuming it's correlation value is high or moderate. It means the association is monotonic. Hope it helps.

DeleteThanks

ReplyDelete