# AnalyticBridge

A Data Science Central Community

can we find out which variables are important for carrying out logisitc regression before carrying out logistic regression?

Views: 975

Comment

Join AnalyticBridge Comment by RockyRambo on March 8, 2012 at 11:26am

Variable transformations are usually applied if the relationship between the independent and dependent variables is not linear.. Comment by Minethedata on November 1, 2010 at 7:20am
Tom, thanks for the input. Yes, I was about to ask the question on when to use decision trees, logisitic regression, clustering or neural nets when the dependent variable is dichotomous. Can you please elaborate on this as well as why to use other data mining techniques when the relationship is non-linear? Is this rule applicable even when there is exponential relationship between variables and logisitc regression seems to be apt for modeling?Is this non-linear relationship between the dependent variable and any one independent var even if other indepdendent variables are not non-linearly related.Also please let me know if you have reading material on this Comment by Minethedata on October 29, 2010 at 6:00am
Yes, I was about to ask the question on when to use decision trees, logisitic regression, clustering or neural nets when the dependent variable is dichotomous. Can you please elaborate on this as well as why to use other data mining techniques when the relationship is non-linear? Is this rule applicable even when there is exponential relationship between variables and logisitc regression seems to be apt for modeling?Is this non-linear relationship between the dependent variable and any one independent var even if other indepdendent variables are not non-linearly related.Also please let me know if you have reading material on this. Comment by Minethedata on October 28, 2010 at 4:41am
Thanks Tom for th answer on concatenating 2 variables. A question which comes to my mind is that we consider a variable to be fit for using it for Logistic regression if it has a high correlation with the dependant variable. A high correlation coefficient also tells us that there is a linear relationship between the variables. Suppose 2 variables have a non-linear relationship then the correlation coefficient may not capture this and we may land up neglecting the variable for logisitic regression. Or do we have to check for other correlation coefficients, and if yes which one should be used in case of numeric variables. Comment by Ralph Winters on October 25, 2010 at 2:33pm
To properly do correlation (and not association) you need the number of rows to be equal to the number of columns. Then assuming the data is at least ordinal I would perform a Spearman rank correlation test. If you are talking about simple association then you could perform a test like Cramers V.

-Ralph Winters Comment by Minethedata on October 25, 2010 at 2:45am
Tom, using crosstabs i get tthe frequency of the data. Suppose there are 3 columns and 2 rows ( this is all multinomial data) what is the formula used for getting the phi coefficient( I asume you will get the corelation coefficient using phi coefficient only) value. Also in general if there is not a 2*2 contigency table ,, what is the formula used?

Also please elaborate how to concatenate 2 variables? Comment by Minethedata on October 22, 2010 at 11:19am
Thanks Idielle, what happens if the independent variable is categorical or multinomail. I mean how to get coorelation betwwen such non-numeric variable and dependent variable Comment by Minethedata on October 22, 2010 at 5:55am
What happens if VAR_A and VAR_B have a high correlation amongst themselves? Then in that case do we consider only the variable which has a high correlation with the dependent variable. And what happens if VAR_A and VAR_B don't have any correlation? Comment by Minethedata on October 21, 2010 at 1:26am
Thanks Tom , But my question is more on the coefficients of logisitic regression. Suppose a variable has a higher correlation with the dependent variable as compared with other independent variables,will this reflect in a higher beta(coefficient of regression) value as comapred with the coefficients of other independent variables? Comment by Minethedata on October 20, 2010 at 5:42am
Thanks Tom, Suppose a variable has a higher correlation with the dependent variable as compared with other independent variables. Will this reflect in a higher beta(coefficient of regression) value as comapred with the coefficients of other independent variables?