A Data Science Central Community

Started Nov 16, 2011 0 Replies 0 Likes

Hi!,I am working on a project which involves investigation of deterioration in R squared over a period of time for an OLS model. I am wondering how to approach this problem considering the fact that…Continue

Started this discussion. Last reply by kiran chapidi Dec 11, 2012. 3 Replies 0 Likes

Hi All!I want to understand the ways in which Weight of Evidence (WoE) is computed or adjusted in the following scenarios: 1. When number of goods in a class of a variable is 02. When number of bads…Continue

Started this discussion. Last reply by Sandeep Sunkara Mar 24, 2012. 4 Replies 0 Likes

Hi! I need inputs on the pros and cons of building a log-reg model using dummy variables instead of the Weight of evidence approach for categorical variables. Some of the cons that I can think of…Continue

Started Apr 5, 2011 0 Replies 0 Likes

Hi!, Can anyone suggest titles of books or reference material found on the web for FRAUD & RISK ANALYTICS? Regards,SharathContinue

Michael Akinwumi replied to Sharath Dandamudi's discussion Difference between Prediction and Forecasting

"They are two different concepts.
Prediction uses explanatory variables to characterize expected outcome or expected response.
On the other hand, forecasting uses trends in observation to characterize expected outcome or expected response.
In…"

May 31, 2017

MIKHIL NAGARALE replied to Sharath Dandamudi's discussion Difference between Prediction and Forecasting

"Prediction is the generalize term & it's independent of time. Forecasting is the prediction with time as a one of the dependent variable. Eg-
Prediction- Predicting amount spend by user for certain case. It's happen over the…"

May 31, 2016

R.Venkat replied to Sharath Dandamudi's discussion Degrees of freedom and Standard deviation

"Imagine i ask u to choose 5 numbers that sum up to 100.
For simplicity sake u tell me the 5 no.s are 20, 20, 20, 20, 20.
But when you utter 20 fourth time, i tell you to stop and ask you what was our goal
you tell me that the goal was to choose 5n.o…"

Jan 23, 2016

Clifford Long replied to Sharath Dandamudi's discussion Linearity assumption in Linear Regression

"The 'linearity' assumption in linear regression means that the expected value of the response is a linear function of the parameters. "Linear in the betas." Compared to "linear in the predictor…"

Sep 15, 2015

kiran chapidi replied to Sharath Dandamudi's discussion Computation of Weight of Evidence when either the number of bads or goods in a class of a variable is 0

"I have tried to use the WOE = ln(bad_distribution/good_distribution)
when the age variable
age band bads goods
19-25 2388 2019 8
26-30 1920 1716 24
31-35 1399 1377 53
36-40 1097 1157 73
41-45 934 1126 113
46-50 628 948 180
>50 527 876 209
The…"

Dec 11, 2012

Branko Mlikota replied to Sharath Dandamudi's discussion Computation of Weight of Evidence when either the number of bads or goods in a class of a variable is 0

Oct 23, 2012

RockyRambo replied to Sharath Dandamudi's discussion Data points for sensitivity and 1-specificity in ROC

"Hi Sharath,
Since sensitivity and (1- specificity) are determined using different cut off points from the confusion matrix, you can get both of them by varying the cutoff points..Let's say I have scores of 1 million observations in my model. If…"

Mar 7, 2012

Sharath Dandamudi posted a discussion### Reasons behind deterioration of r squared for an OLS model over a period of time

Hi!,I am working on a project which involves investigation of deterioration in R squared over a period of time for an OLS model. I am wondering how to approach this problem considering the fact that there are few dummy independent variables as well along with a few continuous ones. Do I have to check if there is any shift at an overall model level and characteristic level? If there are any other approaches that are adopted in the industry then please let me know. Any help on this would be…See More

Nov 16, 2011

Jozo Kovac replied to Sharath Dandamudi's discussion Computation of Weight of Evidence when either the number of bads or goods in a class of a variable is 0

"Exactly as you've written - it's undefined for some categories.
Such categories can't be used by logistic regression as well.
You have several options:
- discard attributes having such categories
- merge categories so none of them…"

Oct 18, 2011

Sharath Dandamudi posted a discussion### Computation of Weight of Evidence when either the number of bads or goods in a class of a variable is 0

Hi All!I want to understand the ways in which Weight of Evidence (WoE) is computed or adjusted in the following scenarios: 1. When number of goods in a class of a variable is 02. When number of bads in a class of a variable is 0 WoE = ln(distribution of goods/distributions of bads) Scenario 1: WoE=ln(0) ?? when number of goods in a class =0.Scenario 2: WoE=ln(distribution of goods/0)=ln(infinity) ?? when number of bads in a class = 0. Regards,Sharath See More

Oct 17, 2011

Jozo Kovac replied to Sharath Dandamudi's discussion Pros and cons of Dummy variable vs WoE approach for variables in Model building

"I understand well. It's about terminology.
WoE=Weight of Evidence is metrics and has own formula.
Dummy variable is binary flag created from categorical variable with more than 2 categories.
And again - simpler model is better. If…"

Oct 11, 2011

Sharath Dandamudi replied to Sharath Dandamudi's discussion Pros and cons of Dummy variable vs WoE approach for variables in Model building

"Hi Jozo,
Thanks for the reply. What I meant by WoE vs Dummy is say for eg. there is a categorical variable (independent variable, of course) with 4 levels-
Colour- Blue, Green, Red and White
The two ways I mentioned about is computing WoE for…"

Oct 11, 2011

Jozo Kovac replied to Sharath Dandamudi's discussion Pros and cons of Dummy variable vs WoE approach for variables in Model building

"First - you can compute WoE for both dummy and categorical variable, they aren't competitors.
Second - dummies lower degrees of freedom, produce simpler models and simpler is better according Occam's razor. And maybe also less sensitive to…"

Oct 10, 2011

Sharath Dandamudi's discussion was featured### Pros and cons of Dummy variable vs WoE approach for variables in Model building

Hi! I need inputs on the pros and cons of building a log-reg model using dummy variables instead of the Weight of evidence approach for categorical variables. Some of the cons that I can think of using Dummy variable approach are: 1. Overfitting2. Interpretation of output I know one of the things that needs to be looked at is the number of unique levels within a categorical variable. But, making reasonable assumptions, in a generic sense I would like to know if there are any pros and few other…See More

Oct 5, 2011

Sharath Dandamudi posted a discussion### Pros and cons of Dummy variable vs WoE approach for variables in Model buildning

Hi! I need inputs on the pros and cons of building a log-reg model using dummy variables instead of the Weight of evidence approach for categorical variables. Some of the cons that I can think of using Dummy variable approach are: 1. Overfitting2. Interpretation of output I know one of the things that needs to be looked at is the number of unique levels within a categorical variable. But, making reasonable assumptions, in a generic sense I would like to know if there are any pros and few other…See More

Oct 5, 2011

Daniel I. Shostak replied to Sharath Dandamudi's discussion Difference between Prediction and Forecasting

"HI Sharath:
I'm president of Strategic Affairs Forecasting LLC and am a futurist that has made very careful distinctions between prediction and forecast for many years. Here are the key elements in my opinion:
-For a number of…"

Apr 25, 2011

- Short Bio:
- Data Mining analyst

- Field of Expertise:
- Predictive Modeling, Data Mining, Statistical Programming

- Years of Experience in Analytical Role:
- 4

- Professional Status:
- Technical

- Interests:
- Networking, Other

- No comments yet!

© 2019 AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions