A Data Science Central Community

Hi,

Someone asked me a question on credit scoring model using WOE/IV only (without using logistic/OLS regression/CHAID).

Is it possible to give points to a variable based on its IV value and to the individual categories of that variable based on their WOE.

Suppose, the following is the output from the IV table, how would you use the WOE to give scores to them.

Asssumption - Total score is 100, significant vars - 4, Age is the most significant with IV - 0.46, other 3 vars have a lesser value than 0.46 but more than 0.25.

I thought we could weight a variable based on its IV value. Here Age's weight or score could be 100*0.46/(sum of IV of all 4 vars), which say, comes to 35.

Like this we can get IV weighted scores for each variable. But all the categories within a variable can't be assigned same weight. So, how to weight a variable's score within its categories. The best category can be given score equal to that of the variable and others be given scores lesser than that.

Do you think this is a correct approach and if so, how can get the scores for each category, 1,2,3,4,5 for age-group below ? Say - Group 5 as 35, 4 as 25, 3 as 20 and so on.

Age group | Good | Bad | Total | %Good | %Bad | %Good-%Bad | Ln(%Good/%Bad) | Marginal IV |

1 | 50 | 120 | 170 | 8.00% | 25.50% | -17.50% | -1.1604 | 0.20308535 |

2 | 75 | 110 | 185 | 12.00% | 23.40% | -11.40% | -0.6680 | 0.07615328 |

3 | 125 | 90 | 215 | 20.00% | 19.10% | 0.90% | 0.04348 | 0.00039137 |

4 | 175 | 80 | 255 | 28.00% | 17.00% | 11.00% | 0.49774 | 0.05475144 |

5 | 200 | 70 | 270 | 32.00% | 14.90% | 17.10% | 0.76480 | 0.13078134 |

IV | 0.4651628 | |||||||

Thanks,

Nitin

Tags: EVIDENCE, INFORMATION, IV, OF, REGRESSION, Risk, VALUE, WEIGHT, WOE, scorecard

© 2021 TechTarget, Inc. Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions