Ian Morton's Posts - AnalyticBridge2020-10-27T00:11:05ZIan Mortonhttps://www.analyticbridge.datasciencecentral.com/profile/IanMortonhttps://storage.ning.com/topology/rest/1.0/file/get/2191573128?profile=RESIZE_48X48&width=48&height=48&crop=1%3A1https://www.analyticbridge.datasciencecentral.com/profiles/blog/feed?user=181jql61nhksq&xn_auth=noThe application of Propensity Score Matchingtag:www.analyticbridge.datasciencecentral.com,2013-05-16:2004291:BlogPost:2461782013-05-16T09:00:39.000ZIan Mortonhttps://www.analyticbridge.datasciencecentral.com/profile/IanMorton
<p><a href="http://en.wikipedia.org/wiki/Propensity_score_matching">Propensity Score Matching</a> is a statistical matching technique that attempts to estimate the effect of a treatment, policy or other intervention by accounting for the covariates that predict receiving the treatment. It helps to reduce bias due to confounding and can be used to estimate the counterfactual outcome.</p>
<p>For example, many of you will have been to a particular university or school and achieved a certain…</p>
<p><a href="http://en.wikipedia.org/wiki/Propensity_score_matching">Propensity Score Matching</a> is a statistical matching technique that attempts to estimate the effect of a treatment, policy or other intervention by accounting for the covariates that predict receiving the treatment. It helps to reduce bias due to confounding and can be used to estimate the counterfactual outcome.</p>
<p>For example, many of you will have been to a particular university or school and achieved a certain result. But have you ever wondered what could have been the result if you had attended somewhere else (the counterfactual outcome) ? To determine this you would need to account for the covariates using information on people like yourself who studied the same course. Then, you could estimate this counterfactual outcome using Propensity Score Matching.</p>
<p>I have put various resources (including SAS code) on my blog. These have allowed me to do Propensity Score Matching - See blog post here: <a href="http://www.analysisandstatistics.blogspot.co.uk/2013/05/what-could-propensity-score-matching-do.html">What could propensity score matching do for you ? (with examples from justice, medicine, education and finance)</a>.</p>
<p> </p>
<p>Ian Morton has built propensity scoring models for the financial services sector, for a utility company, and for the public sector. He has given a number of presentations on the technique of propensity score matching, and has also co-authored a forthcoming peer-reviewed journal article.</p>Building a good predictive model for credit risktag:www.analyticbridge.datasciencecentral.com,2013-05-09:2004291:BlogPost:2449302013-05-09T16:00:00.000ZIan Mortonhttps://www.analyticbridge.datasciencecentral.com/profile/IanMorton
<p><span class="font-size-2" style="font-family: arial, helvetica, sans-serif;">A colleague of mine wanted to understand how to build predictive models, and asked if I had a strategy for building them. I thought it would be useful to share this. For more details about each stage see my personal blog (<a href="http://bit.ly/10uyAVu" target="_self">My suggested strategy for building a “good” predictive model</a>).…</span></p>
<p></p>
<p><span style="font-family: arial, helvetica, sans-serif;" class="font-size-2">A colleague of mine wanted to understand how to build predictive models, and asked if I had a strategy for building them. I thought it would be useful to share this. For more details about each stage see my personal blog (<a href="http://bit.ly/10uyAVu" target="_self">My suggested strategy for building a “good” predictive model</a>).</span></p>
<p><span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"> </span></p>
<p><span style="font-family: arial, helvetica, sans-serif;" class="font-size-2">Stage 1 – Perform initial investigations</span></p>
<p><span style="font-family: arial, helvetica, sans-serif;" class="font-size-2">Stage 2 - Getting the data ready</span></p>
<p><span style="font-family: arial, helvetica, sans-serif;" class="font-size-2">Stage 3 - Modelling</span></p>
<p><span style="font-family: arial, helvetica, sans-serif;" class="font-size-2">Stage 4 - Check the model</span></p>
<p><span style="font-family: arial, helvetica, sans-serif;" class="font-size-2">Stage 5 - Start again</span></p>
<p><span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"> </span></p>
<p><span style="font-family: arial, helvetica, sans-serif;" class="font-size-2">The bottom line: It’s an iterative process and it might take some time to get a model that’s acceptable in terms of fit, and acceptable to business users. Always, always, always and at each stage consult with the business to check on ethical issues, applicability of the model, and that the model can be implemented.</span></p>
<p></p>
<p><span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"><b>Ian Morton worked in credit risk for big banks for a number of years. He learnt about how to (and how not to) build “good” statistical models in the form of scorecards using the SAS Language.</b></span></p>
<p> </p>
<p><b>George E. P. Box “Essentially, all models are wrong, but some are useful”</b></p>
<p></p>Round and round at sea with circular statisticstag:www.analyticbridge.datasciencecentral.com,2013-02-28:2004291:BlogPost:2330922013-02-28T09:00:08.000ZIan Mortonhttps://www.analyticbridge.datasciencecentral.com/profile/IanMorton
<p><i>I was recently reminded of some work I had completed for the oil and gas industry many years ago and thought it would be useful to share with other analysts/statisticians.</i></p>
<p><i>For more general information see a case study I have put onto my personal blog (<a href="http://analysisandstatistics.blogspot.co.uk/2012/12/offshore-storms-statistics-and-oil-rigs.html" target="_self">Offshore Storms Statistics and Oil Rigs</a>).</i></p>
<p> </p>
<p> </p>
<p>It was important for the oil…</p>
<p><i>I was recently reminded of some work I had completed for the oil and gas industry many years ago and thought it would be useful to share with other analysts/statisticians.</i></p>
<p><i>For more general information see a case study I have put onto my personal blog (<a href="http://analysisandstatistics.blogspot.co.uk/2012/12/offshore-storms-statistics-and-oil-rigs.html" target="_self">Offshore Storms Statistics and Oil Rigs</a>).</i></p>
<p> </p>
<p> </p>
<p>It was important for the oil and gas engineers to gain a better understanding of the offshore environment during storm conditions, and in particular if there would be an impact upon the mooring of their semi-submersible drilling rigs. (See for example this paper: Bowers J, Morton I, Mould G 1997. Weathering the storm – how OR steered a course between extreme statistics & offshore design. OR Insight; 10 3: 16-21).</p>
<p> </p>
<p>It was necessary for me to use circular statistics to analyse the wind and wave directions and come up with the mean directions and a 95% confidence interval.</p>
<p> </p>
<p>Linear statistics aren’t appropriate because there is a crossover problem. If we consider three wind directions of 358, 0 and 2 degrees respectively then using linear statistics to find the mean we would add them together and divide by three, to arrive at a mean wind direction of 120 degrees (but it should have given an answer of 0 degrees).</p>
<p> </p>
<p>I started to read about how to calculate basic circular statistics from a very useful book by NI Fisher <i>Statistical analysis of circular data</i>, and soon realised that I would have to revise my school maths on how to work with trigonometric functions. The mean direction is found from equation 1. It turns out that the calculation of a 95% confidence interval for the sample mean of circular data has mathematical equations that are intractable, and approximations which are unwieldy. Firstly, you could then assume that the data fit the “von Mises” distribution (It’s a wild assumption, other distributions might be more appropriate). Then, given this leap of faith, a reasonable approximation to determine the concentration is as in equation 2 and following on from this, an estimate of the 95% confidence interval is given by equation 3. Maybe I have set the seeds for you to think about using circular statistics in your work, or alternatively you didn’t get this far because you were put off by the equations. Other practical examples are provided by Fisher and two examples are: the arrival times at an intensive care unit; and the vanishing directions of homing pigeons.</p>
<p> </p>
<p>Please let me know if you are using circular statistics in your work, or you have any other comments.</p>
<p><a href="http://storage.ning.com/topology/rest/1.0/file/get/2220279516?profile=original" target="_self"><img src="http://storage.ning.com/topology/rest/1.0/file/get/2220279516?profile=original" width="491" class="align-full"/></a></p>
<p> </p>
<p><b>Further reading</b></p>
<p>Bowers JA, Morton I, Mould GI 2000. <u>Directional statistics of the wind and waves</u>. Applied Ocean Research; 22: 13-30.</p>