Subscribe to DSC Newsletter

Featured Blog Posts (1,509)

Source code for Robust Ridge and Linear Regression with Bootstrap

Works well with non Gaussian data or outliers. Allows you to set up bounds on the regression parameters (similar to ridge regression). Does not use matrix inversion, thus numerically stable. Robust parameter estimation based on Monte-Carlo simulations and re-sampling. The source code can easily be modified to perform logistic regression. This package can be used by scientists, programmers, analysts or engineers with limited statistical knowledge. Works on Unix, Linux or Windows. We will help… Continue

Added by Vincent Granville on March 1, 2008 at 3:00pm — No Comments

What is Six Sigma?

The concepts surrounding the drive to Six Sigma quality are essentially those of statistics and probability. In simple language, these concepts boil down to, “How confident can I be that what I planned to happen actually will happen?” Basically, the concept of Six Sigma deals with measuring and improving how close we come to delivering on what we planned to do.



Anything we do varies, even if only slightly, from the plan. Since no result can exactly match our intention, we usually… Continue

Added by Vincent Granville on March 15, 2008 at 9:09am — No Comments

Salary survey for Data Warehouse and Business Intelligence Professionals

The purpose of this report is to gain a better sense of the people and teams who built and maintained business intelligence (BI) and data warehousing (DW) solutions during the 2007 calendar year. This report uses the term “BI” to refer to both business intelligence and data warehousing initiatives, and the term “BI professionals” to the individuals who deliver these initiatives. Specifically, the report looks at…

Continue

Added by Vincent Granville on March 21, 2008 at 2:00am — No Comments

Double Sampling and Calibrating a visitor usage meter

Ive been consulting with a National Park on how to improve estimates of visitor usage. they have door mounted meters but these are not considered very accurate. Here is a summary of my sample design:





SUMMARY: National Park Administrators are required to file monthly reports of visitor totals at various park facilities such as visitor centers. Although door mounted meters can and are used to estimate these totals, there are issues with the quality of meter counts. Meters are… Continue

Added by J. Liddle on March 26, 2008 at 5:11am — No Comments

Nonparametric regression: the LOESS procedure

PROC LOESS implements a nonparametric method for estimating local regression surfaces pioneered by Cleveland (1979); also refer to Cleveland et al. (1988) and Cleveland and Grosse (1991). This method is commonly referred to as loess, which is short for local regression.



PROC LOESS allows greater flexibility than traditional modeling tools because you can use it for situations in which you do not know a suitable parametric form of the regression surface. Furthermore, PROC LOESS is… Continue

Added by Vincent Granville on March 31, 2008 at 11:00pm — No Comments

Predictive Model Markup Language - PMML

The Predictive Model Markup Language (PMML) is a mark up language for statistical and data mining models. Some sort of HTML language to handle statisical tasks such as regression, decision trees etc.



PMML is an XML-based language which provides a way for applications to define statistical and data mining models and to share models between PMML compliant applications.



PMML provides applications a vendor-independent method of defining models so that proprietary issues and… Continue

Added by Vincent Granville on April 5, 2008 at 1:30am — No Comments

Data Mining Definition

Having read from great authors and researchers, I had always taken the definition of data mining for granted. Following the tradiitons of Occam's Razor, I tried to summarize one myself but felt the difficulty of covering the entire spectrum of data mining which is ever evolving. Anyways, here is my try...



Data Mining is a process of extraction of non-trivial patterns from massive datasets which either provides descriptive insights of the data (not perceived without this… Continue

Added by Atif Abdul-Rahman on February 27, 2008 at 1:14pm — 2 Comments

Post-graduate degree in Business Intelligence

Hi all,



Nice to find out about this forum of quant people. Hope to be of help to anyone of you.



I have just been through a career change and am going to start working within the telecommunications industry. Having a good background in statistics and and an interest in data mining, I am now looking for a distance learning degree to get more knowledgable in Knowledge Management, DM, and Data Warehousing.



So far I have found only an MSc in Business Intelligence at… Continue

Added by Ivan on February 25, 2008 at 2:52am — 1 Comment

2008 Statistical Resolutions

The biggest challenge of 2007 for statisticians was the pressing demand for real-world explanations and applications of statistical concepts. On one hand, statisticians found relief in not having to focus so much of their time on inspiring trust in the number-crunching methods. A techno-driven quasi-dependence on quantitative intel set such concerns on the back-burner. On the other hand, businesses and consumers united in expressing a supposedly simple request -- "Ok, so show me how I can USE… Continue

Added by Vincent Granville on February 17, 2008 at 12:53am — No Comments

Featured Monthly Archives

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

On Data Science Central

© 2020   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service