A Data Science Central Community

PROC LOESS implements a nonparametric method for estimating local regression surfaces pioneered by Cleveland (1979); also refer to Cleveland et al. (1988) and Cleveland and Grosse (1991). This method is commonly referred to as loess, which is short for local regression.

PROC LOESS allows greater flexibility than traditional modeling tools because you can use it for situations in which you do not know a suitable parametric form of the regression surface. Furthermore, PROC LOESS is suitable when there are outliers in the data and a robust fitting method is necessary.

The main features of PROC LOESS are as follows:

fits nonparametric models

supports the use of multidimensional predictors

supports multiple dependent variables

supports both direct and interpolated fitting using kd trees

computes confidence limits for predictions

performs iterative reweighting to provide robust

fitting when there are outliers in the data

supports scoring for multiple data sets

Local Regression and the Loess Method Assume that for i = 1 to n, the ith measurement yi of the response y and the corresponding measurement xi of the vector x of p predictors are related by

yi = g(xi) + ei

where g is the regression function and ei is a random error. The idea of local regression is that near x = x0, the regression function g(x) can be locally approximated by the value of a function in some specified parametric class. Such a local approximation is obtained by fitting a regression surface to the data points within a chosen neighborhood of the point x0.

In the loess method, weighted least squares is used to fit linear or quadratic functions of the predictors at the centers of neighborhoods. The radius of each neighborhood is chosen so that the neighborhood contains a specified percentage of the data points. The fraction of the data, called the smoothing parameter, in each local neighborhood controls the smoothness of the estimated surface. Data points in a given local neighborhood are weighted by a smooth decreasing function of their distance fromthe center of the neighborhood

Example 1.

ods output OutputStatistics=PredLOESS;

proc loess data=ExperimentA;

model Yield = Temperature Catalyst / scale=sd degree=2 select=gcv;

run;

ods output close;

proc gam data=ExperimentA;

model Yield = loess(Temperature) loess(Catalyst) / method=gcv;

output out=PredGAM;

run;

.

Although LOESS provides a model of the response surface, it do not provide an equation stating the dependence and do not provide information about interactions and non-linearities.If the span for the preferred LOESS fit is small, it is unlikely that a common functiuon can be found for all the data. If the span is large, then it is quite likely that a common function can be found.

If we have more than one (or two) outliers or points of influence (leverage points) we can't just drop one point, re-do the hat matrix, drop another point and re-do the hat matrix one more time . We need a more comprehensive approach like LOESS, M-estimation (which was introduced by Huber in 1973), S-estimaion, LTS-estimation, and MM-estimation. All of these (other than LOESS) are in PROC ROBUSTREG.

Full article at: www.gotstat.com/post/The-LOESS-procedure.aspx

PROC LOESS allows greater flexibility than traditional modeling tools because you can use it for situations in which you do not know a suitable parametric form of the regression surface. Furthermore, PROC LOESS is suitable when there are outliers in the data and a robust fitting method is necessary.

The main features of PROC LOESS are as follows:

fits nonparametric models

supports the use of multidimensional predictors

supports multiple dependent variables

supports both direct and interpolated fitting using kd trees

computes confidence limits for predictions

performs iterative reweighting to provide robust

fitting when there are outliers in the data

supports scoring for multiple data sets

Local Regression and the Loess Method Assume that for i = 1 to n, the ith measurement yi of the response y and the corresponding measurement xi of the vector x of p predictors are related by

yi = g(xi) + ei

where g is the regression function and ei is a random error. The idea of local regression is that near x = x0, the regression function g(x) can be locally approximated by the value of a function in some specified parametric class. Such a local approximation is obtained by fitting a regression surface to the data points within a chosen neighborhood of the point x0.

In the loess method, weighted least squares is used to fit linear or quadratic functions of the predictors at the centers of neighborhoods. The radius of each neighborhood is chosen so that the neighborhood contains a specified percentage of the data points. The fraction of the data, called the smoothing parameter, in each local neighborhood controls the smoothness of the estimated surface. Data points in a given local neighborhood are weighted by a smooth decreasing function of their distance fromthe center of the neighborhood

Example 1.

ods output OutputStatistics=PredLOESS;

proc loess data=ExperimentA;

model Yield = Temperature Catalyst / scale=sd degree=2 select=gcv;

run;

ods output close;

proc gam data=ExperimentA;

model Yield = loess(Temperature) loess(Catalyst) / method=gcv;

output out=PredGAM;

run;

.

Although LOESS provides a model of the response surface, it do not provide an equation stating the dependence and do not provide information about interactions and non-linearities.If the span for the preferred LOESS fit is small, it is unlikely that a common functiuon can be found for all the data. If the span is large, then it is quite likely that a common function can be found.

If we have more than one (or two) outliers or points of influence (leverage points) we can't just drop one point, re-do the hat matrix, drop another point and re-do the hat matrix one more time . We need a more comprehensive approach like LOESS, M-estimation (which was introduced by Huber in 1973), S-estimaion, LTS-estimation, and MM-estimation. All of these (other than LOESS) are in PROC ROBUSTREG.

Full article at: www.gotstat.com/post/The-LOESS-procedure.aspx

© 2021 TechTarget, Inc. Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge