# AnalyticBridge

A Data Science Central Community

# Weather normalization

I am using demand (MW) data and want to take out the weather component from my demand (MW). Any suggestions on a relatively simple way to do this?

On a side note I need this so I can figure out when demand(MW) on a weekend( Saturday, Sunday) will take over demand(MW) on a weekday( Mon-Fri).

If anyone has experience in this I would appreciate it, it's been challenging to model this.

Views: 1307

### Replies to This Discussion

Do you need an hourly model? If not use a simpler daily model:

Step 1: Aggregate hourly MW data --> daily MWh, and fit a simple model of DailyMWh=f(Intercept, Day Of Week, Month of Year, CDD, HDD, Trend)
Step 2: Construct normal daily CDD/HDD (see below)
Step 3: Feed the normal CDD/HDD to the model (overwriting the historical daily CDD/HDD) and use the predicted values as your weather normalized daily energy.

To construct normal CDD/HDD:
2. Contruct HDD=max(55-avgtemp,0) and CDD=max(avgtemp-65,0) by day over the historical period
3. Average by day of year to get 365 normalCDD, normalHDD values
I ended up using a daily regression model when I did this a few weeks ago. I will try to redo it your way. I appreciate your time and input.

I have another question for you if you dont mind. This is an econometrics question.

The slope is used only in a simple regression model which has serial correlation. This slope is used for marginal cost calculations. Since the slope remains the same ( unbiased) with serial correlation or without I can use this slope for marginal cost calculations. Once again I am only using the coefficent ( slope) value.
Question: I don't need to adjust my slope for serial correlation, right?
If you are worried about serial correlation affecting your coefficient estimates, you should first check your residuals to see if they have serial correlation. You can do this by plotting the ACF of your residuals. If you notice a slow decay in the ACF of the residual series you might have a problem with serial correlation. According to standard theory this will not bias any of your model coefficient estimates, but they may end up with large standard errors. Since most of the time we are not too concerned with inference on the parameters per se, it doesn't really matter (from a practical point of view we are usually most concerned with just the point forecast from the model). I would say if you have a decent MAPE or R-squared then you can safely ignore the serial correlation in the residuals.

If you still want to correct for serial correlation the easiest fix is to include new terms in the model that help explain the structure in the residuals. First I would try intercept shifts and trend adjustments. As a last resort I would include an AR(1) term in your model. However my experience has been that including time series errors in a load forecasting model will tend to dampen the impact of the weather and seasonal variables, since we are essentially moving most of the explanatory power out of these variables and into the error term, and then assigning this error term a time series structure. If the AR term ends up swamping all the other terms in your model you are essentially giving up on a structural explanation for variation in load. The model might fit very nicely, but it won't forecast well.

Another paper you might find useful can be found here:
http://www.itron.com/pages/resources_white_paper.asp?id=itr_016784.xml
Hi Scott,

Once again, thanks for your insight into this.
My simple reg model that I am using has serial correlation and I am only using the slope. I was not sure if my slope( coefficient) will be affected by the serial correlation but after your input and research( found out that it was unbiased)I can confidently use my slope for marginal calculations.

Thanks,
Frank

Hello Scott,

I've been doing weather normalization of gas consumption for about 2 years now. And wonder if there's a way to provide a confidence interval for weather corrected consumption?

This is my concearn:

Let's say a simple equation:

Estimated consumption:

Chat=a+bHDD

Reference consumption:

Cref=a+bHDDref

Climat Correction (CC)=Cref-Chat:

Corrected Consumption:

Ccorr=Cobserved+CC=Cobserved+Cref-Chat=Cref+errow

errow=Cobserved-Chat

How to provide confidence interval for something that has error term? I've looked through some studies but I have not found Conf. Intervals for weather correction. Have you done it before?

Thank you.

Sincerely,

Irina