# AnalyticBridge

A Data Science Central Community

Some of my old articles published in 2001, but still worth reading. Lots of good data science advice in it.

Pitfalls in Optimizing Statistical Trading Strategies. Part I: Over-Parametrization.

One of the common mistakes in optimizing statistical trading strategies consists of over-parametrizing the problem and then computing a global optimum. It is well know that this technique provides extremely high return on historical data but does not work in practice. We shall investigate this problem, and see how it can be side-stepped. We will explain how to build a very efficient 6-parameter strategy.

This issue is actually relevant to many real life statistical and mathematical situations. The problem itself can be referred to as over-parametrization or over-fitting. The explication as to why this approach fails can be illustrated by a simple example. Let's imagine you fit data with a 30-parameter model. If you have 30 data points (that is, the number of parameters is equal to the number of observations), then you can have a perfect, fully optimized fit with your data set. However, any future data point (e.g. tomorrow stock prices) might have a very bad fit with the model, resulting in huge losses. Why? We have the same number of parameters as data points. Thus, on average each estimated parameter of the model is worth no more than one data point.

From a statistical viewpoint, you are in the same situation as if you were estimating the median US salary, interviewing only one person. Chances are your estimation will be very poor, even though the fit with your one-person sample is perfect. In fact, you run a 50% chance that the salary of the interviewee will be either very low or very high.

Roughly speaking, this is what happens when over-parametricizing a model. You obviously gain by reducing the number of parameters. However, if handled correctly, the drawback can actually be turned into an advantage. You can actually build a model with many parameters that is more robust and more efficient (in terms of return rate) than a simplistic model with fewer parameters. How is it possible? The answer to the question is in the way you test the strategy. When you use a model with more than three parameters, the strategy that provides the highest return on historical data will not be the best. You need to use more sophisticated optimization criteria.

One solution is to add boundaries to the problem thus performing constrained optimization. Look for strategies that meet one fundamental constraint: reliability. That is, you want to eliminate all strategies that are too sensitive to small variations. Thus, you focus on that tiny part of the parameter space that shows robustness against all kinds of noise. Noise, in this case, can be trading errors, spread, small variations in the historical stock prices or in the parameter set.

From a practical viewpoint, the solution consists in trying million of strategies that work well under many different market conditions. Usually, it requires several months' worth of data to have various market patterns and some statistical significance. Then for each of these strategies, you must introduce noise in millions of different ways and look at the impact. You then discard all strategies that can be badly impacted by noise and retain the tiny fraction that are robust.

The computational problem is complex, since it is in fact equivalent to testing millions of millions of strategies. But it is worth the effort. The end result is a reliable strategy that can be adjusted over time by slightly varying the parameters. Data Shaping's strategies are actually designed this way. They are associated with 6 parameters:

• Four parameters are used to track how the stock is moving (up, neutral, or down)
• One parameter is used to set the buy price
• One parameter is used to set the sell price

The details will appear in the Professional issue of the newsletter.

It would have been possible to reduce the dimensionality of the problem by imposing symmetry in the parameters (e.g. parameters being identical for buy and sell price). Instead, our approach combines the advantage of low dimensionality (reliability) with returns appreciably higher than you would normally expect when being conservative.

A final note of advice. When you backtest a trading system, optimize the strategy using historical data that are more than one month old. Then check if the real-life return obtained during the last month (outside the historical data time-window ) is satisfactory. If your system passes this test, then optimize the strategy using the most recent data, and use it. Otherwise, do not use your trading system in real life. More on backtesting in the next issue.

Part II: Improving long-term return on short-term strategies.

In this article, we describe how to backtest a short-term strategy to assess its long-term return distribution. We will focus on strategies that require frequent updates. They are also called adaptive strategies. We examine an undesirable long-term characteristic shared by many of these systems: long-term oscillations with zero return on average. We propose a solution that takes advantage of the periodic nature of the return function, to design a truly profitable system.

When a strategy relies on parameters requiring frequent updates, one has to design appropriate backtesting tools. From Part I, we know that we should limit the number of parameters to six. We have also learned how to improve backtesting techniques, using robust statistical methods and constrained optimization. For the sake of simplicity, we assume that the system to be tested provides daily signals and needs monthly updates. The correct way to test such an adaptive system is to backtest it one month at a time, on historical data, as follows.

Algorithm

For each month in the test period, do:

• Step 1: Backtesting

Collect the last six months worth of historical data prior the month of interest. Backtest the system on these six months to estimate the parameters of the model.

• Step 2: Walk forward

Apply the trading system with the parameters obtained in step 1 to the month of interest. Compute the daily gains.

The whole test period should be at least 18 months long. Thus we need to gather and process 24 months worth of historical data (18 months, plus 6 extra months for backtesting). Monthly returns obtained sequentially OUT OF SAMPLE (one month at a time) in step 2 should be recorded for further investigation. You are likely to observe the following patterns:

• many months are performing very well
• many months are performing very badly
• on average the return is zero
• good months are often followed by good months

We have now all the ingredients to build a long term reliable system, that we will call a metastrategy, since it is built on top of the original system. It works as follows:

Metastrategy

If last month return (as obtained in step 2) is positive, use strategy this month, otherwise use reverse strategy by swapping buy and sell prices.

This feature will be introduced in some of our selected trading keys, resulting in a lower but more consistent long-term return. A special symbol will be used to identify these high-tech keys.

A Simple, Efficient and Convenient Universal System

We propose an original system that provides reliable daily index and stock trending signals. The non-parametric statistical techniques described in this article have several advantages: simplicity, efficiency, convenience and universality.

• Simplicity:
There are no advanced mathematics involved, only basic algebra. The algorithms do not require sophisticated programming techniques. They rely on data that is easy to obtain.
• Efficiency:
Daily predictions were correct 60% of the time in our tests. This good performance can be improved using techniques described in this article.
• Convenience:
The non-parametric system does not require parameter estimation. It automatically adapts to new market conditions. Additionally, the algorithms are very light in terms of computation, providing forecasts in a snap even on very slow machines.
• Universality:
The system works with any stock or index with a large enough volume, at any given time, in the absence of major events impacting the price. The same algorithm applies to all stocks and indices.

Algorithm

The algorithm computes the probability, for a particular stock or index, that tomorrow's close will be higher than tomorrow's open by at least a specified percentage. The algorithm can easily be adapted to compare today's close with tomorrow's close instead. The estimated probabilities are based on at most the last 100 days of historical data for the stock (or index) in question.

The first step consists of selecting a few price cross-ratios that have an average value of 1. The variables in the ratios can be selected so as to optimize the forecasts. In one of our applications, we have chosen the following three cross-ratios:

1. Ratio A = ( today's high / today's low ) / ( yesterday's high / yesterday's low )
2. Ratio B = ( today's close / today's open ) / ( yesterday's close / yesterday's open )
3. Ratio C = today's volume / yesterday's volume

Then each day in the historical data set is assigned to one of 8 possible price configurations. The configurations are defined as follows:

1. Ratio A > 1, Ratio B > 1, Ratio C > 1
2. Ratio A > 1, Ratio B > 1, Ratio C <= 1
3. Ratio A > 1, Ratio B <= 1, Ratio C > 1
4. Ratio A > 1, Ratio B <= 1, Ratio C <= 1
5. Ratio A <= 1, Ratio B > 1, Ratio C > 1
6. Ratio A <= 1, Ratio B > 1, Ratio C <= 1
7. Ratio A <= 1, Ratio B <= 1, Ratio C > 1
8. Ratio A <= 1, Ratio B <= 1, Ratio C <= 1

Now, to compute the probability that close tomorrow will be at least 1.25% higher than tomorrow open, we first compute today's price configuration. Then we check all past days in our historical dataset that have that configuration. We count these days. Let N be the number of such days. Then, let M be the number of such days further satisfying the following:

Next day close is at least 1.25% higher than next day open.

The probability that we want to compute is simply M/N. This is the probability, based on past data, that close tomorrow will be at least 1.25% higher than tomorrow's open. Of course, the 1.25 figure can be substituted by any arbitrary percentage.

Performance

There are different ways of assessing the performance of our stock trend predictor. We have investigated two approaches:

1. computing the proportion of successful daily predictions, using a threshold of 0% instead of 1.25%, over a period of at least 200 trading days
2. using the predicted trends (with threshold set to 0% as above) in a strategy: buy at open, sell at close or the other way around based on the prediction

Our tests showed a success rate between 54% and 65% in predicting the Nasdaq trend. The strategy associated with the forecaster has been analysed on our web site.

Even with a 56% success rate in predicting the trend, the long-term (non compound) yearly return before costs is above 40% in many instances. Note that we provide similar strategies that do not rely on the open price to interested clients. As with many trading strategies, the system sometimes exhibits oscillations in performance. It is possible to substantially attenuate these oscillations, using a technique described above..

In its simplest form, the technique consists of using the same system tomorrow if it worked today. If the system fails to correctly predict today's trend, then use the reverse system for tomorrow.

Universal Forecaster

Universal Trend Forecaster is the full name of our implementation of this system.

You can check out the real past performance (last 365 days) online, for any stock or index, by entering the stock symbol in the trading box and clicking on the submit button. Additionally, we provide an Excel template, Trender.xls, containing all the formulas to perform the required computations. You can assess the impact of trading fees on return by downloading this spreadsheet.

Views: 1813

Comment Comment by Illingworth Juan José on March 14, 2013 at 6:06pm