Volatility Autocorrelation in Different Markets

For an introduction to autocorrelation see this previous post: First Order Autocorrelation as a Moderator of Daily MR.

In the previous post, I looked at how autocorrelation indicated MR performance, in this post, I observed the autocorrelation of another moderator of daily follow-through; volatility. Intuitively, it makes sense for volatility to be highly autocorrelated. VIX spikes are a good example of it. Let’s look at how the autocorrelation of volatility behave during bull/bear markets as defined by the 200/50 days moving averages crossover.

For this test, I looked at the first through fifth order autocorrelation of the 21 day rolling standard deviation of SPY returns, dividing them into percentiles (using the percent rank function with a 252 days lookback period) and looked at the average and the standard deviation of the percentiles.

As I thought before beginning the test, the volatility of returns is highly autocorrelated. However I thought that a bigger difference would exist between bull and bear markets. One thing to notice in the table below is that the autocorrelation is on average greater when the 50 day MA is greater than 200 day MA for all orders. The same observation holds true regarding the standard deviation.

Trading wise, this information a no direct value by itself, however it puts in numbers a well defined market mechanics. We know that volatility is easier to predict than returns, and the level of autocorrelation is greatly responsible for this fact. We also know that daily mean-reversion strategies perform better during high volatility regimes. Thus we can use the volatility autocorrelation as an indicator of the suitability of our strategy. For example, during a high volatility regime with high autocorrelation we can expect our strategy to perform well and vice-versa if we change the contingency. For a very interesting read about the predictability of volatility and its application to trading, I recommend this recent article winning the NAAIM 2010 Wagner award from Tony Cooper. http://naaim.org/files/2010/1st_Place_Tony_Cooper_abstract.pdf

QF

Drivers of MR Performance

A lot of the daily mean reversion strategies discussed in the blogosphere are designed for equity index ETFs or equity index mutual funds. Indexes being a group of individual stocks, if we could determine which stocks will drive the index returns or in this case, the behavior, we could adapt a strategy to profit from the prevailing market paradigm as indicated by the drivers of the index.

For example if the stocks affecting the index the most are exhibit mostly mean reverting behavior, we can expect the index to exhibit the same behavior. The same conclusion holds for trending behavior. The important point here is that indices are not an entity in themselves; they are made to reflect a certain sector/type of stocks/area etc. To illustrate the concept, I performed a simple test using the popular unbounded DV 2.

I used the Nasdaq 100 and its related ETF QQQQ to perform this test. First I needed a way to quantitatively identify the drivers of the market. For this purpose, I used a weighted r-squared. Many of you know that the r-square under its other name: the coefficient of determination. The statistic provides an indication of the level in which a series is predicted by another (i.e. the goodness of fit). To obtain it you can just elevate the correlation coefficient to the power 2. For this test I looked at the r-squared only for the 21 previous days on a rolling basis. Then I looked at the volume of each individual component of the index relative to the total volume of all the components, and then weighted the r-squared with the proportion of the volume of this particular stock.

In a more rigorous form:

After this computation for every stock at every period n, I take the top 10 weighted r-squared stocks and compute their DV2 values, average it and trade the QQQQ using the signal given by the DV2 (long/short from 0). The results below compare this strategy, buy & hold QQQQ, and DV2 on the QQQQ data.

As you can see, the results are quite similar and as expected, highly correlated. It shows that in an index formed with 100 stocks, looking at only a select group with a big influence on the returns can help determine the MR performance of the index itself.

QF

Introduction of SVM Learning in Strategies

This post is derived from a comment I received my last post on probability density function for an adaptive RSI strategy. pinner made the following observation:

“Alternatively you could regress the returns against the set of 6mo & 1yr RSI points as a means to determine the best decision. While this approach probably requires more historical data, it also affords more detailed metrics from which to make each decision.”

I am personally not a big fan of simple regression model for my trading decisions. However, I think there is something in the concept. Such a setup is also a good place to use a common traditional machine learning technique; the support vector machine (SVM).

From wikipedia: given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that predicts whether a new example falls into one category or the other. Intuitively, an SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on.

In concrete terms, the training algorithm separate the data into up days and down days then look at the predictors value. It then creates a set of rules dividing the data; these rules minimize the classification error while also maximizing the margin of safety, thus giving the model more room, resulting (hopefully) in a greater accuracy. Based on this set of rules, the algorithm classifies new data in a category. Note that this is not a numeric prediction (i.e. next day return should be xx%) it is a binary factor prediction, thus allowing us to derive a probability along with the prediction. It comes in handy when we want a confidence based system.

This is a nice technique, but it has its drawbacks. First of all, it is a parameter dependant technique. The effectiveness of the svm is mostly determined by two parameters. It is usually recommended to test different values for the pair and to retain the pair that performs best during cross-validation for the model. This can become quite computationally annoying. Without getting to technical (or into details), if we want a non-linear classification algorithm, we have to choose the type of kernel function we want; there is several.

I just wanted to throw the idea out there, since pinner’s suggestion was a good one. If readers are interested to try it out, I encourage you to send me the results, I would be happy to post them. Alternatively, I might post some results depending on the interest. I just wanted to introduce support vector machine and its potential application when developing strategies.

QF

Using Probability Density as an Adaptive Mechanism

I will take a pause of the Time Machine series for now while I work on it some more and prepare future posts. Today I will follow up on my post on return distributions and show a simplistic way to include it in an adaptive strategy.

It has been discussed quite a lot on the blogosphere that a strategies with fixed parameters are inferior to adaptive strategies. For example the simplest daily MR strategy I can think of is probably RSI 2 50/50, but this strategy did not always worked and I certainly don’t expect it to keep working forever. Furthermore, the most profitable lookback parameter for RSI also varies in time. This is where return distribution is useful. From it, we can derive the probability density function and use that to create an adaptive mechanism.

Just a little background on probability density function; from wiki: “density of a continuous random variable is a function that describes the relative likelihood for this random variable to occur at a given point in the observation space.” In plain language; the probability of a certain event happening. I recommend using your favorite statistical software to do so, unless you want to be doing integrals for a long time!

For this test, I took SPY data, computed RSI values for different lookback periods (2 to 30), and then looked at the results for each strategy. For a rolling period of 1 year and 6 months I looked at the probability densities of returns for every strategy looking specifically at the probability of returns greater than zero (this can be changed to a higher threshold, just want to keep it basic for this). I then traded the strategy that had the highest combination of 1 year and 6 months values. That way, the capital is allocated to the strategy with the parameter generating the highest probability of positive returns as measured by the probability density function. I compared the strategy, RSI 2 50/50 and buy and hold.

Results

The results are not particularly impressive, the point of the article was to illustrate the concept as simply as possible. I believe that there is ways to make this particular strategy more robust; as a starter to take a shorter time frame for the lookback period to make it more sensitive to recent market data or introducing a weighting scheme to weight more recent data. I will let the reader experiment with it, I would be happy to post results if you care to share. Even though the results are not spectacular, the strategy seems to adapt to the different waves in the market and allocate the capital to a more appropriate parameter length for the RSI depending on the current market paradigm.

QF

(Part 4) Time Machine Test – Commodities

Results on a commodities basket.

WTI Crude

Natural Gas

Gold

Silver

Results

As you can see, the algorithm adapts to different classes of commodities and outperforms most of the buy and hold returns. But I want to emphasis that this concept alone is not something I would trade as is (not robust enough). The results are in no way good enough to rely on out-of-sample.

I found that it is much harder for the algorithm to find strategies significant enough to trade on for long periods on these assets. For equities, there is on average 7 different active strategies (i.e. the level of significance is high enough) at the same time. The number is much less with commodities and currencies, the lack of diversification between strategies certainly hurts performance when compared to equity indices. It also add on more exposure to a given strategy adding a lot of volatility to the returns as showed by the numbers above.

QF

(Part 3) Time Machine Test – Currencies

Edit: This is a repost of the previous version of the post for currencies and commodities. When adapting my code for Datastream data, I made an error, the code was peeking which explains the ridiculously straight equity curve. Here is the corrected version of the post. I sincerely apologize to readers for the inconvenience. I also want to assure you that the previous results on equity indices are correct; I had a couple colleagues take a look at it to confirm that there were no bugs left.

CAD

EUR

GBP

Results

First thing I notice looking at these results is difference between the usefulness of run analysis on currencies. It makes sense since currencies are related in good part to macroeconomics factors. It is also much harder for the algorithm to find significant strategies, the time in the market for the strategy is much less than with equities.. Because of this, it becomes harder to squeeze out alpha out of the strategy with a high confidence level in the analysis.

QF

Blog Beauty Pageant

Just a quick post to tell you not to be surprised if I change the theme of my blog, or at the very least the font size, during the day, some reader have brought to my attention the difficulty to read posts. I will be experimenting with WordPress’ themes and HTML today to find a solution. It is made harder by the fact that I don’t really want to take a blog theme already used by the quant blogosphere big players (a real shame since they are pretty nice). Regardless, expect a post testing the time machine algorithm on different asset classes; commodities, currencies later today.
QF

(Part 2) Time Machine Test – Non-parametric Statistical Filter

As promised yesterday, I tried a small change to the original “time machine” strategy first introduced by CSS Analytics. Now if you still have not, please go read these background articles on statistical filters and their importance in a trade system:

- The Adaptive Time Machine: The Importance of Statistical Filters – CSS Analytics

- Transactional vs Confidence-based Trading Strategies – MarketSci

In yesterday’s post, I used the student t-test approach to filter the significance of every of the 50 strategies the algorithm can choose from. As you may know, the Student’s t-distribution used to estimate the mean of a normally distributed population. Such an assumption on the distribution contradicts the kind of fat tail returns the market throws at us. To relax the normality assumption, one can use a non-parametric statistical test. Non-parametric statistics make fewer assumptions regarding the distribution of the underlying and therefore can be more robust, thus making a prime choice for the “Time Machine” algorithm. More reading on Wilcoxon signed-rank test can be found here: http://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test.

For this test, I used the Wilcoxon signed-rank test instead of the Student’s t-test to establish the significance of strategies. The results below are for the strategy using a 95% significance filter on the S&P 500.

The results obtained are not very different from the previous one. It is interesting to see that the maximum drawdown is smaller when using the Wilcoxon test. This is probably caused by the increased robustness of the statistical test. For the time being I will keep testing the algorithm on different equity indices and asset class stay tuned.

QF

Time Machine Test (Part 1)

This series of post is based on CSS Analytics’ Time-Machine post series. I recommend you read it, since I this is not my original idea. However, I do think that my implementation differs slightly from CSS Analytics’.

The following results represent backtests using my version of the algorithm on different equity indices on all free historical data available for each ticker. I used index data since the available data points go way further in history. I did test the algorithm on available ETFs with similar results. All results presented below are frictionless.

S&P 500

RUSSELL 2000

NASDAQ 100

S&P/TSX Comp. (Can)

FTSE 100 (U.K.)

NIKKEI 225 (JAP)

HANG SENG (CHN)

Summary

As the results above show, the algorithm is quite robust and does adapt fairly well to different market regimes as well as to the differences in market behavior for all these indices. Regardless, I think this is a very nice and simple concept that can still be improved.

I will try several modifications to the algorithm in the next few posts or so. For now you can expect tests on different asset classes: commodities, futures, currencies, etc. I also want to try to replace the t-test with a non-parametric Wilcoxon signed-rank test and see how the strategy performs when we get rid of the normality assumption when testing for significance. I also have other ideas in mind to improve the algorithm, stay tuned!

QF

First Order Autocorrelation as a Moderator of Daily MR

In the same line of thought than my previous post on volatility as a moderator of daily MR, this post will observe first order autocorrelation. From wiki: “Autocorrelation is the cross-correlation of a signal with itself. Informally, it is the similarity between observations as a function of the time separation between them.” Basically, it is the extent to which series values are correlated with previous values (aka lagged values) in the same series. For example, the first order autocorrelation of a daily logarithmic return series is the correlation between two subsets of the return series; the series as is with a look back period and the same subset lagged 1 period.

From a trading perspective, autocorrelation is a very simple tool to incorporate in a market regime indicator, or more globally in a trading system. It is also interesting to see the evolution of autocorrelation in a given asset. The figure below shows the equity curve of the S&P 500 since 1957, the rolling Sharpe ratio of a daily MR strategy (RSI 2 50/50) and finally the first order correlation of the S&P logarithmic return using a rolling 2 years look back period.

As one would expect, first order autocorrelation can help moderate daily MR performance. When autocorrelation is positive, daily follow through is more profitable than mean reversion, around the turn of the century however, autocorrelation switched to negative territory, which is consistent with the MR predominance in profitable directional swing strategies as very well explained in the blogs on my blogroll. I plan to post a more number intensive note soon, this was just a post to introduce autocorrelation as a valuable moderator of daily MR.

QF