Volatility Autocorrelation in Different Markets

For an introduction to autocorrelation see this previous post: First Order Autocorrelation as a Moderator of Daily MR.

In the previous post, I looked at how autocorrelation indicated MR performance, in this post, I observed the autocorrelation of another moderator of daily follow-through; volatility. Intuitively, it makes sense for volatility to be highly autocorrelated. VIX spikes are a good example of it. Let’s look at how the autocorrelation of volatility behave during bull/bear markets as defined by the 200/50 days moving averages crossover.

For this test, I looked at the first through fifth order autocorrelation of the 21 day rolling standard deviation of SPY returns, dividing them into percentiles (using the percent rank function with a 252 days lookback period) and looked at the average and the standard deviation of the percentiles.

As I thought before beginning the test, the volatility of returns is highly autocorrelated. However I thought that a bigger difference would exist between bull and bear markets. One thing to notice in the table below is that the autocorrelation is on average greater when the 50 day MA is greater than 200 day MA for all orders. The same observation holds true regarding the standard deviation.

Trading wise, this information a no direct value by itself, however it puts in numbers a well defined market mechanics. We know that volatility is easier to predict than returns, and the level of autocorrelation is greatly responsible for this fact. We also know that daily mean-reversion strategies perform better during high volatility regimes. Thus we can use the volatility autocorrelation as an indicator of the suitability of our strategy. For example, during a high volatility regime with high autocorrelation we can expect our strategy to perform well and vice-versa if we change the contingency. For a very interesting read about the predictability of volatility and its application to trading, I recommend this recent article winning the NAAIM 2010 Wagner award from Tony Cooper. http://naaim.org/files/2010/1st_Place_Tony_Cooper_abstract.pdf



Drivers of MR Performance

A lot of the daily mean reversion strategies discussed in the blogosphere are designed for equity index ETFs or equity index mutual funds. Indexes being a group of individual stocks, if we could determine which stocks will drive the index returns or in this case, the behavior, we could adapt a strategy to profit from the prevailing market paradigm as indicated by the drivers of the index.

For example if the stocks affecting the index the most are exhibit mostly mean reverting behavior, we can expect the index to exhibit the same behavior. The same conclusion holds for trending behavior. The important point here is that indices are not an entity in themselves; they are made to reflect a certain sector/type of stocks/area etc. To illustrate the concept, I performed a simple test using the popular unbounded DV 2.

I used the Nasdaq 100 and its related ETF QQQQ to perform this test. First I needed a way to quantitatively identify the drivers of the market. For this purpose, I used a weighted r-squared. Many of you know that the r-square under its other name: the coefficient of determination. The statistic provides an indication of the level in which a series is predicted by another (i.e. the goodness of fit). To obtain it you can just elevate the correlation coefficient to the power 2. For this test I looked at the r-squared only for the 21 previous days on a rolling basis. Then I looked at the volume of each individual component of the index relative to the total volume of all the components, and then weighted the r-squared with the proportion of the volume of this particular stock.

In a more rigorous form:

After this computation for every stock at every period n, I take the top 10 weighted r-squared stocks and compute their DV2 values, average it and trade the QQQQ using the signal given by the DV2 (long/short from 0). The results below compare this strategy, buy & hold QQQQ, and DV2 on the QQQQ data.

As you can see, the results are quite similar and as expected, highly correlated. It shows that in an index formed with 100 stocks, looking at only a select group with a big influence on the returns can help determine the MR performance of the index itself.


Introduction of SVM Learning in Strategies

This post is derived from a comment I received my last post on probability density function for an adaptive RSI strategy. pinner made the following observation:

“Alternatively you could regress the returns against the set of 6mo & 1yr RSI points as a means to determine the best decision. While this approach probably requires more historical data, it also affords more detailed metrics from which to make each decision.”

I am personally not a big fan of simple regression model for my trading decisions. However, I think there is something in the concept. Such a setup is also a good place to use a common traditional machine learning technique; the support vector machine (SVM).

From wikipedia: given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that predicts whether a new example falls into one category or the other. Intuitively, an SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on.

In concrete terms, the training algorithm separate the data into up days and down days then look at the predictors value. It then creates a set of rules dividing the data; these rules minimize the classification error while also maximizing the margin of safety, thus giving the model more room, resulting (hopefully) in a greater accuracy. Based on this set of rules, the algorithm classifies new data in a category. Note that this is not a numeric prediction (i.e. next day return should be xx%) it is a binary factor prediction, thus allowing us to derive a probability along with the prediction. It comes in handy when we want a confidence based system.

This is a nice technique, but it has its drawbacks. First of all, it is a parameter dependant technique. The effectiveness of the svm is mostly determined by two parameters. It is usually recommended to test different values for the pair and to retain the pair that performs best during cross-validation for the model. This can become quite computationally annoying. Without getting to technical (or into details), if we want a non-linear classification algorithm, we have to choose the type of kernel function we want; there is several.

I just wanted to throw the idea out there, since pinner’s suggestion was a good one. If readers are interested to try it out, I encourage you to send me the results, I would be happy to post them. Alternatively, I might post some results depending on the interest. I just wanted to introduce support vector machine and its potential application when developing strategies.


Using Probability Density as an Adaptive Mechanism

I will take a pause of the Time Machine series for now while I work on it some more and prepare future posts. Today I will follow up on my post on return distributions and show a simplistic way to include it in an adaptive strategy.

It has been discussed quite a lot on the blogosphere that a strategies with fixed parameters are inferior to adaptive strategies. For example the simplest daily MR strategy I can think of is probably RSI 2 50/50, but this strategy did not always worked and I certainly don’t expect it to keep working forever. Furthermore, the most profitable lookback parameter for RSI also varies in time. This is where return distribution is useful. From it, we can derive the probability density function and use that to create an adaptive mechanism.

Just a little background on probability density function; from wiki: “density of a continuous random variable is a function that describes the relative likelihood for this random variable to occur at a given point in the observation space.” In plain language; the probability of a certain event happening. I recommend using your favorite statistical software to do so, unless you want to be doing integrals for a long time!

For this test, I took SPY data, computed RSI values for different lookback periods (2 to 30), and then looked at the results for each strategy. For a rolling period of 1 year and 6 months I looked at the probability densities of returns for every strategy looking specifically at the probability of returns greater than zero (this can be changed to a higher threshold, just want to keep it basic for this). I then traded the strategy that had the highest combination of 1 year and 6 months values. That way, the capital is allocated to the strategy with the parameter generating the highest probability of positive returns as measured by the probability density function. I compared the strategy, RSI 2 50/50 and buy and hold.


The results are not particularly impressive, the point of the article was to illustrate the concept as simply as possible. I believe that there is ways to make this particular strategy more robust; as a starter to take a shorter time frame for the lookback period to make it more sensitive to recent market data or introducing a weighting scheme to weight more recent data. I will let the reader experiment with it, I would be happy to post results if you care to share. Even though the results are not spectacular, the strategy seems to adapt to the different waves in the market and allocate the capital to a more appropriate parameter length for the RSI depending on the current market paradigm.


(Part 4) Time Machine Test – Commodities

Results on a commodities basket.

WTI Crude

Natural Gas




As you can see, the algorithm adapts to different classes of commodities and outperforms most of the buy and hold returns. But I want to emphasis that this concept alone is not something I would trade as is (not robust enough). The results are in no way good enough to rely on out-of-sample.

I found that it is much harder for the algorithm to find strategies significant enough to trade on for long periods on these assets. For equities, there is on average 7 different active strategies (i.e. the level of significance is high enough) at the same time. The number is much less with commodities and currencies, the lack of diversification between strategies certainly hurts performance when compared to equity indices. It also add on more exposure to a given strategy adding a lot of volatility to the returns as showed by the numbers above.


(Part 3) Time Machine Test – Currencies

Edit: This is a repost of the previous version of the post for currencies and commodities. When adapting my code for Datastream data, I made an error, the code was peeking which explains the ridiculously straight equity curve. Here is the corrected version of the post. I sincerely apologize to readers for the inconvenience. I also want to assure you that the previous results on equity indices are correct; I had a couple colleagues take a look at it to confirm that there were no bugs left.





First thing I notice looking at these results is difference between the usefulness of run analysis on currencies. It makes sense since currencies are related in good part to macroeconomics factors. It is also much harder for the algorithm to find significant strategies, the time in the market for the strategy is much less than with equities.. Because of this, it becomes harder to squeeze out alpha out of the strategy with a high confidence level in the analysis.


Blog Beauty Pageant

Just a quick post to tell you not to be surprised if I change the theme of my blog, or at the very least the font size, during the day, some reader have brought to my attention the difficulty to read posts. I will be experimenting with WordPress’ themes and HTML today to find a solution. It is made harder by the fact that I don’t really want to take a blog theme already used by the quant blogosphere big players (a real shame since they are pretty nice). Regardless, expect a post testing the time machine algorithm on different asset classes; commodities, currencies later today.