I created a new Monte Carlo analysis tool. This tool allows you to simulate the next 40 years of market returns, month by month, using the same probability distribution encountered historically for monthly returns. The goal is to see how much variation is possible in actual market returns compared to the expected market returns often assumed in financial planning. You may be surprised by the answer.

The data I used for S&P monthly returns is from Robert Shiller of “Irrational Exuberance” fame. (You can download this historical data here.) Starting in January of 1871 and running to November 2010, there are 1678 different monthly periods in the dataset, encompassing nearly 140 years. The data has nominal prices as well as real prices adjusted by the CPI. Since the CPI is a woefully inadequate descriptor of inflation with the government games as of the last couple of decades, I chose to use the nominal data. Just remember that inflation/deflation and taxes will affect your final outcome in reality.

An important concept for this analysis is a cumulative probability distribution. A regular probability distribution looks like the bell curve you may be familiar with. The x-axis is the value, and the y-axis is the number of items occurring at that value describe the probability density, which for a normal distribution will look something like this:

This curve shows where the most probable numbers lie in the distribution. The more space under the curve at a given x value, the more numbers existed at that point. A cumulative probability distribution maps this same data onto a single-valued function in y. This is done by calculating the probability that any given number is higher than the x value chosen, like so:

That is the same data as shown in the first bell curve, just mapped to a function we can use by choosing a y value and reading the one corresponding x value. This is the key enabler for the Monte Carlo Analysis.

I began the analysis by mapping the Shiller monthly returns to a cumulative probability function that gave a value between 0 and 1. I started with the lowest monthly return (-26.47%) and noted that all returns in the set were equal or greater to that return (1678/1678 = 1). Then for the next return (-23.97%), only 1677/1678 returns were equal or greater than that amount (0.999404052), and so forth to the last value, which I capped at 0. I used linear interpolation between the points to create a continuous function. Once finished, you then have a function that quantifies the relative probability of encountering each given return with a proportionally-probable number between 0 and 1. This cumulative probability distribution function for the Shiller data is plotted below:

The way to interpret this is if you looked at a cumulative probability on the y-axis of 0.1 (or 10%), then reading the curve, that means that only one out of ten months had returns that were greater than or equal to +5% for the month. Nine out of ten months returned +5% or less. Note that this distribution is representative only of what we have seen so far. As more years of data are created, this distribution will most probably change. It’s also worth noting that the distribution is not a “normal” distribution like the classical bell curve in statistics. Here’s the overall cumulative probability plot of a normal distribution compared to our market returns distribution:

Ours distribution has “fat tails”, meaning the probabilities of getting results at the outlying edges are much higher than you would expect, both for large gains and for large losses. Zooming in, this becomes clear.

Gains Fat Tail (returns greater than about 12% happen more than predicted):

Losses Fat Tail (returns less than about -7% happen more than predicted):

The fat tail losses are huge. If you used a normal distribution, you would expect a month of -12% to only happen 1 in 1000 times (0.999), but the actual frequency is more like 1 in 100 (0.99), a factor of 10 difference. This should show that using normally distributed statistical models for financial market analysis is inaccurate at best and nearly blows up the world Long-Term-Capital-Management-Style at worst. Probabilistically speaking, we are going to see more losses than we expect, and it shows up in the analysis. It turns out that we got pretty lucky over the last 40 years in reality!

To perform the Monte Carlo analysis, we simulate the flip of a special virtual coin. This “coin” is just a random number between 0 and 1. Any outcome in that range is equally as likely as any other. We take that random number and read it against our cumulative probability distribution to get the corresponding monthly return. The new account equity is calculated using that return for the month. Then we repeat the process for 480 months (40 years) and record the final equity, as well as the peak and minimum equity during that time. Since the average working lifetime of a person is about 40 years, and 40 years would probably be classified as “the long run” by most people, I went with that duration for the MC sim. Doing this once represents one simulation. We can repeat it many times to get a better idea of how the final outcomes may vary depending on how each month played out.

The chart below shows the results for running 32,000 simulations of 40 years of returns (32,000 because that’s the most we can fit on one chart in Excel). Each one started with $100,000 of equity:

You immediately see that there are a few really large outliers to the upside. There were 26 instances of finishing with $10,000,000 (100x the original account value) out of 32000. Seems like a lot, but remember that is only 0.081% of the simulations.

I went on to run 201765 simulations over the course of several hours, again with $100,000 starting equity. From that dataset, here are some stats:

The median ending value was $539,134, or only about 5X for the 40 years. That means that 50% of the simulations had final equity more than this, but 50% of them had final equity less than $539,134. If you ended up exactly at the median value after 40 years, the simple annual rate of return was 10.98%, and the compounded rate of return was only 4.30%. The simple return is analogous to buying a $100,000 40 year bond paying 10.98% a year for 40 years. At the end you’d have $539,134. The compounded return is like putting $100,000 in a savings account paying 4.30% per year on the balance, and you re-invest the interest every year. You’d have the $539,134 at the end of 40 years.

The mean (average) ending value was $806,008. The average is so much higher than the median because of those few very large outliers mentioned earlier. If you ended up exactly at the mean value after 40 years, the simple annual rate of return was 17.65%, and the compounded rate of return was only 5.36%. The simple return is analogous to buying a $100,000 40 year bond paying 17.65% a year for 40 years. At the end you’d have $806,008. The compounded return is like putting $100,000 in a savings account paying 5.36% per year on the balance, and you re-invest the interest every year. You’d have the $806,008 at the end of 40 years.

The smallest final equity over 40 years was $11,818.25, almost a 90% LOSS(!) The highest was $54,361,024. Both of these outcomes are highly improbable, but technically possible.

Well, how does that compare to the actual S&P results over the last 40 years? If you started with $100,000 on Dec 1, 1970 you’d have $1,421,155.67 as of Dec 1, 2010. The simple annual rate of return was 33.03%, and the compounded rate of return was only 6.86% (a little compounded interest goes a long way, no?). That is a very large increase over the median result of our Monte Carlo analysis, and even nearly a full double over the much higher average value! It would seem that if the historical probabilities continue to match the future outcomes, that we have been highly favored by the stock gods over the past 40 years. I would not expect the next 40 to be as kind.

Well, then what can we expect? To get an idea of a the typical range of values, I took the mean of our results +/- one standard deviation. I subtracted the initial account value to get a chart of profit/loss instead of final account value. This gives the levels seen in the following plot:

“Standard deviation” is a concept that applies only to the theoretical normal distribution, but in this case it’s the best I could think of for a metric for “typical” values. In a normal distribution, 68% of the values would fall within this +/- one SD band. In our case, we don’t know if our future values will fall there or not. Probably not exactly. But if someone told you there was roughly a 2/3 chance of something happening, you’d call that “probable” I imagine.

From that chart, you can see that the actual PnL for the S&P over the last 40 years was below the upper “probable” band, but still well above the average. Basically, we had a really good 4 decades, and as we all know, most of it was the 90’s. So if I were to be making retirement planning, I would not be assuming anything close to the last 40 years. I’d be using the average value, and possibly the median value if I was feeling rather conservative. Because that’s what you are likely to get. It’s even possible (and probable) that we could see a worse return than the median. But you won’t see that on CNBC.

Here are a few plots showing some typical results for different simulations. Each of these has the large dataset mean, median, SD bands and 40 year S&P actuals on it. The “Net PnL” line is the PnL graph of the current simulation month by month over the 40 years. As you can see, there can be considerable variation in the outcomes! It’s not as simple as assuming 8% a year for 40 years:

I’m releasing the tool for free. When you open it, make sure that your security settings allow macros and that you enable them if prompted. To use the tool, you go to the Analysis tab, enter your starting equity in the orange cell, enter the number of simulations you want to run, and hit the “RUN” button. (If you enter a huge number and you’re tired of waiting, hit your ESC key and hit end on the pop-up. Change your value to a smaller number and hit run again.) When complete, it moves to the MonteCarloPlots tab and you can see your chart. You can refresh simulations one at a time by hitting F9, which will refresh the random numbers and plot a new “Net PnL” curve. This gives a good visual understanding of the stats and I recommend doing it. Finally, You can change the cumulative probability data in the CumProb sheet.

Download the Excel file “GeneralMonteCarloAnalysis.xlsm” tool from “Released Tools” on my Google Site. (I also have provided a version for pre-2007 Excel, the one with the “.xls” file extension, and remember that you can’t do more than 64,000 points or so because of the restriction on number of rows.) You can investigate the analysis I did, or you can substitute data of your own to create a different cumulative probability distribution to perform other analysis. If you use the tool and it’s valuable to you, I encourage you to make a donation to support future work like this. Thanks!

Tags: market returns, monte carlo, S&P

December 19, 2010 at 12:29 pm

Nice work, Quinn.

December 22, 2010 at 1:05 am

i agree, this really puts things into perspective, thanks for compiling this data, keep up the good work, hope you find more time from your busy schedule to work on this blog because its there have been a lot of great points, and i feel many yet to still be uncovered

February 3, 2011 at 9:27 am

You are forgetting the fact that this financial data is strongly autocorrelated from month to month, so sampling from the same distribution each month is problematic as the conditional distribution given the outcome from the previous month is different, month to month returns are definitely not independent

February 3, 2011 at 9:40 am

You are right about that. A fully detailed analysis would have to take broad market conditions into account as well as month to month correlations. However, this should be a good study of “random walk” conditions if the market were truly random (I don’t think it is). At lease it points out that returns can vary more than we commonly think they can over the long term.