Backtesting Strategies In-Depth

AnalyticsForTraders Insights Leave a Comment

In a previous article, Understanding the Reward/Risk and Win/Loss Ratios of a Trading System, the discussion focused chiefly on:

  • Generalized Reward/Risk + Win/Loss Calibration

This article is an extension of the thoughts from that original piece along with some additional reinforcement via a tangible example.

Let’s start with a system:

The system’s characteristics for the parameter set on which it was run are summarized in the following table:

Share This Post

While the PnL figure certainly appears lucrative, the underlying Reward/Risk (R/R) and associated Win/Loss characteristics raise an interesting question: is the accompanying performance sustainable at a win rate of 45% with an R/R of 1.07?

The answer to that question can be tackled by taking the characteristics of the system, which have been determined empirically, and running them through an iterative process in order to confirm the likelihood of the observed real-life occurrence. Let’s begin first by recreating the exact characteristics of the system as observed:

Running 1000 iterations for the characteristics set with a trade depth of 55 (25 wins + 30 losses, put aside sample size concerns for the time being) shows that only 33.8% of those iterations resulted in account values greater than their initial value.

Furthermore, the median of paths (thick orange curve) appears to be monotonically decreasing.

The observed empirical ending value of 115,504.40 from the real-life system (50,000 initial capital + 65,504.40 Net PnL) is at the upper threshold of all of the modeled paths. The modeled maximum was 137,956.52. Only 14 values in the simulation resulted in values at or above the empirically observed amount of 115,504.40. That’s 1.4%.

It would appear then that the system has produced somewhat of an anomalous outcome… but has it?

Those who gave the original table a thorough look-through probably have a good understanding of why these figures are so skewed… this point will be addressed shortly. Prior to doing so however, let’s return to the model and run out the simulation 10x its empirical depth, as measured by the number of trades:

The observation of monotonically decreasing behavior is confirmed, and in a big way. The number of positive accounts has dropped from 33.8% to 4.7%.

While a handful of iterations do proceed on to generate positive PnL, it is rather evident that the UNDERLYING characteristics of this specific configuration are not in of themselves reliably profitable.

Returning back to the question of why that is, let’s revisit the characteristics table:

There is a prominent dispersion in the distribution of the largest gain/loss compared to the respective averages of gains/losses. So prominent, in fact, that the single largest winning trade accounts for over 43% of the cumulative Net PnL.

Removing this occurrence and focusing on the underlying process itself strips away the impact of such “outliers” to provide insight into the characteristics that better describe the system on average.

Does this mean that the system is worthless? Not necessarily; it did manage to produce a 131% return trading single lots, after all.

What it does mean is that the system is dependent on outsized events (trades). For some individuals that is completely fine, business as usual. For others it can present challenges that are a bit more problematic (i.e. cash flow management, drawdown tolerance, etc.). Ultimately it is a matter of preference and suitability.

The point is to recognize that a clean backtest may or may not mean what it implies at first glance.

Monte Carlo Simulator

Get this tool for FREE!

About the Author


Facebook Twitter