[WITH CODE] Evaluation: Metrics for your systems
Metrics will help you understand how good your trading strategy is
Table of contents:
Introduction.
Basic statistical functions.
Performance metrics.
Risk-adjusted metrics.
Statistical properties of returns.
Risk measures.
Benchmark comparison metrics.
Drawdown metrics.
Trade-based metrics.
Introduction
Returns are just vanity metrics until you dissect the risk engine that drives them. The Sharpe ratio isn't an academic trophy; it's the blood pressure gauge for volatility-adjusted returns. Maximum drawdown? That's your strategy's breaking point when markets soar like they did in 2008.
And friends, if your backtest collapses under the 10-sigma volatility spikes of March 2020 or a liquidity crunch from a hawkish Fed pivot, you're not trading, you're donating.
Quantitative analysts obsess over left-tail events hidden in the noise. Their alpha is just beta in disguise if it can't survive rate hikes, flash crashes, or a Black Swan that chokes the repo market. Remember: markets don't care about your elegant code. They break strategies that confuse backtest luck with structural advantage.
Today we begin by analyzing a basic overview of any trading system. Without this, you won't succeed.
Basic statistical functions
Before diving into individual metrics, we need to establish the statistical building blocks that we will use to compute averages, dispersion, and percentiles.
Mean
The arithmetic mean of a dataset {x1,x2,…,xn} is defined as:
The mean is the central value of the data. It gives a first-order summary and is used in nearly every metric.
def my_mean(data):
"""Calculate the mean of a list of numbers."""
if not data:
return 0
return sum(data) / len(data)
Standard deviation
The population standard deviation measures the dispersion of data around the mean. Its formula is:
A higher standard deviation indicates that data points are spread out over a wider range of values, which is particularly important in risk assessment.
def my_std(data):
"""Calculate the population standard deviation of a list of numbers."""
if not data:
return 0
m = my_mean(data)
variance = sum((x - m) ** 2 for x in data) / len(data)
return math.sqrt(variance)
Percentiles
They are used to determine the relative standing of a value in a dataset. The pppth percentile is computed by sorting the data and interpolating between the nearest ranks.
For example, the 5th percentile—which is used in VaR calculations—tells us the value below which 5% of the data fall.
def my_percentile(data, percentile):
"""
Compute the percentile of a list of numbers using linear interpolation.
"""
if not data:
return None
sorted_data = sorted(data)
n = len(sorted_data)
pos = (percentile / 100) * (n - 1)
lower = math.floor(pos)
upper = math.ceil(pos)
if lower == upper:
return sorted_data[int(pos)]
lower_value = sorted_data[lower]
upper_value = sorted_data[upper]
weight = pos - lower
return lower_value + weight * (upper_value - lower_value)
Perfect! Now that we have these concepts clear, let's move on to a little more interesting things.
Oh! By the way, if you're interested in these topics, don't miss what's coming soon related to all this. Start here; it'll get interesting soon. It will get wild soon!🔥🔥
Performance metrics
Performance metrics help you understand how well your trading strategy grows your capital.
Cumulative return
The cumulative return is defined as:
This metric measures the overall growth of your portfolio over the period. If you start with $100,000 and finish with $150,000, your cumulative return is 50%.
def cumulative_return(equity_initial, equity_final):
"""Calculate the cumulative return as a percentage."""
return ((equity_final / equity_initial) - 1) * 100
Compound annual growth rate (CAGR)
CAGR is defined as:
where N is the number of years.
CAGR provides an annualized growth rate that smooths out fluctuations, giving you a single number to compare across different time frames. It’s like finding the average speed of a car over a long journey, regardless of the stops and starts.
def calculate_cagr(equity_initial, equity_final, years):
"""Calculate the Compound Annual Growth Rate (CAGR) as a decimal."""
if equity_initial <= 0 or years <= 0:
return None
return (equity_final / equity_initial) ** (1 / years) - 1
Risk-adjusted metrics
These metrics incorporate both returns and the risk taken to achieve them. They help answer the question: Is the extra return worth the extra risk?
Sharpe ratio
It computes the excess return for each day, averages them, and divides by the volatility of these returns.
E(R) is the expected return.
Rf is the risk-free rate.
σ is the standard deviation—volatility—of returns.
The Sharpe Ratio measures how much excess return you are receiving for the extra volatility that you endure for holding a riskier asset.
def sharpe_ratio(returns, risk_free_rate):
"""
Calculate the Sharpe Ratio using excess returns.
"""
# Compute excess returns for each period.
excess_returns = [r - risk_free_rate for r in returns]
# Compute the mean of the excess returns.
mean_excess = my_mean(excess_returns)
# Compute the standard deviation of the excess returns.
std_excess = my_std(excess_returns)
# Avoid division by zero.
if std_excess == 0:
return float('inf')
return mean_excess / std_excess
Sortino ratio
In this ratio, we isolate only the days with losses, compute their variability, and then see how the overall excess return compares to that risk.
where σd is the standard deviation of the negative returns—downside deviation.
Unlike the Sharpe Ratio, which penalizes both upside and downside volatility, the Sortino Ratio focuses only on harmful volatility—the volatility caused by negative returns.
def sortino_ratio(returns, risk_free_rate):
"""
Calculate the Sortino Ratio considering only negative deviations.
"""
# Compute excess returns.
excess_returns = [r - risk_free_rate for r in returns]
# Filter for negative excess returns (downside risk).
downside_returns = [x for x in excess_returns if x < 0]
# Compute the downside standard deviation.
downside_std = my_std(downside_returns) if downside_returns else 1e-10
# Return the ratio.
return my_mean(excess_returns) / downside_std
Omega ratio
I like this one and not only because of its name but for its elegance.
Beautiful! It compares the total gains to the total losses, giving an idea of whether your winning days are compensating for your losing days.
def omega_ratio(returns):
"""
Calculate the Omega Ratio: ratio of sum of positive returns to sum of absolute negative returns.
"""
sum_positive = sum(r for r in returns if r > 0)
sum_negative = sum(abs(r) for r in returns if r < 0)
if sum_negative == 0:
return float('inf')
return sum_positive / sum_negative