Trading Strategy Backtesting Platform Development
A backtesting platform is infrastructure for testing trading strategies on historical data. A good platform not only replays a strategy on past data but does so as realistically as possible: accounting for commissions, slippage, liquidity, and protection against data leakage.
Platform Architecture
Data Layer — historical data storage: OHLCV candles, tick data, order book snapshots. ClickHouse or Arctic (Python) for efficient access.
Backtest Engine — core simulation engine. Event-driven or vectorized architecture.
Strategy Runtime — user strategy execution environment with isolation.
Results & Reporting — metrics calculation, P&L visualization, strategy comparison.
Optimization Engine — parameter exploration (grid search, genetic algorithm, Bayesian).
Event-Driven vs Vectorized
Vectorized — entire logic applies to pandas DataFrame at once. Fast (NumPy operations) but difficult to model realistic order behavior:
# Vectorized approach
import pandas as pd
import numpy as np
def backtest_ma_crossover(df: pd.DataFrame, fast: int, slow: int) -> pd.Series:
fast_ma = df['close'].rolling(fast).mean()
slow_ma = df['close'].rolling(slow).mean()
# Signals
signal = np.where(fast_ma > slow_ma, 1, -1)
signal = pd.Series(signal, index=df.index)
# Returns
returns = df['close'].pct_change()
strategy_returns = signal.shift(1) * returns # shift(1) = no look-ahead
return strategy_returns.cumsum()
Event-driven — more realistic simulation. Each market event (candle, tick) is processed sequentially. Can model partial fills, order slippage, margin calls:
class EventDrivenBacktester:
def run(self, strategy: Strategy, data_feed: DataFeed) -> BacktestResult:
portfolio = Portfolio(initial_cash=100_000)
broker = SimulatedBroker(portfolio, slippage=0.001, commission=0.0005)
for event in data_feed:
if isinstance(event, MarketEvent):
strategy.on_market_data(event)
elif isinstance(event, SignalEvent):
order = strategy.generate_order(event)
broker.submit_order(order)
elif isinstance(event, FillEvent):
portfolio.update(event)
strategy.on_fill(event)
return BacktestResult(portfolio.equity_curve, portfolio.trades)
Order Execution Simulation
Realistic simulation is the key distinction between a good backtester and a poor one:
class SimulatedBroker:
def __init__(self, slippage_pct: float = 0.001, commission_pct: float = 0.0005):
self.slippage = slippage_pct
self.commission = commission_pct
self.pending_orders: list[Order] = []
def simulate_fill(self, order: Order, bar: OHLCV) -> FillEvent:
if order.type == "MARKET":
# Market order executes at next open + slippage
fill_price = bar.open * (1 + self.slippage if order.side == "BUY" else 1 - self.slippage)
elif order.type == "LIMIT":
# Limit order executes if price reached level
if order.side == "BUY" and bar.low <= order.price:
fill_price = min(order.price, bar.open) # conservative fill
elif order.side == "SELL" and bar.high >= order.price:
fill_price = max(order.price, bar.open)
else:
return None # not executed
commission = fill_price * order.quantity * self.commission
return FillEvent(
order_id=order.id,
fill_price=fill_price,
quantity=order.quantity,
commission=commission,
timestamp=bar.timestamp,
)
Backtest Metrics
def calculate_metrics(equity_curve: pd.Series, trades: list[Trade]) -> BacktestMetrics:
returns = equity_curve.pct_change().dropna()
annual_factor = 252 # trading days
# Basic metrics
total_return = (equity_curve.iloc[-1] / equity_curve.iloc[0]) - 1
annual_return = (1 + total_return) ** (annual_factor / len(returns)) - 1
# Risk-adjusted
sharpe = returns.mean() / returns.std() * np.sqrt(annual_factor) if returns.std() > 0 else 0
sortino = returns.mean() / returns[returns < 0].std() * np.sqrt(annual_factor)
# Drawdown
rolling_max = equity_curve.cummax()
drawdown = (equity_curve - rolling_max) / rolling_max
max_drawdown = drawdown.min()
calmar = annual_return / abs(max_drawdown) if max_drawdown != 0 else 0
# Trade-level metrics
winning_trades = [t for t in trades if t.pnl > 0]
losing_trades = [t for t in trades if t.pnl < 0]
win_rate = len(winning_trades) / len(trades) if trades else 0
gross_profit = sum(t.pnl for t in winning_trades)
gross_loss = abs(sum(t.pnl for t in losing_trades))
profit_factor = gross_profit / gross_loss if gross_loss > 0 else float('inf')
return BacktestMetrics(
total_return=total_return,
annual_return=annual_return,
sharpe_ratio=sharpe,
sortino_ratio=sortino,
max_drawdown=max_drawdown,
calmar_ratio=calmar,
win_rate=win_rate,
profit_factor=profit_factor,
total_trades=len(trades),
avg_trade_pnl=sum(t.pnl for t in trades) / len(trades) if trades else 0,
)
Data Leakage Protection
The most common backtesting mistake is data leakage: using future data when making decisions in the past.
Look-ahead bias — strategy uses data from the current period (e.g., close price) to make a decision that should have been made before the bar closes.
# Wrong: using close from same candle
signal = df['close'].rolling(20).mean() # signal[-1] includes current close
entry_price = df['close'] # entry at same close
# Right: entry on next bar
signal = df['close'].rolling(20).mean().shift(1) # shift(1) = past period
entry_price = df['open'] # entry at next open
Survivorship bias — if testing on currently active symbols, excluding delisted ones, results are inflated. Use historical index lists.
Parameter optimization leakage — if parameters are optimized on all data and tested on the same data, it's not a test, it's curve fitting. Always reserve an out-of-sample period.
Walk-Forward Validation
def walk_forward_backtest(
strategy_class,
data: pd.DataFrame,
train_period: int, # days
test_period: int, # days
optimization_func,
) -> list[BacktestResult]:
results = []
start_idx = 0
while start_idx + train_period + test_period <= len(data):
train_data = data.iloc[start_idx:start_idx + train_period]
test_data = data.iloc[start_idx + train_period:start_idx + train_period + test_period]
# Optimize parameters on train data
best_params = optimization_func(strategy_class, train_data)
# Test on out-of-sample
strategy = strategy_class(**best_params)
result = run_backtest(strategy, test_data)
results.append(result)
start_idx += test_period
return results
Parameter Optimization
Grid search, genetic algorithm, Bayesian optimization — different approaches to finding optimal parameters. Key principle: optimize on train, evaluate on test, never the reverse.
For a platform with user strategies — task queue (Celery, RQ) for distributed backtest execution across multiple workers. One complex backtest (1 year of data, thousands of parameter combinations) can take hours — async execution with completion notification is mandatory.







