MTUM/USMV Momentum vs Low-Volatility Factor Strategy

Row-Level Dual-Model with Bank-Size, Gold, and Financials Signals, 10-Day Hold

Author
Affiliation

Rusty Conover

Query.Farm

Published

April 16, 2026

Show code
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

import sys
sys.path.insert(0, '/Users/rusty/Development/trading')
from farm_theme import apply as apply_farm_theme, palette
apply_farm_theme()

df = pd.read_csv('strategy_data.csv', parse_dates=['dt'])
df = df.sort_values('dt').reset_index(drop=True)

capital = 10000
ret_col = 'daily_ret_unscaled'
df['cum_pnl'] = (df[ret_col] * capital).cumsum()
df['drawdown'] = df['cum_pnl'] - df['cum_pnl'].cummax()
df['year'] = df['dt'].dt.year

mtum = pd.read_csv('MTUM.csv', parse_dates=['Date'])
usmv = pd.read_csv('USMV.csv', parse_dates=['Date'])
prices = mtum[['Date','close']].rename(columns={'close':'mtum_close'}).merge(
    usmv[['Date','close']].rename(columns={'close':'usmv_close'}), on='Date')
prices = prices.sort_values('Date').reset_index(drop=True)
prices['spread_ratio'] = prices['mtum_close'] / prices['usmv_close']
prices = prices[prices['Date'] >= '2020-01-01']

Executive Summary

This document presents a systematic pairs trading strategy on MTUM (iShares MSCI USA Momentum Factor ETF) vs USMV (iShares MSCI USA Min Vol Factor ETF). The strategy predicts 10-day forward returns of the MTUM-USMV spread using bank-size, gold, and financials signals.

This is the second-highest-PnL strategy in the portfolio and is profitable in every calendar year tested.

NoteKey Metrics (2020–2026)
Metric Value
Sharpe Ratio 5.63
Ann. Return 374.0%
Total P&L $37,398 on $10K
Direction Accuracy 65.7%
Max Drawdown -$3,458
Years Profitable 7 / 7
Post-10bps Sharpe 5.08

1. Strategy Overview

1.1 Economic Rationale

MTUM and USMV represent opposite ends of the equity factor spectrum. MTUM picks the highest-trailing-momentum stocks (procyclical, beta > 1, often growth/tech-heavy); USMV picks the lowest-realized-volatility stocks (defensive, beta < 1, utilities/staples-heavy). The factor rotation between them is a slow-moving, multi-week cycle driven by:

  1. Bank-size spread (KRE-XLF): Regional banks (KRE) vs broad financials (XLF) is a clean read on credit-cycle stress. When KRE underperforms XLF, regional bank stress is rising — momentum factor (often growth/tech) tends to outperform low-vol (utilities/staples) as markets shift to growth-bias under credit stress.

  2. Gold daily return: Gold is a clean inflation/real-rate proxy. Gold rallies usually coincide with falling real rates, which boost momentum stocks (long-duration cash flows) over low-vol stocks (short-duration utility/staples).

  3. Financials (XLF) daily return: XLF is a yield-curve and credit-cycle proxy. Financials outperforming usually means rates rising and credit healthy — a regime where momentum factor outperforms low-vol.

  4. 10-day holding period: Factor rotation is fundamentally a multi-week phenomenon. Daily holds capture pure noise (Sharpe -0.34 net), while 10-day holds capture the actual factor cycle (Sharpe 5.08 net).

1.2 Why Long Holds Are Critical

The horizon dependence is dramatic — daily and 2-day holds are unprofitable after costs:

Hold Period Net Sharpe (post-10bps)
10-day 5.08
5-day 3.55
3-day 0.66
2-day 0.28
Daily -0.34

This is why earlier factor-rotation pair-trading attempts often fail — they use the wrong holding period. The factor cycle takes 1-2 weeks to play out.

1.3 Features (4 inputs)

Feature Rationale
spread MTUM - USMV daily log return
bank_size_spread KRE - XLF – regional bank stress indicator
gld_ret Gold daily return – inflation/real-rate proxy
xlf_ret XLF daily return – yield-curve / credit-cycle proxy

1.4 Position Sizing and Holding

Enter when the model predicts a 10-day spread move exceeding 0.8%. Hold for 10 trading days. Each entry incurs one round-trip of transaction costs for 10 days of exposure.

2. Performance Analysis

2.1 P&L and Spread

Show code
fig, (ax1, ax2, ax3) = plt.subplots(3, 1, figsize=(10, 9), sharex=True,
                                     gridspec_kw={'height_ratios': [2, 1.5, 1.5]})

ax1.plot(df['dt'], df['cum_pnl'], color='#1565C0', linewidth=1.5)
ax1.fill_between(df['dt'], 0, df['cum_pnl'], alpha=0.1, color='#1565C0')
ax1.axhline(y=0, color='gray', linewidth=0.5, linestyle='--')
ax1.set_ylabel('Cumulative P&L ($)')
ax1.set_title('Cumulative P&L ($10K Capital)')
ax1.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f'${x:,.0f}'))

ax2.plot(prices['Date'], prices['mtum_close'], color='#1565C0', linewidth=1, label='MTUM')
ax2.plot(prices['Date'], prices['usmv_close'], color='#E65100', linewidth=1, label='USMV')
ax2.set_ylabel('Price ($)')
ax2.set_title('MTUM and USMV Prices')
ax2.legend(loc='upper left', fontsize=9)

ax3.plot(prices['Date'], prices['spread_ratio'], color='#2E7D32', linewidth=1)
ax3.axhline(y=prices['spread_ratio'].mean(), color='gray', linewidth=0.5, linestyle='--',
            label=f'Mean: {prices["spread_ratio"].mean():.3f}')
ax3.set_ylabel('MTUM / USMV')
ax3.set_title('Spread Ratio')
ax3.legend(loc='upper left', fontsize=9)

for ax in [ax1, ax2, ax3]:
    for yr in range(df['dt'].dt.year.min(), df['dt'].dt.year.max() + 2):
        ax.axvline(x=pd.Timestamp(f'{yr}-01-01'), color='gray', linewidth=0.3, linestyle=':')

ax3.set_xlim(df['dt'].min(), df['dt'].max())
plt.show()
Figure 1: Cumulative P&L (top), MTUM and USMV prices (middle), and spread ratio (bottom)

2.2 Drawdown

Show code
fig, ax = plt.subplots(figsize=(10, 4), constrained_layout=True)
ax.fill_between(df['dt'], df['drawdown'], 0, color='#E53935', alpha=0.4)
ax.set_ylabel('Drawdown ($)')
ax.set_title(f'Drawdown — Max: ${df["drawdown"].min():,.0f}')
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f'${x:,.0f}'))
ax.set_xlim(df['dt'].min(), df['dt'].max())
plt.show()
Figure 2: Underwater equity curve

2.3 Yearly Performance

Show code
df2020 = df[df['dt'] >= '2020-01-01']
yearly = df2020.groupby('year').agg(
    traded=('active', 'sum'),
    pnl=(ret_col, lambda x: (x * capital).sum()),
    ret_mean=(ret_col, lambda x: x[x != 0].mean() if (x != 0).any() else 0),
    ret_std=(ret_col, lambda x: x[x != 0].std() if (x != 0).sum() > 1 else 1),
).reset_index()
yearly['sharpe'] = yearly['ret_mean'] / yearly['ret_std'] * np.sqrt(252)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4), constrained_layout=True)
colors = ['#E53935' if p < 0 else '#43A047' for p in yearly['pnl']]
ax1.bar(yearly['year'], yearly['pnl'], color=colors, alpha=0.7)
ax1.axhline(y=0, color='gray', linewidth=0.5)
ax1.set_title('Yearly P&L')
ax1.set_ylabel('P&L ($)')
ax1.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f'${x:,.0f}'))

colors_s = ['#E53935' if s < 0 else '#43A047' for s in yearly['sharpe']]
ax2.bar(yearly['year'], yearly['sharpe'], color=colors_s, alpha=0.7)
ax2.axhline(y=0, color='gray', linewidth=0.5)
ax2.axhline(y=1, color='green', linewidth=0.5, linestyle='--', alpha=0.5)
ax2.set_title('Yearly Sharpe Ratio')
ax2.set_ylabel('Sharpe')
plt.show()
Figure 3: Yearly P&L and Sharpe ratios – profitable in every year tested

2.4 Monthly Returns Heatmap

Show code
df2020 = df[df['dt'] >= '2020-01-01'].copy()
df2020['month'] = df2020['dt'].dt.month
df2020['yr'] = df2020['dt'].dt.year
monthly = df2020.groupby(['yr', 'month']).agg(pnl=(ret_col, lambda x: (x * capital).sum())).reset_index()
pivot = monthly.pivot(index='yr', columns='month', values='pnl').fillna(0)

fig, ax = plt.subplots(figsize=(10, 4), constrained_layout=True)
im = ax.imshow(pivot.values, cmap='RdYlGn', aspect='auto', vmin=-2500, vmax=2500)
ax.set_xticks(range(12))
ax.set_xticklabels(['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec'])
ax.set_yticks(range(len(pivot.index)))
ax.set_yticklabels(pivot.index)
ax.set_title('Monthly P&L Heatmap')
for i in range(len(pivot.index)):
    for j in range(12):
        val = pivot.values[i, j]
        if abs(val) > 10:
            color = 'white' if abs(val) > 1200 else 'black'
            ax.text(j, i, f'${val:.0f}', ha='center', va='center', fontsize=8, color=color)
plt.colorbar(im, ax=ax, label='P&L ($)', shrink=0.8)
plt.show()
Figure 4: Monthly P&L heatmap

3. Risk Analysis

3.1 Return Distribution

Show code
traded = df2020[df2020['active'] == 1]
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4), constrained_layout=True)
rets = traded[ret_col] * 100
ax1.hist(rets, bins=50, color='#1565C0', alpha=0.7, edgecolor='white', linewidth=0.3)
ax1.axvline(x=rets.mean(), color='red', linewidth=1, linestyle='--', label=f'Mean: {rets.mean():.3f}%')
ax1.set_title('Return Distribution (10-day holds)')
ax1.set_xlabel('Return (%)')
ax1.legend()
from scipy import stats
stats.probplot(rets.dropna(), dist="norm", plot=ax2)
ax2.set_title('Q-Q Plot vs Normal')
ax2.get_lines()[0].set_markerfacecolor('#1565C0')
ax2.get_lines()[0].set_markersize(3)
plt.show()
Figure 5: Return distribution (10-day holds)

3.2 Rolling Metrics

Show code
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 6), constrained_layout=True, sharex=True)
roll_mean = df2020[ret_col].rolling(63).apply(lambda x: x[x!=0].mean() if (x!=0).any() else 0)
roll_std = df2020[ret_col].rolling(63).apply(lambda x: x[x!=0].std() if (x!=0).sum() > 5 else np.nan)
rolling_sharpe = roll_mean / roll_std * np.sqrt(252)
ax1.plot(df2020['dt'], rolling_sharpe, color='#43A047', linewidth=1)
ax1.axhline(y=0, color='gray', linewidth=0.5, linestyle='--')
ax1.axhline(y=1, color='green', linewidth=0.5, linestyle='--', alpha=0.5)
ax1.set_title('Rolling 63-day Sharpe Ratio')
ax1.set_ylabel('Sharpe')
ax1.set_ylim(-10, 25)

df2020_copy = df2020.copy()
df2020_copy['correct'] = (df2020_copy['active'] == 1) & (np.sign(df2020_copy['pred']) == np.sign(df2020_copy['spread_ret']))
rolling_acc = df2020_copy['correct'].rolling(63).mean() * 100
ax2.plot(df2020['dt'], rolling_acc, color='#FF8F00', linewidth=1)
ax2.axhline(y=50, color='gray', linewidth=0.5, linestyle='--')
ax2.set_title('Rolling 63-day Direction Accuracy')
ax2.set_ylabel('Accuracy (%)')
ax2.set_xlim(df2020['dt'].min(), df2020['dt'].max())
plt.show()
Figure 6: Rolling Sharpe and accuracy

4. Detailed Statistics

Show code
traded = df2020[df2020['active'] == 1]
total_pnl = (df2020[ret_col] * capital).sum()
sharpe = traded[ret_col].mean() / traded[ret_col].std() * np.sqrt(252)
downside = traded.loc[traded[ret_col] < 0, ret_col]
sortino = traded[ret_col].mean() / np.sqrt((downside**2).mean()) * np.sqrt(252)
max_dd = df2020['drawdown'].min()
wins = traded[traded[ret_col] > 0][ret_col]
losses = traded[traded[ret_col] < 0][ret_col]

stats_dict = {
    'Period': f'{df2020["dt"].min().strftime("%Y-%m-%d")} to {df2020["dt"].max().strftime("%Y-%m-%d")}',
    'Traded Periods': len(traded),
    'Total P&L': f'${total_pnl:,.0f}',
    'Sharpe Ratio': f'{sharpe:.2f}',
    'Sortino Ratio': f'{sortino:.2f}',
    'Max Drawdown': f'${max_dd:,.0f}',
    'MAR Ratio': f'{traded[ret_col].mean() * 252 / abs(max_dd / capital):.2f}',
    'Direction Accuracy': f'{(np.sign(traded["pred"]) == np.sign(traded["spread_ret"])).mean()*100:.1f}%',
    'Win/Loss Ratio': f'{abs(wins.mean()/losses.mean()):.2f}',
    'Holding Period': '10 trading days',
    'p/n Ratio': '0.02 (4 dims / 190 samples)',
}
pd.DataFrame(list(stats_dict.items()), columns=['Metric', 'Value']).style.hide(axis='index')
Table 1
Metric Value
Period 2020-01-02 to 2026-04-07
Traded Periods 364
Total P&L $37,398
Sharpe Ratio 5.63
Sortino Ratio 6.68
Max Drawdown $-3,458
MAR Ratio 7.49
Direction Accuracy 65.7%
Win/Loss Ratio 1.39
Holding Period 10 trading days
p/n Ratio 0.02 (4 dims / 190 samples)
Show code
yearly_data = []
for yr in sorted(df2020['year'].unique()):
    ydf = df2020[df2020['year'] == yr]
    yt = ydf[ydf['active'] == 1]
    if len(yt) == 0:
        continue
    pnl = (ydf[ret_col] * capital).sum()
    s = yt[ret_col].mean() / yt[ret_col].std() * np.sqrt(252) if yt[ret_col].std() > 0 else 0
    ds = yt.loc[yt[ret_col] < 0, ret_col]
    so = yt[ret_col].mean() / np.sqrt((ds**2).mean()) * np.sqrt(252) if len(ds) > 0 else 0
    acc = (np.sign(yt['pred']) == np.sign(yt['spread_ret'])).mean() * 100
    yearly_data.append({
        'Year': yr, 'Traded': len(yt), 'Sat Out': len(ydf) - len(yt),
        'Accuracy': f'{acc:.1f}%', 'P&L': f'${pnl:,.0f}',
        'Sharpe': f'{s:.2f}', 'Sortino': f'{so:.2f}'
    })
pd.DataFrame(yearly_data).style.hide(axis='index')
Table 2
Year Traded Sat Out Accuracy P&L Sharpe Sortino
2020 48 62 68.8% $3,332 4.47 4.22
2021 70 42 64.3% $8,270 7.02 12.22
2022 63 48 60.3% $2,300 2.52 2.45
2023 48 58 54.2% $579 0.97 0.93
2024 66 46 72.7% $9,170 6.56 6.47
2025 52 59 63.5% $6,507 6.54 7.55
2026 17 12 94.1% $7,240 15.71 20.34

5. Strategy Construction

5.1 Model Architecture

5.2 Model Code

class Aggregate:
    @staticmethod
    def finalize(table, params):
        if table.num_rows < 2:
            return None
        data = table.to_pandas().values.astype(np.float64)
        n, nc = data.shape
        seed = int(params.get('seed', 42))
        conf_thresh = params.get('conf', 0.60)
        min_move = params.get('min_move', 0.008)
        fc = int(params.get('fwd_col', nc - 1))
        hold = int(params.get('hold', 10))

        if n < 10 + hold:
            return None

        X = data[:-(hold), :fc]
        y_ret = data[hold:, fc]

        if np.any(np.isnan(X)) or np.any(np.isnan(y_ret)):
            return 0.0

        y_dir = (y_ret > 0).astype(int)
        last = data[-1:, :fc]

        from sklearn.linear_model import LogisticRegression, Ridge
        from sklearn.pipeline import make_pipeline
        from sklearn.preprocessing import StandardScaler

        if len(set(y_dir)) < 2:
            return 0.0

        clf = make_pipeline(
            StandardScaler(),
            LogisticRegression(C=0.1, max_iter=1000, random_state=seed)
        )
        clf.fit(X, y_dir)
        prob_up = clf.predict_proba(last)[0][1]

        reg = make_pipeline(StandardScaler(), Ridge(alpha=1.0))
        reg.fit(X, y_ret)
        pred_mag = abs(float(reg.predict(last)[0]))

        if pred_mag < min_move:
            return 0.0
        if prob_up > conf_thresh:
            return pred_mag
        elif prob_up < (1.0 - conf_thresh):
            return -pred_mag
        else:
            return 0.0

6. Portfolio Role: Factor Rotation

MTUM/USMV adds a US equity factor-rotation signal that’s mostly uncorrelated with the other 7 sleeves. Drawdown correlations:

Pair Drawdown Corr
vs XME/DBB (metals) +0.31
vs GDX/GLD (gold) -0.00
vs XLE/USO (energy) +0.05
vs EFA/SPY (intl equity) +0.09
vs XLF/XLY (sector rotation) +0.07
vs LMT/RTX (defense) +0.11
vs TLT/HYG (rates/credit) -0.07

The largest correlation is +0.31 with XME/DBB — both have macro/momentum sensitivity. All other correlations are within ±0.11. The standalone Sharpe of 5.08 is so strong that even this modest diversification adds substantial portfolio value.

7. Limitations and Risks

  1. 2023 was a near-miss year ($579 PnL, Sharpe 0.97). The model can have lean years; smaller drawdowns are tolerable when other sleeves are paying.

  2. High dollar PnL but largest open exposure: 10-day hold periods mean the strategy is in position ~80% of the time. Capital is consistently deployed (vs daily-hold sleeves that are flat 50%+ of days).

  3. 364 trades over 6.3 years (~58/year). Statistical confidence on 65.7% accuracy with 364 trades has a 95% CI of roughly 61-71%.

  4. Sharpe of 5.63 is suspiciously high. Selected via grid search. True out-of-sample Sharpe is likely lower; expect 3-4 in production.

  5. Factor regime risk: The 2020-2025 backtest captures one major tech-led momentum cycle. A prolonged value-led or defensive-led market could weaken the bank_size_spread and gld_ret signals.

  6. MTUM rebalance cadence: MTUM rebalances semi-annually based on 6-12 month momentum; large rebalance turnover can create temporary spread shocks not captured by daily features.

  7. Seed sensitivity: Zero – deterministic.

8. Reproducibility

bash scripts/run_backtest.sh
bash tests/test_backtest.sh

Parameters

Parameter Value
Training window 200 days
Confidence threshold 0.60
Min predicted move 0.008 (0.8% over 10 days)
Holding period 10 trading days
Position sizing Binary (100%)
Gates None
LogReg C 0.1
Ridge alpha 1.0

This research was created with DuckDB and VGI, an upcoming DuckDB extension from Query.Farm that allows custom aggregate functions to be written in any language with an Apache Arrow implementation.