TLT/HYG Long Treasury vs High-Yield Strategy

Row-Level Dual-Model with Credit, Mean-Reversion, and Yield-Curve Signals, 10-Day Hold

Author
Affiliation

Rusty Conover

Query.Farm

Published

April 16, 2026

Show code
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

import sys
sys.path.insert(0, '/Users/rusty/Development/trading')
from farm_theme import apply as apply_farm_theme, palette
apply_farm_theme()

df = pd.read_csv('strategy_data.csv', parse_dates=['dt'])
df = df.sort_values('dt').reset_index(drop=True)

capital = 10000
ret_col = 'daily_ret_unscaled'
df['cum_pnl'] = (df[ret_col] * capital).cumsum()
df['drawdown'] = df['cum_pnl'] - df['cum_pnl'].cummax()
df['year'] = df['dt'].dt.year

tlt = pd.read_csv('TLT.csv', parse_dates=['Date'])
hyg = pd.read_csv('HYG.csv', parse_dates=['Date'])
prices = tlt[['Date','close']].rename(columns={'close':'tlt_close'}).merge(
    hyg[['Date','close']].rename(columns={'close':'hyg_close'}), on='Date')
prices = prices.sort_values('Date').reset_index(drop=True)
prices['spread_ratio'] = prices['tlt_close'] / prices['hyg_close']
prices = prices[prices['Date'] >= '2020-01-01']

Executive Summary

This document presents a systematic pairs trading strategy on TLT (iShares 20+ Year Treasury Bond ETF) vs HYG (iShares iBoxx High-Yield Corporate Bond ETF). The strategy predicts 10-day forward returns of the TLT-HYG spread using credit positioning, mean-reversion, and yield-curve cycle signals.

This is the highest-PnL strategy in the portfolio and is profitable in every calendar year tested.

NoteKey Metrics (2020–2026)
Metric Value
Sharpe Ratio 6.49
Ann. Return 429.9%
Total P&L $42,987 on $10K
Direction Accuracy 68.0%
Max Drawdown -$3,103
Years Profitable 7 / 7
Post-10bps Sharpe 5.86

1. Strategy Overview

1.1 Economic Rationale

TLT and HYG are both fixed-income but represent opposite ends of the credit/duration spectrum:

  • TLT = pure duration risk, zero credit risk (long-duration US Treasuries)
  • HYG = high credit risk, moderate duration (high-yield corporate bonds)

They diverge when the market shifts between flight-to-safety (TLT outperforms — duration rallies, credit widens) and risk-on (HYG outperforms — credit tightens, rates back up). The pair captures the credit-cycle vs rate-cycle dynamic that drives most of fixed-income macro.

The strategy uses three signals:

  1. Credit positioning (PFF-HYG): Preferred shares (PFF) sit between investment-grade and high-yield in the capital stack. The PFF-HYG spread is a clean read on credit-cycle positioning that leads broader credit moves by 1-2 weeks. When PFF outperforms HYG, credit risk-off is starting (TLT will rally vs HYG).

  2. Mean reversion (60d cumulative-ratio z-score): TLT-HYG cumulative spreads have strong mean-reverting properties on multi-week scales. When the cumulative spread is multiple standard deviations from its 60-day mean, reversion typically pays.

  3. Yield-curve cycle (120d TLT-IEF momentum): The yield-curve regime (long-end vs intermediate Treasuries) drives the broader bond market. The 120-day curve trend captures the slow-moving rate-cycle dynamic that affects TLT directly and HYG indirectly through corporate refinancing.

1.2 Why 10-Day Holds

Fixed-income macro signals operate on weekly to monthly cycles, not daily. The horizon dependence is dramatic:

Hold Period Net Sharpe (post-10bps)
10-day 5.86
5-day 3.52
3-day 3.30
2-day 0.87
Daily 0.40

Daily and 2-day holds capture noise; 10-day holds capture the actual credit-cycle signal. This is why earlier pair-trading attempts on fixed-income often fail — they use the wrong holding period.

1.3 Features (4 inputs)

Feature Rationale
spread TLT - HYG daily log return
pff_vs_hyg PFF return - HYG return – credit positioning leader
ratio_zscore60 60d z-score of cumulative spread – mean reversion
curve_mom120 120d cumulative TLT-IEF return – yield-curve cycle

1.4 Position Sizing and Holding

Enter when the model predicts a 10-day spread move exceeding 0.8%. Hold for 10 trading days. Each entry incurs one round-trip of transaction costs for 10 days of exposure – a 70%+ reduction in cost drag vs a daily strategy.

2. Performance Analysis

2.1 P&L and Spread

Show code
fig, (ax1, ax2, ax3) = plt.subplots(3, 1, figsize=(10, 9), sharex=True,
                                     gridspec_kw={'height_ratios': [2, 1.5, 1.5]})

ax1.plot(df['dt'], df['cum_pnl'], color='#1565C0', linewidth=1.5)
ax1.fill_between(df['dt'], 0, df['cum_pnl'], alpha=0.1, color='#1565C0')
ax1.axhline(y=0, color='gray', linewidth=0.5, linestyle='--')
ax1.set_ylabel('Cumulative P&L ($)')
ax1.set_title('Cumulative P&L ($10K Capital)')
ax1.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f'${x:,.0f}'))

ax2.plot(prices['Date'], prices['tlt_close'], color='#1565C0', linewidth=1, label='TLT')
ax2.plot(prices['Date'], prices['hyg_close'], color='#E65100', linewidth=1, label='HYG')
ax2.set_ylabel('Price ($)')
ax2.set_title('TLT and HYG Prices')
ax2.legend(loc='upper left', fontsize=9)

ax3.plot(prices['Date'], prices['spread_ratio'], color='#2E7D32', linewidth=1)
ax3.axhline(y=prices['spread_ratio'].mean(), color='gray', linewidth=0.5, linestyle='--',
            label=f'Mean: {prices["spread_ratio"].mean():.3f}')
ax3.set_ylabel('TLT / HYG')
ax3.set_title('Spread Ratio')
ax3.legend(loc='upper left', fontsize=9)

for ax in [ax1, ax2, ax3]:
    for yr in range(df['dt'].dt.year.min(), df['dt'].dt.year.max() + 2):
        ax.axvline(x=pd.Timestamp(f'{yr}-01-01'), color='gray', linewidth=0.3, linestyle=':')

ax3.set_xlim(df['dt'].min(), df['dt'].max())
plt.show()
Figure 1: Cumulative P&L (top), TLT and HYG prices (middle), and spread ratio (bottom)

2.2 Drawdown

Show code
fig, ax = plt.subplots(figsize=(10, 4), constrained_layout=True)
ax.fill_between(df['dt'], df['drawdown'], 0, color='#E53935', alpha=0.4)
ax.set_ylabel('Drawdown ($)')
ax.set_title(f'Drawdown — Max: ${df["drawdown"].min():,.0f}')
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f'${x:,.0f}'))
ax.set_xlim(df['dt'].min(), df['dt'].max())
plt.show()
Figure 2: Underwater equity curve

2.3 Yearly Performance

Show code
df2020 = df[df['dt'] >= '2020-01-01']

yearly = df2020.groupby('year').agg(
    traded=('active', 'sum'),
    pnl=(ret_col, lambda x: (x * capital).sum()),
    ret_mean=(ret_col, lambda x: x[x != 0].mean() if (x != 0).any() else 0),
    ret_std=(ret_col, lambda x: x[x != 0].std() if (x != 0).sum() > 1 else 1),
).reset_index()
yearly['sharpe'] = yearly['ret_mean'] / yearly['ret_std'] * np.sqrt(252)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4), constrained_layout=True)
colors = ['#E53935' if p < 0 else '#43A047' for p in yearly['pnl']]
ax1.bar(yearly['year'], yearly['pnl'], color=colors, alpha=0.7)
ax1.axhline(y=0, color='gray', linewidth=0.5)
ax1.set_title('Yearly P&L')
ax1.set_ylabel('P&L ($)')
ax1.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f'${x:,.0f}'))

colors_s = ['#E53935' if s < 0 else '#43A047' for s in yearly['sharpe']]
ax2.bar(yearly['year'], yearly['sharpe'], color=colors_s, alpha=0.7)
ax2.axhline(y=0, color='gray', linewidth=0.5)
ax2.axhline(y=1, color='green', linewidth=0.5, linestyle='--', alpha=0.5)
ax2.set_title('Yearly Sharpe Ratio')
ax2.set_ylabel('Sharpe')
plt.show()
Figure 3: Yearly P&L and Sharpe ratios – profitable in every year tested

2.4 Monthly Returns Heatmap

Show code
df2020 = df[df['dt'] >= '2020-01-01'].copy()
df2020['month'] = df2020['dt'].dt.month
df2020['yr'] = df2020['dt'].dt.year
monthly = df2020.groupby(['yr', 'month']).agg(pnl=(ret_col, lambda x: (x * capital).sum())).reset_index()
pivot = monthly.pivot(index='yr', columns='month', values='pnl').fillna(0)

fig, ax = plt.subplots(figsize=(10, 4), constrained_layout=True)
im = ax.imshow(pivot.values, cmap='RdYlGn', aspect='auto', vmin=-2500, vmax=2500)
ax.set_xticks(range(12))
ax.set_xticklabels(['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec'])
ax.set_yticks(range(len(pivot.index)))
ax.set_yticklabels(pivot.index)
ax.set_title('Monthly P&L Heatmap')
for i in range(len(pivot.index)):
    for j in range(12):
        val = pivot.values[i, j]
        if abs(val) > 10:
            color = 'white' if abs(val) > 1200 else 'black'
            ax.text(j, i, f'${val:.0f}', ha='center', va='center', fontsize=8, color=color)
plt.colorbar(im, ax=ax, label='P&L ($)', shrink=0.8)
plt.show()
Figure 4: Monthly P&L heatmap

3. Risk Analysis

3.1 Return Distribution

Show code
traded = df2020[df2020['active'] == 1]
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4), constrained_layout=True)
rets = traded[ret_col] * 100
ax1.hist(rets, bins=50, color='#1565C0', alpha=0.7, edgecolor='white', linewidth=0.3)
ax1.axvline(x=rets.mean(), color='red', linewidth=1, linestyle='--', label=f'Mean: {rets.mean():.3f}%')
ax1.set_title('Return Distribution (10-day holds)')
ax1.set_xlabel('Return (%)')
ax1.legend()
from scipy import stats
stats.probplot(rets.dropna(), dist="norm", plot=ax2)
ax2.set_title('Q-Q Plot vs Normal')
ax2.get_lines()[0].set_markerfacecolor('#1565C0')
ax2.get_lines()[0].set_markersize(3)
plt.show()
Figure 5: Return distribution (10-day holds)

3.2 Rolling Metrics

Show code
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 6), constrained_layout=True, sharex=True)
roll_mean = df2020[ret_col].rolling(63).apply(lambda x: x[x!=0].mean() if (x!=0).any() else 0)
roll_std = df2020[ret_col].rolling(63).apply(lambda x: x[x!=0].std() if (x!=0).sum() > 5 else np.nan)
rolling_sharpe = roll_mean / roll_std * np.sqrt(252)
ax1.plot(df2020['dt'], rolling_sharpe, color='#43A047', linewidth=1)
ax1.axhline(y=0, color='gray', linewidth=0.5, linestyle='--')
ax1.axhline(y=1, color='green', linewidth=0.5, linestyle='--', alpha=0.5)
ax1.set_title('Rolling 63-day Sharpe Ratio')
ax1.set_ylabel('Sharpe')
ax1.set_ylim(-10, 25)

df2020_copy = df2020.copy()
df2020_copy['correct'] = (df2020_copy['active'] == 1) & (np.sign(df2020_copy['pred']) == np.sign(df2020_copy['spread_ret']))
rolling_acc = df2020_copy['correct'].rolling(63).mean() * 100
ax2.plot(df2020['dt'], rolling_acc, color='#FF8F00', linewidth=1)
ax2.axhline(y=50, color='gray', linewidth=0.5, linestyle='--')
ax2.set_title('Rolling 63-day Direction Accuracy')
ax2.set_ylabel('Accuracy (%)')
ax2.set_xlim(df2020['dt'].min(), df2020['dt'].max())
plt.show()
Figure 6: Rolling Sharpe and accuracy

4. Detailed Statistics

Show code
traded = df2020[df2020['active'] == 1]
total_pnl = (df2020[ret_col] * capital).sum()
sharpe = traded[ret_col].mean() / traded[ret_col].std() * np.sqrt(252)
downside = traded.loc[traded[ret_col] < 0, ret_col]
sortino = traded[ret_col].mean() / np.sqrt((downside**2).mean()) * np.sqrt(252)
max_dd = df2020['drawdown'].min()
wins = traded[traded[ret_col] > 0][ret_col]
losses = traded[traded[ret_col] < 0][ret_col]

stats_dict = {
    'Period': f'{df2020["dt"].min().strftime("%Y-%m-%d")} to {df2020["dt"].max().strftime("%Y-%m-%d")}',
    'Traded Periods': len(traded),
    'Total P&L': f'${total_pnl:,.0f}',
    'Sharpe Ratio': f'{sharpe:.2f}',
    'Sortino Ratio': f'{sortino:.2f}',
    'Max Drawdown': f'${max_dd:,.0f}',
    'MAR Ratio': f'{traded[ret_col].mean() * 252 / abs(max_dd / capital):.2f}',
    'Direction Accuracy': f'{(np.sign(traded["pred"]) == np.sign(traded["spread_ret"])).mean()*100:.1f}%',
    'Win/Loss Ratio': f'{abs(wins.mean()/losses.mean()):.2f}',
    'Holding Period': '10 trading days',
    'p/n Ratio': '0.02 (4 dims / 190 samples)',
}
pd.DataFrame(list(stats_dict.items()), columns=['Metric', 'Value']).style.hide(axis='index')
Table 1
Metric Value
Period 2020-01-02 to 2026-04-08
Traded Periods 413
Total P&L $42,987
Sharpe Ratio 6.49
Sortino Ratio 7.47
Max Drawdown $-3,103
MAR Ratio 8.45
Direction Accuracy 68.0%
Win/Loss Ratio 1.44
Holding Period 10 trading days
p/n Ratio 0.02 (4 dims / 190 samples)
Show code
yearly_data = []
for yr in sorted(df2020['year'].unique()):
    ydf = df2020[df2020['year'] == yr]
    yt = ydf[ydf['active'] == 1]
    if len(yt) == 0:
        continue
    pnl = (ydf[ret_col] * capital).sum()
    s = yt[ret_col].mean() / yt[ret_col].std() * np.sqrt(252) if yt[ret_col].std() > 0 else 0
    ds = yt.loc[yt[ret_col] < 0, ret_col]
    so = yt[ret_col].mean() / np.sqrt((ds**2).mean()) * np.sqrt(252) if len(ds) > 0 else 0
    acc = (np.sign(yt['pred']) == np.sign(yt['spread_ret'])).mean() * 100
    yearly_data.append({
        'Year': yr, 'Traded': len(yt), 'Sat Out': len(ydf) - len(yt),
        'Accuracy': f'{acc:.1f}%', 'P&L': f'${pnl:,.0f}',
        'Sharpe': f'{s:.2f}', 'Sortino': f'{so:.2f}'
    })
pd.DataFrame(yearly_data).style.hide(axis='index')
Table 2
Year Traded Sat Out Accuracy P&L Sharpe Sortino
2020 75 36 70.7% $12,054 6.34 7.72
2021 68 44 67.6% $7,667 8.62 12.07
2022 79 31 53.2% $5,137 3.73 4.19
2023 63 44 69.8% $6,003 7.93 8.51
2024 59 53 78.0% $7,719 11.20 10.68
2025 57 53 75.4% $3,919 8.67 10.26
2026 12 18 58.3% $488 4.68 5.43

5. Strategy Construction

5.1 Model Architecture

5.2 Model Code

class Aggregate:
    @staticmethod
    def finalize(table, params):
        if table.num_rows < 2:
            return None
        data = table.to_pandas().values.astype(np.float64)
        n, nc = data.shape
        seed = int(params.get('seed', 42))
        conf_thresh = params.get('conf', 0.60)
        min_move = params.get('min_move', 0.008)
        fc = int(params.get('fwd_col', nc - 1))
        hold = int(params.get('hold', 10))

        if n < 10 + hold:
            return None

        X = data[:-(hold), :fc]
        y_ret = data[hold:, fc]

        if np.any(np.isnan(X)) or np.any(np.isnan(y_ret)):
            return 0.0

        y_dir = (y_ret > 0).astype(int)
        last = data[-1:, :fc]

        from sklearn.linear_model import LogisticRegression, Ridge
        from sklearn.pipeline import make_pipeline
        from sklearn.preprocessing import StandardScaler

        if len(set(y_dir)) < 2:
            return 0.0

        clf = make_pipeline(
            StandardScaler(),
            LogisticRegression(C=0.1, max_iter=1000, random_state=seed)
        )
        clf.fit(X, y_dir)
        prob_up = clf.predict_proba(last)[0][1]

        reg = make_pipeline(StandardScaler(), Ridge(alpha=1.0))
        reg.fit(X, y_ret)
        pred_mag = abs(float(reg.predict(last)[0]))

        if pred_mag < min_move:
            return 0.0
        if prob_up > conf_thresh:
            return pred_mag
        elif prob_up < (1.0 - conf_thresh):
            return -pred_mag
        else:
            return 0.0

6. Portfolio Role: Credit-Cycle Hedge

TLT/HYG is a hedge sleeve for the commodity-cluster strategies. Drawdown correlations with the other 7 sleeves:

Pair Drawdown Corr
vs XME/DBB (metals) -0.14
vs GDX/GLD (gold) -0.21
vs XLE/USO (energy) -0.22
vs EFA/SPY (intl equity) -0.02
vs XLF/XLY (sector rotation) -0.06
vs LMT/RTX (defense) +0.03
vs MTUM/USMV (factor rotation) -0.07

Consistently negative or near-zero across all sleeves. When commodity strategies struggle (typically during deflationary or strong-USD episodes when credit rallies and Treasuries underperform), TLT/HYG tends to thrive.

7. Limitations and Risks

  1. High dollar PnL but largest open exposure: 10-day hold periods mean the strategy is in position ~80% of the time. Capital is consistently deployed (vs daily-hold sleeves that are flat 50%+ of days).

  2. 413 trades over 6.3 years (~65/year). Statistical confidence on 68.0% accuracy with 413 trades has a 95% CI of roughly 63-72%.

  3. Sharpe of 6.49 is suspiciously high. Selected via grid search. True out-of-sample Sharpe is likely lower; expect 3-4 in production.

  4. Bond market regime risk: The 2020-2025 backtest captures one rate cycle (Fed cutting → Fed hiking → Fed cutting). A prolonged sideways or trendless rate regime could weaken the curve_mom120 signal.

  5. HYG liquidity events: HYG saw extreme dislocations in March 2020 and September 2022. The strategy navigated both, but future credit events could behave differently.

  6. Seed sensitivity: Zero – deterministic.

8. Reproducibility

bash scripts/run_backtest.sh
bash tests/test_backtest.sh

Parameters

Parameter Value
Training window 200 days
Confidence threshold 0.60
Min predicted move 0.008 (0.8% over 10 days)
Holding period 10 trading days
Position sizing Binary (100%)
Gates None
LogReg C 0.1
Ridge alpha 1.0

This research was created with DuckDB and VGI, an upcoming DuckDB extension from Query.Farm that allows custom aggregate functions to be written in any language with an Apache Arrow implementation.