[idea] Trying to upgrade my bot with three HMM regime-detection papers

  • Deep Reinforcement Learning for Financial Trading Enhanced by Cluster Embedding and Zero-Shot Prediction
  • Markov and Hidden Markov Models for Regime Detection in Cryptocurrency Markets: Evidence from Bitcoin (2024–2026)
  • Regime-Aware LightGBM for Stock Market Forecasting: A Validated Walk-Forward Framework with Statistical Rigor and Explainable AI Analysis

I read three interesting papers and tested whether bolting their ideas onto the crypto bot I’m running right now would actually make it better. The punchline first: I ran 26 variants and not a single one passed.

 

1. What the papers said

 

All three shared the same theme: “Use a model to figure out what state (regime) the market is in, then adjust your trade sizing accordingly.”

  • Pagliaro 2026 — Don’t throttle everything at once; pick out only the strategies that genuinely struggle in that regime and throttle just those
  • Markov HMM BTC — Use external information like volume to catch state transitions faster (NHHMM)
  • DRL + Cluster Embedding — Combine reinforcement learning with future prediction for a smarter state representation

 

2. How I tested it

I simulated layering a regime throttle on top of the portfolio (10 strategies) I’m currently running live.

 

  • 6 timeframes: 5m / 15m / 1h / 4h / 12h / 1d
  • 2 HMM training approaches: offline (one-shot training) / rolling (periodic retraining using only past data)
  • 3 throttle modes: none / blanket half-size / selective
  • 3 periods: old OOS2 (2021–22) / IS (2023) / the real holdout OOS (2024–26)

 

The non-negotiable rule: parameter selection happens only on OOS2 + IS, and OOS stays untouched as “a future I’ve never seen.” Break this and you’re just memorizing past patterns.

 

3. The result — 0/26

DirectionOOS Calmar (selective)OOS Calmar (do nothing)DiffA. Pagliaro selective4.577.37**-2.80B. NHHMM4.577.37-2.80C. enriched 7-feature4.067.37-3.31**

  • Calmar = CAGR / |max drawdown|. Higher is better.

 

The most promising-looking candidate was 1h rolling, which posted a Calmar of 22.5 on the older data (2021–23). But when I measured it on the unseen future (2024–26), it dropped to 4.57. It had learned the noise in the past, not a real signal.

 

4. Why it didn’t work

When I sat with it, the reasons were clear.

  • The bot is already too well-diversified. Six strategies, each with a different mechanism — pairs / funding / trend / intraday RSI / breakout. There’s no shared vulnerability that a single volatility regime can throttle in one stroke.
  • The cost of throttling > the protection it buys. “Cut size in half when volatility is high” — sure, the bad stretches are softened, but the good stretches get cut in half too, and cumulative returns end up worse. Even a flat 0.5× throttle cost -1.66 in Calmar.
  • The post-2024 market is a different animal (overfitting). ETF approval, the halving, AI-coin rotation — the “this strategy is weak in this regime” patterns learned from 2021–23 have shifted into different patterns since 2024.

 

5. Takeaways

Two things got reconfirmed.

  1. The OOS holdout is genuinely sacred. “It looked good on the old data” means almost nothing. It has to survive on a future you’ve never peeked at — that’s the only test that counts.
  2. The hardest thing is adding something to a baseline that’s already strong. Slap a filter on a weak strategy and it’s easy to improve; add a throttle to a system that’s already running well and you almost always just accumulate cost.

 

6. So What?

I’m closing the regime-throttle direction. If I were to try again, it would have to be through a different mechanism — for example, using signals from another asset class (outside crypto) to modulate sizing, or just adding an entirely new sleeve. That would actually be worth doing.

The infrastructure I built (the 4-hour capital base engine, the 6-timeframe HMM code) can be reused as-is for the next hypothesis, which is a small consolation — the time wasn’t a total write-off.

댓글