[idea] Improving My Bot with 3 Papers on HMM Regime Detection

Deep Reinforcement Learning for Financial Trading Enhanced by Cluster Embedding and Zero-Shot Prediction
Markov and Hidden Markov Models for Regime Detection in Cryptocurrency Markets: Evidence from Bitcoin (2024–2026)
Regime-Aware LightGBM for Stock Market Forecasting: A Validated Walk-Forward Framework with Statistical Rigor and Explainable AI Analysis

I read three intriguing papers and set out to test whether bolting their ideas onto the crypto bot strategy I’m currently running would actually make it better. The short version: I ran 26 variations, and not a single one passed.

1. What were the papers about

All three shared a common theme: “use a model to figure out what state (regime) the market is in, and adjust your position sizing accordingly.”

Pagliaro 2026 — Don’t cut everything across the board; pick out only the strategies that genuinely underperform in that state and trim those.
Markov HMM BTC — Catch state transitions faster using external information like trading volume (NHHMM).
DRL + Cluster Embedding — Combine reinforcement learning with future prediction for a smarter state representation.

2. How I tested it

I ran simulations by adding a regime throttle layer on top of my live portfolio (10 strategies).

6 timeframes: 5m / 15m / 1h / 4h / 12h / 1d
2 HMM training methods: offline (train once) / rolling (retrain periodically, looking only at the past)
3 throttle modes: none / blanket halving / selective
3 periods: old OOS2 (2021–22) / IS (2023) / the real validation OOS (2024–26)

The key rule: parameter selection happens only on OOS2+IS, and OOS is set aside as “a future I’ve never once looked at.” Break that rule and all you’ve done is memorize past patterns.

3. The result — 0/26

방향	OOS Calmar (선택적)	OOS Calmar (아무것도 안 함)	차이
A. Pagliaro 선택적	4.57	7.37	-2.80
B. NHHMM	4.57	7.37	-2.80
C. enriched 7-feature	4.06	7.37	-3.31

Calmar = CAGR / |max drawdown|. Higher is better.

The most promising-looking candidate was 1h rolling, which showed a Calmar of 22.5 on the old data (2021–23). But when I re-measured it on the future I’d never looked at (2024–26), it dropped to 4.57. It had merely learned the noise of the past — it wasn’t a real signal.

4. Why didn’t it work

Once I looked closely, the reasons were clear.

The bot is already too well diversified. Six strategies, each a different mechanism — pairs / funding / trend / RSI intraday / breakout. There’s no single weak spot you can patch across the board with one volatility regime.
The cost of cutting > the protection it buys. “Halve your size when volatility is high” — yes, it protects you in the bad stretches, but it also halves the good stretches, so cumulative returns end up shaved down further. Even just a blanket 0.5× throttle cost me Calmar -1.66.
The market changed after 2024 (overfitting). The ETF approval, the halving, the AI-coin rotation — the pattern I’d learned in 2021–23 (“this strategy is weak in this regime”) had morphed into a different pattern after 2024.

5. Lessons

Two things got confirmed once again.

OOS really is sacred and inviolable. “It looked good on the old data” means almost nothing. It’s only real if it survives in a future you’ve never once seen.
The hardest thing is adding something to an already-strong baseline. Slap a filter onto a weak strategy and it’s easy to improve, but add a throttle to a system that’s already running well and you almost always just pile up costs.

6. So What?

I’m closing the door on the regime-throttle direction. If I were to try again, it would only be worth it through a different mechanism — say, adjusting sizing using signals from another asset (outside crypto), or adding an entirely new sleeve.

The infrastructure I built (the 4-hour capital-base engine, the 6-timeframe HMM code) can be reused as-is for testing the next hypothesis, so the small consolation is that the time wasn’t completely wasted.