Building Machine B: From Research Model to Production System

By this point, Machine B existed as a research model.

It had labels.
It understood RTH.
It survived contract rollovers.

But research models don’t trade.
Production systems do.

That forced a new question:

How do you turn a machine-learned trailing stop into something you actually trust with real capital?

Research vs Production

In research, you care about:

Accuracy
Cross-validation scores
Feature importance
Backtests

In production, you care about:

Stability
Failure modes
Retraining cadence
Safety constraints

Machine B had to graduate from a notebook to an operating system.

Rolling Retraining

Markets change.
Models decay.

So I adopted a rolling retraining window:

Train on the last 8–16 weeks of RTH data
Hold out the most recent 1–2 weeks for sanity checks
Retrain every weekend

This gave me:

Adaptation to regime shifts
Enough data for statistical stability
A clean walk-forward evaluation loop

No random shuffles.
No hindsight leakage.

Walk-Forward Validation

Backtests lie easily.

So I simulated production:

Train on week N–8 → N–2
Validate on week N–1
Slide the window forward
Repeat

This produced a time-forward equity curve that approximated live behavior.

Not perfect.
But honest.

Confidence Gating

Machine learning models are probabilistic.
Trading systems should be conservative.

So Machine B does not act on every signal.

It only tightens when:

P(good tighten) > threshold

For example:

Tighten if probability > 0.65
Hold otherwise

This reduced overtrading and prevented noise-driven stop tightening.

Safety Rules (Non-Negotiable)

Machine B is an advisor, not a dictator.

Hard constraints:

1) Never loosen a stop
2) Cap tighten frequency (e.g., once every N bars)
3) Respect structural stops (swing highs/lows, emergency stops)
4) Fall back to deterministic rules if ML fails

ML adds intelligence.
Rules provide guardrails.

Contract Awareness

Every quarter:

Export new front contract data
Retrain Machine B
Validate on new contract
Switch production model

No silent model carryover across contracts.

Observability

I log every Machine B decision:

Timestamp
Features
Predicted probability
Action (tighten/hold)
Outcome (forward window result)

This turns live trading into a growing training dataset.

Machine B learns.
I audit.

A Philosophical Shift

Machine B changed how I think about trading.

Entries predict direction.
Exits shape distribution.

Machine B is not about predicting markets.
It is about engineering risk.

The Engineering Lesson

Trading systems fail quietly.

A model that is slightly wrong doesn’t crash.
It bleeds.

Production engineering is about detecting and preventing silent decay.

Rolling retraining.
Confidence gating.
Rule-based fallbacks.

None of that is glamorous.
All of it matters.

Final Thought

Machine B is not a holy grail.

It is a second brain—systematic, consistent, and adaptive.

That was the goal from the beginning.

If trailing stops are systematic, consistent, and adaptive, the equity curve will take care of itself.

This series is the blueprint of how I tried to make that statement true.

Previous: RTH vs ETH: Data Distribution Mismatch →

Series Index: Building a Machine-Learned Trailing Stop Engine →