TradeReady.io
Gymnasium / RL Training

Gymnasium Overview

Train RL trading agents with tradeready-gym, a fully Gymnasium-compliant package

Download .md

tradeready-gym is a Python package that wraps the TradeReady backtesting engine in a standard Gymnasium interface. You can plug it into any RL framework that speaks Gymnasium — Stable-Baselines3, RLlib, CleanRL, and others.


What Is It?

The package gives you training environments where your model controls a virtual trading agent. At each step, the model receives market observations and account state, chooses an action (buy, sell, hold, or a continuous position size), and receives a reward signal.

The environments are backed by the same sandbox used for manual backtesting: real Binance candle data, 0.1% trading fees, and realistic slippage. There is no look-ahead bias — the DataReplayer enforces a strict WHERE bucket <= virtual_clock constraint.

What you get:

  • 7 pre-registered environments for single-asset, multi-asset, and live trading
  • 5 reward functions (PnL, log return, Sharpe, Sortino, drawdown penalty)
  • 3 environment wrappers (feature engineering, normalization, batch stepping)
  • Automatic training progress reporting to the platform dashboard via TrainingTracker

Installation

pip install tradeready-gym

# With RL frameworks
pip install tradeready-gym stable-baselines3 torch

Requirements: Python 3.12+, Gymnasium >= 0.29, numpy >= 1.26, httpx >= 0.28, and a running TradeReady platform instance.


Quick Start

import gymnasium as gym
import tradeready_gym  # registers all environments

env = gym.make(
    "TradeReady-BTC-v0",
    api_key="ak_live_...",
    starting_balance=10000,
    timeframe="1h",
    start_time="2025-01-01T00:00:00Z",
    end_time="2025-03-01T00:00:00Z",
)

obs, info = env.reset()

for _ in range(1000):
    action = env.action_space.sample()  # replace with your model
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        obs, info = env.reset()

env.close()

The 5-Tuple Step API

Every call to env.step(action) returns:

ValueTypeDescription
obsnp.ndarrayMarket observations and account state
rewardfloatSignal from the reward function
terminatedboolTrue when the episode ends at the final candle
truncatedboolTrue if ended early (e.g. account blown out)
infodictStep details: equity, filled orders, virtual time

The info dictionary includes:

{
    "equity": 10245.30,
    "available_cash": 8000.00,
    "position_value": 2245.30,
    "unrealized_pnl": 245.30,
    "virtual_time": "2025-01-15T14:00:00Z",
    "step": 336,
    "total_steps": 1464,
    "filled_orders": [],
    "progress_pct": 22.95
}

Gymnasium Compliance

tradeready-gym implements the full Gymnasium Env interface:

env.reset(seed=42)             # reproducible episode starts
env.step(action)               # 5-tuple return
env.action_space               # gymnasium.spaces.Space subclass
env.observation_space          # gymnasium.spaces.Box
env.render()                   # optional — not implemented (trading has no visual render)
env.close()                    # completes the training run and cleans up the backtest session

All action and observation spaces are properly defined as gymnasium.spaces objects, so env.action_space.sample() always returns a valid action and env.observation_space.contains(obs) always returns True.

Always call env.close() when training is finished. This marks the training run as complete in the platform dashboard and releases the backtest session.


Three Workflow Types

The gym supports three distinct workflows:

WorkflowUse CaseEnvironment
Historical trainingTrain on past data, evaluate on held-out periodsTradeReady-BTC-v0, TradeReady-BTC-Continuous-v0, TradeReady-Portfolio-v0
Live paper tradingDeploy a trained model to trade in real timeTradeReady-Live-v0
RL training pipelineFull cycle: train → track → compare → deploy bestAny environment with track_training=True

Training Tracking

Set track_training=True (the default) to automatically sync training progress with the platform dashboard:

env = gym.make(
    "TradeReady-BTC-v0",
    api_key="ak_live_...",
    track_training=True,
    strategy_label="ppo_v1",
)

The tracker:

  1. Registers a training run on the first reset() call
  2. Reports episode metrics (ROI, Sharpe, drawdown, trades, cumulative reward) after each completed episode
  3. Marks the run as complete on env.close()

Training runs are visible at /training in the dashboard. See Training Tracking for full details.


Next Steps

  • Environments — all 7 registered environments, action spaces, observation space
  • Rewards — 5 reward functions and how to write custom ones
  • Training Tracking — syncing runs to the dashboard
  • Examples — complete working scripts

On this page