Gymnasium Overview

tradeready-gym is a Python package that wraps the TradeReady backtesting engine in a standard Gymnasium interface. You can plug it into any RL framework that speaks Gymnasium — Stable-Baselines3, RLlib, CleanRL, and others.

What Is It?

The package gives you training environments where your model controls a virtual trading agent. At each step, the model receives market observations and account state, chooses an action (buy, sell, hold, or a continuous position size), and receives a reward signal.

The environments are backed by the same sandbox used for manual backtesting: real Binance candle data, 0.1% trading fees, and realistic slippage. There is no look-ahead bias — the DataReplayer enforces a strict WHERE bucket <= virtual_clock constraint.

What you get:

7 pre-registered environments for single-asset, multi-asset, and live trading
5 reward functions (PnL, log return, Sharpe, Sortino, drawdown penalty)
3 environment wrappers (feature engineering, normalization, batch stepping)
Automatic training progress reporting to the platform dashboard via TrainingTracker

Installation

pip install tradeready-gym

# With RL frameworks
pip install tradeready-gym stable-baselines3 torch

Requirements: Python 3.12+, Gymnasium >= 0.29, numpy >= 1.26, httpx >= 0.28, and a running TradeReady platform instance.

Quick Start

import gymnasium as gym
import tradeready_gym  # registers all environments

env = gym.make(
    "TradeReady-BTC-v0",
    api_key="ak_live_...",
    starting_balance=10000,
    timeframe="1h",
    start_time="2025-01-01T00:00:00Z",
    end_time="2025-03-01T00:00:00Z",
)

obs, info = env.reset()

for _ in range(1000):
    action = env.action_space.sample()  # replace with your model
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        obs, info = env.reset()

env.close()

The 5-Tuple Step API

Every call to env.step(action) returns:

Value	Type	Description
`obs`	`np.ndarray`	Market observations and account state
`reward`	`float`	Signal from the reward function
`terminated`	`bool`	`True` when the episode ends at the final candle
`truncated`	`bool`	`True` if ended early (e.g. account blown out)
`info`	`dict`	Step details: equity, filled orders, virtual time

The info dictionary includes:

{
    "equity": 10245.30,
    "available_cash": 8000.00,
    "position_value": 2245.30,
    "unrealized_pnl": 245.30,
    "virtual_time": "2025-01-15T14:00:00Z",
    "step": 336,
    "total_steps": 1464,
    "filled_orders": [],
    "progress_pct": 22.95
}

Gymnasium Compliance

tradeready-gym implements the full Gymnasium Env interface:

env.reset(seed=42)             # reproducible episode starts
env.step(action)               # 5-tuple return
env.action_space               # gymnasium.spaces.Space subclass
env.observation_space          # gymnasium.spaces.Box
env.render()                   # optional — not implemented (trading has no visual render)
env.close()                    # completes the training run and cleans up the backtest session

All action and observation spaces are properly defined as gymnasium.spaces objects, so env.action_space.sample() always returns a valid action and env.observation_space.contains(obs) always returns True.

Always call env.close() when training is finished. This marks the training run as complete in the platform dashboard and releases the backtest session.

Three Workflow Types

The gym supports three distinct workflows:

Workflow	Use Case	Environment
Historical training	Train on past data, evaluate on held-out periods	`TradeReady-BTC-v0`, `TradeReady-BTC-Continuous-v0`, `TradeReady-Portfolio-v0`
Live paper trading	Deploy a trained model to trade in real time	`TradeReady-Live-v0`
RL training pipeline	Full cycle: train → track → compare → deploy best	Any environment with `track_training=True`

Training Tracking

Set track_training=True (the default) to automatically sync training progress with the platform dashboard:

env = gym.make(
    "TradeReady-BTC-v0",
    api_key="ak_live_...",
    track_training=True,
    strategy_label="ppo_v1",
)

The tracker:

Registers a training run on the first reset() call
Reports episode metrics (ROI, Sharpe, drawdown, trades, cumulative reward) after each completed episode
Marks the run as complete on env.close()

Training runs are visible at /training in the dashboard. See Training Tracking for full details.

Next Steps

Environments — all 7 registered environments, action spaces, observation space
Rewards — 5 reward functions and how to write custom ones
Training Tracking — syncing runs to the dashboard
Examples — complete working scripts