<!-- Generated from TradeReady.io docs. Visit https://tradeready.io/docs for the full experience. -->

---
title: Gymnasium Overview
description: Train RL trading agents with tradeready-gym, a fully Gymnasium-compliant package
---

`tradeready-gym` is a Python package that wraps the TradeReady backtesting engine in a standard [Gymnasium](https://gymnasium.farama.org/) interface. You can plug it into any RL framework that speaks Gymnasium — Stable-Baselines3, RLlib, CleanRL, and others.

---

## What Is It?

The package gives you training environments where your model controls a virtual trading agent. At each step, the model receives market observations and account state, chooses an action (buy, sell, hold, or a continuous position size), and receives a reward signal.

The environments are backed by the same sandbox used for manual backtesting: real Binance candle data, 0.1% trading fees, and realistic slippage. There is no look-ahead bias — the `DataReplayer` enforces a strict `WHERE bucket <= virtual_clock` constraint.

**What you get:**

- 7 pre-registered environments for single-asset, multi-asset, and live trading
- 5 reward functions (PnL, log return, Sharpe, Sortino, drawdown penalty)
- 3 environment wrappers (feature engineering, normalization, batch stepping)
- Automatic training progress reporting to the platform dashboard via `TrainingTracker`

---

## Installation

```bash
pip install tradeready-gym

# With RL frameworks
pip install tradeready-gym stable-baselines3 torch
```

**Requirements:** Python 3.12+, Gymnasium >= 0.29, numpy >= 1.26, httpx >= 0.28, and a running TradeReady platform instance.

---

## Quick Start

```python
import gymnasium as gym
import tradeready_gym  # registers all environments

env = gym.make(
    "TradeReady-BTC-v0",
    api_key="ak_live_...",
    starting_balance=10000,
    timeframe="1h",
    start_time="2025-01-01T00:00:00Z",
    end_time="2025-03-01T00:00:00Z",
)

obs, info = env.reset()

for _ in range(1000):
    action = env.action_space.sample()  # replace with your model
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        obs, info = env.reset()

env.close()
```

---

## The 5-Tuple Step API

Every call to `env.step(action)` returns:

| Value | Type | Description |
|-------|------|-------------|
| `obs` | `np.ndarray` | Market observations and account state |
| `reward` | `float` | Signal from the reward function |
| `terminated` | `bool` | `True` when the episode ends at the final candle |
| `truncated` | `bool` | `True` if ended early (e.g. account blown out) |
| `info` | `dict` | Step details: equity, filled orders, virtual time |

The `info` dictionary includes:

```python
{
    "equity": 10245.30,
    "available_cash": 8000.00,
    "position_value": 2245.30,
    "unrealized_pnl": 245.30,
    "virtual_time": "2025-01-15T14:00:00Z",
    "step": 336,
    "total_steps": 1464,
    "filled_orders": [],
    "progress_pct": 22.95
}
```

---

## Gymnasium Compliance

`tradeready-gym` implements the full Gymnasium `Env` interface:

```python
env.reset(seed=42)             # reproducible episode starts
env.step(action)               # 5-tuple return
env.action_space               # gymnasium.spaces.Space subclass
env.observation_space          # gymnasium.spaces.Box
env.render()                   # optional — not implemented (trading has no visual render)
env.close()                    # completes the training run and cleans up the backtest session
```

All action and observation spaces are properly defined as `gymnasium.spaces` objects, so `env.action_space.sample()` always returns a valid action and `env.observation_space.contains(obs)` always returns `True`.

> **Info:**
> Always call `env.close()` when training is finished. This marks the training run as complete in the platform dashboard and releases the backtest session.

---

## Three Workflow Types

The gym supports three distinct workflows:

| Workflow | Use Case | Environment |
|----------|----------|-------------|
| **Historical training** | Train on past data, evaluate on held-out periods | `TradeReady-BTC-v0`, `TradeReady-BTC-Continuous-v0`, `TradeReady-Portfolio-v0` |
| **Live paper trading** | Deploy a trained model to trade in real time | `TradeReady-Live-v0` |
| **RL training pipeline** | Full cycle: train → track → compare → deploy best | Any environment with `track_training=True` |

---

## Training Tracking

Set `track_training=True` (the default) to automatically sync training progress with the platform dashboard:

```python
env = gym.make(
    "TradeReady-BTC-v0",
    api_key="ak_live_...",
    track_training=True,
    strategy_label="ppo_v1",
)
```

The tracker:
1. Registers a training run on the first `reset()` call
2. Reports episode metrics (ROI, Sharpe, drawdown, trades, cumulative reward) after each completed episode
3. Marks the run as complete on `env.close()`

Training runs are visible at `/training` in the dashboard. See [Training Tracking](/docs/gym/training-tracking) for full details.

---

## Next Steps

- [Environments](/docs/gym/environments) — all 7 registered environments, action spaces, observation space
- [Rewards](/docs/gym/rewards) — 5 reward functions and how to write custom ones
- [Training Tracking](/docs/gym/training-tracking) — syncing runs to the dashboard
- [Examples](/docs/gym/examples) — complete working scripts