Environments

Import tradeready_gym to register all environments before calling gym.make():

import gymnasium as gym
import tradeready_gym  # registers all 7 environments

Registered Environments

Environment ID	Action Space	Assets	Mode
`TradeReady-BTC-v0`	Discrete(3)	BTC	Historical
`TradeReady-ETH-v0`	Discrete(3)	ETH	Historical
`TradeReady-SOL-v0`	Discrete(3)	SOL	Historical
`TradeReady-BTC-Continuous-v0`	Box(-1, 1, (1,))	BTC	Historical
`TradeReady-ETH-Continuous-v0`	Box(-1, 1, (1,))	ETH	Historical
`TradeReady-Portfolio-v0`	Box(0, 1, (N,))	Any N pairs	Historical
`TradeReady-Live-v0`	Discrete(3)	Any pairs	Live paper trading

SingleAssetTradingEnv — Discrete

The three discrete environments (TradeReady-BTC-v0, TradeReady-ETH-v0, TradeReady-SOL-v0) use a three-action space:

Action	Value	Effect
Hold	`0`	Do nothing
Buy	`1`	Buy with 10% of current equity
Sell	`2`	Close the current position

env = gym.make(
    "TradeReady-BTC-v0",
    api_key="ak_live_...",
    starting_balance=10000,
    timeframe="1h",
    lookback_window=30,
    start_time="2025-01-01T00:00:00Z",
    end_time="2025-03-01T00:00:00Z",
)

# action_space = Discrete(3)
obs, info = env.reset()
obs, reward, terminated, truncated, info = env.step(1)  # Buy

Configuration parameters:

Parameter	Default	Description
`api_key`	required	Your agent's API key
`starting_balance`	`10000`	Starting USDT balance
`timeframe`	`"1h"`	Candle interval for indicators
`lookback_window`	`30`	Number of past candles in each observation
`start_time`	required	ISO timestamp for episode start
`end_time`	required	ISO timestamp for episode end
`observation_features`	`["ohlcv"]`	Which features to include in observations
`reward_function`	`PnLReward()`	Reward function instance
`track_training`	`True`	Auto-report to training dashboard
`strategy_label`	`None`	Label for the training run

SingleAssetTradingEnv — Continuous

The continuous environments (TradeReady-BTC-Continuous-v0, TradeReady-ETH-Continuous-v0) use a Box action space where the value represents both direction and magnitude:

Signal range	Interpretation
`-0.05` to `0.05`	Dead zone — Hold
`> 0.05`	Buy — quantity = `signal * position_size_pct * equity / price`
`< -0.05`	Sell — same formula with absolute value

env = gym.make(
    "TradeReady-BTC-Continuous-v0",
    api_key="ak_live_...",
    starting_balance=10000,
)

# action_space = Box(-1.0, 1.0, shape=(1,), dtype=float32)
obs, info = env.reset()
obs, reward, terminated, truncated, info = env.step([0.7])  # Buy 70% of position size

Continuous environments are preferred for PPO, SAC, and TD3 where the model can learn nuanced position sizing.

MultiAssetTradingEnv — Portfolio

TradeReady-Portfolio-v0 takes target portfolio weights as actions. The environment rebalances to match the targets on each step:

env = gym.make(
    "TradeReady-Portfolio-v0",
    api_key="ak_live_...",
    pairs=["BTCUSDT", "ETHUSDT", "SOLUSDT"],
    starting_balance=50000,
)

# action_space = Box(0.0, 1.0, shape=(3,), dtype=float32)
obs, info = env.reset()

# Allocate 50% BTC, 30% ETH, 20% SOL
obs, reward, terminated, truncated, info = env.step([0.5, 0.3, 0.2])

If the weights sum to more than 1.0, they are normalized. The remainder of equity stays as cash (USDT).

LiveTradingEnv

TradeReady-Live-v0 connects to real-time Binance prices instead of historical data. It never terminates on its own — it runs until env.close() is called.

env = gym.make(
    "TradeReady-Live-v0",
    api_key="ak_live_...",
    pairs=["BTCUSDT"],
    step_interval_sec=60,  # Wait 60 seconds between steps
)

obs, info = env.reset()
while True:
    action, _ = model.predict(obs)
    obs, reward, _, _, info = env.step(action)
    # Loop forever — never sets terminated=True

The live environment uses your agent's actual virtual balance. Unlike the historical environments, there is no isolated sandbox — trades affect your real agent account.

Observation Space

All environments share the same observation builder. Configure what your model sees via observation_features:

env = gym.make(
    "TradeReady-BTC-v0",
    api_key="ak_live_...",
    lookback_window=30,
    observation_features=[
        "ohlcv",           # Open, High, Low, Close, Volume — 5 dims per candle
        "rsi_14",          # RSI normalized to [0, 1] — 1 dim per candle
        "macd",            # MACD line, signal, histogram — 3 dims per candle
        "bollinger",       # Upper, middle, lower bands — 3 dims per candle
        "volume",          # Raw volume — 1 dim per candle
        "adx",             # Trend strength — 1 dim per candle
        "atr",             # Average True Range — 1 dim per candle
        "balance",         # Cash / starting_balance — 1 scalar
        "position",        # Position value / equity — 1 scalar
        "unrealized_pnl",  # Unrealized PnL / equity — 1 scalar
    ]
)

Feature Dimensions

Feature	Dims per candle	Type
`ohlcv`	5	Windowed (repeated for each candle in `lookback_window`)
`rsi_14`	1	Windowed
`macd`	3	Windowed
`bollinger`	3	Windowed
`volume`	1	Windowed
`adx`	1	Windowed
`atr`	1	Windowed
`balance`	1	Scalar (appended once at the end)
`position`	1	Scalar
`unrealized_pnl`	1	Scalar

Observation shape formula:

obs_size = (lookback_window × windowed_dims × n_assets) + scalar_dims

Example (BTC only, all features, window=30):
  = (30 × 15 × 1) + 3 = 453

The observation space is always a Box(shape=(obs_size,), dtype=float32) with range [-inf, inf]. Apply NormalizationWrapper to bring values into [-1, 1] before feeding to a neural network.

Wrappers

Three wrappers are available to enhance environments:

from tradeready_gym.wrappers import (
    FeatureEngineeringWrapper,
    NormalizationWrapper,
    BatchStepWrapper,
)

env = gym.make("TradeReady-BTC-v0", api_key="ak_live_...")

# Add SMA ratios and momentum to observations
env = FeatureEngineeringWrapper(env, periods=[5, 10, 20])

# Normalize observations to [-1, 1] using online z-score
env = NormalizationWrapper(env, clip=1.0)

# Execute 5 underlying steps per action (reduces HTTP overhead)
env = BatchStepWrapper(env, n_steps=5)

Wrapper	Effect	When to use
`FeatureEngineeringWrapper`	Adds SMA ratios and price momentum to the observation	When you want derived features without a custom obs space
`NormalizationWrapper`	Online z-score normalization, clipped to `[-1, 1]`	Always recommended for neural network training
`BatchStepWrapper`	N underlying steps per agent action, rewards summed	Reduce API call overhead during training

Next Steps

Rewards — choosing and customizing reward functions
Training Tracking — visualizing training runs
Examples — complete training scripts