Environments
All 7 registered Gymnasium environments — action spaces, observation spaces, and configuration
Import tradeready_gym to register all environments before calling gym.make():
import gymnasium as gym
import tradeready_gym # registers all 7 environments
Registered Environments
| Environment ID | Action Space | Assets | Mode |
|---|---|---|---|
TradeReady-BTC-v0 | Discrete(3) | BTC | Historical |
TradeReady-ETH-v0 | Discrete(3) | ETH | Historical |
TradeReady-SOL-v0 | Discrete(3) | SOL | Historical |
TradeReady-BTC-Continuous-v0 | Box(-1, 1, (1,)) | BTC | Historical |
TradeReady-ETH-Continuous-v0 | Box(-1, 1, (1,)) | ETH | Historical |
TradeReady-Portfolio-v0 | Box(0, 1, (N,)) | Any N pairs | Historical |
TradeReady-Live-v0 | Discrete(3) | Any pairs | Live paper trading |
SingleAssetTradingEnv — Discrete
The three discrete environments (TradeReady-BTC-v0, TradeReady-ETH-v0, TradeReady-SOL-v0) use a three-action space:
| Action | Value | Effect |
|---|---|---|
| Hold | 0 | Do nothing |
| Buy | 1 | Buy with 10% of current equity |
| Sell | 2 | Close the current position |
env = gym.make(
"TradeReady-BTC-v0",
api_key="ak_live_...",
starting_balance=10000,
timeframe="1h",
lookback_window=30,
start_time="2025-01-01T00:00:00Z",
end_time="2025-03-01T00:00:00Z",
)
# action_space = Discrete(3)
obs, info = env.reset()
obs, reward, terminated, truncated, info = env.step(1) # Buy
Configuration parameters:
| Parameter | Default | Description |
|---|---|---|
api_key | required | Your agent's API key |
starting_balance | 10000 | Starting USDT balance |
timeframe | "1h" | Candle interval for indicators |
lookback_window | 30 | Number of past candles in each observation |
start_time | required | ISO timestamp for episode start |
end_time | required | ISO timestamp for episode end |
observation_features | ["ohlcv"] | Which features to include in observations |
reward_function | PnLReward() | Reward function instance |
track_training | True | Auto-report to training dashboard |
strategy_label | None | Label for the training run |
SingleAssetTradingEnv — Continuous
The continuous environments (TradeReady-BTC-Continuous-v0, TradeReady-ETH-Continuous-v0) use a Box action space where the value represents both direction and magnitude:
| Signal range | Interpretation |
|---|---|
-0.05 to 0.05 | Dead zone — Hold |
> 0.05 | Buy — quantity = signal * position_size_pct * equity / price |
< -0.05 | Sell — same formula with absolute value |
env = gym.make(
"TradeReady-BTC-Continuous-v0",
api_key="ak_live_...",
starting_balance=10000,
)
# action_space = Box(-1.0, 1.0, shape=(1,), dtype=float32)
obs, info = env.reset()
obs, reward, terminated, truncated, info = env.step([0.7]) # Buy 70% of position size
Continuous environments are preferred for PPO, SAC, and TD3 where the model can learn nuanced position sizing.
MultiAssetTradingEnv — Portfolio
TradeReady-Portfolio-v0 takes target portfolio weights as actions. The environment rebalances to match the targets on each step:
env = gym.make(
"TradeReady-Portfolio-v0",
api_key="ak_live_...",
pairs=["BTCUSDT", "ETHUSDT", "SOLUSDT"],
starting_balance=50000,
)
# action_space = Box(0.0, 1.0, shape=(3,), dtype=float32)
obs, info = env.reset()
# Allocate 50% BTC, 30% ETH, 20% SOL
obs, reward, terminated, truncated, info = env.step([0.5, 0.3, 0.2])
If the weights sum to more than 1.0, they are normalized. The remainder of equity stays as cash (USDT).
LiveTradingEnv
TradeReady-Live-v0 connects to real-time Binance prices instead of historical data. It never terminates on its own — it runs until env.close() is called.
env = gym.make(
"TradeReady-Live-v0",
api_key="ak_live_...",
pairs=["BTCUSDT"],
step_interval_sec=60, # Wait 60 seconds between steps
)
obs, info = env.reset()
while True:
action, _ = model.predict(obs)
obs, reward, _, _, info = env.step(action)
# Loop forever — never sets terminated=True
The live environment uses your agent's actual virtual balance. Unlike the historical environments, there is no isolated sandbox — trades affect your real agent account.
Observation Space
All environments share the same observation builder. Configure what your model sees via observation_features:
env = gym.make(
"TradeReady-BTC-v0",
api_key="ak_live_...",
lookback_window=30,
observation_features=[
"ohlcv", # Open, High, Low, Close, Volume — 5 dims per candle
"rsi_14", # RSI normalized to [0, 1] — 1 dim per candle
"macd", # MACD line, signal, histogram — 3 dims per candle
"bollinger", # Upper, middle, lower bands — 3 dims per candle
"volume", # Raw volume — 1 dim per candle
"adx", # Trend strength — 1 dim per candle
"atr", # Average True Range — 1 dim per candle
"balance", # Cash / starting_balance — 1 scalar
"position", # Position value / equity — 1 scalar
"unrealized_pnl", # Unrealized PnL / equity — 1 scalar
]
)
Feature Dimensions
| Feature | Dims per candle | Type |
|---|---|---|
ohlcv | 5 | Windowed (repeated for each candle in lookback_window) |
rsi_14 | 1 | Windowed |
macd | 3 | Windowed |
bollinger | 3 | Windowed |
volume | 1 | Windowed |
adx | 1 | Windowed |
atr | 1 | Windowed |
balance | 1 | Scalar (appended once at the end) |
position | 1 | Scalar |
unrealized_pnl | 1 | Scalar |
Observation shape formula:
obs_size = (lookback_window × windowed_dims × n_assets) + scalar_dims
Example (BTC only, all features, window=30):
= (30 × 15 × 1) + 3 = 453
The observation space is always a Box(shape=(obs_size,), dtype=float32) with range [-inf, inf]. Apply NormalizationWrapper to bring values into [-1, 1] before feeding to a neural network.
Wrappers
Three wrappers are available to enhance environments:
from tradeready_gym.wrappers import (
FeatureEngineeringWrapper,
NormalizationWrapper,
BatchStepWrapper,
)
env = gym.make("TradeReady-BTC-v0", api_key="ak_live_...")
# Add SMA ratios and momentum to observations
env = FeatureEngineeringWrapper(env, periods=[5, 10, 20])
# Normalize observations to [-1, 1] using online z-score
env = NormalizationWrapper(env, clip=1.0)
# Execute 5 underlying steps per action (reduces HTTP overhead)
env = BatchStepWrapper(env, n_steps=5)
| Wrapper | Effect | When to use |
|---|---|---|
FeatureEngineeringWrapper | Adds SMA ratios and price momentum to the observation | When you want derived features without a custom obs space |
NormalizationWrapper | Online z-score normalization, clipped to [-1, 1] | Always recommended for neural network training |
BatchStepWrapper | N underlying steps per agent action, rewards summed | Reduce API call overhead during training |
Next Steps
- Rewards — choosing and customizing reward functions
- Training Tracking — visualizing training runs
- Examples — complete training scripts