TradeReady.io
Gymnasium / RL Training

Training Tracking

Automatic progress reporting to the platform dashboard and programmatic access to learning curves

Download .md

When track_training=True (the default), the gym package automatically reports training progress to the platform. Every episode's metrics are stored, aggregated into a learning curve, and visible in the dashboard at /training.


How It Works

The TrainingTracker inside the gym environment reports three lifecycle events:

  1. First reset() call — registers a new training run and returns a run_id
  2. End of each episode — reports metrics: ROI, Sharpe ratio, max drawdown, total trades, cumulative reward sum
  3. env.close() — marks the run as complete with final aggregate statistics

All communication is over the same REST API your agent uses for trading. The tracker uses the api_key you passed to gym.make().


Enabling Tracking

Tracking is on by default. Use strategy_label to group related runs:

env = gym.make(
    "TradeReady-BTC-Continuous-v0",
    api_key="ak_live_...",
    track_training=True,      # default
    strategy_label="ppo_v1",  # appears in the dashboard filter
)

To disable tracking entirely:

env = gym.make(
    "TradeReady-BTC-v0",
    api_key="ak_live_...",
    track_training=False,
)

Gym-created training runs use the strategy_label as-is. The UI can filter out training sessions from the regular backtest list using the label prefix.


Dashboard View

Navigate to /training in the platform UI to see:

  • Active training card — live episode count and progress bar for running trainers
  • Learning curves — smoothed charts of ROI, Sharpe ratio, and reward over episodes
  • Episode table — individual episode metrics with timestamps
  • Run comparison — select multiple runs to overlay their learning curves

Querying Training Data via API

You can also query training data programmatically:

List training runs

GET /api/v1/training/runs?status=completed&limit=20
{
  "runs": [
    {
      "run_id": "run_abc123",
      "strategy_label": "ppo_v1",
      "status": "completed",
      "total_episodes": 247,
      "avg_roi_pct": 3.8,
      "avg_sharpe": 1.2,
      "started_at": "2026-03-01T10:00:00Z",
      "completed_at": "2026-03-01T11:42:00Z"
    }
  ]
}

Get full run detail

GET /api/v1/training/runs/{run_id}

Returns run metadata, per-episode results, and a smoothed learning_curve array ready for charting.

Get learning curve data

GET /api/v1/training/runs/{run_id}/learning-curve?metric=roi_pct&window=10
Query paramOptionsDescription
metricroi_pct, sharpe_ratio, max_drawdown_pct, total_trades, reward_sumWhich metric to return
windowintegerRolling mean window for smoothing

Response:

{
  "metric": "roi_pct",
  "window": 10,
  "curve": [
    {"episode": 1, "value": -1.2, "smoothed": -1.2},
    {"episode": 2, "value": 0.4, "smoothed": -0.4},
    {"episode": 10, "value": 2.1, "smoothed": 1.8}
  ]
}

Compare multiple runs

GET /api/v1/training/compare?run_ids=run_abc123,run_def456,run_ghi789

Returns side-by-side aggregate statistics for all specified runs — useful for hyperparameter comparison.


Python SDK Access

from agentexchange import AgentExchangeClient

client = AgentExchangeClient(api_key="ak_live_...")

# List runs
runs = client.get_training_runs(status="completed", limit=10)

# Get detail for a specific run
run = client.get_training_run(run_id="run_abc123")
print(f"Episodes: {run['total_episodes']}")
print(f"Avg ROI: {run['avg_roi_pct']}%")
print(f"Avg Sharpe: {run['avg_sharpe']}")

# Compare PPO vs DQN
comparison = client.compare_training_runs(
    run_ids=["run_ppo_v1", "run_dqn_v1"]
)

Training API Endpoint Reference

MethodPathDescription
POST/api/v1/training/runsRegister a new training run (called by gym)
POST/api/v1/training/runs/{id}/episodesReport episode result (called by gym)
POST/api/v1/training/runs/{id}/completeMark run as complete (called by env.close())
GET/api/v1/training/runsList runs with optional status filter
GET/api/v1/training/runs/{id}Full detail, learning curve, and episode list
GET/api/v1/training/runs/{id}/learning-curveLearning curve with smoothing
GET/api/v1/training/compareCompare multiple runs side-by-side

The POST /runs, POST /runs/{id}/episodes, and POST /runs/{id}/complete endpoints are called automatically by the gym. You only need the GET endpoints if you are querying training data programmatically.


Full Training Loop Example

import gymnasium as gym
import tradeready_gym
from tradeready_gym.rewards import SharpeReward
from tradeready_gym.wrappers import NormalizationWrapper
from stable_baselines3 import PPO

# Create environment with tracking
env = gym.make(
    "TradeReady-BTC-Continuous-v0",
    api_key="ak_live_...",
    starting_balance=10000,
    timeframe="1h",
    lookback_window=50,
    observation_features=["ohlcv", "rsi_14", "macd", "bollinger", "balance", "position"],
    reward_function=SharpeReward(window=50),
    start_time="2025-01-01T00:00:00Z",
    end_time="2025-07-01T00:00:00Z",
    track_training=True,
    strategy_label="ppo_sharpe_v1",
)

# Normalize observations
env = NormalizationWrapper(env)

# Train with PPO
model = PPO(
    "MlpPolicy",
    env,
    verbose=1,
    learning_rate=3e-4,
    n_steps=2048,
    batch_size=64,
    n_epochs=10,
)
model.learn(total_timesteps=100_000)
model.save("ppo_btc_sharpe")

# env.close() marks the training run complete in the dashboard
env.close()

After training, go to /training in the dashboard to see the learning curve and episode-by-episode metrics.


Next Steps

  • Examples — complete working scripts including PPO and custom rewards
  • Strategy Testing — compare rule-based strategy test results against trained models

On this page