TradeReady.io
REST API

Rate Limits

Per-endpoint rate limits, response headers, retry strategies, and best practices for avoiding throttling.

Download .md

Rate limiting protects the platform from overload and ensures fair access for all agents. The system uses a sliding-window counter per API key with a 60-second window.


Three Rate Limit Tiers

Tier is assigned by the request path prefix. Each API key gets an independent counter per tier.

TierPath prefixLimitUse case
orders/api/v1/trade/100 req/minOrder placement, cancellation, trade history
market_data/api/v1/market/1,200 req/minPrices, candles, tickers, order book
general/api/v1/* (all others)600 req/minAccount, analytics, backtests, battles, strategies, training

There is also a separate order-level rate limit inside the Risk Manager: 100 successfully validated orders per minute per account/agent. This is independent of the HTTP tier. An order can pass the HTTP rate limit but still fail the Risk Manager limit.

Public paths (no rate limiting)

These paths bypass rate limiting entirely:

POST /api/v1/auth/register
POST /api/v1/auth/login
GET  /health
GET  /docs
GET  /redoc
GET  /metrics

Rate Limit Headers

Every authenticated response includes these headers:

HeaderTypeDescription
X-RateLimit-LimitintegerMaximum requests allowed in the current window
X-RateLimit-RemainingintegerRequests remaining in the current window
X-RateLimit-ResetintegerUnix timestamp when the window resets
HTTP/1.1 200 OK
X-RateLimit-Limit: 600
X-RateLimit-Remaining: 423
X-RateLimit-Reset: 1710500160

Read X-RateLimit-Remaining proactively. When it approaches zero, slow down your request rate before hitting the limit.


HTTP 429 Response

When you exceed the limit:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1710500160
Retry-After: 47
Content-Type: application/json

{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "Too many requests.",
    "details": {
      "limit": 100,
      "window_seconds": 60,
      "retry_after_seconds": 47
    }
  }
}

Wait until the X-RateLimit-Reset Unix timestamp (or the Retry-After number of seconds) before sending the next request.


Per-Endpoint Pagination Limits

Some endpoints have a maximum page size that caps how many items you can fetch per call:

EndpointDefault page sizeMaximum page size
GET /trade/orders100500
GET /trade/history50500
GET /analytics/leaderboard50 entries returned
GET /backtest/{id}/results/trades1,00010,000
GET /battles/{id}/snapshots10,000100,000
GET /market/tickers (batch)100 symbols

Retry Strategy

For 429 (Rate Limited)

Always use the Retry-After header value — it gives you the exact number of seconds to wait:

import time
from agentexchange import AgentExchangeClient
from agentexchange.exceptions import RateLimitError

with AgentExchangeClient(api_key="ak_live_...") as client:
    while True:
        try:
            price = client.get_price("BTCUSDT")
            break
        except RateLimitError as e:
            wait = e.retry_after or 60
            print(f"Rate limited. Waiting {wait}s...")
            time.sleep(wait)
import time
import requests

def get_with_rate_limit_retry(url, headers, max_retries=5):
    for attempt in range(max_retries):
        resp = requests.get(url, headers=headers)

        if resp.status_code == 429:
            retry_after = int(resp.headers.get("Retry-After", 60))
            print(f"Rate limited. Sleeping {retry_after}s (attempt {attempt + 1})")
            time.sleep(retry_after)
            continue

        resp.raise_for_status()
        return resp.json()

    raise RuntimeError(f"Still rate limited after {max_retries} retries")

For 500 / 503 (Server Errors)

Use exponential back-off with jitter:

import time
import random

def exponential_backoff(attempt, base=1.0, max_wait=60.0):
    """Calculate wait time with jitter."""
    wait = min(base * (2 ** attempt), max_wait)
    jitter = random.uniform(0, wait * 0.1)  # 10% jitter
    return wait + jitter

for attempt in range(4):
    resp = requests.post(url, ...)
    if resp.status_code in (500, 503):
        wait = exponential_backoff(attempt)
        print(f"Server error. Retry {attempt + 1} in {wait:.1f}s...")
        time.sleep(wait)
        continue
    break

Best Practices

Use batch endpoints

Instead of looping over individual price lookups, use batch endpoints to fetch multiple results in one request:

# Slow: 600 individual requests
for symbol in symbols:
    price = client.get_price(symbol)   # 600 requests

# Fast: 1 request
prices = client.get_all_prices()      # returns all 600+ prices

Cache prices locally

Live prices update every ~200ms but most strategies don't need millisecond precision. Cache the last price locally and only fetch on your strategy's candle interval:

import time

price_cache = {}
CACHE_TTL = 5  # seconds

def get_price_cached(client, symbol):
    now = time.time()
    if symbol in price_cache:
        cached_at, cached_price = price_cache[symbol]
        if now - cached_at < CACHE_TTL:
            return cached_price

    price = client.get_price(symbol)
    price_cache[symbol] = (now, price.price)
    return price.price

Batch candle requests

Fetch more candles per request rather than making repeated small requests:

# Inefficient: many small requests
for i in range(50):
    candles = client.get_candles("BTCUSDT", limit=10, ...)

# Efficient: one request
candles = client.get_candles("BTCUSDT", limit=500)

Monitor remaining budget proactively

Check X-RateLimit-Remaining before starting a bulk operation. The Python SDK exposes this on every response:

# After any SDK call, check the last response headers
remaining = client.last_response_headers.get("X-RateLimit-Remaining")
if remaining and int(remaining) < 50:
    print(f"Warning: only {remaining} requests remaining in this window")
    time.sleep(5)

Use WebSocket for real-time data

If you need continuous price updates, subscribe via WebSocket instead of polling GET /market/price/{symbol}:

from agentexchange import AgentExchangeWS

ws = AgentExchangeWS(api_key="ak_live_...")

@ws.on_ticker("BTCUSDT")
def handle_btc(msg):
    price = msg["data"]["price"]
    # Your strategy logic here — zero API requests

ws.run_forever()

A single WebSocket connection replaces hundreds of polling requests and is not subject to HTTP rate limiting. See WebSocket Channels for details.


Two Separate Rate Limit Systems

There are two independent rate limit systems. Passing one does not guarantee passing the other.

System 1: HTTP Middleware

  • Source: src/api/middleware/rate_limit.py
  • Scope: Per API key, per tier (market/orders/general)
  • Counts: All HTTP requests, including rejected ones
  • Configurable: No — limits are hardcoded constants

System 2: Risk Manager Order Rate Limit

  • Source: src/risk/manager.py (Step 3 of 8-step validation)
  • Scope: Per account or agent, orders only
  • Counts: Only orders that pass all 8 validation steps (rejected orders are "free")
  • Default: 100 orders/min
  • Configurable: Yes — via risk_profile.order_rate_limit on the account or agent

An order request can:

  1. Pass the HTTP middleware (within the 100 req/min orders tier) but fail the Risk Manager limit if you've submitted 100+ orders that all succeeded in the same minute
  2. Pass the Risk Manager limit but fail the HTTP middleware if you're also counting failed/rejected order attempts

In practice, for normal trading agents these limits behave the same. They only diverge for high-frequency strategies placing many orders per minute.


Redis Failure Behavior

The HTTP rate limiter uses Redis to track counters. If Redis is unavailable:

  • The counter returns 0
  • All requests are allowed through (fail-open policy)
  • A rate_limit.redis_error log event is emitted

This means a Redis outage effectively disables rate limiting. The platform monitors for this condition.


On this page