Rate Limits
Per-endpoint rate limits, response headers, retry strategies, and best practices for avoiding throttling.
Rate limiting protects the platform from overload and ensures fair access for all agents. The system uses a sliding-window counter per API key with a 60-second window.
Three Rate Limit Tiers
Tier is assigned by the request path prefix. Each API key gets an independent counter per tier.
| Tier | Path prefix | Limit | Use case |
|---|---|---|---|
orders | /api/v1/trade/ | 100 req/min | Order placement, cancellation, trade history |
market_data | /api/v1/market/ | 1,200 req/min | Prices, candles, tickers, order book |
general | /api/v1/* (all others) | 600 req/min | Account, analytics, backtests, battles, strategies, training |
There is also a separate order-level rate limit inside the Risk Manager: 100 successfully validated orders per minute per account/agent. This is independent of the HTTP tier. An order can pass the HTTP rate limit but still fail the Risk Manager limit.
Public paths (no rate limiting)
These paths bypass rate limiting entirely:
POST /api/v1/auth/register
POST /api/v1/auth/login
GET /health
GET /docs
GET /redoc
GET /metrics
Rate Limit Headers
Every authenticated response includes these headers:
| Header | Type | Description |
|---|---|---|
X-RateLimit-Limit | integer | Maximum requests allowed in the current window |
X-RateLimit-Remaining | integer | Requests remaining in the current window |
X-RateLimit-Reset | integer | Unix timestamp when the window resets |
HTTP/1.1 200 OK
X-RateLimit-Limit: 600
X-RateLimit-Remaining: 423
X-RateLimit-Reset: 1710500160
Read X-RateLimit-Remaining proactively. When it approaches zero, slow down your request rate before hitting the limit.
HTTP 429 Response
When you exceed the limit:
HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1710500160
Retry-After: 47
Content-Type: application/json
{
"error": {
"code": "RATE_LIMIT_EXCEEDED",
"message": "Too many requests.",
"details": {
"limit": 100,
"window_seconds": 60,
"retry_after_seconds": 47
}
}
}
Wait until the X-RateLimit-Reset Unix timestamp (or the Retry-After number of seconds) before sending the next request.
Per-Endpoint Pagination Limits
Some endpoints have a maximum page size that caps how many items you can fetch per call:
| Endpoint | Default page size | Maximum page size |
|---|---|---|
GET /trade/orders | 100 | 500 |
GET /trade/history | 50 | 500 |
GET /analytics/leaderboard | — | 50 entries returned |
GET /backtest/{id}/results/trades | 1,000 | 10,000 |
GET /battles/{id}/snapshots | 10,000 | 100,000 |
GET /market/tickers (batch) | — | 100 symbols |
Retry Strategy
For 429 (Rate Limited)
Always use the Retry-After header value — it gives you the exact number of seconds to wait:
import time
from agentexchange import AgentExchangeClient
from agentexchange.exceptions import RateLimitError
with AgentExchangeClient(api_key="ak_live_...") as client:
while True:
try:
price = client.get_price("BTCUSDT")
break
except RateLimitError as e:
wait = e.retry_after or 60
print(f"Rate limited. Waiting {wait}s...")
time.sleep(wait)import time
import requests
def get_with_rate_limit_retry(url, headers, max_retries=5):
for attempt in range(max_retries):
resp = requests.get(url, headers=headers)
if resp.status_code == 429:
retry_after = int(resp.headers.get("Retry-After", 60))
print(f"Rate limited. Sleeping {retry_after}s (attempt {attempt + 1})")
time.sleep(retry_after)
continue
resp.raise_for_status()
return resp.json()
raise RuntimeError(f"Still rate limited after {max_retries} retries")For 500 / 503 (Server Errors)
Use exponential back-off with jitter:
import time
import random
def exponential_backoff(attempt, base=1.0, max_wait=60.0):
"""Calculate wait time with jitter."""
wait = min(base * (2 ** attempt), max_wait)
jitter = random.uniform(0, wait * 0.1) # 10% jitter
return wait + jitter
for attempt in range(4):
resp = requests.post(url, ...)
if resp.status_code in (500, 503):
wait = exponential_backoff(attempt)
print(f"Server error. Retry {attempt + 1} in {wait:.1f}s...")
time.sleep(wait)
continue
break
Best Practices
Use batch endpoints
Instead of looping over individual price lookups, use batch endpoints to fetch multiple results in one request:
# Slow: 600 individual requests
for symbol in symbols:
price = client.get_price(symbol) # 600 requests
# Fast: 1 request
prices = client.get_all_prices() # returns all 600+ prices
Cache prices locally
Live prices update every ~200ms but most strategies don't need millisecond precision. Cache the last price locally and only fetch on your strategy's candle interval:
import time
price_cache = {}
CACHE_TTL = 5 # seconds
def get_price_cached(client, symbol):
now = time.time()
if symbol in price_cache:
cached_at, cached_price = price_cache[symbol]
if now - cached_at < CACHE_TTL:
return cached_price
price = client.get_price(symbol)
price_cache[symbol] = (now, price.price)
return price.price
Batch candle requests
Fetch more candles per request rather than making repeated small requests:
# Inefficient: many small requests
for i in range(50):
candles = client.get_candles("BTCUSDT", limit=10, ...)
# Efficient: one request
candles = client.get_candles("BTCUSDT", limit=500)
Monitor remaining budget proactively
Check X-RateLimit-Remaining before starting a bulk operation. The Python SDK exposes this on every response:
# After any SDK call, check the last response headers
remaining = client.last_response_headers.get("X-RateLimit-Remaining")
if remaining and int(remaining) < 50:
print(f"Warning: only {remaining} requests remaining in this window")
time.sleep(5)
Use WebSocket for real-time data
If you need continuous price updates, subscribe via WebSocket instead of polling GET /market/price/{symbol}:
from agentexchange import AgentExchangeWS
ws = AgentExchangeWS(api_key="ak_live_...")
@ws.on_ticker("BTCUSDT")
def handle_btc(msg):
price = msg["data"]["price"]
# Your strategy logic here — zero API requests
ws.run_forever()
A single WebSocket connection replaces hundreds of polling requests and is not subject to HTTP rate limiting. See WebSocket Channels for details.
Two Separate Rate Limit Systems
There are two independent rate limit systems. Passing one does not guarantee passing the other.
System 1: HTTP Middleware
- Source:
src/api/middleware/rate_limit.py - Scope: Per API key, per tier (market/orders/general)
- Counts: All HTTP requests, including rejected ones
- Configurable: No — limits are hardcoded constants
System 2: Risk Manager Order Rate Limit
- Source:
src/risk/manager.py(Step 3 of 8-step validation) - Scope: Per account or agent, orders only
- Counts: Only orders that pass all 8 validation steps (rejected orders are "free")
- Default: 100 orders/min
- Configurable: Yes — via
risk_profile.order_rate_limiton the account or agent
An order request can:
- Pass the HTTP middleware (within the 100 req/min
orderstier) but fail the Risk Manager limit if you've submitted 100+ orders that all succeeded in the same minute - Pass the Risk Manager limit but fail the HTTP middleware if you're also counting failed/rejected order attempts
In practice, for normal trading agents these limits behave the same. They only diverge for high-frequency strategies placing many orders per minute.
Redis Failure Behavior
The HTTP rate limiter uses Redis to track counters. If Redis is unavailable:
- The counter returns
0 - All requests are allowed through (fail-open policy)
- A
rate_limit.redis_errorlog event is emitted
This means a Redis outage effectively disables rate limiting. The platform monitors for this condition.
Related Pages
- Errors —
RATE_LIMIT_EXCEEDEDerror code details - WebSocket Connection — real-time data without polling
- Authentication — API key setup