Track usage with enforcement
All-in-one endpoint for tracking usage with quota and rate limit enforcement.
This is the recommended endpoint for most use cases. It performs three operations in sequence:
- Rate Limit Check: Validates against configured rate limits (fast, Redis-based)
- Quota Check: Validates against configured quotas (database-based)
- Event Ingestion: Records the event if all checks pass
Response Codes:
201 Created: Event tracked successfully, all limits within bounds429 Too Many Requests: Rate limit or quota exceeded (check response body for details)
Rate Limit Headers: The response includes standard rate limit headers:
X-RateLimit-Limit: Maximum requests allowedX-RateLimit-Remaining: Requests remaining in current windowX-RateLimit-Reset: Unix timestamp when window resetsRetry-After: Seconds to wait before retrying (on 429 responses)
Quota Headers (on quota exceeded): When a quota is exceeded, the response includes:
Retry-After: Seconds to wait before retryingX-Quota-Reset: Unix timestamp when the quota period resetsX-Quota-Period: Quota period (hour, day, week, month)X-Quota-Metric: Quota metric (total_tokens, total_events, total_cost_cents)
Dimension Matching: Quotas and rate limits are matched based on dimensions extracted from:
customerId→customer_ideventType→event_typemodel→modelprovider→providerproperties→ All string values are included as dimensions
Authorization
Bearer Project API key
In: header
Request Body
application/json
Your customer's identifier (e.g., user ID, tenant ID)
1 <= lengthType of event (e.g., "model_call", "embedding", "tool_call")
1 <= lengthModel used (e.g., "gpt-4", "claude-3-opus", "gpt-3.5-turbo")
Provider name (e.g., "openai", "anthropic", "google")
Number of input tokens consumed
0 <= valueNumber of output tokens generated
0 <= valueTotal tokens (auto-calculated if not provided)
0 <= valueRequest latency in milliseconds
0 <= valueCost in cents (e.g., 25 = $0.25)
0 <= valueCustom dimensions for quota/rate limit matching (e.g., user_id, team_id, feature)
Unique key for deduplication (prevents duplicate events)
ISO 8601 timestamp (defaults to current time if not provided)
date-timeResponse Body
application/json
application/json
application/json
application/json
curl -X POST "https://api.limitry.com/v1/track" \ -H "Content-Type: application/json" \ -d '{ "customerId": "cust_123", "eventType": "model_call", "model": "gpt-4", "provider": "openai", "inputTokens": 150, "outputTokens": 50, "totalTokens": 200, "latencyMs": 1200, "costCents": 25, "properties": { "user_id": "user_456", "team_id": "team_eng", "feature": "chat" }, "idempotencyKey": "req_abc123", "timestamp": "2024-01-15T10:30:00Z" }'{
"id": "evt_abc123",
"allowed": true,
"rateLimits": [
{
"id": "rl_xyz789",
"name": "API Rate Limit",
"window": "1h",
"limit": 1000,
"remaining": 847,
"reset": 1705312800,
"exceeded": false
}
],
"quotas": [
{
"id": "qta_def456",
"name": "Daily Token Limit",
"metric": "total_tokens",
"period": "day",
"limit": 100000,
"used": 45230,
"remaining": 54770,
"exceeded": false
}
]
}{
"error": "Invalid request"
}{
"error": "Invalid or missing API key"
}{
"allowed": false,
"error": "Rate limit exceeded",
"rateLimits": [
{
"id": "rl_xyz789",
"name": "API Rate Limit",
"window": "1h",
"limit": 1000,
"remaining": 0,
"reset": 1705312800,
"exceeded": true
}
],
"quotas": []
}