Track usage with enforcement

All-in-one endpoint for tracking usage with quota and rate limit enforcement.

This is the recommended endpoint for most use cases. It performs three operations in sequence:

  1. Rate Limit Check: Validates against configured rate limits (fast, Redis-based)
  2. Quota Check: Validates against configured quotas (database-based)
  3. Event Ingestion: Records the event if all checks pass

Response Codes:

  • 201 Created: Event tracked successfully, all limits within bounds
  • 429 Too Many Requests: Rate limit or quota exceeded (check response body for details)

Rate Limit Headers: The response includes standard rate limit headers:

  • X-RateLimit-Limit: Maximum requests allowed
  • X-RateLimit-Remaining: Requests remaining in current window
  • X-RateLimit-Reset: Unix timestamp when window resets
  • Retry-After: Seconds to wait before retrying (on 429 responses)

Quota Headers (on quota exceeded): When a quota is exceeded, the response includes:

  • Retry-After: Seconds to wait before retrying
  • X-Quota-Reset: Unix timestamp when the quota period resets
  • X-Quota-Period: Quota period (hour, day, week, month)
  • X-Quota-Metric: Quota metric (total_tokens, total_events, total_cost_cents)

Dimension Matching: Quotas and rate limits are matched based on dimensions extracted from:

  • customerIdcustomer_id
  • eventTypeevent_type
  • modelmodel
  • providerprovider
  • properties → All string values are included as dimensions
POST
/track
AuthorizationBearer <token>

Project API key

In: header

Request Body

application/json

customerId*string

Your customer's identifier (e.g., user ID, tenant ID)

Length1 <= length
eventType*string

Type of event (e.g., "model_call", "embedding", "tool_call")

Length1 <= length
model?string

Model used (e.g., "gpt-4", "claude-3-opus", "gpt-3.5-turbo")

provider?string

Provider name (e.g., "openai", "anthropic", "google")

inputTokens?integer

Number of input tokens consumed

Range0 <= value
outputTokens?integer

Number of output tokens generated

Range0 <= value
totalTokens?integer

Total tokens (auto-calculated if not provided)

Range0 <= value
latencyMs?integer

Request latency in milliseconds

Range0 <= value
costCents?integer

Cost in cents (e.g., 25 = $0.25)

Range0 <= value
properties?

Custom dimensions for quota/rate limit matching (e.g., user_id, team_id, feature)

idempotencyKey?string

Unique key for deduplication (prevents duplicate events)

timestamp?string

ISO 8601 timestamp (defaults to current time if not provided)

Formatdate-time

Response Body

application/json

application/json

application/json

application/json

curl -X POST "https://api.limitry.com/v1/track" \  -H "Content-Type: application/json" \  -d '{    "customerId": "cust_123",    "eventType": "model_call",    "model": "gpt-4",    "provider": "openai",    "inputTokens": 150,    "outputTokens": 50,    "totalTokens": 200,    "latencyMs": 1200,    "costCents": 25,    "properties": {      "user_id": "user_456",      "team_id": "team_eng",      "feature": "chat"    },    "idempotencyKey": "req_abc123",    "timestamp": "2024-01-15T10:30:00Z"  }'
{
  "id": "evt_abc123",
  "allowed": true,
  "rateLimits": [
    {
      "id": "rl_xyz789",
      "name": "API Rate Limit",
      "window": "1h",
      "limit": 1000,
      "remaining": 847,
      "reset": 1705312800,
      "exceeded": false
    }
  ],
  "quotas": [
    {
      "id": "qta_def456",
      "name": "Daily Token Limit",
      "metric": "total_tokens",
      "period": "day",
      "limit": 100000,
      "used": 45230,
      "remaining": 54770,
      "exceeded": false
    }
  ]
}
{
  "error": "Invalid request"
}
{
  "error": "Invalid or missing API key"
}

{
  "allowed": false,
  "error": "Rate limit exceeded",
  "rateLimits": [
    {
      "id": "rl_xyz789",
      "name": "API Rate Limit",
      "window": "1h",
      "limit": 1000,
      "remaining": 0,
      "reset": 1705312800,
      "exceeded": true
    }
  ],
  "quotas": []
}