Skip to content

Rate Limits & Timeouts (Non-Contract Guidance)

This page is non-contract guidance. The public contract does not specify quotas or enforced rate limits. Treat this as operational advice for clients, not a guarantee.

What the contract does (and does not) guarantee

  • Per-org rate limiting is enforced: application-level rate limiting is active across 16 services (CRM, ICS, SCM, PPM, PCM, Influencer, Accounting, IPM, RBS, UCP, UTL, PVM, PMC, MRS, OFM, SLC). Exceeding the per-org request window returns 429 throttled with retryable: true. Service accounts (x-api-key with service_account principal) bypass the limiter. OPS is exempt (operator-only).
  • SLOs are guidance: business-level targets are published in /common/performance-slos.html, but remain non-contract.
  • Infrastructure limits still exist: API Gateway, Lambda, and storage layers may still enforce platform-level throttles in addition to the application-level limiter.
  • AWS WAF rate limit is enforced: all API endpoints are protected by a per-IP rate limit of 2,000 requests per 5-minute window (evaluated by AWS WAF before the request reaches the application). Exceeding this returns a 403 with an HTML body (no JSON). CloudFront distributions have a 5,000 req/5min per-IP limit. WAF rate limits are separate from and in addition to the per-org application-level limiter.

Suggested client posture

  • Timeouts: set explicit request timeouts and fail fast on hung connections.
  • Retries: retry only idempotent calls (GET, safe reads) or endpoints that support idempotency keys.
  • Backoff: use exponential backoff with jitter on retryable failures (5xx, 429, transient network errors).
  • Concurrency: cap parallel requests per tenant and per host; prefer small bursts over unbounded fan-out.
  • Budgeting: apply a per-request time budget and a per-workflow budget to avoid cascading retries.

Dependency time budgets (hot-path guidance)

For Tier C/D workflows that read multiple services (SCM → PPM/ICS/PMC/USM/OFM):

  • PPM price/resolve: 100–200 ms budget.
  • ICS availability/reserve: 150–250 ms budget.
  • PMC catalog lookup: 100–200 ms budget.
  • USM session validate: 50–100 ms budget (cacheable).
  • OFM member resolve: 50–120 ms budget (cacheable).
  • Total workflow budget: 300–600 ms for Tier C reads; Tier D writes may allow 1–3 s (see /common/performance-slos.html).

Circuit breaker and fallback posture

Cross-service calls use an in-Lambda circuit breaker (packages/circuit-breaker). State machine: CLOSED → (N failures in window) → OPEN → (cooldown) → HALF_OPEN → (1 success) → CLOSED. When open, calls fail fast with 502 dependency-unavailable without attempting the downstream call.

  • PPM slow/unavailable: allow cached price snapshot if policy permits; otherwise fail closed (price required).
  • ICS slow/unavailable: fail closed for commits; allow read-only availability fallback when policy allows.
  • PMC slow/unavailable: allow cached online catalog snapshot for read-only lookups; writes/publish fail closed.
  • USM/OFM slow/unavailable: fail closed for authenticated calls; use short-lived cache to avoid repeated lookups.
  • Maintenance mode: each service checks OPS maintenance state; returns 503 maintenance-active when active. The check uses a 20-second cache TTL and fails open on errors (OPS outage does not block other services).

Cache TTL guidance (hot path)

  • USM validate: 30–60 seconds with jitter; immediate invalidate on explicit logout/revoke if possible.
  • OFM resolve: 60–120 seconds with jitter; reduce TTL for high-risk roles or recent membership changes.

Error handling hints

  • 404 not-found is ambiguous under anti-enumeration; treat it as “missing or not associated.”
  • 409 conflict and 428 expected-revision-required mean your local state is stale; re-fetch before retrying.
  • 403 org-write-blocked means the org is not verified (or is parked/suspended/doomed); do not retry.
  • 403 with no JSON body means the request was blocked by WAF (see /common/environments.html). Check request payload for patterns that resemble injection attacks. Do not retry — fix the request content.

References