Skip to content

Performance and SLO Targets (Business-Level)

These targets describe required business behavior. The values below are the approved baseline and can be revised only by policy change.

SLO tiers (conceptual)

  • Tier 0 (hot path): customer-facing or scan-based workflows that must feel near-instant.
  • Tier 1 (operational): critical back-office workflows that must be fast and predictable.
  • Tier 2 (batch): large imports/exports and analytics pipelines that can be asynchronous.

Route-class taxonomy (OpenAPI-encoded)

Route classes are embedded in OpenAPI operations using vendor extensions so each endpoint has explicit targets.

OpenAPI fields (per operation):

  • x-route-class: Tier A | Tier B | Tier C | Tier D | Tier E
  • x-qps-target: peak QPS target (per org, per route)
  • x-concurrency-target: peak concurrency target (per org, per route)
  • x-latency-p95-ms: p95 latency target
  • x-latency-p99-ms: p99 latency target (Tier C/D only; others omit)

Tier meanings (used for route-class assignment):

  • Tier A (control plane): low-QPS configuration/approval/admin writes.
  • Tier B (reference reads): lookup/list/reference reads (non-hot path).
  • Tier C (hot operational reads): cart/checkout/availability/pricing reads.
  • Tier D (hot operational writes): orders, reservations, receiving, fulfillment.
  • Tier E (async/background): exports, reconciliation, analytics rollups.

Tier defaults (when a route does not override with a specific target):

TierPeak QPSPeak concurrencyP95P99
Tier A50200500 ms
Tier B300500300 ms
Tier C30004000150 ms400 ms
Tier D10002000300 ms600 ms
Tier E

Mapping to conceptual tiers:

  • Tier C/D → Tier 0 (hot path)
  • Tier A/B → Tier 1 (operational)
  • Tier E → Tier 2 (batch/async) Exception: USM validate and OFM resolve are treated as Tier 0 for availability/error budgeting even if assigned Tier B targets.

Client timeouts (baseline)

TierRecommended client timeout
Tier A5–10s
Tier B3–5s
Tier C1–2s
Tier D2–4s
Tier E30s+ (async jobs)

If a service documents a different timeout, follow that guidance.

Caching rules (baseline)

  • Safe to cache: reference data (catalog lookups, read-only lists) when explicitly documented.
  • Do not cache: session validation, authorization decisions, mutable operational records.
  • Short-lived cache: member/resolve and api_key/validate can be cached for seconds to reduce load (per service guidance).
  • Invalidate on write: any write must invalidate related cached reads for that entity scope.

Approved latency targets

Workflowp95 targetp99 targetNotes
Stock availability check (ICS)200 ms750 msFacility-scoped; used by checkout and pick.
Price resolution (PPM)150 ms500 msMust be fast enough for cart/checkout.
Reserve/allocate/commit (ICS+SCM)500 ms1500 msIncludes channel context.
Checkout / order place (SCM)1 s3 sIncludes price snapshot + reserve/commit.
Receiving scan / putaway (ICS)300 ms1 sScanner workflows must feel instant.
Transfer submit / approval (ICS)1 s3 sExcludes long-running shipment steps.
Return intake (SCM+ICS)1 s3 sIncludes disposition capture.
Procurement submission (PCM)2 s5 sApproval step may be async.
Analytics aggregation (BI/KPI)5 min30 minAsynchronous; no operational blocking.

Performance dependency diagrams (cross-service)

Checkout / order placement (hot path):

mermaid
flowchart LR
  UI[Sales Channel Cart/Checkout] --> SCM[SCM Order Place]
  SCM -->|price resolution| PPM[PPM Price/Resolve]
  SCM -->|availability| ICS[ICS Reserve/Allocate]
  SCM -->|catalog lookup| PMC[PMC Product Lookup]
  SCM -->|auth| USM[USM Validate]
  SCM -->|roles| OFM[OFM Member Resolve]
  PMC -->|published snapshot| SCM

Receiving / returns (inventory updates):

mermaid
flowchart LR
  PCM[PCM Receipt Record] --> ICS[ICS Receiving/Putaway]
  SCM[SCM Return Receive] --> ICS
  ICS --> Events[Eventing/Event Changelog]

Publishing (catalog control plane):

mermaid
flowchart LR
  PMC[PMC Publish Run] --> PVM[PVM Snapshot Read]
  PMC --> Search[Search Plane Index]
  PMC --> Events[Changelog/Usage Store]

Pricing + promotions + tax resolution (hot path + policy dependencies):

mermaid
flowchart LR
  Cart[Sales Channel Cart] --> SCM[SCM Pricing Context]
  SCM --> PPM[PPM Price/Resolve]
  PPM --> PVM[PVM Product Attributes]
  PPM --> CRM[CRM Customer Tier/Exemptions]
  PPM --> OFM[OFM Facility/Jurisdiction Context]
  PPM --> Tax[Tax Policy Plane]
  PPM --> SCM

Inventory rebalancing / transfer planning (suggestions → execution):

mermaid
flowchart LR
  Planner[ICS Rebalance Engine] --> Stock[ICS Stock + History]
  Planner --> SCM[SCM Sales Velocity/Forecasts]
  Planner --> PPM[PPM Price/Promo Calendar]
  Planner --> PCM[PCM Inbound Pipeline]
  Planner --> OFM[OFM Facility Constraints]
  Planner --> PVM[PVM Product Constraints]
  Planner --> Suggest[Transfer/Allocation Suggestions]
  Suggest --> Request[ICS Transfer Request Create]
  Request --> Approvals[Approval Gate]
  Approvals --> Ship[Transfer Shipment/Receive]

Notes:

  • Hot-path calls should parallelize PPM/ICS/PMC reads where possible.
  • USM validation and OFM resolve should be cached/short-lived to meet Tier C/D latency.
  • Cross-service dependencies are read-only on hot paths; writes are localized to the owning service.

Approved availability targets

  • Tier 0: 99.95% monthly availability.
  • Tier 1: 99.9% monthly availability.
  • Tier 2: 99.5% monthly availability.

Approved error-rate targets (5xx/timeout)

  • Tier 0 (Tier C/D): ≤ 0.10% error rate per 5‑minute window; ≤ 0.15% monthly.
  • Tier 1 (Tier A/B): ≤ 0.25% error rate per 5‑minute window; ≤ 0.35% monthly.
  • Tier 2 (Tier E async): ≤ 0.50% error rate per 5‑minute window; ≤ 0.75% monthly.
  • 4xx client errors are excluded unless they indicate systemic misconfiguration.

Eventing and reconciliation SLOs

  • Outbox/event publish latency: p95 ≤ 2 minutes, p99 ≤ 10 minutes (operational events).
  • Search-plane indexing latency: p95 ≤ 30 seconds, p99 ≤ 2 minutes (operational search).
  • Reconciliation lag: p95 ≤ 15 minutes, p99 ≤ 60 minutes for derived views.
  • Event pipelines and reconciliation jobs are included in availability/error-rate budgeting.

Approved throughput expectations

  • Support high-volume warehouse scanning and concurrent checkout bursts without degradation.
  • Support multiple facilities per org and many orgs concurrently without global bottlenecks.
  • Sustain event ingestion for analytics without delaying operational writes.
  • Provide predictable backpressure behavior during peak periods (no silent data loss).

Volume assumptions (baseline)

  • Large retailers may emit ~3M sales events/day per org with ~50% seasonal increases.
  • Burst handling should tolerate 10-20x average event rates without pre-allocation.
  • Capacity planning must assume multiple concurrent hot paths per org (checkout, availability, price resolution).

Approved data freshness targets

  • Operational records are visible immediately after commit (read-your-writes within the same workflow).
  • Event and analytics pipelines publish within minutes; longer latency is acceptable only for deep historical aggregates.

Operational vs analytics boundaries

  • Operational queries must be facility-scoped and optimized for low latency.
  • Org-wide aggregations and historical analytics can be asynchronous.
  • Projections are permitted to be slower than current/historical reads.

Guardrails

  • No N+1 patterns in operational queries.
  • Storage capacity is on-demand; designs must avoid hot partitions and not assume provisioned throughput.
  • Idempotent, retry-safe writes for all state transitions.
  • Degraded dependencies must fail closed for policy enforcement but allow read-only where safe.
  • Approval checks must not add meaningful latency to operational calls.