Performance and SLO Targets (Business-Level)

These targets describe required business behavior. The values below are the approved baseline and can be revised only by policy change.

SLO tiers (conceptual)

Tier 0 (hot path): customer-facing or scan-based workflows that must feel near-instant.
Tier 1 (operational): critical back-office workflows that must be fast and predictable.
Tier 2 (batch): large imports/exports and analytics pipelines that can be asynchronous.

Route-class taxonomy (OpenAPI-encoded)

Route classes are embedded in OpenAPI operations using vendor extensions so each endpoint has explicit targets.

OpenAPI fields (per operation):

x-route-class: Tier A | Tier B | Tier C | Tier D | Tier E
x-qps-target: peak QPS target (per org, per route)
x-concurrency-target: peak concurrency target (per org, per route)
x-latency-p95-ms: p95 latency target
x-latency-p99-ms: p99 latency target (Tier C/D only; others omit)

Tier meanings (used for route-class assignment):

Tier A (control plane): low-QPS configuration/approval/admin writes.
Tier B (reference reads): lookup/list/reference reads (non-hot path).
Tier C (hot operational reads): cart/checkout/availability/pricing reads.
Tier D (hot operational writes): orders, reservations, receiving, fulfillment.
Tier E (async/background): exports, reconciliation, analytics rollups.

Tier defaults (when a route does not override with a specific target):

Tier	Peak QPS	Peak concurrency	P95	P99
Tier A	50	200	500 ms	—
Tier B	300	500	300 ms	—
Tier C	3000	4000	150 ms	400 ms
Tier D	1000	2000	300 ms	600 ms
Tier E	—	—	—	—

Mapping to conceptual tiers:

Tier C/D → Tier 0 (hot path)
Tier A/B → Tier 1 (operational)
Tier E → Tier 2 (batch/async) Exception: USM validate and OFM resolve are treated as Tier 0 for availability/error budgeting even if assigned Tier B targets.

Client timeouts (baseline)

Tier	Recommended client timeout
Tier A	5–10s
Tier B	3–5s
Tier C	1–2s
Tier D	2–4s
Tier E	30s+ (async jobs)

If a service documents a different timeout, follow that guidance.

Caching rules (baseline)

Safe to cache: reference data (catalog lookups, read-only lists) when explicitly documented.
Do not cache: session validation, authorization decisions, mutable operational records.
Short-lived cache: member/resolve and api_key/validate can be cached for seconds to reduce load (per service guidance).
Invalidate on write: any write must invalidate related cached reads for that entity scope.

Approved latency targets

Workflow	p95 target	p99 target	Notes
Stock availability check (ICS)	200 ms	750 ms	Facility-scoped; used by checkout and pick.
Price resolution (PPM)	150 ms	500 ms	Must be fast enough for cart/checkout.
Reserve/allocate/commit (ICS+SCM)	500 ms	1500 ms	Includes channel context.
Checkout / order place (SCM)	1 s	3 s	Includes price snapshot + reserve/commit.
Receiving scan / putaway (ICS)	300 ms	1 s	Scanner workflows must feel instant.
Transfer submit / approval (ICS)	1 s	3 s	Excludes long-running shipment steps.
Return intake (SCM+ICS)	1 s	3 s	Includes disposition capture.
Procurement submission (PCM)	2 s	5 s	Approval step may be async.
Analytics aggregation (BI/KPI)	5 min	30 min	Asynchronous; no operational blocking.

Performance dependency diagrams (cross-service)

Checkout / order placement (hot path):

mermaid

flowchart LR
  UI[Sales Channel Cart/Checkout] --> SCM[SCM Order Place]
  SCM -->|price resolution| PPM[PPM Price/Resolve]
  SCM -->|availability| ICS[ICS Reserve/Allocate]
  SCM -->|catalog lookup| PMC[PMC Product Lookup]
  SCM -->|auth| USM[USM Validate]
  SCM -->|roles| OFM[OFM Member Resolve]
  PMC -->|published snapshot| SCM

Receiving / returns (inventory updates):

mermaid

flowchart LR
  PCM[PCM Receipt Record] --> ICS[ICS Receiving/Putaway]
  SCM[SCM Return Receive] --> ICS
  ICS --> Events[Eventing/Event Changelog]

Publishing (catalog control plane):

mermaid

flowchart LR
  PMC[PMC Publish Run] --> PVM[PVM Snapshot Read]
  PMC --> Search[Search Plane Index]
  PMC --> Events[Changelog/Usage Store]

Pricing + promotions + tax resolution (hot path + policy dependencies):

mermaid

flowchart LR
  Cart[Sales Channel Cart] --> SCM[SCM Pricing Context]
  SCM --> PPM[PPM Price/Resolve]
  PPM --> PVM[PVM Product Attributes]
  PPM --> CRM[CRM Customer Tier/Exemptions]
  PPM --> OFM[OFM Facility/Jurisdiction Context]
  PPM --> Tax[Tax Policy Plane]
  PPM --> SCM

Inventory rebalancing / transfer planning (suggestions → execution):

mermaid

flowchart LR
  Planner[ICS Rebalance Engine] --> Stock[ICS Stock + History]
  Planner --> SCM[SCM Sales Velocity/Forecasts]
  Planner --> PPM[PPM Price/Promo Calendar]
  Planner --> PCM[PCM Inbound Pipeline]
  Planner --> OFM[OFM Facility Constraints]
  Planner --> PVM[PVM Product Constraints]
  Planner --> Suggest[Transfer/Allocation Suggestions]
  Suggest --> Request[ICS Transfer Request Create]
  Request --> Approvals[Approval Gate]
  Approvals --> Ship[Transfer Shipment/Receive]

Notes:

Hot-path calls should parallelize PPM/ICS/PMC reads where possible.
USM validation and OFM resolve should be cached/short-lived to meet Tier C/D latency.
Cross-service dependencies are read-only on hot paths; writes are localized to the owning service.

Approved availability targets

Tier 0: 99.95% monthly availability.
Tier 1: 99.9% monthly availability.
Tier 2: 99.5% monthly availability.

Approved error-rate targets (5xx/timeout)

Tier 0 (Tier C/D): ≤ 0.10% error rate per 5‑minute window; ≤ 0.15% monthly.
Tier 1 (Tier A/B): ≤ 0.25% error rate per 5‑minute window; ≤ 0.35% monthly.
Tier 2 (Tier E async): ≤ 0.50% error rate per 5‑minute window; ≤ 0.75% monthly.
4xx client errors are excluded unless they indicate systemic misconfiguration.

Eventing and reconciliation SLOs

Outbox/event publish latency: p95 ≤ 2 minutes, p99 ≤ 10 minutes (operational events).
Search-plane indexing latency: p95 ≤ 30 seconds, p99 ≤ 2 minutes (operational search).
Reconciliation lag: p95 ≤ 15 minutes, p99 ≤ 60 minutes for derived views.
Event pipelines and reconciliation jobs are included in availability/error-rate budgeting.

Approved throughput expectations

Support high-volume warehouse scanning and concurrent checkout bursts without degradation.
Support multiple facilities per org and many orgs concurrently without global bottlenecks.
Sustain event ingestion for analytics without delaying operational writes.
Provide predictable backpressure behavior during peak periods (no silent data loss).

Volume assumptions (baseline)

Large retailers may emit ~3M sales events/day per org with ~50% seasonal increases.
Burst handling should tolerate 10-20x average event rates without pre-allocation.
Capacity planning must assume multiple concurrent hot paths per org (checkout, availability, price resolution).

Approved data freshness targets

Operational records are visible immediately after commit (read-your-writes within the same workflow).
Event and analytics pipelines publish within minutes; longer latency is acceptable only for deep historical aggregates.

Operational vs analytics boundaries

Operational queries must be facility-scoped and optimized for low latency.
Org-wide aggregations and historical analytics can be asynchronous.
Projections are permitted to be slower than current/historical reads.

Guardrails

No N+1 patterns in operational queries.
Storage capacity is on-demand; designs must avoid hot partitions and not assume provisioned throughput.
Idempotent, retry-safe writes for all state transitions.
Degraded dependencies must fail closed for policy enforcement but allow read-only where safe.
Approval checks must not add meaningful latency to operational calls.

Performance and SLO Targets (Business-Level) ​

SLO tiers (conceptual) ​

Route-class taxonomy (OpenAPI-encoded) ​

Client timeouts (baseline) ​

Caching rules (baseline) ​

Approved latency targets ​

Performance dependency diagrams (cross-service) ​

Approved availability targets ​

Approved error-rate targets (5xx/timeout) ​

Eventing and reconciliation SLOs ​

Approved throughput expectations ​

Volume assumptions (baseline) ​

Approved data freshness targets ​

Operational vs analytics boundaries ​

Guardrails ​

Performance and SLO Targets (Business-Level)

SLO tiers (conceptual)

Route-class taxonomy (OpenAPI-encoded)

Client timeouts (baseline)

Caching rules (baseline)

Approved latency targets

Performance dependency diagrams (cross-service)

Approved availability targets

Approved error-rate targets (5xx/timeout)

Eventing and reconciliation SLOs

Approved throughput expectations

Volume assumptions (baseline)

Approved data freshness targets

Operational vs analytics boundaries

Guardrails