Skip to content

Integration Plane Management (IPM)

Status: implemented (core integration plane).

Purpose

Provide the external integration plane for downstream systems:

  • Event catalog for stack-module events.
  • Export contracts aligned to event streams.
  • Webhooks/subscriptions with retry, signatures, and scoped delivery (org/channel/facility).
  • CDC feeds (org-scoped with service/channel/facility/time filters).
  • Bulk import/export jobs for event streams with continuation support.
  • Contract-maturity surfaces (lifecycle/state machines, specimen payloads, pagination conventions).
  • KPI snapshots and alerts (org/facility buckets) with optional inbox delivery.

System-of-record boundaries

  • IPM does not own domain records. It mirrors events emitted by domain systems.
  • CDC and bulk exports are derived from ingested event streams.
  • Webhook delivery failures are tracked independently from upstream events.

Core workflows

  • Catalog + lifecycle discovery: list event catalog, lifecycle/state-machine summaries, and specimen payloads.
  • Export contract: list per-service export contract metadata aligned to event streams.
  • Webhook subscription: create a webhook with filters, deliver events with HMAC signatures, retry on failure.
  • Webhook replay: re-deliver matching events from the CDC store with a cursor for continuation.
  • CDC feed: list ingested events by org, with optional service/channel/facility/time filters and pagination.
  • Bulk export: create export job to generate NDJSON files for event streams, continue with cursors.
  • Bulk import: upload NDJSON events and commit for ingestion; continuations resume from line offsets.
  • KPI snapshots + alerts: record KPI snapshots, query by time bucket, and emit threshold alerts to inbox.

Data contracts

  • Request context required for write operations and stored with job/subscription metadata.
  • Events carry request_context snapshots, reason codes, and source_refs.
  • CDC feeds return the event envelope as stored, with redaction metadata in the catalog.

Performance posture

  • Read/list operations are Tier B/C (low latency with pagination).
  • Write operations (webhook and bulk job mutations) are Tier D.

Failure posture

  • Webhook delivery failures are recorded and retried with backoff.
  • Webhooks auto-pause after repeated delivery failures; operators can re-enable (resetting failure counters).
  • CDC ingestion is idempotent by event_id and timestamp; duplicates are tolerated.
  • Bulk jobs are queued and executed asynchronously; partial failures include reports.

Webhook secret posture

  • signing_secret is returned once on create; get/list return only secret_last4.
  • Rotate by creating a new webhook, verifying delivery, then revoking the old webhook.