SRE Agents
Adaptive lets SRE agents diagnose and remediate production incidents — pulling traces, querying clusters, and running runbooks — without standing credentials. JIT access, scoped to the alert, with every action recorded. You write the prompts and workflows; Adaptive provides the harness, tools, MCP registry, networking, and guardrails.
SRE agents need real access to production to be useful — Kubernetes clusters, observability stacks, databases, and infrastructure APIs. Granting that access through static service accounts gives every agent the same broad blast radius an on-call engineer has, with none of the human judgement. When an agent restarts the wrong pod or rolls a bad config, there is no scoped credential to revoke and no session to replay.
Production reliability work is the highest-privilege, highest-pressure environment in the company. Agents that help here have to be fast, scoped, and reviewable — most setups deliver on at most one of the three.
Scoped, auditable runbook execution for SRE agents
Adaptive provides the harness, tools, MCP registry, networking, and guardrails — JIT credentials per incident, kube/SSH/cloud access bound to the affected service, and guardrails that block destructive actions until a reviewer signs off. You provide the prompts and workflows. The SRE agent runs your runbooks inside Exo policy envelope, with full session capture for the postmortem.
How Adaptive helps
Alert-Scoped Access
Each SRE agent session is scoped to the alert it was paged on — the affected service, namespace, cluster, and time window. Agents cannot wander into unrelated systems while triaging.
Wire alerts from your monitoring stack into Adaptive. Exo issues a session bound to the alert's labels and revokes it when the incident closes.
JIT Kube, SSH & Cloud Credentials
Generate short-lived kubeconfigs, SSH certificates, and cloud roles per session. No static admin tokens on the agent — credentials expire when the runbook completes.
Onboard clusters and cloud accounts in the control plane once. Agents request credentials per session and operators rotate from one place.
Guardrails for Destructive Actions
Block or require human sign-off on irreversible operations — pod deletes outside the affected service, config rollbacks across environments, schema changes, mass restarts. Safe diagnostics pass through automatically.
Define guardrails per resource and operation. High-risk actions route to the on-call reviewer; reads and known-safe runbooks execute without delay.
Session Replay for Postmortems
Every command, tool call, and MCP invocation an SRE agent makes during an incident is recorded against the alert. Replay the session in the postmortem instead of reconstructing it from chat scrollback.
Stream session events into your SIEM and incident tool. Attach the replay link to the incident record so the timeline writes itself.
SOC2 Type II