π‘ New Finds
SRE Agent Becomes a Virtual Responder
PagerDuty's SRE Agent evolved from an assistant into a first-line virtual responder. It now handles detection, triage, and initial diagnostics autonomously β escalating to humans only when it hits its confidence boundary. This is the closest any major ops platform has come to fully autonomous incident response.
- Handles detection β triage β initial diagnostics without human intervention
- Escalates with full context package when it hits a wall
- Positioned as shift-left prevention, not just response
Multi-Agent Ecosystem via MCP
PagerDuty opened its agent platform through Model Context Protocol (MCP), enabling third-party AI agents to plug into the Operations Cloud as coordinated responders. Think: Datadog agent detects anomaly β PagerDuty agent triages β Slack agent notifies β human validates. All coordinated, not siloed.
- MCP connectors for agent-to-agent handoffs across vendors
- Operations Cloud acts as the coordination layer, not just a ticketing system
- Partners include observability, DevOps, and coding tools
5-Step Agentic Workflow Builder
PagerDuty published their framework for automating critical workflows with AI agents: Identify β Design β Build β Test β Deploy. Not just marketing β it's a structured onboarding path with real hooks into their platform. Targets the "islands of automation" problem where teams have tools that don't talk to each other.
- Framework guides teams from manual chaos β autonomous operations
- Each step maps to PagerDuty product capabilities
- Designed for non-AI-engineers to configure agents
Slack-Native Incident Management
Deeper Slack integration means incident response happens where engineers already are. AI agents run diagnostics, pull runbooks, and update status β all within Slack threads. No context-switching to a separate dashboard.
- Full incident lifecycle inside Slack threads
- AI agents respond directly in-channel
- Reduces mean-time-to-engage by eliminating tool switching
π Agentic Incident Lifecycle
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AGENTIC INCIDENT LIFECYCLE β
β β
β TRIGGERS AI CLASSIFY AI TRIAGE β
β βββββββββ βββββββββββ βββββββββ β
β β’ Alert fires β’ PagerDuty β’ SRE Agent β
β β’ Metric breach Advance reads assesses severity β
β β’ Log anomaly alert payload β’ Pulls similar β
β β’ User report β’ ML noise past incidents β
β β’ API webhook reduction β’ Checks affected β
β β’ Groups related services β
β alerts β
β β β β β
β βΌ βΌ βΌ β
β βββββββββββ βββββββββββ βββββββββββββ β
β β INGEST ββββββββΆβCLASSIFY ββββββββΆβ TRIAGE β β
β βββββββββββ βββββββββββ βββββββββββββ β
β β β
β βββββββββββββββββββββββββββΌβββββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β βββββββββββββ βββββββββββββ βββββββββββ β
β βAUTO-HEAL β β ESCALATE β βSUPPRESS β β
β β(Agent β β to Human β β(Noise) β β
β β executes β β w/Context β β β β
β β runbook) β β Package β β β β
β βββββββββββββ βββββββββββββ βββββββββββ β
β β β β
β βΌ βΌ β
β βββββββββββββ βββββββββββββ β
β β VERIFY & β β HUMAN β β
β β CLOSE β β RESOLVES β β
β β(Agent β β β β
β β confirms β β β β
β β fix) β β β β
β βββββββββββββ βββββββββββββ β
β β β β
β βββββββββββ¬ββββββββββββββββ β
β βΌ β
β βββββββββββββββββ β
β β FEEDBACK LOOP β β
β β β’ Update KB β β
β β β’ Retrain ML β β
β β β’ Refine β β
β β runbooks β β
β βββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Where AI Intervenes vs. Where Humans Take Over
| Stage | AI Agent | Human | Boundary Condition |
|---|---|---|---|
| Ingest | β Noise reduction, grouping | β | Fully automated |
| Classify | β ML + LLM classification | β | Uncertain β flag for review |
| Triage | β Severity, impact, similar past | Review if novel | Confidence < threshold |
| Auto-Heal | β Runbook execution | Approve if destructive | Destructive actions gated |
| Escalate | β Context package assembly | β Resolution | Handoff with full context |
| Verify | β Health check, automated tests | Spot-check | Sampling for QA |
| Close | β Auto-close with summary | Review summary | Post-mortem triggers |
| Learn | β KB update, model feedback | β Post-mortem writeup | Human writes, AI enriches |
π° Pricing Intelligence
PagerDuty AIOps + Advance (as of June 2026)
| Product | Starting Price | Model | What You Get |
|---|---|---|---|
| AIOps | $699/mo (annual) $799/mo (monthly) |
Consumption-based | ML noise reduction, event grouping, real-time context, event-driven automation |
| Advance (Gen AI Add-on) | $415/mo (annual only) | Per-seat add-on | AI Agents, generative runbooks, intelligent triage, Slack-native AI, post-incident summaries |
| Operations Cloud (Bundle) | Custom pricing | Platform + consumption | Incident Mgmt + AIOps + Automation + Customer Service Ops |
Key takeaway: Advance requires annual commitment and is add-on only β you can't buy it standalone. The AI agents live behind a paywall that assumes you're already deep in the PagerDuty ecosystem. No transparent per-resolution pricing (unlike Atlassian Rovo's $1/resolution model).
How This Compares to Last Week's Pricing
| Vendor | AI Pricing Unit | Entry Price | Transparency |
|---|---|---|---|
| Atlassian Rovo (JSM) | Per resolution ($1.00) | $0 + consumption | β Transparent |
| ServiceNow Now Assist | Per user/month (tiered) | Custom quote | β Opaque |
| PagerDuty Advance | Flat add-on ($415/mo) | $415/mo + AIOps base | β οΈ Partial |
| Freshworks Freddy AI | Per agent/month | $49/agent/mo | β Transparent |
PagerDuty's flat add-on model is simpler to budget for than consumption pricing β but it also means you pay the same whether your AI agents resolve 10 incidents or 10,000. No incentive alignment on resolution volume.
π Worth Watching
Competitive Positioning: PagerDuty vs. The Field
| Dimension | PagerDuty | ServiceNow | Atlassian (JSM+Rovo) |
|---|---|---|---|
| Agent Autonomy | ββββ SRE Agent as first responder | βββ Now Assist aids humans | ββ Rovo agents, early stage |
| Multi-Agent Protocol | ββββ MCP β open, cross-vendor | ββ Proprietary only | ββ MCP connectors, but ecosystem-locked |
| ITSM Depth | ββ Incident-only, no ITSM | βββββ Full ITIL suite | ββββ JSM covers ITSM well |
| MSP Readiness | ββ Multi-tenant, but no PSA | βββ MSP partner program | β No native MSP features |
| Pricing Clarity | βββ Flat add-on, partial | β Opaque enterprise | ββββ Per-resolution, transparent |
| Slack/DevOps Native | βββββ Best in class | ββ Separate module | βββ Good, not ops-first |
PagerDuty's Architecture: Operations Cloud as Coordination Layer
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β OPERATIONS CLOUD β
β βββββββββββ βββββββββββ ββββββββββββ βββββββββββββ β
β βIncident β β AIOps β βAutomationβ βCustomer β β
β βMgmt β β β β β βService Opsβ β
β ββββββ¬βββββ ββββββ¬βββββ ββββββ¬ββββββ βββββββ¬ββββββ β
β β β β β β
β ββββββββββββββΌβββββββββββββΌββββββββββββββββ β
β β β β
β βββββββββ΄βββββββββββββ΄ββββββββ β
β β PAGERDUTY ADVANCE β β
β β ββββββββββββββββββββββββ β β
β β β AI AGENTS β β β
β β β β’ SRE Agent β β β
β β β β’ Triage Agent β β β
β β β β’ Diagnostics Agent β β β
β β β β’ Runbook Agent β β β
β β ββββββββββββ¬ββββββββββββ β β
β βββββββββββββββΌβββββββββββββββ β
β β β
β βββββββββββββββΌβββββββββββββββ β
β β MCP CONNECTOR LAYER β β
β β ββββββββ ββββββββ ββββββββ β
β β βSlack β βGitHubβ βData-ββ ... more β
β β βAgent β βAgent β βdog ββ partners β
β β ββββββββ ββββββββ ββββββββ β
β βββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
THIRD-PARTY AGENTS (MCP)
ββββββββββ ββββββββββ ββββββββββ
βDatadog β β New β βCustom β
β AI β β Relic β β Agent β
ββββββ¬ββββ βββββ¬βββββ βββββ¬βββββ
β β β
ββββββββββββΌββββββββββββ
β
[MCP Protocol]
β
Operations Cloud Coordination
The MCP layer is the big bet: PagerDuty doesn't need to build every agent. They want to be the coordination fabric that routes work between specialized agents from any vendor. If this takes off, they become the "Kubernetes of operational AI."
π οΈ MSP Relevance Assessment
Where PagerDuty Fits (and Doesn't) for MSPs
| MSP Need | PagerDuty Fit | Gap |
|---|---|---|
| Multi-tenant alert routing | β Teams + escalation policies | No native client-separation billing |
| Automated NOC | β SRE Agent as first responder | Requires runbook investment |
| Client-facing AI | β οΈ Customer Service Ops exists | Not MSP-grade yet |
| PSA integration | β None native | ConnectWise/Autotask missing |
| RMM alert ingestion | β Via webhooks + 700+ integrations | Requires custom mapping |
| Per-client billing | β Not designed for MSP model | Flat per-seat pricing doesn't align |
| Co-managed IT | β οΈ Possible via teams | No co-managed workflow templates |
Bottom line: PagerDuty is ops-native, not MSP-native. The SRE Agent is genuinely impressive for internal DevOps teams. But MSPs need PSA integration, client billing, and co-managed workflows β none of which PagerDuty has prioritized. Not a drop-in MSP solution.
π Action Items
Next Research Candidates
- Issue #5 (June 8): ConnectWise + RMM AI β the MSP-native angle. How does the dominant PSA/RMM vendor approach agentic AI?
- Issue #6 (June 15): Datadog AIOps/Watchdog β observability-first agentic AI, natural complement to PagerDuty
- Backlog candidates: Atomicwork (emerging ITSM AI), NinjaOne (AI Co-Pilot for MSPs), HaloPSA (AI features)
Design Takeaways for SuperOps
- PagerDuty's MCP-based multi-agent protocol is the standard to watch β if you're building an AI service desk, MCP compatibility should be on the roadmap
- The "virtual responder" pattern (agent as first line, human as escalation) is the UX model to steal β it's cleaner than chatbot-in-a-widget
- PagerDuty's flat add-on pricing ($415/mo) vs Atlassian's consumption ($1/resolution) β two competing models, worth tracking which wins
- Slack-native incident management is the bar now β if your AI doesn't work where engineers live, it's friction