← Back to Blog

DevOps/SRE MCP Pack (2025) — Kubernetes, AWS, Prometheus: Triage and Recover Fast

Published 10/2/2025• Updated 10/2/2025• By RouterMCP Team

Operate services with cluster queries, runbooks, and real-time metrics via MCP. Includes setup and an incident drill.

Incident view connecting K8s logs, Prometheus metrics, and AWS rollback.

DevOps/SRE MCP Pack (2025) — Kubernetes, AWS, Prometheus: Triage and Recover Fast

TL;DR: Ask for failing pods, recent errors, and a rollback in plain language — then execute with approvals.

Servers

Kubernetes MCP (community). https://github.com/SedulousSuchcha/mcp-kubernetes
AWS MCP (community). https://github.com/isaacwasserman/mcp-aws
Prometheus MCP (community). https://github.com/realrasengan/mcp-prometheus

Incident drill

“Show pods with restartCount>3 in checkout ns.”
“Graph 5m error rate for api_gateway.”
“Roll back to previous task definition; confirm before apply.”

Internal links

Pack docs: /packs/devops-sre
Related posts: Observability (10), Security (01)

FAQ Q: How do we prevent destructive commands?
A: Require confirmations and role‑based policies per tool.

Schema

Checklist (fast)

Intent. 2) Title/meta. 3) Slug. 4) TL;DR. 5) Drill. 6) FAQ. 7) Links. 8) Images/alt. 9) Edit. 10) CTA.

CTA

Use the template: examples/packs/devops-sre.mcp.json and the “safe ops” policy examples + limiter configs.