Product Brief v3

MSPclaw: L1 Orchestrator
for MSP Service Desks

An open-source orchestration layer on top of existing PSA and RMM tools. Existing tools stay as the system of record. MSPclaw matches community playbooks to incoming tickets and alerts, executes steps via connected tool APIs, and writes results back. Tech controls the logic. AI assists only in authoring.

github.com/thisisprabha/MSPclaw Research: 108 agents, adversarially verified, 2026-06-03 v3: full L1 coverage, light theme

Looking for what to build next? Build Roadmap: Sprints 1-4 →

The Orchestrator Model

MSPclaw never stores ticket or asset data. It receives events from connected tools, identifies the matching community playbook, executes each step through the appropriate tool API, and writes the result back to the PSA. All existing tools retain their role as systems of record.

Event flow: source tools - orchestrator - write-back

Event sources

NinjaRMM alert

Zabbix threshold

SuperOps ticket created

ConnectWise ticket created

Datto BCDR alert

webhook
/ poll

↓

MSPclaw
ORCH.

↓

API
call

Write-back targets

PSA ticket updated

Worklog added

Alert resolved

User notified

Escalated to L2

Core principle MSPs must explain to clients exactly what ran on their infrastructure. "The AI did something" is not an acceptable incident report. A versioned, auditable playbook is. AI assists only in authoring new playbooks, not in executing them.

The Gap

Tool	Layer it covers	PSA integration	Orchestrates across tools	Community playbooks
TacticalRMM	RMM scripting only	None	No	Scripts only
Tracecat	SOC workflow engine	No MSP connectors	SOC tools only	No
n8n / Zapier	Generic workflow	Via generic HTTP	Generic	Not MSP-specific
ConnectWise Automate	RMM + scripting	CW Manage only	Single vendor	Locked
MSPclaw (proposed)	Orchestration layer	Multi-PSA	Any connected tool	Community marketplace

Two Categories of L1 Tickets

L1 covers both system-generated and human-generated tickets. The orchestrator handles both using different trigger types but the same playbook execution model.

Trigger types Alert-driven L1 is triggered by a monitoring tool webhook or poll. User-reported L1 is triggered by a new ticket event with keyword or category matching on the subject and description. Both map to the same playbook step format.

Category	Trigger	Volume (typical MSP)	Auto-resolve potential
Alert-driven L1	Monitoring webhook or poll. Alert fields matched against playbook conditions.	30-60% of total ticket volume	HIGH - system state is measurable
User-reported L1	Ticket created event. Keyword or category match on subject and description.	40-70% of total ticket volume	MEDIUM - user confirmation often needed

Alert-Driven L1 Coverage

Triggered by monitoring systems. MSPclaw receives the alert, executes the matched playbook, and updates the PSA ticket with results.

Scenario	Playbook steps (abbreviated)	Resolution
Storage and performance
Disk full disk_percent > 90	Get disk breakdown via RMM. Clear temp/IIS/CBS logs. Re-check. Escalate if still >85%.	Auto-resolve
High CPU cpu_avg > 85% 10min	Get top processes via RMM. Kill known runaway processes (configurable list). Report to ticket.	Semi-auto
High memory mem_used > 90%	Get memory consumers via RMM. Clear standby list. Report. Escalate if VM or server.	Semi-auto
PC running slow perf_score < 40	Disk cleanup, defrag check, startup items report, resource snapshot. Report findings.	Diagnostic
Services and connectivity
Service down service_status = stopped	Check if in restart-safe list. Attempt restart via RMM. Verify running. Update ticket.	Auto-resolve
Device offline ping_fail > 5min	Check network path via RMM agent on same segment. Attempt WOL if supported. Escalate.	Semi-auto
Network slow latency > threshold	Run traceroute + speed test via RMM. Check device count on segment. Report findings.	Diagnostic
Backup and security
Backup failed backup_status = failed	Get failure reason via Datto/Veeam API. Retry if transient error. Escalate with log.	Semi-auto
AV detection threat_detected	Get threat details. Quarantine confirm. Isolate asset if severity high. Alert L2.	Semi-auto
Failed logins failed_logins > 10/hr	Check source IPs. Lock account if brute-force pattern. Report to ticket. Alert security.	Semi-auto
SSL cert expiry days_remaining < 14	Confirm cert details. Notify account team. Create renewal task in PSA.	Diagnostic
Infrastructure
UPS / power alert ups_status = battery	Check runtime remaining. Alert on-call. Initiate graceful shutdown if runtime < 5min.	Semi-auto
Patch failed patch_status = failed	Get error code via RMM. Retry for known transient codes. Log failure details to ticket.	Semi-auto

User-Reported L1 Coverage

Triggered by a ticket created event. MSPclaw matches against the ticket subject and description using keyword groups and category codes from the PSA. Write-back includes a user-facing reply in addition to the internal worklog.

Key difference from alert-driven User-reported playbooks must also send a communication to the end user, not just update the internal ticket. The playbook write-back step includes a templated user-facing message with what was found and what was done.

Scenario	Trigger keywords / category	Playbook steps (abbreviated)	Resolution
Account and access
Password reset	"password", "locked", "can't log in", "forgot"	Check AD/Entra account status via RMM. Trigger self-service reset or script reset. Reply to user with instructions.	Auto-resolve
Account locked	"locked out", "account locked"	Check lockout reason in AD (bad password source). Unlock if clean. Alert if repeated or unusual source.	Semi-auto
MFA / 2FA issue	"MFA", "authenticator", "2FA", "code not working"	Check user's MFA registration. Re-enrol if needed. Provide self-service link or reset via admin API.	Semi-auto
VPN not connecting	"VPN", "remote access", "can't connect from home"	Check VPN service status via RMM. Check user certificate. Run connectivity test. Report findings.	Diagnostic
Network and connectivity
No WiFi / WiFi issue	"wifi", "wireless", "no internet", "can't connect"	RMM: check adapter status, run netsh reset, get WiFi diagnostics. Reply with steps taken.	Semi-auto
WiFi password request	"wifi password", "wireless password"	Verify user identity against PSA contact. Retrieve SSID credentials from vault. Reply securely.	Auto-resolve
Internet slow	"internet slow", "network slow", "browsing slow"	Run speed test via RMM. Check DNS, clear browser cache via script. Report findings.	Diagnostic
Shared drive / file share	"can't access drive", "mapped drive", "shared folder"	Check share availability via RMM. Re-map drive via script. Verify permissions. Report.	Semi-auto
Devices and peripherals
Printer not working	"printer", "can't print", "printing", "print queue"	RMM: check print spooler, restart if stopped, clear stuck jobs, check driver status. Reply with result.	Semi-auto
Keyboard / mouse	"keyboard", "mouse", "not typing", "cursor"	RMM: check device manager for errors, run HID reset script, update driver check. Report findings.	Diagnostic
Monitor / display	"monitor", "screen", "no display", "second screen"	RMM: check display adapters, run display detect script, check cable type. Provide steps to user.	Diagnostic
PC performance and software
PC running slow	"slow", "computer slow", "freezing", "taking long"	Run disk cleanup, check startup items, get top CPU/memory consumers, defrag check. Report.	Semi-auto
App not loading / crash	"not opening", "crashing", "keeps closing", app name	Check event logs via RMM for app errors. Repair/reinstall via RMM script. Reply with result.	Semi-auto
Email not working	"email", "outlook", "can't send", "not receiving"	Check Exchange/M365 mailbox status, run Outlook diagnostic script, check autodiscover. Report.	Diagnostic
Software install	"install", "need software", "can you install"	Match against approved software list. Deploy via RMM if approved. Escalate if not on list.	Semi-auto
Onboarding and offboarding
New user setup	Category = "Onboarding" or "New starter"	Create AD account, assign license, map drives, install standard apps, add to groups. Update ticket.	Auto-resolve
User offboarding	Category = "Offboarding" or "Leaver"	Disable AD account, revoke M365 license, forward email, remove from groups. Log all steps.	Auto-resolve

Playbook Examples

password-reset-ad.playbook.yamlAuto-resolve

Trigger: ticket created, subject contains "password" or "locked"

Matches on PSA ticket create webhook. Extracts requester email and device from ticket contact fields.

source: psa.ticket_created

Check AD account status

Look up account in Active Directory or Entra ID. Check if enabled, locked, password expired.

via: rmm.run_script(check_ad_account.ps1, user=requester_email)

Branch: account locked vs password expired vs unknown

locked: step 2 / expired: step 2 / unknown: escalate

Reset or unlock account

Unlock account and force password reset at next login. Or trigger self-service reset email via Entra SSPR.

via: rmm.run_script(reset_ad_account.ps1)

Write back: update ticket + reply to user

Close ticket with internal note (what was done + AD log). Send user-facing reply with next steps.

via: psa.update_ticket(status=resolved) + psa.reply_to_user(template=password_reset_done)

disk-full-windows.playbook.yamlAuto-resolve

Trigger: alert.disk_percent > 90

Matches on RMM or Zabbix alert webhook. Extracts asset ID and disk path.

source: rmm.alert or monitoring.threshold

Get disk breakdown (top directories by size)

via: rmm.run_script(get_disk_breakdown.ps1, asset=asset_id)

Clear temp files, IIS logs, CBS logs

via: rmm.run_script(clear_temp_and_logs.ps1)

Branch: disk still >85% after cleanup?

yes: escalate to L2 / no: resolve

Write back: update PSA ticket + worklog

Update ticket with disk before/after, bytes freed, steps taken. Add worklog entry. Close or assign to L2.

via: psa.update_ticket() + psa.add_worklog(auto_generated=true)

Playbook Builder Tool

The builder is what makes the marketplace sustainable. Without a low-friction authoring tool, only developers contribute. With one, L1 techs contribute from the operational knowledge they actually have.

Key design decision AI can suggest a full playbook draft from a description ("user forgot password, we use AD on-prem"). But the tech reviews every step before saving. AI does not push to staging without tech sign-off. The tech is accountable for what runs, so the tech must approve it.

Feature	What it does
Step sequencer	Drag-and-drop action blocks: run_script, check_condition, branch, update_ticket, reply_to_user. No YAML required to build.
Trigger builder	Form UI for alert conditions (field, operator, value) and keyword groups for user-reported tickets. No query language needed.
Test mode (dry-run)	Run against a real asset with all write operations in preview mode. Shows exact API call, expected output, and what would be written back to the PSA before committing.
AI suggest	Describe the scenario in plain text. AI generates a draft playbook with steps and trigger conditions. Tech reviews, edits each step, and approves before saving.
Publish to staging	One-click publish to community staging tier. Shows diff vs previous version. Requires test mode to have been run at least once.
Version history	Every change tracked with author and timestamp. Rollback to previous version in one click. Forking from any community playbook creates a versioned copy.

Marketplace Model

Three-tier contribution model based on TacticalRMM community-scripts (verified working pattern). License: AGPL v3.0. Self-hosted: unlimited playbook executions, no caps. Revenue from managed cloud hosting, not the software.

Community

WIP and staging playbooks. Published via builder tool. Not bundled with installs. Anyone can publish; no review required to reach staging.

free

Verified

Maintainer-reviewed playbooks. Must pass a test matrix across declared PSA and RMM support. Ships bundled with every MSPclaw install.

free

Partner Cloud

Managed cloud hosting and SLAs for MSPs who don't want to self-host. Includes support. Revenue source for the project.

paid hosting

Integration Priority

Minimum viable surface: webhook receive + run script on asset (RMM) or create/update/close ticket + add worklog + reply to user (PSA).

#	Platform	Role	Rationale
1	SuperOps	PSA + RMM	Your domain. Modern REST API and clean webhooks. Ship first. Covers both layers in one connector.
2	ConnectWise Manage	PSA	Largest PSA install base. Webhooks unreliable - use polling fallback. Required for credibility.
3	NinjaRMM	RMM	Primary RMM for script execution. Clean API for alert webhooks and remote script runs.
4	Halo PSA	PSA	Clean REST API. Fast-growing in UK and EU markets. Good early-adopter profile.
5	Datto Autotask	PSA	Large install base. Older API but broad coverage justifies it.
6	Atera	RMM + PSA	All-in-one for smaller MSPs. Single connector covers both layers.

Open Questions

1Is there a playbook or runbook-sharing community in MSPGeek Slack or GitHub that can seed the marketplace at launch? Empty marketplace on day one kills adoption.

2For user-reported L1, who supplies the user-facing reply templates? Are they part of the playbook definition or configured per MSP? This affects the playbook schema design.

3AGPL vs MIT: protect against hosted resale (AGPL, slower growth) or maximise adoption speed (MIT, SaaS forks possible)?

4How does the playbook builder test mode access real assets safely? Opt-in dry-run per step, a dedicated test asset per customer, or a sandbox RMM environment?

5ConnectWise webhook reliability: is polling acceptable for v1, or should the CW connector wait for v2 while SuperOps + NinjaRMM + Halo ships first?

6For the keyword matching on user-reported tickets: is NLP intent detection needed for v1, or is a well-curated keyword group list good enough to start?

MSPclaw: L1 Orchestratorfor MSP Service Desks