docs: update refactoring summary and mastery plan to reflect completion of all 11 atomic skills
Some checks failed
Some checks failed
- Mark Phase 2 as completed with all 11/11 atomic skills created - Update skill counts: AITBC skills (6/6), OpenClaw skills (5/5) - Move aitbc-node-coordinator and aitbc-analytics-analyzer from remaining to completed - Update Phase 3 status from PLANNED to IN PROGRESS - Add Gitea-based node synchronization documentation (replaces SCP) - Clarify two-node architecture with same port (8006) on different I
This commit is contained in:
151
.windsurf/skills/openclaw-error-handler.md
Normal file
151
.windsurf/skills/openclaw-error-handler.md
Normal file
@@ -0,0 +1,151 @@
|
||||
---
|
||||
description: Atomic OpenClaw error detection and recovery procedures with deterministic outputs
|
||||
title: openclaw-error-handler
|
||||
version: 1.0
|
||||
---
|
||||
|
||||
# OpenClaw Error Handler
|
||||
|
||||
## Purpose
|
||||
Detect, diagnose, and recover from errors in OpenClaw agent operations with systematic error handling and recovery procedures.
|
||||
|
||||
## Activation
|
||||
Trigger when user requests error handling: error diagnosis, recovery procedures, error analysis, or system health checks.
|
||||
|
||||
## Input
|
||||
```json
|
||||
{
|
||||
"operation": "detect|diagnose|recover|analyze",
|
||||
"agent": "agent_name",
|
||||
"error_type": "execution|communication|configuration|timeout|unknown",
|
||||
"error_context": "string (optional)",
|
||||
"recovery_strategy": "auto|manual|rollback|retry"
|
||||
}
|
||||
```
|
||||
|
||||
## Output
|
||||
```json
|
||||
{
|
||||
"summary": "Error handling operation completed successfully",
|
||||
"operation": "detect|diagnose|recover|analyze",
|
||||
"agent": "agent_name",
|
||||
"error_detected": {
|
||||
"type": "string",
|
||||
"severity": "critical|high|medium|low",
|
||||
"timestamp": "number",
|
||||
"context": "string"
|
||||
},
|
||||
"diagnosis": {
|
||||
"root_cause": "string",
|
||||
"affected_components": ["component1", "component2"],
|
||||
"impact_assessment": "string"
|
||||
},
|
||||
"recovery_applied": {
|
||||
"strategy": "string",
|
||||
"actions_taken": ["action1", "action2"],
|
||||
"success": "boolean"
|
||||
},
|
||||
"issues": [],
|
||||
"recommendations": [],
|
||||
"confidence": 1.0,
|
||||
"execution_time": "number",
|
||||
"validation_status": "success|partial|failed"
|
||||
}
|
||||
```
|
||||
|
||||
## Process
|
||||
|
||||
### 1. Analyze
|
||||
- Scan agent logs for errors
|
||||
- Identify error patterns
|
||||
- Assess error severity
|
||||
- Determine error scope
|
||||
|
||||
### 2. Diagnose
|
||||
- Analyze root cause
|
||||
- Trace error propagation
|
||||
- Identify affected components
|
||||
- Assess impact
|
||||
|
||||
### 3. Execute Recovery
|
||||
- Select recovery strategy
|
||||
- Apply recovery actions
|
||||
- Monitor recovery progress
|
||||
- Validate recovery success
|
||||
|
||||
### 4. Validate
|
||||
- Verify error resolution
|
||||
- Check system stability
|
||||
- Validate agent functionality
|
||||
- Confirm no side effects
|
||||
|
||||
## Constraints
|
||||
- **MUST NOT** modify critical system files
|
||||
- **MUST NOT** exceed 60 seconds for error diagnosis
|
||||
- **MUST** preserve error logs for analysis
|
||||
- **MUST** validate recovery before applying
|
||||
- **MUST** rollback on recovery failure
|
||||
|
||||
## Environment Assumptions
|
||||
- Agent logs accessible at `/var/log/aitbc/`
|
||||
- Error tracking system functional
|
||||
- Recovery procedures documented
|
||||
- Agent state persistence available
|
||||
- System monitoring active
|
||||
|
||||
## Error Handling
|
||||
- Recovery failure → Attempt alternative recovery strategy
|
||||
- Multiple errors → Prioritize by severity
|
||||
- Unknown error type → Apply generic recovery procedure
|
||||
- System instability → Emergency rollback
|
||||
|
||||
## Example Usage Prompt
|
||||
|
||||
```
|
||||
Diagnose and recover from execution errors in main agent
|
||||
```
|
||||
|
||||
## Expected Output Example
|
||||
|
||||
```json
|
||||
{
|
||||
"summary": "Error diagnosed and recovered successfully in main agent",
|
||||
"operation": "recover",
|
||||
"agent": "main",
|
||||
"error_detected": {
|
||||
"type": "execution",
|
||||
"severity": "high",
|
||||
"timestamp": 1775811500,
|
||||
"context": "Transaction processing timeout during blockchain sync"
|
||||
},
|
||||
"diagnosis": {
|
||||
"root_cause": "Network latency causing P2P sync timeout",
|
||||
"affected_components": ["p2p_network", "transaction_processor"],
|
||||
"impact_assessment": "Delayed transaction processing, no data loss"
|
||||
},
|
||||
"recovery_applied": {
|
||||
"strategy": "retry",
|
||||
"actions_taken": ["Increased timeout threshold", "Retried transaction processing"],
|
||||
"success": true
|
||||
},
|
||||
"issues": [],
|
||||
"recommendations": ["Monitor network latency for future occurrences", "Consider implementing adaptive timeout"],
|
||||
"confidence": 1.0,
|
||||
"execution_time": 18.3,
|
||||
"validation_status": "success"
|
||||
}
|
||||
```
|
||||
|
||||
## Model Routing Suggestion
|
||||
|
||||
**Reasoning Model** (Claude Sonnet, GPT-4)
|
||||
- Complex error diagnosis
|
||||
- Root cause analysis
|
||||
- Recovery strategy selection
|
||||
- Impact assessment
|
||||
|
||||
**Performance Notes**
|
||||
- **Execution Time**: 5-30 seconds for detection, 15-45 seconds for diagnosis, 10-60 seconds for recovery
|
||||
- **Memory Usage**: <150MB for error handling operations
|
||||
- **Network Requirements**: Agent communication for error context
|
||||
- **Concurrency**: Safe for sequential error handling on different agents
|
||||
Reference in New Issue
Block a user