docs: update refactoring summary and mastery plan to reflect completion of all 11 atomic skills

- Mark Phase 2 as completed with all 11/11 atomic skills created - Update skill counts: AITBC skills (6/6), OpenClaw skills (5/5) - Move aitbc-node-coordinator and aitbc-analytics-analyzer from remaining to completed - Update Phase 3 status from PLANNED to IN PROGRESS - Add Gitea-based node synchronization documentation (replaces SCP) - Clarify two-node architecture with same port (8006) on different I
2026-04-10 12:46:09 +02:00
parent 6bfd78743d
commit 084dcdef31
15 changed files with 2400 additions and 240 deletions
--- a/.windsurf/skills/openclaw-error-handler.md
+++ b/.windsurf/skills/openclaw-error-handler.md
@@ -0,0 +1,151 @@
+---
+description: Atomic OpenClaw error detection and recovery procedures with deterministic outputs
+title: openclaw-error-handler
+version: 1.0
+---
+
+# OpenClaw Error Handler
+
+## Purpose
+Detect, diagnose, and recover from errors in OpenClaw agent operations with systematic error handling and recovery procedures.
+
+## Activation
+Trigger when user requests error handling: error diagnosis, recovery procedures, error analysis, or system health checks.
+
+## Input
+```json
+{
+  "operation": "detect|diagnose|recover|analyze",
+  "agent": "agent_name",
+  "error_type": "execution|communication|configuration|timeout|unknown",
+  "error_context": "string (optional)",
+  "recovery_strategy": "auto|manual|rollback|retry"
+}
+```
+
+## Output
+```json
+{
+  "summary": "Error handling operation completed successfully",
+  "operation": "detect|diagnose|recover|analyze",
+  "agent": "agent_name",
+  "error_detected": {
+    "type": "string",
+    "severity": "critical|high|medium|low",
+    "timestamp": "number",
+    "context": "string"
+  },
+  "diagnosis": {
+    "root_cause": "string",
+    "affected_components": ["component1", "component2"],
+    "impact_assessment": "string"
+  },
+  "recovery_applied": {
+    "strategy": "string",
+    "actions_taken": ["action1", "action2"],
+    "success": "boolean"
+  },
+  "issues": [],
+  "recommendations": [],
+  "confidence": 1.0,
+  "execution_time": "number",
+  "validation_status": "success|partial|failed"
+}
+```
+
+## Process
+
+### 1. Analyze
+- Scan agent logs for errors
+- Identify error patterns
+- Assess error severity
+- Determine error scope
+
+### 2. Diagnose
+- Analyze root cause
+- Trace error propagation
+- Identify affected components
+- Assess impact
+
+### 3. Execute Recovery
+- Select recovery strategy
+- Apply recovery actions
+- Monitor recovery progress
+- Validate recovery success
+
+### 4. Validate
+- Verify error resolution
+- Check system stability
+- Validate agent functionality
+- Confirm no side effects
+
+## Constraints
+- **MUST NOT** modify critical system files
+- **MUST NOT** exceed 60 seconds for error diagnosis
+- **MUST** preserve error logs for analysis
+- **MUST** validate recovery before applying
+- **MUST** rollback on recovery failure
+
+## Environment Assumptions
+- Agent logs accessible at `/var/log/aitbc/`
+- Error tracking system functional
+- Recovery procedures documented
+- Agent state persistence available
+- System monitoring active
+
+## Error Handling
+- Recovery failure → Attempt alternative recovery strategy
+- Multiple errors → Prioritize by severity
+- Unknown error type → Apply generic recovery procedure
+- System instability → Emergency rollback
+
+## Example Usage Prompt
+
+```
+Diagnose and recover from execution errors in main agent
+```
+
+## Expected Output Example
+
+```json
+{
+  "summary": "Error diagnosed and recovered successfully in main agent",
+  "operation": "recover",
+  "agent": "main",
+  "error_detected": {
+    "type": "execution",
+    "severity": "high",
+    "timestamp": 1775811500,
+    "context": "Transaction processing timeout during blockchain sync"
+  },
+  "diagnosis": {
+    "root_cause": "Network latency causing P2P sync timeout",
+    "affected_components": ["p2p_network", "transaction_processor"],
+    "impact_assessment": "Delayed transaction processing, no data loss"
+  },
+  "recovery_applied": {
+    "strategy": "retry",
+    "actions_taken": ["Increased timeout threshold", "Retried transaction processing"],
+    "success": true
+  },
+  "issues": [],
+  "recommendations": ["Monitor network latency for future occurrences", "Consider implementing adaptive timeout"],
+  "confidence": 1.0,
+  "execution_time": 18.3,
+  "validation_status": "success"
+}
+```
+
+## Model Routing Suggestion
+
+**Reasoning Model** (Claude Sonnet, GPT-4)
+- Complex error diagnosis
+- Root cause analysis
+- Recovery strategy selection
+- Impact assessment
+
+**Performance Notes**
+- **Execution Time**: 5-30 seconds for detection, 15-45 seconds for diagnosis, 10-60 seconds for recovery
+- **Memory Usage**: <150MB for error handling operations
+- **Network Requirements**: Agent communication for error context
+- **Concurrency**: Safe for sequential error handling on different agents