Some checks failed
Cross-Node Transaction Testing / transaction-test (push) Has been cancelled
Deploy to Testnet / deploy-testnet (push) Has been cancelled
Documentation Validation / validate-docs (push) Has been cancelled
Documentation Validation / validate-policies-strict (push) Has been cancelled
Multi-Node Stress Testing / stress-test (push) Has been cancelled
Node Failover Simulation / failover-test (push) Has been cancelled
Integration Tests / test-service-integration (push) Has been cancelled
Security Scanning / security-scan (push) Has been cancelled
Python Tests / test-python (push) Has been cancelled
CLI Tests / test-cli (push) Has been cancelled
Blockchain Synchronization Verification / sync-verification (push) Successful in 11s
Contract Performance Benchmarks / benchmark-gas-usage (push) Successful in 1m36s
Contract Performance Benchmarks / benchmark-execution-time (push) Successful in 1m24s
Contract Performance Benchmarks / benchmark-throughput (push) Successful in 1m25s
Cross-Chain Functionality Tests / test-cross-chain-sync (push) Successful in 2s
Cross-Chain Functionality Tests / test-cross-chain-transactions (push) Successful in 5s
Cross-Chain Functionality Tests / test-cross-chain-bridge (push) Has been skipped
Cross-Chain Functionality Tests / test-multi-chain-consensus (push) Successful in 3s
Cross-Chain Functionality Tests / aggregate-results (push) Has been skipped
Multi-Chain Island Architecture Tests / test-multi-chain-island (push) Successful in 2s
Multi-Node Blockchain Health Monitoring / health-check (push) Successful in 3s
P2P Network Verification / p2p-verification (push) Successful in 2s
Smart Contract Tests / test-solidity (map[name:aitbc-contracts path:contracts]) (push) Failing after 1m28s
Smart Contract Tests / test-solidity (map[name:aitbc-token path:packages/solidity/aitbc-token]) (push) Successful in 21s
Smart Contract Tests / test-foundry (push) Failing after 20s
Smart Contract Tests / lint-solidity (push) Successful in 30s
Smart Contract Tests / deploy-contracts (push) Successful in 1m40s
Systemd Sync / sync-systemd (push) Successful in 26s
Contract Performance Benchmarks / compare-benchmarks (push) Successful in 4s
- Update workflow paths from docs/openclaw to docs/hermes - Rename skill prefixes from openclaw-* to hermes-* - Update agent skill references in refactoring and analysis docs - Rename OPENCLAW_AITBC_MASTERY_PLAN.md to reflect hermes branding - Update CLI examples and command references throughout documentation
6.2 KiB
6.2 KiB
description, title, version
| description | title | version |
|---|---|---|
| Atomic Hermes agent testing with deterministic communication validation and performance metrics | hermes-agent-testing-skill | 1.1 |
Hermes Agent Testing Skill
Purpose
Test and validate Hermes agent functionality, communication patterns, session management, and performance with deterministic validation metrics.
Activation
Trigger when user requests Hermes agent testing: agent functionality validation, communication testing, session management testing, or agent performance analysis.
Input
{
"operation": "test-agent-communication|test-session-management|test-agent-performance|test-multi-agent|comprehensive",
"agent": "main|specific_agent_name (default: main)",
"test_message": "string (optional for communication testing)",
"session_id": "string (optional for session testing)",
"thinking_level": "off|minimal|low|medium|high|xhigh",
"test_duration": "number (optional, default: 60 seconds)",
"message_count": "number (optional, default: 5)",
"concurrent_agents": "number (optional, default: 2)"
}
Output
{
"summary": "Hermes agent testing completed successfully",
"operation": "test-agent-communication|test-session-management|test-agent-performance|test-multi-agent|comprehensive",
"test_results": {
"agent_communication": "boolean",
"session_management": "boolean",
"agent_performance": "boolean",
"multi_agent_coordination": "boolean"
},
"agent_details": {
"agent_name": "string",
"agent_status": "online|offline|error",
"response_time": "number",
"message_success_rate": "number"
},
"communication_metrics": {
"messages_sent": "number",
"messages_received": "number",
"average_response_time": "number",
"communication_success_rate": "number"
},
"session_metrics": {
"sessions_created": "number",
"session_preservation": "boolean",
"context_maintenance": "boolean",
"session_duration": "number"
},
"performance_metrics": {
"cpu_usage": "number",
"memory_usage": "number",
"response_latency": "number",
"throughput": "number"
},
"issues": [],
"recommendations": [],
"confidence": 1.0,
"execution_time": "number",
"validation_status": "success|partial|failed"
}
Process
1. Analyze
- Validate agent testing parameters and operation type
- Check Hermes service availability and health
- Verify agent availability and status
- Assess testing scope and requirements
2. Plan
- Prepare agent communication test scenarios
- Define session management testing strategy
- Set performance monitoring and validation criteria
- Configure multi-agent coordination tests
3. Execute
- Test agent communication with various thinking levels
- Validate session creation and context preservation
- Monitor agent performance and resource utilization
- Test multi-agent coordination and communication patterns
4. Validate
- Verify agent communication success and response quality
- Check session management effectiveness and context preservation
- Validate agent performance metrics and resource usage
- Confirm multi-agent coordination and communication patterns
Constraints
- MUST NOT test unavailable agents without explicit request
- MUST NOT exceed message length limits (4000 characters)
- MUST validate thinking level compatibility
- MUST handle communication timeouts gracefully
- MUST preserve session context during testing
- MUST provide deterministic performance metrics
Environment Assumptions
- Hermes 2026.3.24+ installed and gateway running
- Agent workspace configured at
~/.hermes/workspace/ - Network connectivity for agent communication
- Default agent available: "main"
- Session management functional
Error Handling
- Agent unavailable → Return agent status and availability recommendations
- Communication timeout → Return timeout details and retry suggestions
- Session management failures → Return session diagnostics and recovery steps
- Performance issues → Return performance metrics and optimization recommendations
Example Usage Prompt
Run comprehensive Hermes agent testing including communication, session management, performance, and multi-agent coordination validation
Expected Output Example
{
"summary": "Comprehensive Hermes agent testing completed with all systems operational",
"operation": "comprehensive",
"test_results": {
"agent_communication": true,
"session_management": true,
"agent_performance": true,
"multi_agent_coordination": true
},
"agent_details": {
"agent_name": "main",
"agent_status": "online",
"response_time": 2.3,
"message_success_rate": 100.0
},
"communication_metrics": {
"messages_sent": 5,
"messages_received": 5,
"average_response_time": 2.1,
"communication_success_rate": 100.0
},
"session_metrics": {
"sessions_created": 3,
"session_preservation": true,
"context_maintenance": true,
"session_duration": 45.2
},
"performance_metrics": {
"cpu_usage": 15.3,
"memory_usage": 85.2,
"response_latency": 2.1,
"throughput": 2.4
},
"issues": [],
"recommendations": ["All agents operational", "Communication latency optimal", "Session management effective"],
"confidence": 1.0,
"execution_time": 67.3,
"validation_status": "success"
}
Model Routing Suggestion
Fast Model (Claude Haiku, GPT-3.5-turbo)
- Simple agent availability checking
- Basic communication testing with low thinking
- Quick agent status validation
Reasoning Model (Claude Sonnet, GPT-4)
- Comprehensive agent communication testing
- Session management validation and optimization
- Multi-agent coordination testing and analysis
- Complex agent performance diagnostics
Coding Model (Claude Sonnet, GPT-4)
- Agent performance optimization algorithms
- Communication pattern analysis and improvement
- Session management enhancement strategies
Performance Notes
- Execution Time: 5-15 seconds for basic tests, 30-90 seconds for comprehensive testing
- Memory Usage: <150MB for agent testing operations
- Network Requirements: Hermes gateway connectivity
- Concurrency: Safe for multiple simultaneous agent tests with different agents
- Session Management: Automatic session creation and context preservation testing