docs: add YAML frontmatter to skill files to make them proper skills
Some checks failed
Cross-Node Transaction Testing / transaction-test (push) Successful in 2s
Deploy to Testnet / deploy-testnet (push) Has been cancelled
Documentation Validation / validate-docs (push) Has been cancelled
Documentation Validation / validate-policies-strict (push) Has been cancelled
Multi-Node Stress Testing / stress-test (push) Has been cancelled
Node Failover Simulation / failover-test (push) Successful in 3s
Some checks failed
Cross-Node Transaction Testing / transaction-test (push) Successful in 2s
Deploy to Testnet / deploy-testnet (push) Has been cancelled
Documentation Validation / validate-docs (push) Has been cancelled
Documentation Validation / validate-policies-strict (push) Has been cancelled
Multi-Node Stress Testing / stress-test (push) Has been cancelled
Node Failover Simulation / failover-test (push) Successful in 3s
- Add YAML frontmatter to aitbc-ai-operations.md - Add YAML frontmatter to aitbc-basic-operations.md - Add YAML frontmatter to aitbc-blockchain-troubleshooting.md - Add YAML frontmatter to aitbc-marketplace.md - Add YAML frontmatter to aitbc-multi-node-operations.md - Add YAML frontmatter to aitbc-node-coordination.md - aitbc-cli.md and aitbc.md already had YAML frontmatter - All 8 skill files now have proper YAML frontmatter for Hermes agents to load
This commit is contained in:
158
docs/skills/aitbc-blockchain-troubleshooting.md
Normal file
158
docs/skills/aitbc-blockchain-troubleshooting.md
Normal file
@@ -0,0 +1,158 @@
|
||||
---
|
||||
name: aitbc-blockchain-troubleshooting
|
||||
description: Blockchain troubleshooting including sync issues, P2P problems, service failures, data corruption, and blockchain recovery operations
|
||||
category: troubleshooting
|
||||
---
|
||||
|
||||
# AITBC Blockchain Troubleshooting Skill
|
||||
|
||||
## Trigger Conditions
|
||||
Activate when user requests blockchain troubleshooting: sync issues, P2P problems, service failures, data corruption, or blockchain recovery operations.
|
||||
|
||||
## Purpose
|
||||
Diagnose and troubleshoot AITBC blockchain issues including synchronization failures, P2P network problems, service failures, and data corruption.
|
||||
|
||||
## Prerequisites
|
||||
- SSH access to all nodes (aitbc, aitbc1, gitea-runner)
|
||||
- Systemd services operational or accessible for debugging
|
||||
- Log access at `/var/log/aitbc/`
|
||||
- Data directory at `/var/lib/aitbc/`
|
||||
- CLI accessible at `/opt/aitbc/aitbc-cli`
|
||||
|
||||
## Operations
|
||||
|
||||
### 1. Initial Diagnosis
|
||||
```bash
|
||||
# Check service status on all nodes
|
||||
systemctl status aitbc-blockchain-node.service
|
||||
ssh aitbc1 'systemctl status aitbc-blockchain-node.service'
|
||||
ssh gitea-runner 'systemctl status aitbc-blockchain-node.service'
|
||||
|
||||
# Check blockchain RPC health
|
||||
curl http://localhost:8006/health
|
||||
curl http://10.1.223.40:8006/health
|
||||
curl http://aitbc1:8006/health
|
||||
|
||||
# Check P2P network status
|
||||
netstat -an | grep 7070
|
||||
ssh aitbc1 'netstat -an | grep 7070'
|
||||
```
|
||||
|
||||
### 2. Blockchain Sync Issues
|
||||
```bash
|
||||
# Check blockchain height on all nodes
|
||||
./aitbc-cli chain
|
||||
ssh aitbc1 'cd /opt/aitbc && ./aitbc-cli chain'
|
||||
ssh gitea-runner 'cd /opt/aitbc && ./aitbc-cli chain'
|
||||
|
||||
# Check mempool status
|
||||
./aitbc-cli mempool status
|
||||
ssh aitbc1 'cd /opt/aitbc && ./aitbc-cli mempool status'
|
||||
|
||||
# Check P2P connections
|
||||
./aitbc-cli network
|
||||
ssh aitbc1 'cd /opt/aitbc && ./aitbc-cli network'
|
||||
```
|
||||
|
||||
### 3. P2P Network Problems
|
||||
```bash
|
||||
# Check P2P node IDs
|
||||
cat /etc/aitbc/.env | grep p2p_node_id
|
||||
ssh aitbc1 'cat /etc/aitbc/.env | grep p2p_node_id'
|
||||
|
||||
# Generate unique node IDs if duplicates found
|
||||
/opt/aitbc/scripts/utils/generate_unique_node_ids.py
|
||||
|
||||
# Restart blockchain services
|
||||
systemctl restart aitbc-blockchain-p2p.service
|
||||
ssh aitbc1 'systemctl restart aitbc-blockchain-p2p.service'
|
||||
```
|
||||
|
||||
### 4. Service Failures
|
||||
```bash
|
||||
# Check service logs
|
||||
journalctl -u aitbc-blockchain-node.service -n 100
|
||||
journalctl -u aitbc-blockchain-p2p.service -n 100
|
||||
ssh aitbc1 'journalctl -u aitbc-blockchain-node.service -n 100'
|
||||
|
||||
# Check application logs
|
||||
tail -f /var/log/aitbc/blockchain-node.log
|
||||
tail -f /var/log/aitbc/blockchain-p2p.log
|
||||
ssh aitbc1 'tail -f /var/log/aitbc/blockchain-node.log'
|
||||
```
|
||||
|
||||
### 5. Data Corruption
|
||||
```bash
|
||||
# Check database integrity
|
||||
sqlite3 /var/lib/aitbc/data/blockchain.db "PRAGMA integrity_check;"
|
||||
|
||||
# Check CoW status (Btrfs)
|
||||
lsattr /var/lib/aitbc
|
||||
|
||||
# Disable CoW if enabled
|
||||
chattr +C /var/lib/aitbc
|
||||
|
||||
# Enable WAL mode
|
||||
sqlite3 /var/lib/aitbc/data/blockchain.db "PRAGMA journal_mode=WAL;"
|
||||
```
|
||||
|
||||
### 6. Recovery Operations
|
||||
```bash
|
||||
# Stop blockchain services
|
||||
systemctl stop aitbc-blockchain-node.service aitbc-blockchain-p2p.service
|
||||
ssh aitbc1 'systemctl stop aitbc-blockchain-node.service aitbc-blockchain-p2p.service'
|
||||
|
||||
# Backup current data
|
||||
cp -r /var/lib/aitbc/data /var/lib/aitbc/data.backup
|
||||
|
||||
# Restore from backup if needed
|
||||
systemctl stop aitbc-blockchain-node.service
|
||||
rm -rf /var/lib/aitbc/data/*
|
||||
cp -r /var/lib/aitbc/data.backup/* /var/lib/aitbc/data/
|
||||
systemctl start aitbc-blockchain-node.service
|
||||
|
||||
# Restart services
|
||||
systemctl start aitbc-blockchain-node.service aitbc-blockchain-p2p.service
|
||||
ssh aitbc1 'systemctl start aitbc-blockchain-node.service aitbc-blockchain-p2p.service'
|
||||
```
|
||||
|
||||
### 7. Communication Test
|
||||
```bash
|
||||
# Run full communication test
|
||||
./scripts/blockchain-communication-test.sh --full --debug
|
||||
|
||||
# Verify all services are healthy
|
||||
curl http://localhost:8006/health
|
||||
curl http://aitbc1:8006/health
|
||||
curl http://10.1.223.40:8001/health
|
||||
curl http://10.1.223.40:8011/health
|
||||
|
||||
# Check blockchain sync
|
||||
NODE_URL=http://10.1.223.40:8006 ./aitbc-cli blockchain height
|
||||
NODE_URL=http://aitbc1:8006 ./aitbc-cli blockchain height
|
||||
```
|
||||
|
||||
## Common Pitfalls
|
||||
|
||||
1. **Duplicate P2P Node IDs:** Check for duplicate p2p_node_id in `/etc/aitbc/.env` - generate unique IDs
|
||||
2. **Btrfs CoW Corruption:** Disable CoW on `/var/lib/aitbc` with `chattr +C`
|
||||
3. **SQLite Corruption:** Enable WAL mode and check database integrity
|
||||
4. **Port Mismatches:** Coordinator API is on port 8011 (not 8000)
|
||||
5. **Service Start Order:** Ensure P2P service starts before blockchain-node service
|
||||
6. **Network Connectivity:** Verify P2P port 7070 is open and accessible
|
||||
7. **Data Directory Permissions:** Ensure proper permissions on `/var/lib/aitbc/data`
|
||||
|
||||
## Verification Checklist
|
||||
- [ ] All blockchain services running
|
||||
- [ ] Blockchain heights match across nodes
|
||||
- [ ] P2P connections established (port 7070)
|
||||
- [ ] RPC endpoints responding (port 8006)
|
||||
- [ ] No duplicate P2P node IDs
|
||||
- [ ] Database integrity check passes
|
||||
- [ ] CoW disabled on data directory
|
||||
- [ ] WAL mode enabled for database
|
||||
|
||||
## CLI Tool Preference
|
||||
- **Primary CLI:** `/opt/aitbc/aitbc-cli` is the single CLI entry point
|
||||
- **RPC URL:** Default is `http://localhost:8006`
|
||||
- **Coordinator API:** Port 8011 (not 8000)
|
||||
Reference in New Issue
Block a user