```
chore: enhance .gitignore and remove obsolete documentation files - Reorganize .gitignore with categorized sections for better maintainability - Add comprehensive ignore patterns for Python, Node.js, databases, logs, and build artifacts - Add project-specific ignore rules for coordinator, explorer, and deployment files - Remove outdated documentation: BITCOIN-WALLET-SETUP.md, LOCAL_ASSETS_SUMMARY.md, README-CONTAINER-DEPLOYMENT.md, README-DOMAIN-DEPLOYMENT.md ```
This commit is contained in:
97
.windsurf/skills/blockchain-operations/SKILL.md
Normal file
97
.windsurf/skills/blockchain-operations/SKILL.md
Normal file
@@ -0,0 +1,97 @@
|
||||
---
|
||||
name: blockchain-operations
|
||||
description: Comprehensive blockchain node management and operations for AITBC
|
||||
version: 1.0.0
|
||||
author: Cascade
|
||||
tags: [blockchain, node, mining, transactions, aitbc, operations]
|
||||
---
|
||||
|
||||
# Blockchain Operations Skill
|
||||
|
||||
This skill provides standardized procedures for managing AITBC blockchain nodes, verifying transactions, and optimizing mining operations.
|
||||
|
||||
## Overview
|
||||
|
||||
The blockchain operations skill ensures reliable management of all blockchain-related components including node synchronization, transaction processing, mining operations, and network health monitoring.
|
||||
|
||||
## Capabilities
|
||||
|
||||
### Node Management
|
||||
- Node deployment and configuration
|
||||
- Sync status monitoring
|
||||
- Peer management
|
||||
- Network diagnostics
|
||||
|
||||
### Transaction Operations
|
||||
- Transaction verification and debugging
|
||||
- Gas optimization
|
||||
- Batch processing
|
||||
- Mempool management
|
||||
|
||||
### Mining Operations
|
||||
- Mining performance optimization
|
||||
- Pool management
|
||||
- Reward tracking
|
||||
- Hash rate optimization
|
||||
|
||||
### Network Health
|
||||
- Network connectivity checks
|
||||
- Block propagation monitoring
|
||||
- Fork detection and resolution
|
||||
- Consensus validation
|
||||
|
||||
## Common Workflows
|
||||
|
||||
### 1. Node Health Check
|
||||
- Verify node synchronization
|
||||
- Check peer connections
|
||||
- Validate consensus rules
|
||||
- Monitor resource usage
|
||||
|
||||
### 2. Transaction Debugging
|
||||
- Trace transaction lifecycle
|
||||
- Verify gas usage
|
||||
- Check receipt status
|
||||
- Debug failed transactions
|
||||
|
||||
### 3. Mining Optimization
|
||||
- Analyze mining performance
|
||||
- Optimize GPU settings
|
||||
- Configure mining pools
|
||||
- Monitor profitability
|
||||
|
||||
### 4. Network Diagnostics
|
||||
- Test connectivity to peers
|
||||
- Analyze block propagation
|
||||
- Detect network partitions
|
||||
- Validate consensus state
|
||||
|
||||
## Supporting Files
|
||||
|
||||
- `node-health.sh` - Comprehensive node health monitoring
|
||||
- `tx-tracer.py` - Transaction tracing and debugging tool
|
||||
- `mining-optimize.sh` - GPU mining optimization script
|
||||
- `network-diag.py` - Network diagnostics and analysis
|
||||
- `sync-monitor.py` - Real-time sync status monitor
|
||||
|
||||
## Usage
|
||||
|
||||
This skill is automatically invoked when you request blockchain-related operations such as:
|
||||
- "check node status"
|
||||
- "debug transaction"
|
||||
- "optimize mining"
|
||||
- "network diagnostics"
|
||||
|
||||
## Safety Features
|
||||
|
||||
- Automatic backup of node data before operations
|
||||
- Validation of all transactions before processing
|
||||
- Safe mining parameter adjustments
|
||||
- Rollback capability for configuration changes
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- AITBC node installed and configured
|
||||
- GPU drivers installed (for mining operations)
|
||||
- Proper network connectivity
|
||||
- Sufficient disk space for blockchain data
|
||||
296
.windsurf/skills/blockchain-operations/mining-optimize.sh
Executable file
296
.windsurf/skills/blockchain-operations/mining-optimize.sh
Executable file
@@ -0,0 +1,296 @@
|
||||
#!/bin/bash
|
||||
|
||||
# AITBC GPU Mining Optimization Script
|
||||
# Optimizes GPU settings for maximum mining efficiency
|
||||
|
||||
set -e
|
||||
|
||||
# Configuration
|
||||
LOG_FILE="/var/log/aitbc/mining-optimize.log"
|
||||
CONFIG_FILE="/etc/aitbc/mining.conf"
|
||||
GPU_VENDOR="" # Will be auto-detected
|
||||
|
||||
# Colors
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
NC='\033[0m'
|
||||
|
||||
# Logging function
|
||||
log() {
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a $LOG_FILE
|
||||
}
|
||||
|
||||
# Detect GPU vendor
|
||||
detect_gpu() {
|
||||
echo -e "${BLUE}=== Detecting GPU ===${NC}"
|
||||
|
||||
if command -v nvidia-smi &> /dev/null; then
|
||||
GPU_VENDOR="nvidia"
|
||||
echo -e "${GREEN}✓${NC} NVIDIA GPU detected"
|
||||
log "GPU vendor: NVIDIA"
|
||||
elif command -v rocm-smi &> /dev/null; then
|
||||
GPU_VENDOR="amd"
|
||||
echo -e "${GREEN}✓${NC} AMD GPU detected"
|
||||
log "GPU vendor: AMD"
|
||||
elif lspci | grep -i vga &> /dev/null; then
|
||||
echo -e "${YELLOW}⚠${NC} GPU detected but vendor-specific tools not found"
|
||||
log "GPU detected but vendor unknown"
|
||||
GPU_VENDOR="unknown"
|
||||
else
|
||||
echo -e "${RED}✗${NC} No GPU detected"
|
||||
log "No GPU detected - cannot optimize mining"
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
# Get GPU information
|
||||
get_gpu_info() {
|
||||
echo -e "\n${BLUE}=== GPU Information ===${NC}"
|
||||
|
||||
case $GPU_VENDOR in
|
||||
"nvidia")
|
||||
nvidia-smi --query-gpu=name,memory.total,temperature.gpu,utilization.gpu,power.draw --format=csv,noheader,nounits
|
||||
;;
|
||||
"amd")
|
||||
rocm-smi --showproductname
|
||||
rocm-smi --showmeminfo vram
|
||||
rocm-smi --showtemp
|
||||
;;
|
||||
*)
|
||||
echo "GPU info not available for vendor: $GPU_VENDOR"
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
# Optimize NVIDIA GPU
|
||||
optimize_nvidia() {
|
||||
echo -e "\n${BLUE}=== Optimizing NVIDIA GPU ===${NC}"
|
||||
|
||||
# Get current power limit
|
||||
CURRENT_POWER=$(nvidia-smi --query-gpu=power.limit --format=csv,noheader,nounits | head -n1)
|
||||
echo "Current power limit: ${CURRENT_POWER}W"
|
||||
|
||||
# Set optimal power limit (80% of max for efficiency)
|
||||
MAX_POWER=$(nvidia-smi --query-gpu=power.max_limit --format=csv,noheader,nounits | head -n1)
|
||||
OPTIMAL_POWER=$((MAX_POWER * 80 / 100))
|
||||
|
||||
echo "Setting power limit to ${OPTIMAL_POWER}W (80% of max)"
|
||||
sudo nvidia-smi -pl $OPTIMAL_POWER
|
||||
log "NVIDIA power limit set to ${OPTIMAL_POWER}W"
|
||||
|
||||
# Set performance mode
|
||||
echo "Setting performance mode to maximum"
|
||||
sudo nvidia-smi -ac 877,1215
|
||||
log "NVIDIA performance mode set to maximum"
|
||||
|
||||
# Configure memory clock
|
||||
echo "Optimizing memory clock"
|
||||
sudo nvidia-smi -pm 1
|
||||
log "NVIDIA persistence mode enabled"
|
||||
|
||||
# Create optimized mining config
|
||||
cat > $CONFIG_FILE << EOF
|
||||
[nvidia]
|
||||
power_limit = $OPTIMAL_POWER
|
||||
performance_mode = maximum
|
||||
memory_clock = max
|
||||
fan_speed = auto
|
||||
temperature_limit = 85
|
||||
EOF
|
||||
|
||||
echo -e "${GREEN}✓${NC} NVIDIA GPU optimized"
|
||||
}
|
||||
|
||||
# Optimize AMD GPU
|
||||
optimize_amd() {
|
||||
echo -e "\n${BLUE}=== Optimizing AMD GPU ===${NC}"
|
||||
|
||||
# Set performance level
|
||||
echo "Setting performance level to high"
|
||||
sudo rocm-smi --setperflevel high
|
||||
log "AMD performance level set to high"
|
||||
|
||||
# Set memory clock
|
||||
echo "Optimizing memory clock"
|
||||
sudo rocm-smi --setmclk 1
|
||||
log "AMD memory clock optimized"
|
||||
|
||||
# Create optimized mining config
|
||||
cat > $CONFIG_FILE << EOF
|
||||
[amd]
|
||||
performance_level = high
|
||||
memory_clock = high
|
||||
fan_speed = auto
|
||||
temperature_limit = 85
|
||||
EOF
|
||||
|
||||
echo -e "${GREEN}✓${NC} AMD GPU optimized"
|
||||
}
|
||||
|
||||
# Monitor mining performance
|
||||
monitor_mining() {
|
||||
echo -e "\n${BLUE}=== Mining Performance Monitor ===${NC}"
|
||||
|
||||
# Check if miner is running
|
||||
if ! pgrep -f "aitbc-miner" > /dev/null; then
|
||||
echo -e "${YELLOW}⚠${NC} Miner is not running"
|
||||
return 1
|
||||
fi
|
||||
|
||||
# Monitor for 30 seconds
|
||||
echo "Monitoring mining performance for 30 seconds..."
|
||||
|
||||
for i in {1..6}; do
|
||||
echo -e "\n--- Check $i/6 ---"
|
||||
|
||||
case $GPU_VENDOR in
|
||||
"nvidia")
|
||||
nvidia-smi --query-gpu=temperature.gpu,utilization.gpu,power.draw,fan.speed --format=csv,noheader,nounits
|
||||
;;
|
||||
"amd")
|
||||
rocm-smi --showtemp --showutilization
|
||||
;;
|
||||
esac
|
||||
|
||||
# Get hash rate from miner API
|
||||
if curl -s http://localhost:8081/api/status > /dev/null; then
|
||||
HASHRATE=$(curl -s http://localhost:8081/api/status | jq -r '.hashrate')
|
||||
echo "Hash rate: ${HASHRATE} H/s"
|
||||
fi
|
||||
|
||||
sleep 5
|
||||
done
|
||||
}
|
||||
|
||||
# Tune fan curves
|
||||
tune_fans() {
|
||||
echo -e "\n${BLUE}=== Tuning Fan Curves ===${NC}"
|
||||
|
||||
case $GPU_VENDOR in
|
||||
"nvidia")
|
||||
# Set custom fan curve
|
||||
echo "Setting custom fan curve for NVIDIA"
|
||||
# This would use nvidia-settings or similar
|
||||
echo "Target: 30% fan at 50°C, 60% at 70°C, 100% at 85°C"
|
||||
log "NVIDIA fan curve configured"
|
||||
;;
|
||||
"amd")
|
||||
echo "Setting fan control to auto for AMD"
|
||||
# AMD cards usually handle this automatically
|
||||
log "AMD fan control set to auto"
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
# Check mining profitability
|
||||
check_profitability() {
|
||||
echo -e "\n${BLUE}=== Profitability Analysis ===${NC}"
|
||||
|
||||
# Get current hash rate
|
||||
if curl -s http://localhost:8081/api/status > /dev/null; then
|
||||
HASHRATE=$(curl -s http://localhost:8081/api/status | jq -r '.hashrate')
|
||||
POWER_USAGE=$(nvidia-smi --query-gpu=power.draw --format=csv,noheader,nounits | head -n1)
|
||||
|
||||
echo "Current hash rate: ${HASHRATE} H/s"
|
||||
echo "Power usage: ${POWER_USAGE}W"
|
||||
|
||||
# Calculate efficiency
|
||||
if [ "$HASHRATE" != "null" ] && [ -n "$POWER_USAGE" ]; then
|
||||
EFFICIENCY=$(echo "scale=2; $HASHRATE / $POWER_USAGE" | bc)
|
||||
echo "Efficiency: ${EFFICIENCY} H/W"
|
||||
|
||||
# Efficiency rating
|
||||
if (( $(echo "$EFFICIENCY > 10" | bc -l) )); then
|
||||
echo -e "${GREEN}✓${NC} Excellent efficiency"
|
||||
elif (( $(echo "$EFFICIENCY > 5" | bc -l) )); then
|
||||
echo -e "${YELLOW}⚠${NC} Good efficiency"
|
||||
else
|
||||
echo -e "${RED}✗${NC} Poor efficiency - consider optimization"
|
||||
fi
|
||||
fi
|
||||
else
|
||||
echo "Miner API not accessible"
|
||||
fi
|
||||
}
|
||||
|
||||
# Generate optimization report
|
||||
generate_report() {
|
||||
echo -e "\n${BLUE}=== Optimization Report ===${NC}"
|
||||
|
||||
echo "GPU Vendor: $GPU_VENDOR"
|
||||
echo "Configuration: $CONFIG_FILE"
|
||||
echo "Optimization completed: $(date)"
|
||||
|
||||
# Current settings
|
||||
echo -e "\nCurrent Settings:"
|
||||
case $GPU_VENDOR in
|
||||
"nvidia")
|
||||
nvidia-smi --query-gpu=power.limit,temperature.gpu,utilization.gpu --format=csv,noheader,nounits
|
||||
;;
|
||||
"amd")
|
||||
rocm-smi --showtemp --showutilization
|
||||
;;
|
||||
esac
|
||||
|
||||
log "Optimization report generated"
|
||||
}
|
||||
|
||||
# Main execution
|
||||
main() {
|
||||
log "Starting mining optimization"
|
||||
echo -e "${BLUE}AITBC GPU Mining Optimizer${NC}"
|
||||
echo "==============================="
|
||||
|
||||
# Check root privileges
|
||||
if [ "$EUID" -ne 0 ]; then
|
||||
echo -e "${YELLOW}⚠${NC} Some optimizations require sudo privileges"
|
||||
fi
|
||||
|
||||
detect_gpu
|
||||
get_gpu_info
|
||||
|
||||
# Perform optimization based on vendor
|
||||
case $GPU_VENDOR in
|
||||
"nvidia")
|
||||
optimize_nvidia
|
||||
;;
|
||||
"amd")
|
||||
optimize_amd
|
||||
;;
|
||||
*)
|
||||
echo -e "${YELLOW}⚠${NC} Cannot optimize unknown GPU vendor"
|
||||
;;
|
||||
esac
|
||||
|
||||
tune_fans
|
||||
monitor_mining
|
||||
check_profitability
|
||||
generate_report
|
||||
|
||||
echo -e "\n${GREEN}Mining optimization completed!${NC}"
|
||||
echo "Configuration saved to: $CONFIG_FILE"
|
||||
echo "Log saved to: $LOG_FILE"
|
||||
|
||||
log "Mining optimization completed successfully"
|
||||
}
|
||||
|
||||
# Parse command line arguments
|
||||
case "${1:-optimize}" in
|
||||
"optimize")
|
||||
main
|
||||
;;
|
||||
"monitor")
|
||||
detect_gpu
|
||||
monitor_mining
|
||||
;;
|
||||
"report")
|
||||
detect_gpu
|
||||
generate_report
|
||||
;;
|
||||
*)
|
||||
echo "Usage: $0 [optimize|monitor|report]"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
398
.windsurf/skills/blockchain-operations/network-diag.py
Executable file
398
.windsurf/skills/blockchain-operations/network-diag.py
Executable file
@@ -0,0 +1,398 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
AITBC Network Diagnostics Tool
|
||||
Analyzes network connectivity, peer health, and block propagation
|
||||
"""
|
||||
|
||||
import json
|
||||
import sys
|
||||
import time
|
||||
import socket
|
||||
import subprocess
|
||||
from datetime import datetime
|
||||
from typing import Dict, List, Tuple, Optional
|
||||
import requests
|
||||
|
||||
class NetworkDiagnostics:
|
||||
def __init__(self, node_url: str = "http://localhost:8545"):
|
||||
"""Initialize network diagnostics"""
|
||||
self.node_url = node_url
|
||||
self.results = {}
|
||||
|
||||
def rpc_call(self, method: str, params: List = None) -> Optional[Dict]:
|
||||
"""Make JSON-RPC call to node"""
|
||||
try:
|
||||
response = requests.post(
|
||||
self.node_url,
|
||||
json={
|
||||
"jsonrpc": "2.0",
|
||||
"method": method,
|
||||
"params": params or [],
|
||||
"id": 1
|
||||
},
|
||||
timeout=10
|
||||
)
|
||||
return response.json().get('result')
|
||||
except Exception as e:
|
||||
return None
|
||||
|
||||
def check_connectivity(self) -> Dict[str, any]:
|
||||
"""Check basic network connectivity"""
|
||||
print("Checking network connectivity...")
|
||||
|
||||
results = {
|
||||
'node_reachable': False,
|
||||
'dns_resolution': {},
|
||||
'port_checks': {},
|
||||
'internet_connectivity': False
|
||||
}
|
||||
|
||||
# Check if node is reachable
|
||||
try:
|
||||
response = requests.get(self.node_url, timeout=5)
|
||||
results['node_reachable'] = response.status_code == 200
|
||||
except:
|
||||
pass
|
||||
|
||||
# DNS resolution checks
|
||||
domains = ['aitbc.io', 'api.aitbc.io', 'mainnet.aitbc.io']
|
||||
for domain in domains:
|
||||
try:
|
||||
ip = socket.gethostbyname(domain)
|
||||
results['dns_resolution'][domain] = {
|
||||
'resolvable': True,
|
||||
'ip': ip
|
||||
}
|
||||
except:
|
||||
results['dns_resolution'][domain] = {
|
||||
'resolvable': False,
|
||||
'ip': None
|
||||
}
|
||||
|
||||
# Port checks
|
||||
ports = [
|
||||
('localhost', 8545, 'RPC'),
|
||||
('localhost', 8546, 'WS'),
|
||||
('localhost', 30303, 'P2P TCP'),
|
||||
('localhost', 30303, 'P2P UDP')
|
||||
]
|
||||
|
||||
for host, port, service in ports:
|
||||
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
|
||||
sock.settimeout(3)
|
||||
result = sock.connect_ex((host, port))
|
||||
results['port_checks'][f'{host}:{port} ({service})'] = result == 0
|
||||
sock.close()
|
||||
|
||||
# Internet connectivity
|
||||
try:
|
||||
response = requests.get('https://google.com', timeout=5)
|
||||
results['internet_connectivity'] = response.status_code == 200
|
||||
except:
|
||||
pass
|
||||
|
||||
self.results['connectivity'] = results
|
||||
return results
|
||||
|
||||
def analyze_peers(self) -> Dict[str, any]:
|
||||
"""Analyze peer connections"""
|
||||
print("Analyzing peer connections...")
|
||||
|
||||
results = {
|
||||
'peer_count': 0,
|
||||
'peer_details': [],
|
||||
'peer_distribution': {},
|
||||
'connection_types': {},
|
||||
'latency_stats': {}
|
||||
}
|
||||
|
||||
# Get peer list
|
||||
peers = self.rpc_call("admin_peers") or []
|
||||
results['peer_count'] = len(peers)
|
||||
|
||||
# Analyze each peer
|
||||
for peer in peers:
|
||||
peer_info = {
|
||||
'id': (peer.get('id', '')[:10] + '...') if peer.get('id') else '',
|
||||
'address': peer.get('network', {}).get('remoteAddress', 'Unknown'),
|
||||
'local_address': peer.get('network', {}).get('localAddress', 'Unknown'),
|
||||
'caps': list(peer.get('protocols', {}).keys()),
|
||||
'connected_duration': peer.get('network', {}).get('connectedDuration', 0)
|
||||
}
|
||||
|
||||
# Extract IP for geolocation
|
||||
if ':' in peer_info['address']:
|
||||
ip = peer_info['address'].split(':')[0]
|
||||
peer_info['ip'] = ip
|
||||
|
||||
# Get country (would use geoip library in production)
|
||||
try:
|
||||
# Simple ping test for latency
|
||||
start = time.time()
|
||||
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
|
||||
sock.settimeout(1)
|
||||
result = sock.connect_ex((ip, 30303))
|
||||
latency = (time.time() - start) * 1000 if result == 0 else None
|
||||
sock.close()
|
||||
peer_info['latency_ms'] = latency
|
||||
except:
|
||||
peer_info['latency_ms'] = None
|
||||
|
||||
results['peer_details'].append(peer_info)
|
||||
|
||||
# Calculate distribution
|
||||
countries = {}
|
||||
for peer in results['peer_details']:
|
||||
country = peer.get('country', 'Unknown')
|
||||
countries[country] = countries.get(country, 0) + 1
|
||||
results['peer_distribution'] = countries
|
||||
|
||||
# Calculate latency stats
|
||||
latencies = [p['latency_ms'] for p in results['peer_details'] if p['latency_ms'] is not None]
|
||||
if latencies:
|
||||
results['latency_stats'] = {
|
||||
'min': min(latencies),
|
||||
'max': max(latencies),
|
||||
'avg': sum(latencies) / len(latencies)
|
||||
}
|
||||
|
||||
self.results['peers'] = results
|
||||
return results
|
||||
|
||||
def test_block_propagation(self) -> Dict[str, any]:
|
||||
"""Test block propagation speed"""
|
||||
print("Testing block propagation...")
|
||||
|
||||
results = {
|
||||
'latest_block': 0,
|
||||
'block_age': 0,
|
||||
'propagation_delay': None,
|
||||
'uncle_rate': 0,
|
||||
'network_hashrate': 0
|
||||
}
|
||||
|
||||
# Get latest block
|
||||
latest_block = self.rpc_call("eth_getBlockByNumber", ["latest", False])
|
||||
if latest_block:
|
||||
results['latest_block'] = int(latest_block['number'], 16)
|
||||
block_timestamp = int(latest_block['timestamp'], 16)
|
||||
results['block_age'] = int(time.time()) - block_timestamp
|
||||
|
||||
# Get uncle rate (check last 100 blocks)
|
||||
try:
|
||||
uncle_count = 0
|
||||
for i in range(100):
|
||||
block = self.rpc_call("eth_getBlockByNumber", [hex(results['latest_block'] - i), False])
|
||||
if block and block.get('uncles'):
|
||||
uncle_count += len(block['uncles'])
|
||||
results['uncle_rate'] = (uncle_count / 100) * 100
|
||||
except:
|
||||
pass
|
||||
|
||||
# Get network hashrate
|
||||
try:
|
||||
latest = self.rpc_call("eth_getBlockByNumber", ["latest", False])
|
||||
if latest:
|
||||
difficulty = int(latest['difficulty'], 16)
|
||||
block_time = 13 # Average block time for ETH-like chains
|
||||
results['network_hashrate'] = difficulty / block_time
|
||||
except:
|
||||
pass
|
||||
|
||||
self.results['block_propagation'] = results
|
||||
return results
|
||||
|
||||
def check_fork_status(self) -> Dict[str, any]:
|
||||
"""Check for network forks"""
|
||||
print("Checking for network forks...")
|
||||
|
||||
results = {
|
||||
'current_fork': None,
|
||||
'fork_blocks': [],
|
||||
'reorg_detected': False,
|
||||
'chain_head': {}
|
||||
}
|
||||
|
||||
# Get current fork block
|
||||
try:
|
||||
fork_block = self.rpc_call("eth_forkBlock")
|
||||
if fork_block:
|
||||
results['current_fork'] = int(fork_block, 16)
|
||||
except:
|
||||
pass
|
||||
|
||||
# Check for recent reorganizations
|
||||
try:
|
||||
# Get last 10 blocks and check for inconsistencies
|
||||
for i in range(10):
|
||||
block_num = hex(int(self.rpc_call("eth_blockNumber"), 16) - i)
|
||||
block = self.rpc_call("eth_getBlockByNumber", [block_num, False])
|
||||
if block:
|
||||
results['chain_head'][block_num] = {
|
||||
'hash': block['hash'],
|
||||
'parent': block.get('parentHash'),
|
||||
'number': block['number']
|
||||
}
|
||||
except:
|
||||
pass
|
||||
|
||||
self.results['fork_status'] = results
|
||||
return results
|
||||
|
||||
def analyze_network_performance(self) -> Dict[str, any]:
|
||||
"""Analyze overall network performance"""
|
||||
print("Analyzing network performance...")
|
||||
|
||||
results = {
|
||||
'rpc_response_time': 0,
|
||||
'ws_connection': False,
|
||||
'bandwidth_estimate': 0,
|
||||
'packet_loss': 0
|
||||
}
|
||||
|
||||
# Test RPC response time
|
||||
start = time.time()
|
||||
self.rpc_call("eth_blockNumber")
|
||||
results['rpc_response_time'] = (time.time() - start) * 1000
|
||||
|
||||
# Test WebSocket connection
|
||||
try:
|
||||
import websocket
|
||||
# Would implement actual WS connection test
|
||||
results['ws_connection'] = True
|
||||
except:
|
||||
results['ws_connection'] = False
|
||||
|
||||
self.results['performance'] = results
|
||||
return results
|
||||
|
||||
def generate_recommendations(self) -> List[str]:
|
||||
"""Generate network improvement recommendations"""
|
||||
recommendations = []
|
||||
|
||||
# Connectivity recommendations
|
||||
if not self.results.get('connectivity', {}).get('node_reachable'):
|
||||
recommendations.append("Node is not reachable - check if the node is running")
|
||||
|
||||
if not self.results.get('connectivity', {}).get('internet_connectivity'):
|
||||
recommendations.append("No internet connectivity - check network connection")
|
||||
|
||||
# Peer recommendations
|
||||
peer_count = self.results.get('peers', {}).get('peer_count', 0)
|
||||
if peer_count < 5:
|
||||
recommendations.append(f"Low peer count ({peer_count}) - check firewall and port forwarding")
|
||||
|
||||
# Performance recommendations
|
||||
rpc_time = self.results.get('performance', {}).get('rpc_response_time', 0)
|
||||
if rpc_time > 1000:
|
||||
recommendations.append("High RPC response time - consider optimizing node or upgrading hardware")
|
||||
|
||||
# Block propagation recommendations
|
||||
block_age = self.results.get('block_propagation', {}).get('block_age', 0)
|
||||
if block_age > 60:
|
||||
recommendations.append("Stale blocks detected - possible sync issues")
|
||||
|
||||
return recommendations
|
||||
|
||||
def print_report(self):
|
||||
"""Print comprehensive diagnostic report"""
|
||||
print("\n" + "="*60)
|
||||
print("AITBC Network Diagnostics Report")
|
||||
print("="*60)
|
||||
print(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
|
||||
print(f"Node URL: {self.node_url}")
|
||||
|
||||
# Connectivity section
|
||||
print("\n[Connectivity]")
|
||||
conn = self.results.get('connectivity', {})
|
||||
print(f" Node Reachable: {'✓' if conn.get('node_reachable') else '✗'}")
|
||||
print(f" Internet Access: {'✓' if conn.get('internet_connectivity') else '✗'}")
|
||||
|
||||
for domain, info in conn.get('dns_resolution', {}).items():
|
||||
status = '✓' if info['resolvable'] else '✗'
|
||||
print(f" DNS {domain}: {status}")
|
||||
|
||||
# Peers section
|
||||
print("\n[Peer Analysis]")
|
||||
peers = self.results.get('peers', {})
|
||||
print(f" Connected Peers: {peers.get('peer_count', 0)}")
|
||||
|
||||
if peers.get('peer_distribution'):
|
||||
print(" Geographic Distribution:")
|
||||
for country, count in list(peers['peer_distribution'].items())[:5]:
|
||||
print(f" {country}: {count} peers")
|
||||
|
||||
if peers.get('latency_stats'):
|
||||
stats = peers['latency_stats']
|
||||
print(f" Latency: {stats['avg']:.0f}ms avg (min: {stats['min']:.0f}ms, max: {stats['max']:.0f}ms)")
|
||||
|
||||
# Block propagation section
|
||||
print("\n[Block Propagation]")
|
||||
prop = self.results.get('block_propagation', {})
|
||||
print(f" Latest Block: {prop.get('latest_block', 0):,}")
|
||||
print(f" Block Age: {prop.get('block_age', 0)} seconds")
|
||||
print(f" Uncle Rate: {prop.get('uncle_rate', 0):.2f}%")
|
||||
|
||||
# Performance section
|
||||
print("\n[Performance]")
|
||||
perf = self.results.get('performance', {})
|
||||
print(f" RPC Response Time: {perf.get('rpc_response_time', 0):.0f}ms")
|
||||
print(f" WebSocket: {'✓' if perf.get('ws_connection') else '✗'}")
|
||||
|
||||
# Recommendations
|
||||
recommendations = self.generate_recommendations()
|
||||
if recommendations:
|
||||
print("\n[Recommendations]")
|
||||
for i, rec in enumerate(recommendations, 1):
|
||||
print(f" {i}. {rec}")
|
||||
|
||||
print("\n" + "="*60)
|
||||
|
||||
def save_report(self, filename: str):
|
||||
"""Save detailed report to file"""
|
||||
report = {
|
||||
'timestamp': datetime.now().isoformat(),
|
||||
'node_url': self.node_url,
|
||||
'results': self.results,
|
||||
'recommendations': self.generate_recommendations()
|
||||
}
|
||||
|
||||
with open(filename, 'w') as f:
|
||||
json.dump(report, f, indent=2)
|
||||
|
||||
print(f"\nDetailed report saved to: {filename}")
|
||||
|
||||
def main():
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser(description='AITBC Network Diagnostics')
|
||||
parser.add_argument('--node', default='http://localhost:8545', help='Node URL')
|
||||
parser.add_argument('--output', help='Save report to file')
|
||||
parser.add_argument('--quick', action='store_true', help='Quick diagnostics')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Run diagnostics
|
||||
diag = NetworkDiagnostics(args.node)
|
||||
|
||||
print("Running AITBC network diagnostics...")
|
||||
print("-" * 40)
|
||||
|
||||
# Run all tests
|
||||
diag.check_connectivity()
|
||||
|
||||
if not args.quick:
|
||||
diag.analyze_peers()
|
||||
diag.test_block_propagation()
|
||||
diag.check_fork_status()
|
||||
diag.analyze_network_performance()
|
||||
|
||||
# Print report
|
||||
diag.print_report()
|
||||
|
||||
# Save if requested
|
||||
if args.output:
|
||||
diag.save_report(args.output)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
248
.windsurf/skills/blockchain-operations/node-health.sh
Executable file
248
.windsurf/skills/blockchain-operations/node-health.sh
Executable file
@@ -0,0 +1,248 @@
|
||||
#!/bin/bash
|
||||
|
||||
# AITBC Node Health Check Script
|
||||
# Monitors and reports on blockchain node health
|
||||
|
||||
set -e
|
||||
|
||||
# Configuration
|
||||
NODE_URL="http://localhost:8545"
|
||||
LOG_FILE="/var/log/aitbc/node-health.log"
|
||||
ALERT_THRESHOLD=90 # Sync threshold percentage
|
||||
|
||||
# Colors
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
NC='\033[0m'
|
||||
|
||||
# Logging function
|
||||
log() {
|
||||
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a $LOG_FILE
|
||||
}
|
||||
|
||||
# JSON RPC call function
|
||||
rpc_call() {
|
||||
local method=$1
|
||||
local params=$2
|
||||
curl -s -X POST $NODE_URL \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{\"jsonrpc\":\"2.0\",\"method\":\"$method\",\"params\":$params,\"id\":1}" \
|
||||
| jq -r '.result'
|
||||
}
|
||||
|
||||
# Check if node is running
|
||||
check_node_running() {
|
||||
echo -e "\n${BLUE}=== Checking Node Status ===${NC}"
|
||||
|
||||
if pgrep -f "aitbc-node" > /dev/null; then
|
||||
echo -e "${GREEN}✓${NC} AITBC node process is running"
|
||||
log "Node process: RUNNING"
|
||||
else
|
||||
echo -e "${RED}✗${NC} AITBC node is not running"
|
||||
log "Node process: NOT RUNNING"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
# Check sync status
|
||||
check_sync_status() {
|
||||
echo -e "\n${BLUE}=== Checking Sync Status ===${NC}"
|
||||
|
||||
local sync_result=$(rpc_call "eth_syncing" "[]")
|
||||
|
||||
if [ "$sync_result" = "false" ]; then
|
||||
echo -e "${GREEN}✓${NC} Node is fully synchronized"
|
||||
log "Sync status: FULLY SYNCED"
|
||||
else
|
||||
local current_block=$(echo $sync_result | jq -r '.currentBlock')
|
||||
local highest_block=$(echo $sync_result | jq -r '.highestBlock')
|
||||
local sync_percent=$(echo "scale=2; $current_block * 100 / $highest_block" | bc)
|
||||
|
||||
if (( $(echo "$sync_percent > $ALERT_THRESHOLD" | bc -l) )); then
|
||||
echo -e "${YELLOW}⚠${NC} Node syncing: ${sync_percent}% (Block $current_block / $highest_block)"
|
||||
log "Sync status: SYNCING at ${sync_percent}%"
|
||||
else
|
||||
echo -e "${RED}✗${NC} Node far behind: ${sync_percent}% (Block $current_block / $highest_block)"
|
||||
log "Sync status: FAR BEHIND at ${sync_percent}%"
|
||||
fi
|
||||
fi
|
||||
}
|
||||
|
||||
# Check peer connections
|
||||
check_peers() {
|
||||
echo -e "\n${BLUE}=== Checking Peer Connections ===${NC}"
|
||||
|
||||
local peer_count=$(rpc_call "net_peerCount" "[]")
|
||||
local peer_count_dec=$((peer_count))
|
||||
|
||||
if [ $peer_count_dec -gt 0 ]; then
|
||||
echo -e "${GREEN}✓${NC} Connected to $peer_count_dec peers"
|
||||
log "Peer count: $peer_count_dec"
|
||||
|
||||
# Get detailed peer info
|
||||
local peers=$(rpc_call "admin_peers" "[]")
|
||||
local active_peers=$(echo $peers | jq '. | length')
|
||||
echo -e " Active peers: $active_peers"
|
||||
|
||||
# Show peer countries
|
||||
echo -e "\n Peer Distribution:"
|
||||
echo $peers | jq -r '.[].network.remoteAddress' | cut -d: -f1 | sort | uniq -c | sort -nr | head -5 | while read count ip; do
|
||||
country=$(geoiplookup $ip 2>/dev/null | awk -F': ' '{print $2}' | awk -F',' '{print $1}' || echo "Unknown")
|
||||
echo " $country: $count peers"
|
||||
done
|
||||
else
|
||||
echo -e "${RED}✗${NC} No peer connections"
|
||||
log "Peer count: 0 - CRITICAL"
|
||||
fi
|
||||
}
|
||||
|
||||
# Check block propagation
|
||||
check_block_propagation() {
|
||||
echo -e "\n${BLUE}=== Checking Block Propagation ===${NC}"
|
||||
|
||||
local latest_block=$(rpc_call "eth_getBlockByNumber" '["latest", false]')
|
||||
local block_number=$(echo $latest_block | jq -r '.number')
|
||||
local block_timestamp=$(echo $latest_block | jq -r '.timestamp')
|
||||
local current_time=$(date +%s)
|
||||
local block_age=$((current_time - block_timestamp))
|
||||
|
||||
if [ $block_age -lt 30 ]; then
|
||||
echo -e "${GREEN}✓${NC} Latest block received ${block_age} seconds ago"
|
||||
log "Block propagation: ${block_age}s ago - GOOD"
|
||||
elif [ $block_age -lt 120 ]; then
|
||||
echo -e "${YELLOW}⚠${NC} Latest block received ${block_age} seconds ago"
|
||||
log "Block propagation: ${block_age}s ago - SLOW"
|
||||
else
|
||||
echo -e "${RED}✗${NC} Stale block (${block_age} seconds old)"
|
||||
log "Block propagation: ${block_age}s ago - CRITICAL"
|
||||
fi
|
||||
|
||||
# Show block details
|
||||
local gas_limit=$(echo $latest_block | jq -r '.gasLimit')
|
||||
local gas_used=$(echo $latest_block | jq -r '.gasUsed')
|
||||
local utilization=$(echo "scale=2; $gas_used * 100 / $gas_limit" | bc)
|
||||
echo -e " Block #$(($block_number)) - Gas utilization: ${utilization}%"
|
||||
}
|
||||
|
||||
# Check resource usage
|
||||
check_resources() {
|
||||
echo -e "\n${BLUE}=== Checking Resource Usage ===${NC}"
|
||||
|
||||
# Memory usage
|
||||
local node_pid=$(pgrep -f "aitbc-node")
|
||||
if [ -n "$node_pid" ]; then
|
||||
local memory=$(ps -p $node_pid -o rss= | awk '{print $1/1024 " MB"}')
|
||||
local cpu=$(ps -p $node_pid -o %cpu= | awk '{print $1 "%"}')
|
||||
|
||||
echo -e " Memory usage: $memory"
|
||||
echo -e " CPU usage: $cpu"
|
||||
log "Resource usage - Memory: $memory, CPU: $cpu"
|
||||
|
||||
# Check if memory usage is high
|
||||
local memory_mb=$(ps -p $node_pid -o rss= | awk '{print $1}')
|
||||
if [ $memory_mb -gt 8388608 ]; then # 8GB
|
||||
echo -e "${YELLOW}⚠${NC} High memory usage detected"
|
||||
fi
|
||||
fi
|
||||
|
||||
# Disk usage for blockchain data
|
||||
local blockchain_dir="/var/lib/aitbc/blockchain"
|
||||
if [ -d "$blockchain_dir" ]; then
|
||||
local disk_usage=$(du -sh $blockchain_dir | awk '{print $1}')
|
||||
echo -e " Blockchain data size: $disk_usage"
|
||||
fi
|
||||
}
|
||||
|
||||
# Check consensus status
|
||||
check_consensus() {
|
||||
echo -e "\n${BLUE}=== Checking Consensus Status ===${NC}"
|
||||
|
||||
# Get latest block and verify consensus
|
||||
local latest_block=$(rpc_call "eth_getBlockByNumber" '["latest", false]')
|
||||
local block_hash=$(echo $latest_block | jq -r '.hash')
|
||||
local difficulty=$(echo $latest_block | jq -r '.difficulty')
|
||||
|
||||
echo -e " Latest block hash: ${block_hash:0:10}..."
|
||||
echo -e " Difficulty: $difficulty"
|
||||
|
||||
# Check for consensus alerts
|
||||
local chain_id=$(rpc_call "eth_chainId" "[]")
|
||||
echo -e " Chain ID: $chain_id"
|
||||
|
||||
log "Consensus check - Block: ${block_hash:0:10}..., Chain: $chain_id"
|
||||
}
|
||||
|
||||
# Generate health report
|
||||
generate_report() {
|
||||
echo -e "\n${BLUE}=== Health Report Summary ===${NC}"
|
||||
|
||||
# Overall status
|
||||
local score=0
|
||||
local total=5
|
||||
|
||||
# Node running
|
||||
if pgrep -f "aitbc-node" > /dev/null; then
|
||||
((score++))
|
||||
fi
|
||||
|
||||
# Sync status
|
||||
local sync_result=$(rpc_call "eth_syncing" "[]")
|
||||
if [ "$sync_result" = "false" ]; then
|
||||
((score++))
|
||||
fi
|
||||
|
||||
# Peers
|
||||
local peer_count=$(rpc_call "net_peerCount" "[]")
|
||||
if [ $((peer_count)) -gt 0 ]; then
|
||||
((score++))
|
||||
fi
|
||||
|
||||
# Block propagation
|
||||
local latest_block=$(rpc_call "eth_getBlockByNumber" '["latest", false]')
|
||||
local block_timestamp=$(echo $latest_block | jq -r '.timestamp')
|
||||
local current_time=$(date +%s)
|
||||
local block_age=$((current_time - block_timestamp))
|
||||
if [ $block_age -lt 30 ]; then
|
||||
((score++))
|
||||
fi
|
||||
|
||||
# Resources
|
||||
local node_pid=$(pgrep -f "aitbc-node")
|
||||
if [ -n "$node_pid" ]; then
|
||||
((score++))
|
||||
fi
|
||||
|
||||
local health_percent=$((score * 100 / total))
|
||||
|
||||
if [ $health_percent -eq 100 ]; then
|
||||
echo -e "${GREEN}Overall Health: EXCELLENT (${health_percent}%)${NC}"
|
||||
elif [ $health_percent -ge 80 ]; then
|
||||
echo -e "${YELLOW}Overall Health: GOOD (${health_percent}%)${NC}"
|
||||
else
|
||||
echo -e "${RED}Overall Health: POOR (${health_percent}%)${NC}"
|
||||
fi
|
||||
|
||||
log "Health check completed - Score: ${score}/${total} (${health_percent}%)"
|
||||
}
|
||||
|
||||
# Main execution
|
||||
main() {
|
||||
log "Starting node health check"
|
||||
echo -e "${BLUE}AITBC Node Health Check${NC}"
|
||||
echo "============================"
|
||||
|
||||
check_node_running
|
||||
check_sync_status
|
||||
check_peers
|
||||
check_block_propagation
|
||||
check_resources
|
||||
check_consensus
|
||||
generate_report
|
||||
|
||||
echo -e "\n${BLUE}Health check completed. Log saved to: $LOG_FILE${NC}"
|
||||
}
|
||||
|
||||
# Run main function
|
||||
main "$@"
|
||||
329
.windsurf/skills/blockchain-operations/ollama-test-scenario.md
Normal file
329
.windsurf/skills/blockchain-operations/ollama-test-scenario.md
Normal file
@@ -0,0 +1,329 @@
|
||||
# Ollama GPU Inference Testing Scenario
|
||||
|
||||
## Overview
|
||||
|
||||
This document describes the complete end-to-end testing workflow for Ollama GPU inference jobs on the AITBC platform, from job submission to receipt generation.
|
||||
|
||||
## Test Architecture
|
||||
|
||||
```
|
||||
Client (CLI) → Coordinator API → GPU Miner (Host) → Ollama → Receipt → Blockchain
|
||||
↓ ↓ ↓ ↓ ↓ ↓
|
||||
Submit Job Queue Job Process Job Run Model Generate Record Tx
|
||||
Check Status Assign Miner Submit Result Metrics Receipt with Payment
|
||||
```
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### System Setup
|
||||
```bash
|
||||
# Repository location
|
||||
cd /home/oib/windsurf/aitbc
|
||||
|
||||
# Virtual environment
|
||||
source .venv/bin/activate
|
||||
|
||||
# Ensure services are running
|
||||
./scripts/aitbc-cli.sh health
|
||||
```
|
||||
|
||||
### Required Services
|
||||
- Coordinator API: http://127.0.0.1:18000
|
||||
- Ollama API: http://localhost:11434
|
||||
- GPU Miner Service: systemd service
|
||||
- Blockchain Node: http://127.0.0.1:19000
|
||||
|
||||
## Test Scenarios
|
||||
|
||||
### Scenario 1: Basic Inference Job
|
||||
|
||||
#### Step 1: Submit Job
|
||||
```bash
|
||||
./scripts/aitbc-cli.sh submit inference \
|
||||
--prompt "What is artificial intelligence?" \
|
||||
--model llama3.2:latest \
|
||||
--ttl 900
|
||||
|
||||
# Expected output:
|
||||
# ✅ Job submitted successfully!
|
||||
# Job ID: abc123def456...
|
||||
```
|
||||
|
||||
#### Step 2: Monitor Job Status
|
||||
```bash
|
||||
# Check status immediately
|
||||
./scripts/aitbc-cli.sh status abc123def456
|
||||
|
||||
# Expected: State = RUNNING
|
||||
|
||||
# Monitor until completion
|
||||
watch -n 2 "./scripts/aitbc-cli.sh status abc123def456"
|
||||
```
|
||||
|
||||
#### Step 3: Verify Completion
|
||||
```bash
|
||||
# Once completed, check receipt
|
||||
./scripts/aitbc-cli.sh receipts --job-id abc123def456
|
||||
|
||||
# Expected: Receipt with price > 0
|
||||
```
|
||||
|
||||
#### Step 4: Blockchain Verification
|
||||
```bash
|
||||
# View on blockchain explorer
|
||||
./scripts/aitbc-cli.sh browser --receipt-limit 1
|
||||
|
||||
# Expected: Transaction showing payment amount
|
||||
```
|
||||
|
||||
### Scenario 2: Concurrent Jobs Test
|
||||
|
||||
#### Submit Multiple Jobs
|
||||
```bash
|
||||
# Submit 5 jobs concurrently
|
||||
for i in {1..5}; do
|
||||
./scripts/aitbc-cli.sh submit inference \
|
||||
--prompt "Explain topic $i in detail" \
|
||||
--model mistral:latest &
|
||||
done
|
||||
|
||||
# Wait for all to submit
|
||||
wait
|
||||
```
|
||||
|
||||
#### Monitor All Jobs
|
||||
```bash
|
||||
# Check all active jobs
|
||||
./scripts/aitbc-cli.sh admin-jobs
|
||||
|
||||
# Expected: Multiple RUNNING jobs, then COMPLETED
|
||||
```
|
||||
|
||||
#### Verify All Receipts
|
||||
```bash
|
||||
# List recent receipts
|
||||
./scripts/aitbc-cli.sh receipts --limit 5
|
||||
|
||||
# Expected: 5 receipts with different payment amounts
|
||||
```
|
||||
|
||||
### Scenario 3: Model Performance Test
|
||||
|
||||
#### Test Different Models
|
||||
```bash
|
||||
# Test with various models
|
||||
models=("llama3.2:latest" "mistral:latest" "deepseek-coder:6.7b-base" "qwen2.5:1.5b")
|
||||
|
||||
for model in "${models[@]}"; do
|
||||
echo "Testing model: $model"
|
||||
./scripts/aitbc-cli.sh submit inference \
|
||||
--prompt "Write a Python hello world" \
|
||||
--model "$model" \
|
||||
--ttl 900
|
||||
done
|
||||
```
|
||||
|
||||
#### Compare Performance
|
||||
```bash
|
||||
# Check receipts for performance metrics
|
||||
./scripts/aitbc-cli.sh receipts --limit 10
|
||||
|
||||
# Note: Different models have different processing times and costs
|
||||
```
|
||||
|
||||
### Scenario 4: Error Handling Test
|
||||
|
||||
#### Test Job Expiration
|
||||
```bash
|
||||
# Submit job with very short TTL
|
||||
./scripts/aitbc-cli.sh submit inference \
|
||||
--prompt "This should expire" \
|
||||
--model llama3.2:latest \
|
||||
--ttl 5
|
||||
|
||||
# Wait for expiration
|
||||
sleep 10
|
||||
|
||||
# Check status
|
||||
./scripts/aitbc-cli.sh status <job_id>
|
||||
|
||||
# Expected: State = EXPIRED
|
||||
```
|
||||
|
||||
#### Test Job Cancellation
|
||||
```bash
|
||||
# Submit job
|
||||
job_id=$(./scripts/aitbc-cli.sh submit inference \
|
||||
--prompt "Cancel me" \
|
||||
--model llama3.2:latest | grep "Job ID" | awk '{print $3}')
|
||||
|
||||
# Cancel immediately
|
||||
./scripts/aitbc-cli.sh cancel "$job_id"
|
||||
|
||||
# Verify cancellation
|
||||
./scripts/aitbc-cli.sh status "$job_id"
|
||||
|
||||
# Expected: State = CANCELED
|
||||
```
|
||||
|
||||
## Monitoring and Debugging
|
||||
|
||||
### Check Miner Health
|
||||
```bash
|
||||
# Systemd service status
|
||||
sudo systemctl status aitbc-host-gpu-miner.service
|
||||
|
||||
# Real-time logs
|
||||
sudo journalctl -u aitbc-host-gpu-miner.service -f
|
||||
|
||||
# Manual run for debugging
|
||||
python3 scripts/gpu/gpu_miner_host.py
|
||||
```
|
||||
|
||||
### Verify Ollama Integration
|
||||
```bash
|
||||
# Check Ollama status
|
||||
curl http://localhost:11434/api/tags
|
||||
|
||||
# Test Ollama directly
|
||||
curl -X POST http://localhost:11434/api/generate \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"model": "llama3.2:latest", "prompt": "Hello", "stream": false}'
|
||||
```
|
||||
|
||||
### Check Coordinator API
|
||||
```bash
|
||||
# Health check
|
||||
curl http://127.0.0.1:18000/v1/health
|
||||
|
||||
# List registered miners
|
||||
curl -H "X-Api-Key: REDACTED_ADMIN_KEY" \
|
||||
http://127.0.0.1:18000/v1/admin/miners
|
||||
|
||||
# List all jobs
|
||||
curl -H "X-Api-Key: REDACTED_ADMIN_KEY" \
|
||||
http://127.0.0.1:18000/v1/admin/jobs
|
||||
```
|
||||
|
||||
## Expected Results
|
||||
|
||||
### Successful Job Flow
|
||||
1. **Submission**: Job ID returned, state = QUEUED
|
||||
2. **Acquisition**: Miner picks up job, state = RUNNING
|
||||
3. **Processing**: Ollama runs inference (visible in logs)
|
||||
4. **Completion**: Miner submits result, state = COMPLETED
|
||||
5. **Receipt**: Generated with:
|
||||
- units: Processing time in seconds
|
||||
- unit_price: 0.02 AITBC/second (default)
|
||||
- price: Total payment (units × unit_price)
|
||||
6. **Blockchain**: Transaction recorded with payment
|
||||
|
||||
### Sample Receipt
|
||||
```json
|
||||
{
|
||||
"receipt_id": "abc123...",
|
||||
"job_id": "def456...",
|
||||
"provider": "REDACTED_MINER_KEY",
|
||||
"client": "REDACTED_CLIENT_KEY",
|
||||
"status": "completed",
|
||||
"units": 2.5,
|
||||
"unit_type": "gpu_seconds",
|
||||
"unit_price": 0.02,
|
||||
"price": 0.05,
|
||||
"signature": "0x..."
|
||||
}
|
||||
```
|
||||
|
||||
## Common Issues and Solutions
|
||||
|
||||
### Jobs Stay RUNNING
|
||||
- **Cause**: Miner not running or not polling
|
||||
- **Solution**: Check miner service status and logs
|
||||
- **Command**: `sudo systemctl restart aitbc-host-gpu-miner.service`
|
||||
|
||||
### No Payment in Receipt
|
||||
- **Cause**: Missing metrics in job result
|
||||
- **Solution**: Ensure miner submits duration/units
|
||||
- **Check**: `./scripts/aitbc-cli.sh receipts --job-id <id>`
|
||||
|
||||
### Ollama Connection Failed
|
||||
- **Cause**: Ollama not running or wrong port
|
||||
- **Solution**: Start Ollama service
|
||||
- **Command**: `ollama serve`
|
||||
|
||||
### GPU Not Detected
|
||||
- **Cause**: NVIDIA drivers not installed
|
||||
- **Solution**: Install drivers and verify
|
||||
- **Command**: `nvidia-smi`
|
||||
|
||||
## Performance Metrics
|
||||
|
||||
### Expected Processing Times
|
||||
- llama3.2:latest: ~1-3 seconds per response
|
||||
- mistral:latest: ~1-2 seconds per response
|
||||
- deepseek-coder:6.7b-base: ~2-4 seconds per response
|
||||
- qwen2.5:1.5b: ~0.5-1 second per response
|
||||
|
||||
### Expected Costs
|
||||
- Default rate: 0.02 AITBC/second
|
||||
- Typical job cost: 0.02-0.1 AITBC
|
||||
- Minimum charge: 0.01 AITBC
|
||||
|
||||
## Automation Script
|
||||
|
||||
### End-to-End Test Script
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# e2e-ollama-test.sh
|
||||
|
||||
set -e
|
||||
|
||||
echo "Starting Ollama E2E Test..."
|
||||
|
||||
# Check prerequisites
|
||||
echo "Checking services..."
|
||||
./scripts/aitbc-cli.sh health
|
||||
|
||||
# Start miner if needed
|
||||
if ! systemctl is-active --quiet aitbc-host-gpu-miner.service; then
|
||||
echo "Starting miner service..."
|
||||
sudo systemctl start aitbc-host-gpu-miner.service
|
||||
fi
|
||||
|
||||
# Submit test job
|
||||
echo "Submitting test job..."
|
||||
job_id=$(./scripts/aitbc-cli.sh submit inference \
|
||||
--prompt "E2E test: What is 2+2?" \
|
||||
--model llama3.2:latest | grep "Job ID" | awk '{print $3}')
|
||||
|
||||
echo "Job submitted: $job_id"
|
||||
|
||||
# Monitor job
|
||||
echo "Monitoring job..."
|
||||
while true; do
|
||||
status=$(./scripts/aitbc-cli.sh status "$job_id" | grep "State" | awk '{print $2}')
|
||||
echo "Status: $status"
|
||||
|
||||
if [ "$status" = "COMPLETED" ]; then
|
||||
echo "Job completed!"
|
||||
break
|
||||
elif [ "$status" = "FAILED" ] || [ "$status" = "CANCELED" ] || [ "$status" = "EXPIRED" ]; then
|
||||
echo "Job failed with status: $status"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
sleep 2
|
||||
done
|
||||
|
||||
# Verify receipt
|
||||
echo "Checking receipt..."
|
||||
./scripts/aitbc-cli.sh receipts --job-id "$job_id"
|
||||
|
||||
echo "E2E test completed successfully!"
|
||||
```
|
||||
|
||||
Run with:
|
||||
```bash
|
||||
chmod +x e2e-ollama-test.sh
|
||||
./e2e-ollama-test.sh
|
||||
```
|
||||
268
.windsurf/skills/blockchain-operations/skill.md
Normal file
268
.windsurf/skills/blockchain-operations/skill.md
Normal file
@@ -0,0 +1,268 @@
|
||||
# Blockchain Operations Skill
|
||||
|
||||
This skill provides standardized procedures for managing AITBC blockchain nodes, verifying transactions, and optimizing mining operations, including end-to-end Ollama GPU inference testing.
|
||||
|
||||
## Overview
|
||||
|
||||
The blockchain operations skill ensures reliable management of all blockchain-related components including node synchronization, transaction processing, mining operations, and network health monitoring. It also includes comprehensive testing scenarios for Ollama-based GPU inference workflows.
|
||||
|
||||
## Capabilities
|
||||
|
||||
### Node Management
|
||||
- Node deployment and configuration
|
||||
- Sync status monitoring
|
||||
- Peer management
|
||||
- Network diagnostics
|
||||
|
||||
### Transaction Operations
|
||||
- Transaction verification and debugging
|
||||
- Gas optimization
|
||||
- Batch processing
|
||||
- Mempool management
|
||||
- Receipt generation and verification
|
||||
|
||||
### Mining Operations
|
||||
- Mining performance optimization
|
||||
- Pool management
|
||||
- Reward tracking
|
||||
- Hash rate optimization
|
||||
- GPU miner service management
|
||||
|
||||
### Ollama GPU Inference Testing
|
||||
- End-to-end job submission and processing
|
||||
- Miner registration and heartbeat monitoring
|
||||
- Job lifecycle management (submit → running → completed)
|
||||
- Receipt generation with payment amounts
|
||||
- Blockchain explorer verification
|
||||
|
||||
### Network Health
|
||||
- Network connectivity checks
|
||||
- Block propagation monitoring
|
||||
- Fork detection and resolution
|
||||
- Consensus validation
|
||||
|
||||
## Common Workflows
|
||||
|
||||
### 1. Node Health Check
|
||||
- Verify node synchronization
|
||||
- Check peer connections
|
||||
- Validate consensus rules
|
||||
- Monitor resource usage
|
||||
|
||||
### 2. Transaction Debugging
|
||||
- Trace transaction lifecycle
|
||||
- Verify gas usage
|
||||
- Check receipt status
|
||||
- Debug failed transactions
|
||||
|
||||
### 3. Mining Optimization
|
||||
- Analyze mining performance
|
||||
- Optimize GPU settings
|
||||
- Configure mining pools
|
||||
- Monitor profitability
|
||||
|
||||
### 4. Network Diagnostics
|
||||
- Test connectivity to peers
|
||||
- Analyze block propagation
|
||||
- Detect network partitions
|
||||
- Validate consensus state
|
||||
|
||||
### 5. Ollama End-to-End Testing
|
||||
```bash
|
||||
# Setup environment
|
||||
cd /home/oib/windsurf/aitbc
|
||||
source .venv/bin/activate
|
||||
|
||||
# Check all services
|
||||
./scripts/aitbc-cli.sh health
|
||||
|
||||
# Start GPU miner service
|
||||
sudo systemctl restart aitbc-host-gpu-miner.service
|
||||
sudo journalctl -u aitbc-host-gpu-miner.service -f
|
||||
|
||||
# Submit inference job
|
||||
./scripts/aitbc-cli.sh submit inference \
|
||||
--prompt "Explain quantum computing" \
|
||||
--model llama3.2:latest \
|
||||
--ttl 900
|
||||
|
||||
# Monitor job progress
|
||||
./scripts/aitbc-cli.sh status <job_id>
|
||||
|
||||
# View blockchain receipt
|
||||
./scripts/aitbc-cli.sh browser --receipt-limit 5
|
||||
|
||||
# Verify payment in receipt
|
||||
./scripts/aitbc-cli.sh receipts --job-id <job_id>
|
||||
```
|
||||
|
||||
### 6. Job Lifecycle Testing
|
||||
1. **Submission**: Client submits job via CLI
|
||||
2. **Queued**: Job enters queue, waits for miner
|
||||
3. **Acquisition**: Miner polls and acquires job
|
||||
4. **Processing**: Miner runs Ollama inference
|
||||
5. **Completion**: Miner submits result with metrics
|
||||
6. **Receipt**: System generates signed receipt with payment
|
||||
7. **Blockchain**: Transaction recorded on blockchain
|
||||
|
||||
### 7. Miner Service Management
|
||||
```bash
|
||||
# Check miner status
|
||||
sudo systemctl status aitbc-host-gpu-miner.service
|
||||
|
||||
# View miner logs
|
||||
sudo journalctl -u aitbc-host-gpu-miner.service -n 100
|
||||
|
||||
# Restart miner service
|
||||
sudo systemctl restart aitbc-host-gpu-miner.service
|
||||
|
||||
# Run miner manually for debugging
|
||||
python3 scripts/gpu/gpu_miner_host.py
|
||||
|
||||
# Check registered miners
|
||||
./scripts/aitbc-cli.sh admin-miners
|
||||
|
||||
# View active jobs
|
||||
./scripts/aitbc-cli.sh admin-jobs
|
||||
```
|
||||
|
||||
## Testing Scenarios
|
||||
|
||||
### Basic Inference Test
|
||||
```bash
|
||||
# Submit simple inference
|
||||
./scripts/aitbc-cli.sh submit inference \
|
||||
--prompt "Hello AITBC" \
|
||||
--model llama3.2:latest
|
||||
|
||||
# Expected flow:
|
||||
# 1. Job submitted → RUNNING
|
||||
# 2. Miner picks up job
|
||||
# 3. Ollama processes inference
|
||||
# 4. Job status → COMPLETED
|
||||
# 5. Receipt generated with payment amount
|
||||
```
|
||||
|
||||
### Stress Testing Multiple Jobs
|
||||
```bash
|
||||
# Submit multiple jobs concurrently
|
||||
for i in {1..5}; do
|
||||
./scripts/aitbc-cli.sh submit inference \
|
||||
--prompt "Test job $i: Explain AI" \
|
||||
--model mistral:latest &
|
||||
done
|
||||
|
||||
# Monitor all jobs
|
||||
./scripts/aitbc-cli.sh admin-jobs
|
||||
```
|
||||
|
||||
### Payment Verification Test
|
||||
```bash
|
||||
# Submit job with specific model
|
||||
./scripts/aitbc-cli.sh submit inference \
|
||||
--prompt "Detailed analysis" \
|
||||
--model deepseek-r1:14b
|
||||
|
||||
# After completion, check receipt
|
||||
./scripts/aitbc-cli.sh receipts --limit 1
|
||||
|
||||
# Verify transaction on blockchain
|
||||
./scripts/aitbc-cli.sh browser --receipt-limit 1
|
||||
|
||||
# Expected: Receipt shows units, unit_price, and total price
|
||||
```
|
||||
|
||||
## Supporting Files
|
||||
|
||||
- `node-health.sh` - Comprehensive node health monitoring
|
||||
- `tx-tracer.py` - Transaction tracing and debugging tool
|
||||
- `mining-optimize.sh` - GPU mining optimization script
|
||||
- `network-diag.py` - Network diagnostics and analysis
|
||||
- `sync-monitor.py` - Real-time sync status monitor
|
||||
- `scripts/gpu/gpu_miner_host.py` - Host GPU miner client with Ollama integration
|
||||
- `aitbc-cli.sh` - Bash CLI wrapper for all operations
|
||||
- `ollama-test-scenario.md` - Detailed Ollama testing documentation
|
||||
|
||||
## Usage
|
||||
|
||||
This skill is automatically invoked when you request blockchain-related operations such as:
|
||||
- "check node status"
|
||||
- "debug transaction"
|
||||
- "optimize mining"
|
||||
- "network diagnostics"
|
||||
- "test ollama inference"
|
||||
- "submit gpu job"
|
||||
- "verify payment receipt"
|
||||
|
||||
## Safety Features
|
||||
|
||||
- Automatic backup of node data before operations
|
||||
- Validation of all transactions before processing
|
||||
- Safe mining parameter adjustments
|
||||
- Rollback capability for configuration changes
|
||||
- Job expiration handling (15 minutes TTL)
|
||||
- Graceful miner shutdown and restart
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- AITBC node installed and configured
|
||||
- GPU drivers installed (for mining operations)
|
||||
- Ollama installed and running with models
|
||||
- Proper network connectivity
|
||||
- Sufficient disk space for blockchain data
|
||||
- Virtual environment with dependencies installed
|
||||
- systemd service for GPU miner
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Jobs Stuck in RUNNING
|
||||
1. Check if miner is running: `sudo systemctl status aitbc-host-gpu-miner.service`
|
||||
2. View miner logs: `sudo journalctl -u aitbc-host-gpu-miner.service -f`
|
||||
3. Verify coordinator API: `./scripts/aitbc-cli.sh health`
|
||||
4. Cancel stuck jobs: `./scripts/aitbc-cli.sh cancel <job_id>`
|
||||
|
||||
### No Payment in Receipt
|
||||
1. Check job completed successfully
|
||||
2. Verify metrics include duration or units
|
||||
3. Check receipt service logs
|
||||
4. Ensure miner submitted result with metrics
|
||||
|
||||
### Miner Not Processing Jobs
|
||||
1. Restart miner service
|
||||
2. Check Ollama is running: `curl http://localhost:11434/api/tags`
|
||||
3. Verify GPU availability: `nvidia-smi`
|
||||
4. Check miner registration: `./scripts/aitbc-cli.sh admin-miners`
|
||||
|
||||
## Key Components
|
||||
|
||||
### Coordinator API Endpoints
|
||||
- POST /v1/jobs/create - Submit new job
|
||||
- GET /v1/jobs/{id}/status - Check job status
|
||||
- POST /v1/miners/register - Register miner
|
||||
- POST /v1/miners/poll - Poll for jobs
|
||||
- POST /v1/miners/{id}/result - Submit job result
|
||||
|
||||
### CLI Commands
|
||||
- `submit` - Submit inference job
|
||||
- `status` - Check job status
|
||||
- `browser` - View blockchain state
|
||||
- `receipts` - List payment receipts
|
||||
- `admin-miners` - List registered miners
|
||||
- `admin-jobs` - List all jobs
|
||||
- `cancel` - Cancel stuck job
|
||||
|
||||
### Receipt Structure
|
||||
```json
|
||||
{
|
||||
"receipt_id": "...",
|
||||
"job_id": "...",
|
||||
"provider": "REDACTED_MINER_KEY",
|
||||
"client": "REDACTED_CLIENT_KEY",
|
||||
"status": "completed",
|
||||
"units": 1.234,
|
||||
"unit_type": "gpu_seconds",
|
||||
"unit_price": 0.02,
|
||||
"price": 0.02468,
|
||||
"signature": "..."
|
||||
}
|
||||
```
|
||||
313
.windsurf/skills/blockchain-operations/sync-monitor.py
Executable file
313
.windsurf/skills/blockchain-operations/sync-monitor.py
Executable file
@@ -0,0 +1,313 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
AITBC Blockchain Sync Monitor
|
||||
Real-time monitoring of blockchain synchronization status
|
||||
"""
|
||||
|
||||
import time
|
||||
import json
|
||||
import sys
|
||||
import requests
|
||||
from datetime import datetime, timedelta
|
||||
from typing import Dict, List, Optional, Tuple
|
||||
import threading
|
||||
import signal
|
||||
|
||||
class SyncMonitor:
|
||||
def __init__(self, node_url: str = "http://localhost:8545"):
|
||||
"""Initialize the sync monitor"""
|
||||
self.node_url = node_url
|
||||
self.running = False
|
||||
self.start_time = None
|
||||
self.last_block = 0
|
||||
self.sync_history = []
|
||||
self.max_history = 100
|
||||
|
||||
# ANSI colors for terminal output
|
||||
self.colors = {
|
||||
'red': '\033[91m',
|
||||
'green': '\033[92m',
|
||||
'yellow': '\033[93m',
|
||||
'blue': '\033[94m',
|
||||
'magenta': '\033[95m',
|
||||
'cyan': '\033[96m',
|
||||
'white': '\033[97m',
|
||||
'end': '\033[0m'
|
||||
}
|
||||
|
||||
def rpc_call(self, method: str, params: List = None) -> Optional[Dict]:
|
||||
"""Make JSON-RPC call to node"""
|
||||
try:
|
||||
response = requests.post(
|
||||
self.node_url,
|
||||
json={
|
||||
"jsonrpc": "2.0",
|
||||
"method": method,
|
||||
"params": params or [],
|
||||
"id": 1
|
||||
},
|
||||
timeout=5
|
||||
)
|
||||
return response.json().get('result')
|
||||
except Exception as e:
|
||||
return None
|
||||
|
||||
def get_sync_status(self) -> Dict:
|
||||
"""Get current sync status"""
|
||||
sync_result = self.rpc_call("eth_syncing")
|
||||
|
||||
if sync_result is False:
|
||||
# Fully synced
|
||||
latest_block = self.rpc_call("eth_blockNumber")
|
||||
return {
|
||||
'syncing': False,
|
||||
'current_block': int(latest_block, 16) if latest_block else 0,
|
||||
'highest_block': int(latest_block, 16) if latest_block else 0,
|
||||
'sync_percent': 100.0
|
||||
}
|
||||
else:
|
||||
# Still syncing
|
||||
current = int(sync_result.get('currentBlock', '0x0'), 16)
|
||||
highest = int(sync_result.get('highestBlock', '0x0'), 16)
|
||||
percent = (current / highest * 100) if highest > 0 else 0
|
||||
|
||||
return {
|
||||
'syncing': True,
|
||||
'current_block': current,
|
||||
'highest_block': highest,
|
||||
'sync_percent': percent,
|
||||
'starting_block': int(sync_result.get('startingBlock', '0x0'), 16),
|
||||
'pulled_states': sync_result.get('pulledStates', '0x0'),
|
||||
'known_states': sync_result.get('knownStates', '0x0')
|
||||
}
|
||||
|
||||
def get_peer_count(self) -> int:
|
||||
"""Get number of connected peers"""
|
||||
result = self.rpc_call("net_peerCount")
|
||||
return int(result, 16) if result else 0
|
||||
|
||||
def get_block_time(self, block_number: int) -> Optional[datetime]:
|
||||
"""Get block timestamp"""
|
||||
block = self.rpc_call("eth_getBlockByNumber", [hex(block_number), False])
|
||||
if block and 'timestamp' in block:
|
||||
return datetime.fromtimestamp(int(block['timestamp'], 16))
|
||||
return None
|
||||
|
||||
def calculate_sync_speed(self) -> Optional[float]:
|
||||
"""Calculate current sync speed (blocks/second)"""
|
||||
if len(self.sync_history) < 2:
|
||||
return None
|
||||
|
||||
# Get last two data points
|
||||
recent = self.sync_history[-2:]
|
||||
blocks_diff = recent[1]['current_block'] - recent[0]['current_block']
|
||||
time_diff = (recent[1]['timestamp'] - recent[0]['timestamp']).total_seconds()
|
||||
|
||||
if time_diff > 0:
|
||||
return blocks_diff / time_diff
|
||||
return None
|
||||
|
||||
def estimate_time_remaining(self, current: int, target: int, speed: float) -> str:
|
||||
"""Estimate time remaining to sync"""
|
||||
if speed <= 0:
|
||||
return "Unknown"
|
||||
|
||||
blocks_remaining = target - current
|
||||
seconds_remaining = blocks_remaining / speed
|
||||
|
||||
if seconds_remaining < 60:
|
||||
return f"{int(seconds_remaining)} seconds"
|
||||
elif seconds_remaining < 3600:
|
||||
return f"{int(seconds_remaining / 60)} minutes"
|
||||
elif seconds_remaining < 86400:
|
||||
return f"{int(seconds_remaining / 3600)} hours"
|
||||
else:
|
||||
return f"{int(seconds_remaining / 86400)} days"
|
||||
|
||||
def print_status_bar(self, status: Dict):
|
||||
"""Print a visual sync status bar"""
|
||||
width = 50
|
||||
filled = int(width * status['sync_percent'] / 100)
|
||||
bar = '█' * filled + '░' * (width - filled)
|
||||
|
||||
color = self.colors['green'] if status['sync_percent'] > 90 else \
|
||||
self.colors['yellow'] if status['sync_percent'] > 50 else \
|
||||
self.colors['red']
|
||||
|
||||
print(f"\r{color}[{bar}]{self.colors['end']} {status['sync_percent']:.2f}%", end='', flush=True)
|
||||
|
||||
def print_detailed_status(self, status: Dict, speed: float, peers: int):
|
||||
"""Print detailed sync information"""
|
||||
print(f"\n{'='*60}")
|
||||
print(f"{self.colors['cyan']}AITBC Blockchain Sync Monitor{self.colors['end']}")
|
||||
print(f"{'='*60}")
|
||||
|
||||
# Sync status
|
||||
if status['syncing']:
|
||||
print(f"\n{self.colors['yellow']}Syncing...{self.colors['end']}")
|
||||
else:
|
||||
print(f"\n{self.colors['green']}Fully Synchronized!{self.colors['end']}")
|
||||
|
||||
# Block information
|
||||
print(f"\n{self.colors['blue']}Block Information:{self.colors['end']}")
|
||||
print(f" Current: {status['current_block']:,}")
|
||||
print(f" Highest: {status['highest_block']:,}")
|
||||
print(f" Progress: {status['sync_percent']:.2f}%")
|
||||
|
||||
if status['syncing'] and speed:
|
||||
eta = self.estimate_time_remaining(
|
||||
status['current_block'],
|
||||
status['highest_block'],
|
||||
speed
|
||||
)
|
||||
print(f" ETA: {eta}")
|
||||
|
||||
# Sync speed
|
||||
if speed:
|
||||
print(f"\n{self.colors['blue']}Sync Speed:{self.colors['end']}")
|
||||
print(f" {speed:.2f} blocks/second")
|
||||
|
||||
# Calculate blocks per minute/hour
|
||||
print(f" {speed * 60:.0f} blocks/minute")
|
||||
print(f" {speed * 3600:.0f} blocks/hour")
|
||||
|
||||
# Network information
|
||||
print(f"\n{self.colors['blue']}Network:{self.colors['end']}")
|
||||
print(f" Peers connected: {peers}")
|
||||
|
||||
# State sync (if available)
|
||||
if status.get('pulled_states') and status.get('known_states'):
|
||||
pulled = int(status['pulled_states'], 16)
|
||||
known = int(status['known_states'], 16)
|
||||
if known > 0:
|
||||
state_percent = (pulled / known) * 100
|
||||
print(f" State sync: {state_percent:.2f}%")
|
||||
|
||||
# Time information
|
||||
if self.start_time:
|
||||
elapsed = datetime.now() - self.start_time
|
||||
print(f"\n{self.colors['blue']}Time:{self.colors['end']}")
|
||||
print(f" Started: {self.start_time.strftime('%H:%M:%S')}")
|
||||
print(f" Elapsed: {str(elapsed).split('.')[0]}")
|
||||
|
||||
def monitor_loop(self, interval: int = 5, detailed: bool = False):
|
||||
"""Main monitoring loop"""
|
||||
self.running = True
|
||||
self.start_time = datetime.now()
|
||||
|
||||
print(f"Starting sync monitor (interval: {interval}s)")
|
||||
print("Press Ctrl+C to stop\n")
|
||||
|
||||
try:
|
||||
while self.running:
|
||||
# Get current status
|
||||
status = self.get_sync_status()
|
||||
peers = self.get_peer_count()
|
||||
|
||||
# Add to history
|
||||
status['timestamp'] = datetime.now()
|
||||
self.sync_history.append(status)
|
||||
if len(self.sync_history) > self.max_history:
|
||||
self.sync_history.pop(0)
|
||||
|
||||
# Calculate sync speed
|
||||
speed = self.calculate_sync_speed()
|
||||
|
||||
# Display
|
||||
if detailed:
|
||||
self.print_detailed_status(status, speed, peers)
|
||||
else:
|
||||
self.print_status_bar(status)
|
||||
|
||||
# Check if fully synced
|
||||
if not status['syncing']:
|
||||
if not detailed:
|
||||
print() # New line after status bar
|
||||
print(f"\n{self.colors['green']}✓ Sync completed!{self.colors['end']}")
|
||||
break
|
||||
|
||||
# Wait for next interval
|
||||
time.sleep(interval)
|
||||
|
||||
except KeyboardInterrupt:
|
||||
self.running = False
|
||||
print(f"\n\n{self.colors['yellow']}Sync monitor stopped by user{self.colors['end']}")
|
||||
|
||||
# Print final summary
|
||||
self.print_summary()
|
||||
|
||||
def print_summary(self):
|
||||
"""Print sync summary"""
|
||||
if not self.sync_history:
|
||||
return
|
||||
|
||||
print(f"\n{self.colors['cyan']}Sync Summary{self.colors['end']}")
|
||||
print("-" * 40)
|
||||
|
||||
if self.start_time:
|
||||
total_time = datetime.now() - self.start_time
|
||||
print(f"Total time: {str(total_time).split('.')[0]}")
|
||||
|
||||
if len(self.sync_history) >= 2:
|
||||
blocks_synced = self.sync_history[-1]['current_block'] - self.sync_history[0]['current_block']
|
||||
print(f"Blocks synced: {blocks_synced:,}")
|
||||
|
||||
if total_time.total_seconds() > 0:
|
||||
avg_speed = blocks_synced / total_time.total_seconds()
|
||||
print(f"Average speed: {avg_speed:.2f} blocks/second")
|
||||
|
||||
def save_report(self, filename: str):
|
||||
"""Save sync report to file"""
|
||||
report = {
|
||||
'start_time': self.start_time.isoformat() if self.start_time else None,
|
||||
'end_time': datetime.now().isoformat(),
|
||||
'sync_history': [
|
||||
{
|
||||
'timestamp': entry['timestamp'].isoformat(),
|
||||
'current_block': entry['current_block'],
|
||||
'highest_block': entry['highest_block'],
|
||||
'sync_percent': entry['sync_percent']
|
||||
}
|
||||
for entry in self.sync_history
|
||||
]
|
||||
}
|
||||
|
||||
with open(filename, 'w') as f:
|
||||
json.dump(report, f, indent=2)
|
||||
|
||||
print(f"Report saved to: {filename}")
|
||||
|
||||
def signal_handler(signum, frame):
|
||||
"""Handle Ctrl+C"""
|
||||
print("\n\nStopping sync monitor...")
|
||||
sys.exit(0)
|
||||
|
||||
def main():
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser(description='AITBC Blockchain Sync Monitor')
|
||||
parser.add_argument('--node', default='http://localhost:8545', help='Node URL')
|
||||
parser.add_argument('--interval', type=int, default=5, help='Update interval (seconds)')
|
||||
parser.add_argument('--detailed', action='store_true', help='Show detailed output')
|
||||
parser.add_argument('--report', help='Save report to file')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Set up signal handler
|
||||
signal.signal(signal.SIGINT, signal_handler)
|
||||
|
||||
# Create and run monitor
|
||||
monitor = SyncMonitor(args.node)
|
||||
|
||||
try:
|
||||
monitor.monitor_loop(interval=args.interval, detailed=args.detailed)
|
||||
except Exception as e:
|
||||
print(f"Error: {e}")
|
||||
sys.exit(1)
|
||||
|
||||
# Save report if requested
|
||||
if args.report:
|
||||
monitor.save_report(args.report)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
273
.windsurf/skills/blockchain-operations/tx-tracer.py
Normal file
273
.windsurf/skills/blockchain-operations/tx-tracer.py
Normal file
@@ -0,0 +1,273 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
AITBC Transaction Tracer
|
||||
Comprehensive transaction debugging and analysis tool
|
||||
"""
|
||||
|
||||
import web3
|
||||
import json
|
||||
import sys
|
||||
import argparse
|
||||
from datetime import datetime
|
||||
from typing import Dict, List, Optional, Any
|
||||
|
||||
class TransactionTracer:
|
||||
def __init__(self, node_url: str = "http://localhost:8545"):
|
||||
"""Initialize the transaction tracer"""
|
||||
self.w3 = web3.Web3(web3.HTTPProvider(node_url))
|
||||
if not self.w3.is_connected():
|
||||
raise Exception("Failed to connect to AITBC node")
|
||||
|
||||
def trace_transaction(self, tx_hash: str) -> Dict[str, Any]:
|
||||
"""Trace a transaction and return comprehensive information"""
|
||||
try:
|
||||
# Get transaction details
|
||||
tx = self.w3.eth.get_transaction(tx_hash)
|
||||
receipt = self.w3.eth.get_transaction_receipt(tx_hash)
|
||||
|
||||
# Build trace result
|
||||
trace = {
|
||||
'hash': tx_hash,
|
||||
'status': 'success' if receipt.status == 1 else 'failed',
|
||||
'block_number': tx.blockNumber,
|
||||
'block_hash': tx.blockHash.hex(),
|
||||
'transaction_index': receipt.transactionIndex,
|
||||
'from_address': tx['from'],
|
||||
'to_address': tx.get('to'),
|
||||
'value': self.w3.from_wei(tx.value, 'ether'),
|
||||
'gas_limit': tx.gas,
|
||||
'gas_used': receipt.gasUsed,
|
||||
'gas_price': self.w3.from_wei(tx.gasPrice, 'gwei'),
|
||||
'effective_gas_price': self.w3.from_wei(receipt.effectiveGasPrice, 'gwei'),
|
||||
'nonce': tx.nonce,
|
||||
'max_fee_per_gas': None,
|
||||
'max_priority_fee_per_gas': None,
|
||||
'type': tx.get('type', 0)
|
||||
}
|
||||
|
||||
# EIP-1559 transaction fields
|
||||
if tx.get('type') == 2:
|
||||
trace['max_fee_per_gas'] = self.w3.from_wei(tx.maxFeePerGas, 'gwei')
|
||||
trace['max_priority_fee_per_gas'] = self.w3.from_wei(tx.maxPriorityFeePerGas, 'gwei')
|
||||
|
||||
# Calculate gas efficiency
|
||||
trace['gas_efficiency'] = f"{(receipt.gasUsed / tx.gas * 100):.2f}%"
|
||||
|
||||
# Get logs
|
||||
trace['logs'] = self._parse_logs(receipt.logs)
|
||||
|
||||
# Get contract creation info if applicable
|
||||
if tx.get('to') is None:
|
||||
trace['contract_created'] = receipt.contractAddress
|
||||
trace['contract_code'] = self.w3.eth.get_code(receipt.contractAddress).hex()
|
||||
|
||||
# Get internal transfers (if tracing is available)
|
||||
trace['internal_transfers'] = self._get_internal_transfers(tx_hash)
|
||||
|
||||
return trace
|
||||
|
||||
except Exception as e:
|
||||
return {'error': str(e), 'hash': tx_hash}
|
||||
|
||||
def _parse_logs(self, logs: List) -> List[Dict]:
|
||||
"""Parse transaction logs"""
|
||||
parsed_logs = []
|
||||
for log in logs:
|
||||
parsed_logs.append({
|
||||
'address': log.address,
|
||||
'topics': [topic.hex() for topic in log.topics],
|
||||
'data': log.data.hex(),
|
||||
'log_index': log.logIndex,
|
||||
'decoded': self._decode_log(log)
|
||||
})
|
||||
return parsed_logs
|
||||
|
||||
def _decode_log(self, log) -> Optional[Dict]:
|
||||
"""Attempt to decode log events"""
|
||||
# This would contain ABI decoding logic
|
||||
# For now, return basic info
|
||||
return {
|
||||
'signature': log.topics[0].hex() if log.topics else None,
|
||||
'event_name': 'Unknown' # Would be decoded from ABI
|
||||
}
|
||||
|
||||
def _get_internal_transfers(self, tx_hash: str) -> List[Dict]:
|
||||
"""Get internal ETH transfers (requires tracing)"""
|
||||
try:
|
||||
# Try debug_traceTransaction if available
|
||||
trace = self.w3.provider.make_request('debug_traceTransaction', [tx_hash, {}])
|
||||
transfers = []
|
||||
|
||||
# Parse trace for transfers
|
||||
if trace and 'result' in trace:
|
||||
# Implementation would parse the trace for CALL/DELEGATECALL with value
|
||||
pass
|
||||
|
||||
return transfers
|
||||
except:
|
||||
return []
|
||||
|
||||
def analyze_gas_usage(self, tx_hash: str) -> Dict[str, Any]:
|
||||
"""Analyze gas usage and provide optimization tips"""
|
||||
trace = self.trace_transaction(tx_hash)
|
||||
|
||||
if 'error' in trace:
|
||||
return trace
|
||||
|
||||
analysis = {
|
||||
'gas_used': trace['gas_used'],
|
||||
'gas_limit': trace['gas_limit'],
|
||||
'efficiency': trace['gas_efficiency'],
|
||||
'recommendations': []
|
||||
}
|
||||
|
||||
# Gas efficiency recommendations
|
||||
if trace['gas_used'] < trace['gas_limit'] * 0.5:
|
||||
analysis['recommendations'].append(
|
||||
f"Gas limit too high. Consider reducing to ~{int(trace['gas_used'] * 1.2)}"
|
||||
)
|
||||
|
||||
# Gas price analysis
|
||||
if trace['gas_price'] > 100: # High gas price threshold
|
||||
analysis['recommendations'].append(
|
||||
"High gas price detected. Consider using EIP-1559 or waiting for lower gas"
|
||||
)
|
||||
|
||||
return analysis
|
||||
|
||||
def debug_failed_transaction(self, tx_hash: str) -> Dict[str, Any]:
|
||||
"""Debug why a transaction failed"""
|
||||
trace = self.trace_transaction(tx_hash)
|
||||
|
||||
if trace.get('status') == 'success':
|
||||
return {'error': 'Transaction was successful', 'hash': tx_hash}
|
||||
|
||||
debug_info = {
|
||||
'hash': tx_hash,
|
||||
'failure_reason': 'Unknown',
|
||||
'possible_causes': [],
|
||||
'debug_steps': []
|
||||
}
|
||||
|
||||
# Check for common failure reasons
|
||||
debug_info['debug_steps'].append("1. Checking if transaction ran out of gas...")
|
||||
if trace['gas_used'] == trace['gas_limit']:
|
||||
debug_info['failure_reason'] = 'Out of gas'
|
||||
debug_info['possible_causes'].append('Transaction required more gas than provided')
|
||||
debug_info['debug_steps'].append(" ✓ Transaction ran out of gas")
|
||||
|
||||
debug_info['debug_steps'].append("2. Checking for revert reasons...")
|
||||
# Would implement revert reason decoding here
|
||||
debug_info['debug_steps'].append(" ✗ Could not decode revert reason")
|
||||
|
||||
debug_info['debug_steps'].append("3. Checking nonce issues...")
|
||||
# Would check for nonce problems
|
||||
debug_info['debug_steps'].append(" ✓ Nonce appears correct")
|
||||
|
||||
return debug_info
|
||||
|
||||
def monitor_mempool(self, address: str = None) -> Dict[str, Any]:
|
||||
"""Monitor transaction mempool"""
|
||||
try:
|
||||
# Get pending transactions
|
||||
pending_block = self.w3.eth.get_block('pending', full_transactions=True)
|
||||
pending_txs = pending_block.transactions
|
||||
|
||||
mempool_info = {
|
||||
'pending_count': len(pending_txs),
|
||||
'pending_by_address': {},
|
||||
'high_priority_txs': [],
|
||||
'stuck_txs': []
|
||||
}
|
||||
|
||||
# Analyze pending transactions
|
||||
for tx in pending_txs:
|
||||
from_addr = str(tx['from'])
|
||||
if from_addr not in mempool_info['pending_by_address']:
|
||||
mempool_info['pending_by_address'][from_addr] = 0
|
||||
mempool_info['pending_by_address'][from_addr] += 1
|
||||
|
||||
# High priority transactions (high gas price)
|
||||
if tx.gasPrice > web3.Web3.to_wei(50, 'gwei'):
|
||||
mempool_info['high_priority_txs'].append({
|
||||
'hash': tx.hash.hex(),
|
||||
'gas_price': web3.Web3.from_wei(tx.gasPrice, 'gwei'),
|
||||
'from': from_addr
|
||||
})
|
||||
|
||||
return mempool_info
|
||||
|
||||
except Exception as e:
|
||||
return {'error': str(e)}
|
||||
|
||||
def print_trace(self, trace: Dict[str, Any]):
|
||||
"""Print formatted transaction trace"""
|
||||
if 'error' in trace:
|
||||
print(f"Error: {trace['error']}")
|
||||
return
|
||||
|
||||
print(f"\n{'='*60}")
|
||||
print(f"Transaction Trace: {trace['hash']}")
|
||||
print(f"{'='*60}")
|
||||
print(f"Status: {trace['status'].upper()}")
|
||||
print(f"Block: #{trace['block_number']} ({trace['block_hash'][:10]}...)")
|
||||
print(f"From: {trace['from_address']}")
|
||||
print(f"To: {trace['to_address'] or 'Contract Creation'}")
|
||||
print(f"Value: {trace['value']} ETH")
|
||||
print(f"Gas Used: {trace['gas_used']:,} / {trace['gas_limit']:,} ({trace['gas_efficiency']})")
|
||||
print(f"Gas Price: {trace['gas_price']} gwei")
|
||||
if trace['max_fee_per_gas']:
|
||||
print(f"Max Fee: {trace['max_fee_per_gas']} gwei")
|
||||
print(f"Priority Fee: {trace['max_priority_fee_per_gas']} gwei")
|
||||
|
||||
if trace.get('contract_created'):
|
||||
print(f"\nContract Created: {trace['contract_created']}")
|
||||
|
||||
if trace['logs']:
|
||||
print(f"\nLogs ({len(trace['logs'])}):")
|
||||
for log in trace['logs'][:5]: # Show first 5 logs
|
||||
print(f" - {log['address']}: {log['decoded']['event_name'] or 'Unknown Event'}")
|
||||
|
||||
if trace['internal_transfers']:
|
||||
print(f"\nInternal Transfers:")
|
||||
for transfer in trace['internal_transfers']:
|
||||
print(f" {transfer['from']} -> {transfer['to']}: {transfer['value']} ETH")
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description='AITBC Transaction Tracer')
|
||||
parser.add_argument('command', choices=['trace', 'analyze', 'debug', 'mempool'])
|
||||
parser.add_argument('--tx', help='Transaction hash')
|
||||
parser.add_argument('--address', help='Address for mempool monitoring')
|
||||
parser.add_argument('--node', default='http://localhost:8545', help='Node URL')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
tracer = TransactionTracer(args.node)
|
||||
|
||||
if args.command == 'trace':
|
||||
if not args.tx:
|
||||
print("Error: Transaction hash required for trace command")
|
||||
sys.exit(1)
|
||||
trace = tracer.trace_transaction(args.tx)
|
||||
tracer.print_trace(trace)
|
||||
|
||||
elif args.command == 'analyze':
|
||||
if not args.tx:
|
||||
print("Error: Transaction hash required for analyze command")
|
||||
sys.exit(1)
|
||||
analysis = tracer.analyze_gas_usage(args.tx)
|
||||
print(json.dumps(analysis, indent=2))
|
||||
|
||||
elif args.command == 'debug':
|
||||
if not args.tx:
|
||||
print("Error: Transaction hash required for debug command")
|
||||
sys.exit(1)
|
||||
debug = tracer.debug_failed_transaction(args.tx)
|
||||
print(json.dumps(debug, indent=2))
|
||||
|
||||
elif args.command == 'mempool':
|
||||
mempool = tracer.monitor_mempool(args.address)
|
||||
print(json.dumps(mempool, indent=2))
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
76
.windsurf/skills/deploy-production/SKILL.md
Normal file
76
.windsurf/skills/deploy-production/SKILL.md
Normal file
@@ -0,0 +1,76 @@
|
||||
---
|
||||
name: deploy-production
|
||||
description: Automated production deployment workflow for AITBC blockchain components
|
||||
version: 1.0.0
|
||||
author: Cascade
|
||||
tags: [deployment, production, blockchain, aitbc]
|
||||
---
|
||||
|
||||
# Production Deployment Skill
|
||||
|
||||
This skill provides a standardized workflow for deploying AITBC components to production environments.
|
||||
|
||||
## Overview
|
||||
|
||||
The production deployment skill ensures safe, consistent, and verifiable deployments of all AITBC stack components including:
|
||||
- Coordinator services
|
||||
- Blockchain node
|
||||
- Miner daemon
|
||||
- Web applications
|
||||
- Infrastructure components
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Production server access configured
|
||||
- SSL certificates installed
|
||||
- Environment variables set
|
||||
- Backup procedures in place
|
||||
- Monitoring systems active
|
||||
|
||||
## Deployment Steps
|
||||
|
||||
### 1. Pre-deployment Checks
|
||||
- Run health checks on all services
|
||||
- Verify backup integrity
|
||||
- Check disk space and resources
|
||||
- Validate configuration files
|
||||
- Review recent changes
|
||||
|
||||
### 2. Environment Preparation
|
||||
- Update dependencies
|
||||
- Build new artifacts
|
||||
- Run smoke tests
|
||||
- Prepare rollback plan
|
||||
|
||||
### 3. Deployment Execution
|
||||
- Stop services gracefully
|
||||
- Deploy new code
|
||||
- Update configurations
|
||||
- Restart services
|
||||
- Verify health status
|
||||
|
||||
### 4. Post-deployment Verification
|
||||
- Run integration tests
|
||||
- Check API endpoints
|
||||
- Verify blockchain sync
|
||||
- Monitor system metrics
|
||||
- Validate user access
|
||||
|
||||
## Supporting Files
|
||||
|
||||
- `pre-deploy-checks.sh` - Automated pre-deployment validation
|
||||
- `environment-template.env` - Production environment template
|
||||
- `rollback-steps.md` - Emergency rollback procedures
|
||||
- `health-check.py` - Service health verification script
|
||||
|
||||
## Usage
|
||||
|
||||
This skill is automatically invoked when you request production deployment. You can also manually invoke it by mentioning "deploy production" or "production deployment".
|
||||
|
||||
## Safety Features
|
||||
|
||||
- Automatic rollback on failure
|
||||
- Service health monitoring
|
||||
- Configuration validation
|
||||
- Backup verification
|
||||
- Rollback checkpoint creation
|
||||
238
.windsurf/skills/deploy-production/health-check.py
Executable file
238
.windsurf/skills/deploy-production/health-check.py
Executable file
@@ -0,0 +1,238 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
AITBC Production Health Check Script
|
||||
Verifies the health of all AITBC services after deployment
|
||||
"""
|
||||
|
||||
import requests
|
||||
import json
|
||||
import sys
|
||||
import time
|
||||
from datetime import datetime
|
||||
from typing import Dict, List, Tuple
|
||||
|
||||
# Configuration
|
||||
SERVICES = {
|
||||
"coordinator": {
|
||||
"url": "http://localhost:8080/health",
|
||||
"expected_status": 200,
|
||||
"timeout": 10
|
||||
},
|
||||
"blockchain-node": {
|
||||
"url": "http://localhost:8545",
|
||||
"method": "POST",
|
||||
"payload": {
|
||||
"jsonrpc": "2.0",
|
||||
"method": "eth_blockNumber",
|
||||
"params": [],
|
||||
"id": 1
|
||||
},
|
||||
"expected_status": 200,
|
||||
"timeout": 10
|
||||
},
|
||||
"dashboard": {
|
||||
"url": "https://aitbc.io/health",
|
||||
"expected_status": 200,
|
||||
"timeout": 10
|
||||
},
|
||||
"api": {
|
||||
"url": "https://api.aitbc.io/v1/status",
|
||||
"expected_status": 200,
|
||||
"timeout": 10
|
||||
},
|
||||
"miner": {
|
||||
"url": "http://localhost:8081/api/status",
|
||||
"expected_status": 200,
|
||||
"timeout": 10
|
||||
}
|
||||
}
|
||||
|
||||
# Colors for output
|
||||
class Colors:
|
||||
GREEN = '\033[92m'
|
||||
RED = '\033[91m'
|
||||
YELLOW = '\033[93m'
|
||||
BLUE = '\033[94m'
|
||||
ENDC = '\033[0m'
|
||||
|
||||
def print_status(message: str, status: str = "INFO"):
|
||||
"""Print colored status message"""
|
||||
timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||
if status == "SUCCESS":
|
||||
print(f"{Colors.GREEN}[✓]{Colors.ENDC} {timestamp} - {message}")
|
||||
elif status == "ERROR":
|
||||
print(f"{Colors.RED}[✗]{Colors.ENDC} {timestamp} - {message}")
|
||||
elif status == "WARNING":
|
||||
print(f"{Colors.YELLOW}[⚠]{Colors.ENDC} {timestamp} - {message}")
|
||||
else:
|
||||
print(f"{Colors.BLUE}[ℹ]{Colors.ENDC} {timestamp} - {message}")
|
||||
|
||||
def check_service(name: str, config: Dict) -> Tuple[bool, str]:
|
||||
"""Check individual service health"""
|
||||
try:
|
||||
method = config.get('method', 'GET')
|
||||
timeout = config.get('timeout', 10)
|
||||
expected_status = config.get('expected_status', 200)
|
||||
|
||||
if method == 'POST':
|
||||
response = requests.post(
|
||||
config['url'],
|
||||
json=config.get('payload', {}),
|
||||
timeout=timeout,
|
||||
headers={'Content-Type': 'application/json'}
|
||||
)
|
||||
else:
|
||||
response = requests.get(config['url'], timeout=timeout)
|
||||
|
||||
if response.status_code == expected_status:
|
||||
# Additional checks for specific services
|
||||
if name == "blockchain-node":
|
||||
data = response.json()
|
||||
if 'result' in data:
|
||||
block_number = int(data['result'], 16)
|
||||
return True, f"Block number: {block_number}"
|
||||
return False, "Invalid response format"
|
||||
|
||||
elif name == "coordinator":
|
||||
data = response.json()
|
||||
if data.get('status') == 'healthy':
|
||||
return True, f"Version: {data.get('version', 'unknown')}"
|
||||
return False, f"Status: {data.get('status')}"
|
||||
|
||||
return True, f"Status: {response.status_code}"
|
||||
else:
|
||||
return False, f"HTTP {response.status_code}"
|
||||
|
||||
except requests.exceptions.Timeout:
|
||||
return False, "Timeout"
|
||||
except requests.exceptions.ConnectionError:
|
||||
return False, "Connection refused"
|
||||
except Exception as e:
|
||||
return False, str(e)
|
||||
|
||||
def check_database() -> Tuple[bool, str]:
|
||||
"""Check database connectivity"""
|
||||
try:
|
||||
# This would use your actual database connection
|
||||
import psycopg2
|
||||
conn = psycopg2.connect(
|
||||
host="localhost",
|
||||
database="aitbc_prod",
|
||||
user="postgres",
|
||||
password="your_password"
|
||||
)
|
||||
cursor = conn.cursor()
|
||||
cursor.execute("SELECT 1")
|
||||
cursor.close()
|
||||
conn.close()
|
||||
return True, "Database connected"
|
||||
except Exception as e:
|
||||
return False, str(e)
|
||||
|
||||
def check_redis() -> Tuple[bool, str]:
|
||||
"""Check Redis connectivity"""
|
||||
try:
|
||||
import redis
|
||||
r = redis.Redis(host='localhost', port=6379, db=0)
|
||||
r.ping()
|
||||
return True, "Redis connected"
|
||||
except Exception as e:
|
||||
return False, str(e)
|
||||
|
||||
def check_disk_space() -> Tuple[bool, str]:
|
||||
"""Check disk space usage"""
|
||||
import shutil
|
||||
total, used, free = shutil.disk_usage("/")
|
||||
percent_used = (used / total) * 100
|
||||
if percent_used < 80:
|
||||
return True, f"Disk usage: {percent_used:.1f}%"
|
||||
else:
|
||||
return False, f"Disk usage critical: {percent_used:.1f}%"
|
||||
|
||||
def check_ssl_certificates() -> Tuple[bool, str]:
|
||||
"""Check SSL certificate validity"""
|
||||
import ssl
|
||||
import socket
|
||||
from datetime import datetime
|
||||
|
||||
try:
|
||||
context = ssl.create_default_context()
|
||||
with socket.create_connection(("aitbc.io", 443)) as sock:
|
||||
with context.wrap_socket(sock, server_hostname="aitbc.io") as ssock:
|
||||
cert = ssock.getpeercert()
|
||||
expiry_date = datetime.strptime(cert['notAfter'], '%b %d %H:%M:%S %Y %Z')
|
||||
days_until_expiry = (expiry_date - datetime.now()).days
|
||||
|
||||
if days_until_expiry > 7:
|
||||
return True, f"SSL valid for {days_until_expiry} days"
|
||||
else:
|
||||
return False, f"SSL expires in {days_until_expiry} days"
|
||||
except Exception as e:
|
||||
return False, str(e)
|
||||
|
||||
def main():
|
||||
"""Main health check function"""
|
||||
print_status("Starting AITBC Production Health Check", "INFO")
|
||||
print("=" * 60)
|
||||
|
||||
all_passed = True
|
||||
failed_services = []
|
||||
|
||||
# Check all services
|
||||
print_status("\n=== Service Health Checks ===")
|
||||
for name, config in SERVICES.items():
|
||||
success, message = check_service(name, config)
|
||||
if success:
|
||||
print_status(f"{name}: {message}", "SUCCESS")
|
||||
else:
|
||||
print_status(f"{name}: {message}", "ERROR")
|
||||
all_passed = False
|
||||
failed_services.append(name)
|
||||
|
||||
# Check infrastructure components
|
||||
print_status("\n=== Infrastructure Checks ===")
|
||||
|
||||
# Database
|
||||
db_success, db_message = check_database()
|
||||
if db_success:
|
||||
print_status(f"Database: {db_message}", "SUCCESS")
|
||||
else:
|
||||
print_status(f"Database: {db_message}", "ERROR")
|
||||
all_passed = False
|
||||
|
||||
# Redis
|
||||
redis_success, redis_message = check_redis()
|
||||
if redis_success:
|
||||
print_status(f"Redis: {redis_message}", "SUCCESS")
|
||||
else:
|
||||
print_status(f"Redis: {redis_message}", "ERROR")
|
||||
all_passed = False
|
||||
|
||||
# Disk space
|
||||
disk_success, disk_message = check_disk_space()
|
||||
if disk_success:
|
||||
print_status(f"Disk: {disk_message}", "SUCCESS")
|
||||
else:
|
||||
print_status(f"Disk: {disk_message}", "ERROR")
|
||||
all_passed = False
|
||||
|
||||
# SSL certificates
|
||||
ssl_success, ssl_message = check_ssl_certificates()
|
||||
if ssl_success:
|
||||
print_status(f"SSL: {ssl_message}", "SUCCESS")
|
||||
else:
|
||||
print_status(f"SSL: {ssl_message}", "ERROR")
|
||||
all_passed = False
|
||||
|
||||
# Summary
|
||||
print("\n" + "=" * 60)
|
||||
if all_passed:
|
||||
print_status("All checks passed! System is healthy.", "SUCCESS")
|
||||
sys.exit(0)
|
||||
else:
|
||||
print_status(f"Health check failed! Failed services: {', '.join(failed_services)}", "ERROR")
|
||||
print_status("Please check the logs and investigate the issues.", "WARNING")
|
||||
sys.exit(1)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
102
.windsurf/skills/deploy-production/pre-deploy-checks.sh
Executable file
102
.windsurf/skills/deploy-production/pre-deploy-checks.sh
Executable file
@@ -0,0 +1,102 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Pre-deployment checks for AITBC production deployment
|
||||
# This script validates system readiness before deployment
|
||||
|
||||
set -e
|
||||
|
||||
echo "=== AITBC Production Pre-deployment Checks ==="
|
||||
|
||||
# Colors for output
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
NC='\033[0m' # No Color
|
||||
|
||||
# Function to print status
|
||||
check_status() {
|
||||
if [ $? -eq 0 ]; then
|
||||
echo -e "${GREEN}✓${NC} $1"
|
||||
else
|
||||
echo -e "${RED}✗${NC} $1"
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
warning() {
|
||||
echo -e "${YELLOW}⚠${NC} $1"
|
||||
}
|
||||
|
||||
# 1. Check disk space
|
||||
echo -e "\n1. Checking disk space..."
|
||||
DISK_USAGE=$(df / | awk 'NR==2 {print $5}' | sed 's/%//')
|
||||
if [ $DISK_USAGE -lt 80 ]; then
|
||||
check_status "Disk space usage: ${DISK_USAGE}%"
|
||||
else
|
||||
warning "Disk space usage is high: ${DISK_USAGE}%"
|
||||
fi
|
||||
|
||||
# 2. Check memory usage
|
||||
echo -e "\n2. Checking memory usage..."
|
||||
MEM_AVAILABLE=$(free -m | awk 'NR==2{printf "%.0f", $7}')
|
||||
if [ $MEM_AVAILABLE -gt 1024 ]; then
|
||||
check_status "Available memory: ${MEM_AVAILABLE}MB"
|
||||
else
|
||||
warning "Low memory available: ${MEM_AVAILABLE}MB"
|
||||
fi
|
||||
|
||||
# 3. Check service status
|
||||
echo -e "\n3. Checking critical services..."
|
||||
services=("nginx" "docker" "postgresql")
|
||||
for service in "${services[@]}"; do
|
||||
if systemctl is-active --quiet $service; then
|
||||
check_status "$service is running"
|
||||
else
|
||||
echo -e "${RED}✗${NC} $service is not running"
|
||||
fi
|
||||
done
|
||||
|
||||
# 4. Check SSL certificates
|
||||
echo -e "\n4. Checking SSL certificates..."
|
||||
if [ -f "/etc/letsencrypt/live/$(hostname)/fullchain.pem" ]; then
|
||||
EXPIRY=$(openssl x509 -in /etc/letsencrypt/live/$(hostname)/fullchain.pem -noout -enddate | cut -d= -f2)
|
||||
check_status "SSL certificate valid until: $EXPIRY"
|
||||
else
|
||||
warning "SSL certificate not found"
|
||||
fi
|
||||
|
||||
# 5. Check backup
|
||||
echo -e "\n5. Checking recent backup..."
|
||||
BACKUP_DIR="/var/backups/aitbc"
|
||||
if [ -d "$BACKUP_DIR" ]; then
|
||||
LATEST_BACKUP=$(ls -lt $BACKUP_DIR | head -n 2 | tail -n 1 | awk '{print $9}')
|
||||
if [ -n "$LATEST_BACKUP" ]; then
|
||||
check_status "Latest backup: $LATEST_BACKUP"
|
||||
else
|
||||
warning "No recent backup found"
|
||||
fi
|
||||
else
|
||||
warning "Backup directory not found"
|
||||
fi
|
||||
|
||||
# 6. Check environment variables
|
||||
echo -e "\n6. Checking environment configuration..."
|
||||
if [ -f "/etc/environment" ] && grep -q "AITBC_ENV=production" /etc/environment; then
|
||||
check_status "Production environment configured"
|
||||
else
|
||||
warning "Production environment not set"
|
||||
fi
|
||||
|
||||
# 7. Check ports
|
||||
echo -e "\n7. Checking required ports..."
|
||||
ports=("80" "443" "8080" "8545")
|
||||
for port in "${ports[@]}"; do
|
||||
if netstat -tuln | grep -q ":$port "; then
|
||||
check_status "Port $port is listening"
|
||||
else
|
||||
warning "Port $port is not listening"
|
||||
fi
|
||||
done
|
||||
|
||||
echo -e "\n=== Pre-deployment checks completed ==="
|
||||
echo -e "${GREEN}Ready for deployment!${NC}"
|
||||
187
.windsurf/skills/deploy-production/rollback-steps.md
Normal file
187
.windsurf/skills/deploy-production/rollback-steps.md
Normal file
@@ -0,0 +1,187 @@
|
||||
# Production Rollback Procedures
|
||||
|
||||
## Emergency Rollback Guide
|
||||
|
||||
Use these procedures when a deployment causes critical issues in production.
|
||||
|
||||
### Immediate Actions (First 5 minutes)
|
||||
|
||||
1. **Assess the Impact**
|
||||
- Check monitoring dashboards
|
||||
- Review error logs
|
||||
- Identify affected services
|
||||
- Determine if rollback is necessary
|
||||
|
||||
2. **Communicate**
|
||||
- Notify team in #production-alerts
|
||||
- Post status on status page if needed
|
||||
- Document start time of incident
|
||||
|
||||
### Automated Rollback (if available)
|
||||
|
||||
```bash
|
||||
# Quick rollback to previous version
|
||||
./scripts/rollback-to-previous.sh
|
||||
|
||||
# Rollback to specific version
|
||||
./scripts/rollback-to-version.sh v1.2.3
|
||||
```
|
||||
|
||||
### Manual Rollback Steps
|
||||
|
||||
#### 1. Stop Current Services
|
||||
```bash
|
||||
# Stop all AITBC services
|
||||
sudo systemctl stop aitbc-coordinator
|
||||
sudo systemctl stop aitbc-node
|
||||
sudo systemctl stop aitbc-miner
|
||||
sudo systemctl stop aitbc-dashboard
|
||||
sudo docker-compose down
|
||||
```
|
||||
|
||||
#### 2. Restore Previous Code
|
||||
```bash
|
||||
# Get previous deployment tag
|
||||
git tag --sort=-version:refname | head -n 5
|
||||
|
||||
# Checkout previous stable version
|
||||
git checkout v1.2.3
|
||||
|
||||
# Rebuild if necessary
|
||||
docker-compose build --no-cache
|
||||
```
|
||||
|
||||
#### 3. Restore Database (if needed)
|
||||
```bash
|
||||
# List available backups
|
||||
aws s3 ls s3://aitbc-backups/database/
|
||||
|
||||
# Restore latest backup
|
||||
pg_restore -h localhost -U postgres -d aitbc_prod latest_backup.dump
|
||||
```
|
||||
|
||||
#### 4. Restore Configuration
|
||||
```bash
|
||||
# Restore from backup
|
||||
cp /etc/aitbc/backup/config.yaml /etc/aitbc/config.yaml
|
||||
cp /etc/aitbc/backup/.env /etc/aitbc/.env
|
||||
```
|
||||
|
||||
#### 5. Restart Services
|
||||
```bash
|
||||
# Start services in correct order
|
||||
sudo systemctl start aitbc-coordinator
|
||||
sleep 10
|
||||
sudo systemctl start aitbc-node
|
||||
sleep 10
|
||||
sudo systemctl start aitbc-miner
|
||||
sleep 10
|
||||
sudo systemctl start aitbc-dashboard
|
||||
```
|
||||
|
||||
#### 6. Verify Rollback
|
||||
```bash
|
||||
# Check service status
|
||||
./scripts/health-check.sh
|
||||
|
||||
# Run smoke tests
|
||||
./scripts/smoke-test.sh
|
||||
|
||||
# Verify blockchain sync
|
||||
curl -X POST http://localhost:8545 -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}'
|
||||
```
|
||||
|
||||
### Database-Specific Rollbacks
|
||||
|
||||
#### Partial Data Rollback
|
||||
```bash
|
||||
# Create backup before changes
|
||||
pg_dump -h localhost -U postgres aitbc_prod > pre-rollback-backup.sql
|
||||
|
||||
# Rollback specific tables
|
||||
psql -h localhost -U postgres -d aitbc_prod < rollback-tables.sql
|
||||
```
|
||||
|
||||
#### Migration Rollback
|
||||
```bash
|
||||
# Check migration status
|
||||
./scripts/migration-status.sh
|
||||
|
||||
# Rollback last migration
|
||||
./scripts/rollback-migration.sh
|
||||
```
|
||||
|
||||
### Service-Specific Rollbacks
|
||||
|
||||
#### Coordinator Service
|
||||
```bash
|
||||
# Restore coordinator state
|
||||
sudo systemctl stop aitbc-coordinator
|
||||
cp /var/lib/aitbc/coordinator/backup/state.db /var/lib/aitbc/coordinator/
|
||||
sudo systemctl start aitbc-coordinator
|
||||
```
|
||||
|
||||
#### Blockchain Node
|
||||
```bash
|
||||
# Reset to last stable block
|
||||
sudo systemctl stop aitbc-node
|
||||
aitbc-node --reset-to-block 123456
|
||||
sudo systemctl start aitbc-node
|
||||
```
|
||||
|
||||
#### Mining Operations
|
||||
```bash
|
||||
# Stop mining immediately
|
||||
curl -X POST http://localhost:8080/api/mining/stop
|
||||
|
||||
# Reset mining state
|
||||
redis-cli FLUSHDB
|
||||
```
|
||||
|
||||
### Verification Checklist
|
||||
|
||||
- [ ] All services running
|
||||
- [ ] Database connectivity
|
||||
- [ ] API endpoints responding
|
||||
- [ ] Blockchain syncing
|
||||
- [ ] Mining operations (if applicable)
|
||||
- [ ] Dashboard accessible
|
||||
- [ ] SSL certificates valid
|
||||
- [ ] Monitoring alerts cleared
|
||||
|
||||
### Post-Rollback Actions
|
||||
|
||||
1. **Root Cause Analysis**
|
||||
- Document what went wrong
|
||||
- Identify failure point
|
||||
- Create prevention plan
|
||||
|
||||
2. **Team Communication**
|
||||
- Update incident ticket
|
||||
- Share lessons learned
|
||||
- Update runbooks
|
||||
|
||||
3. **Preventive Measures**
|
||||
- Add additional tests
|
||||
- Improve monitoring
|
||||
- Update deployment checklist
|
||||
|
||||
### Contact Information
|
||||
|
||||
- **On-call Engineer**: [Phone/Slack]
|
||||
- **Engineering Lead**: [Phone/Slack]
|
||||
- **DevOps Team**: #devops-alerts
|
||||
- **Management**: #management-alerts
|
||||
|
||||
### Escalation
|
||||
|
||||
1. **Level 1**: On-call engineer (first 15 minutes)
|
||||
2. **Level 2**: Engineering lead (after 15 minutes)
|
||||
3. **Level 3**: CTO (after 30 minutes)
|
||||
|
||||
### Notes
|
||||
|
||||
- Always create a backup before rollback
|
||||
- Document every step during rollback
|
||||
- Test in staging before production if possible
|
||||
- Keep stakeholders informed throughout process
|
||||
39
.windsurf/skills/ollama-gpu-provider/SKILL.md
Normal file
39
.windsurf/skills/ollama-gpu-provider/SKILL.md
Normal file
@@ -0,0 +1,39 @@
|
||||
---
|
||||
name: ollama-gpu-provider
|
||||
description: End-to-end Ollama prompt payment test against the GPU miner provider
|
||||
version: 1.0.0
|
||||
author: Cascade
|
||||
tags: [gpu, miner, ollama, payments, receipts, test]
|
||||
---
|
||||
|
||||
# Ollama GPU Provider Test Skill
|
||||
|
||||
This skill runs an end-to-end client → coordinator → GPU miner → receipt flow using an Ollama prompt.
|
||||
|
||||
## Overview
|
||||
|
||||
The test submits a prompt (default: "hello") to the coordinator via the host proxy, waits for completion, and verifies that the job result and signed receipt are returned.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Host GPU miner running and registered (RTX 4060 Ti + Ollama)
|
||||
- Incus proxy forwarding `127.0.0.1:18000` → container `127.0.0.1:8000`
|
||||
- Coordinator running in container (`coordinator-api.service`)
|
||||
- Receipt signing key configured in `/opt/coordinator-api/src/.env`
|
||||
|
||||
## Test Command
|
||||
|
||||
```bash
|
||||
python3 cli/test_ollama_gpu_provider.py --url http://127.0.0.1:18000 --prompt "hello"
|
||||
```
|
||||
|
||||
## Expected Outcome
|
||||
|
||||
- Job reaches `COMPLETED`
|
||||
- Output returned from Ollama
|
||||
- Receipt present with a `receipt_id`
|
||||
|
||||
## Notes
|
||||
|
||||
- Use `--timeout` to allow longer runs for large models.
|
||||
- If the receipt is missing, verify `receipt_signing_key_hex` is set and restart the coordinator.
|
||||
Reference in New Issue
Block a user