feat: implement CLI blockchain features and pool hub enhancements

CLI Blockchain Features: - Added block operations: import, export, import-chain, blocks-range - Added messaging system commands (deploy, state, topics, create-topic, messages, post, vote, search, reputation, moderate) - Added network force-sync operation - Replaced marketplace handlers with actual RPC calls - Replaced AI handlers with actual RPC calls - Added account operations (account get) - Added transaction query operations - Added mempool query operations - Created keystore_auth.py for authentication - Removed extended features interception - All handlers use keystore credentials for authenticated endpoints Pool Hub Enhancements: - Added SLA monitoring and capacity tables - Added billing integration service - Added SLA collector service - Added SLA router endpoints - Updated pool hub models and settings - Added integration tests for billing and SLA - Updated documentation with SLA monitoring guide
2026-04-22 15:59:00 +02:00
parent 51920a15d7
commit e22d864944
28 changed files with 4783 additions and 358 deletions
--- a/docs/MASTER_INDEX.md
+++ b/docs/MASTER_INDEX.md
@@ -2,7 +2,7 @@

 **Complete documentation catalog with quick access to all content**

-**Project Status**: ✅ **100% COMPLETED** (v0.3.1 - April 13, 2026)
+**Project Status**: ✅ **100% COMPLETED** (v0.3.2 - April 22, 2026)

 ---

@@ -360,7 +360,7 @@ This master index provides complete access to all AITBC documentation. Choose yo

 ---

-*Last updated: 2026-04-02*  
-*Quality Score: 10/10*  
-*Total Topics: 25+ across 4 learning levels*  
+*Last updated: 2026-04-22*
+*Quality Score: 10/10*
+*Total Topics: 25+ across 4 learning levels*
 *External Links: 5+ centralized access points*
--- a/docs/README.md
+++ b/docs/README.md
@@ -2,11 +2,11 @@

 **AI Training Blockchain - Privacy-Preserving ML & Edge Computing Platform**

-**Level**: All Levels  
-**Prerequisites**: Basic computer skills  
-**Estimated Time**: Varies by learning path  
-**Last Updated**: 2026-04-13  
-**Version**: 6.1 (April 13, 2026 Update - Test Cleanup & Milestone Tracking Fix)
+**Level**: All Levels
+**Prerequisites**: Basic computer skills
+**Estimated Time**: Varies by learning path
+**Last Updated**: 2026-04-22
+**Version**: 6.2 (April 22, 2026 Update - ait-mainnet Migration & Cross-Node Tests)

 ## 🎉 **PROJECT STATUS: 100% COMPLETED - April 13, 2026**

@@ -167,7 +167,26 @@ For historical reference, duplicate content, and temporary files.
 - **Test Cleanup**: Removed 12 legacy test files, consolidated configuration
 - **Production Architecture**: Aligned with current codebase, systemd service management

-### 🎯 **Latest Release: v0.3.1**
+### 🎯 **Latest Release: v0.3.2**
+
+**Released**: April 22, 2026
+**Status**: ✅ Stable
+
+### Key Features
+- **ait-mainnet Migration**: Successfully migrated all blockchain nodes from ait-devnet to ait-mainnet
+- **Cross-Node Blockchain Tests**: Created comprehensive test suite for multi-node blockchain features
+- **SQLite Corruption Fix**: Resolved database corruption on aitbc1 caused by Btrfs CoW behavior
+- **Network Connectivity Fixes**: Corrected RPC URLs for all nodes (aitbc, aitbc1, gitea-runner)
+- **Test File Updates**: Updated all verification tests to use ait-mainnet chain_id
+
+### Migration Notes
+- All three nodes now using CHAIN_ID=ait-mainnet (aitbc, aitbc1, gitea-runner)
+- Cross-node tests verify chain_id consistency and RPC connectivity across all nodes
+- Applied `chattr +C` to `/var/lib/aitbc/data` on aitbc1 to disable CoW
+- Updated blockchain node configuration: supported_chains from "ait-devnet" to "ait-mainnet"
+- Test file: `/opt/aitbc/tests/verification/test_cross_node_blockchain.py`
+
+### 🎯 **Previous Release: v0.3.1**

 **Released**: April 13, 2026  
 **Status**: ✅ Stable
@@ -320,11 +339,11 @@ Files are now organized with systematic prefixes based on reading level:

 ---

-**Last Updated**: 2026-04-13  
-**Documentation Version**: 4.0 (April 13, 2026 Update - Federated Mesh Architecture)  
-**Quality Score**: 10/10 (Perfect Documentation)  
-**Total Files**: 500+ markdown files with standardized templates  
-**Status**: PRODUCTION READY with perfect documentation structure  
+**Last Updated**: 2026-04-22
+**Documentation Version**: 4.1 (April 22, 2026 Update - ait-mainnet Migration)
+**Quality Score**: 10/10 (Perfect Documentation)
+**Total Files**: 500+ markdown files with standardized templates
+**Status**: PRODUCTION READY with perfect documentation structure

 **🎉 Achievement: Perfect 10/10 Documentation Quality Score Attained!**
 # OpenClaw Integration
--- a/docs/advanced/04_deployment/sla-monitoring.md
+++ b/docs/advanced/04_deployment/sla-monitoring.md
@@ -0,0 +1,584 @@
+# SLA Monitoring Guide
+
+This guide covers SLA (Service Level Agreement) monitoring and billing instrumentation for coordinator/pool hub services in the AITBC ecosystem.
+
+## Overview
+
+The SLA monitoring system provides:
+- Real-time tracking of miner performance metrics
+- Automated SLA violation detection and alerting
+- Capacity planning with forecasting and scaling recommendations
+- Integration with coordinator-api billing system
+- Comprehensive API endpoints for monitoring and management
+
+## Architecture
+
+```
+┌─────────────────┐
+│   Pool-Hub      │
+│                 │
+│  SLA Collector  │──────┐
+│  Capacity       │      │
+│  Planner        │      │
+│                 │      │
+└────────┬────────┘      │
+         │               │
+         │ HTTP API      │
+         │               │
+┌────────▼────────┐     │
+│ Coordinator-API  │◀────┘
+│                 │
+│ Usage Tracking  │
+│ Billing Service  │
+│ Multi-tenant DB │
+└─────────────────┘
+```
+
+## SLA Metrics
+
+### Miner Uptime
+- **Definition**: Percentage of time a miner is available and responsive
+- **Calculation**: Based on heartbeat intervals (5-minute threshold)
+- **Threshold**: 95%
+- **Alert Levels**:
+  - Critical: <85.5% (threshold * 0.9)
+  - High: <95% (threshold)
+
+### Response Time
+- **Definition**: Average time for miner to respond to match requests
+- **Calculation**: Average of `eta_ms` from match results (last 100 results)
+- **Threshold**: 1000ms (P95)
+- **Alert Levels**:
+  - Critical: >2000ms (threshold * 2)
+  - High: >1000ms (threshold)
+
+### Job Completion Rate
+- **Definition**: Percentage of jobs completed successfully
+- **Calculation**: Successful outcomes / total outcomes (last 7 days)
+- **Threshold**: 90%
+- **Alert Levels**:
+  - Critical: <90% (threshold)
+
+### Capacity Availability
+- **Definition**: Percentage of miners available (not busy)
+- **Calculation**: Active miners / Total miners
+- **Threshold**: 80%
+- **Alert Levels**:
+  - High: <80% (threshold)
+
+## Configuration
+
+### Environment Variables
+
+Add to pool-hub `.env`:
+
+```bash
+# Coordinator-API Billing Integration
+COORDINATOR_BILLING_URL=http://localhost:8011
+COORDINATOR_API_KEY=your_api_key_here
+
+# SLA Configuration
+SLA_UPTIME_THRESHOLD=95.0
+SLA_RESPONSE_TIME_THRESHOLD=1000.0
+SLA_COMPLETION_RATE_THRESHOLD=90.0
+SLA_CAPACITY_THRESHOLD=80.0
+
+# Capacity Planning
+CAPACITY_FORECAST_HOURS=168
+CAPACITY_ALERT_THRESHOLD_PCT=80.0
+
+# Billing Sync
+BILLING_SYNC_INTERVAL_HOURS=1
+
+# SLA Collection
+SLA_COLLECTION_INTERVAL_SECONDS=300
+```
+
+### Settings File
+
+Configuration can also be set in `poolhub/settings.py`:
+
+```python
+class Settings(BaseSettings):
+    # Coordinator-API Billing Integration
+    coordinator_billing_url: str = Field(default="http://localhost:8011")
+    coordinator_api_key: str | None = Field(default=None)
+
+    # SLA Configuration
+    sla_thresholds: Dict[str, float] = Field(
+        default_factory=lambda: {
+            "uptime_pct": 95.0,
+            "response_time_ms": 1000.0,
+            "completion_rate_pct": 90.0,
+            "capacity_availability_pct": 80.0,
+        }
+    )
+
+    # Capacity Planning Configuration
+    capacity_forecast_hours: int = Field(default=168)
+    capacity_alert_threshold_pct: float = Field(default=80.0)
+
+    # Billing Sync Configuration
+    billing_sync_interval_hours: int = Field(default=1)
+
+    # SLA Collection Configuration
+    sla_collection_interval_seconds: int = Field(default=300)
+```
+
+## Database Schema
+
+### SLA Metrics Table
+
+```sql
+CREATE TABLE sla_metrics (
+    id UUID PRIMARY KEY,
+    miner_id VARCHAR(64) NOT NULL REFERENCES miners(miner_id) ON DELETE CASCADE,
+    metric_type VARCHAR(32) NOT NULL,
+    metric_value FLOAT NOT NULL,
+    threshold FLOAT NOT NULL,
+    is_violation BOOLEAN DEFAULT FALSE,
+    timestamp TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
+    metadata JSONB DEFAULT '{}'
+);
+
+CREATE INDEX ix_sla_metrics_miner_id ON sla_metrics(miner_id);
+CREATE INDEX ix_sla_metrics_timestamp ON sla_metrics(timestamp);
+CREATE INDEX ix_sla_metrics_metric_type ON sla_metrics(metric_type);
+```
+
+### SLA Violations Table
+
+```sql
+CREATE TABLE sla_violations (
+    id UUID PRIMARY KEY,
+    miner_id VARCHAR(64) NOT NULL REFERENCES miners(miner_id) ON DELETE CASCADE,
+    violation_type VARCHAR(32) NOT NULL,
+    severity VARCHAR(16) NOT NULL,
+    metric_value FLOAT NOT NULL,
+    threshold FLOAT NOT NULL,
+    violation_duration_ms INTEGER,
+    resolved_at TIMESTAMP WITH TIME ZONE,
+    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
+    metadata JSONB DEFAULT '{}'
+);
+
+CREATE INDEX ix_sla_violations_miner_id ON sla_violations(miner_id);
+CREATE INDEX ix_sla_violations_created_at ON sla_violations(created_at);
+CREATE INDEX ix_sla_violations_severity ON sla_violations(severity);
+```
+
+### Capacity Snapshots Table
+
+```sql
+CREATE TABLE capacity_snapshots (
+    id UUID PRIMARY KEY,
+    total_miners INTEGER NOT NULL,
+    active_miners INTEGER NOT NULL,
+    total_parallel_capacity INTEGER NOT NULL,
+    total_queue_length INTEGER NOT NULL,
+    capacity_utilization_pct FLOAT NOT NULL,
+    forecast_capacity INTEGER NOT NULL,
+    recommended_scaling VARCHAR(32) NOT NULL,
+    scaling_reason TEXT,
+    timestamp TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
+    metadata JSONB DEFAULT '{}'
+);
+
+CREATE INDEX ix_capacity_snapshots_timestamp ON capacity_snapshots(timestamp);
+```
+
+## Database Migration
+
+Run the migration to add SLA and capacity tables:
+
+```bash
+cd apps/pool-hub
+alembic upgrade head
+```
+
+## API Endpoints
+
+### SLA Metrics Endpoints
+
+#### Get SLA Metrics for a Miner
+```bash
+GET /sla/metrics/{miner_id}?hours=24
+```
+
+Response:
+```json
+[
+  {
+    "id": "uuid",
+    "miner_id": "miner_001",
+    "metric_type": "uptime_pct",
+    "metric_value": 98.5,
+    "threshold": 95.0,
+    "is_violation": false,
+    "timestamp": "2026-04-22T15:00:00Z",
+    "metadata": {}
+  }
+]
+```
+
+#### Get All SLA Metrics
+```bash
+GET /sla/metrics?hours=24
+```
+
+#### Get SLA Violations
+```bash
+GET /sla/violations?resolved=false&miner_id=miner_001
+```
+
+#### Trigger SLA Metrics Collection
+```bash
+POST /sla/metrics/collect
+```
+
+Response:
+```json
+{
+  "miners_processed": 10,
+  "metrics_collected": [...],
+  "violations_detected": 2,
+  "capacity": {
+    "total_miners": 10,
+    "active_miners": 8,
+    "capacity_availability_pct": 80.0
+  }
+}
+```
+
+### Capacity Planning Endpoints
+
+#### Get Capacity Snapshots
+```bash
+GET /sla/capacity/snapshots?hours=24
+```
+
+#### Get Capacity Forecast
+```bash
+GET /sla/capacity/forecast?hours_ahead=168
+```
+
+Response:
+```json
+{
+  "forecast_horizon_hours": 168,
+  "current_capacity": 1000,
+  "projected_capacity": 1500,
+  "recommended_scaling": "+50%",
+  "confidence": 0.85,
+  "source": "coordinator_api"
+}
+```
+
+#### Get Scaling Recommendations
+```bash
+GET /sla/capacity/recommendations
+```
+
+Response:
+```json
+{
+  "current_state": "healthy",
+  "recommendations": [
+    {
+      "action": "add_miners",
+      "quantity": 2,
+      "reason": "Projected capacity shortage in 2 weeks",
+      "priority": "medium"
+    }
+  ],
+  "source": "coordinator_api"
+}
+```
+
+#### Configure Capacity Alerts
+```bash
+POST /sla/capacity/alerts/configure
+```
+
+Request:
+```json
+{
+  "threshold_pct": 80.0,
+  "notification_email": "admin@example.com"
+}
+```
+
+### Billing Integration Endpoints
+
+#### Get Billing Usage
+```bash
+GET /sla/billing/usage?hours=24&tenant_id=tenant_001
+```
+
+#### Sync Billing Usage
+```bash
+POST /sla/billing/sync
+```
+
+Request:
+```json
+{
+  "miner_id": "miner_001",
+  "hours_back": 24
+}
+```
+
+#### Record Usage Event
+```bash
+POST /sla/billing/usage/record
+```
+
+Request:
+```json
+{
+  "tenant_id": "tenant_001",
+  "resource_type": "gpu_hours",
+  "quantity": 10.5,
+  "unit_price": 0.50,
+  "job_id": "job_123",
+  "metadata": {}
+}
+```
+
+#### Generate Invoice
+```bash
+POST /sla/billing/invoice/generate
+```
+
+Request:
+```json
+{
+  "tenant_id": "tenant_001",
+  "period_start": "2026-03-01T00:00:00Z",
+  "period_end": "2026-03-31T23:59:59Z"
+}
+```
+
+### Status Endpoint
+
+#### Get SLA Status
+```bash
+GET /sla/status
+```
+
+Response:
+```json
+{
+  "status": "healthy",
+  "active_violations": 0,
+  "recent_metrics_count": 50,
+  "timestamp": "2026-04-22T15:00:00Z"
+}
+```
+
+## Automated Collection
+
+### SLA Collection Scheduler
+
+The SLA collector can be run as a background service to automatically collect metrics:
+
+```python
+from poolhub.services.sla_collector import SLACollector, SLACollectorScheduler
+from poolhub.database import get_db
+
+# Initialize
+db = next(get_db())
+sla_collector = SLACollector(db)
+scheduler = SLACollectorScheduler(sla_collector)
+
+# Start automated collection (every 5 minutes)
+await scheduler.start(collection_interval_seconds=300)
+```
+
+### Billing Sync Scheduler
+
+The billing integration can be run as a background service to automatically sync usage:
+
+```python
+from poolhub.services.billing_integration import BillingIntegration, BillingIntegrationScheduler
+from poolhub.database import get_db
+
+# Initialize
+db = next(get_db())
+billing_integration = BillingIntegration(db)
+scheduler = BillingIntegrationScheduler(billing_integration)
+
+# Start automated sync (every 1 hour)
+await scheduler.start(sync_interval_hours=1)
+```
+
+## Monitoring and Alerting
+
+### Prometheus Metrics
+
+SLA metrics are exposed to Prometheus with the namespace `poolhub`:
+
+- `poolhub_sla_uptime_pct` - Miner uptime percentage
+- `poolhub_sla_response_time_ms` - Response time in milliseconds
+- `poolhub_sla_completion_rate_pct` - Job completion rate percentage
+- `poolhub_sla_capacity_availability_pct` - Capacity availability percentage
+- `poolhub_sla_violations_total` - Total SLA violations
+- `poolhub_billing_sync_errors_total` - Billing sync errors
+
+### Alert Rules
+
+Example Prometheus alert rules:
+
+```yaml
+groups:
+  - name: poolhub_sla
+    rules:
+      - alert: HighSLAViolationRate
+        expr: rate(poolhub_sla_violations_total[5m]) > 0.1
+        for: 5m
+        labels:
+          severity: critical
+        annotations:
+          summary: High SLA violation rate
+
+      - alert: LowMinerUptime
+        expr: poolhub_sla_uptime_pct < 95
+        for: 5m
+        labels:
+          severity: high
+        annotations:
+          summary: Miner uptime below threshold
+
+      - alert: HighResponseTime
+        expr: poolhub_sla_response_time_ms > 1000
+        for: 5m
+        labels:
+          severity: high
+        annotations:
+          summary: Response time above threshold
+```
+
+## Troubleshooting
+
+### SLA Metrics Not Recording
+
+**Symptom**: SLA metrics are not being recorded in the database
+
+**Solutions**:
+1. Check SLA collector is running: `ps aux | grep sla_collector`
+2. Verify database connection: Check pool-hub database logs
+3. Check SLA collection interval: Ensure `sla_collection_interval_seconds` is configured
+4. Verify miner heartbeats: Check `miner_status.last_heartbeat_at` is being updated
+
+### Billing Sync Failing
+
+**Symptom**: Billing sync to coordinator-api is failing
+
+**Solutions**:
+1. Verify coordinator-api is accessible: `curl http://localhost:8011/health`
+2. Check API key: Ensure `COORDINATOR_API_KEY` is set correctly
+3. Check network connectivity: Ensure pool-hub can reach coordinator-api
+4. Review billing integration logs: Check for HTTP errors or timeouts
+
+### Capacity Alerts Not Triggering
+
+**Symptom**: Capacity alerts are not being generated
+
+**Solutions**:
+1. Verify capacity snapshots are being created: Check `capacity_snapshots` table
+2. Check alert thresholds: Ensure `capacity_alert_threshold_pct` is configured
+3. Verify alert configuration: Check alert configuration endpoint
+4. Review coordinator-api capacity planning: Ensure it's receiving pool-hub data
+
+## Testing
+
+Run the SLA and billing integration tests:
+
+```bash
+cd apps/pool-hub
+
+# Run all SLA and billing tests
+pytest tests/test_sla_collector.py
+pytest tests/test_billing_integration.py
+pytest tests/test_sla_endpoints.py
+pytest tests/test_integration_coordinator.py
+
+# Run with coverage
+pytest --cov=poolhub.services.sla_collector tests/test_sla_collector.py
+pytest --cov=poolhub.services.billing_integration tests/test_billing_integration.py
+```
+
+## Best Practices
+
+1. **Monitor SLA Metrics Regularly**: Set up automated monitoring dashboards to track SLA metrics in real-time
+2. **Configure Appropriate Thresholds**: Adjust SLA thresholds based on your service requirements
+3. **Review Violations Promptly**: Investigate and resolve SLA violations quickly to maintain service quality
+4. **Plan Capacity Proactively**: Use capacity forecasting to anticipate scaling needs
+5. **Test Billing Integration**: Regularly test billing sync to ensure accurate usage tracking
+6. **Keep Documentation Updated**: Maintain up-to-date documentation for SLA configurations and procedures
+
+## Integration with Existing Systems
+
+### Coordinator-API Integration
+
+The pool-hub integrates with coordinator-api's billing system via HTTP API:
+
+1. **Usage Recording**: Pool-hub sends usage events to coordinator-api's `/api/billing/usage` endpoint
+2. **Billing Metrics**: Pool-hub can query billing metrics from coordinator-api
+3. **Invoice Generation**: Pool-hub can trigger invoice generation in coordinator-api
+4. **Capacity Planning**: Pool-hub provides capacity data to coordinator-api's capacity planning system
+
+### Prometheus Integration
+
+SLA metrics are automatically exposed to Prometheus:
+- Metrics are labeled by miner_id, metric_type, and other dimensions
+- Use Prometheus query language to create custom dashboards
+- Set up alert rules based on SLA thresholds
+
+### Alerting Integration
+
+SLA violations can trigger alerts through:
+- Prometheus Alertmanager
+- Custom webhook integrations
+- Email notifications (via coordinator-api)
+- Slack/Discord integrations (via coordinator-api)
+
+## Security Considerations
+
+1. **API Key Security**: Store coordinator-api API keys securely (use environment variables or secret management)
+2. **Database Access**: Ensure database connections use SSL/TLS in production
+3. **Rate Limiting**: Implement rate limiting on billing sync endpoints to prevent abuse
+4. **Audit Logging**: Enable audit logging for SLA and billing operations
+5. **Access Control**: Restrict access to SLA and billing endpoints to authorized users
+
+## Performance Considerations
+
+1. **Batch Operations**: Use batch operations for billing sync to reduce HTTP overhead
+2. **Index Optimization**: Ensure database indexes are properly configured for SLA queries
+3. **Caching**: Use Redis caching for frequently accessed SLA metrics
+4. **Async Processing**: Use async operations for SLA collection and billing sync
+5. **Data Retention**: Implement data retention policies for SLA metrics and capacity snapshots
+
+## Maintenance
+
+### Regular Tasks
+
+1. **Review SLA Thresholds**: Quarterly review and adjust SLA thresholds based on service performance
+2. **Clean Up Old Data**: Regularly clean up old SLA metrics and capacity snapshots (e.g., keep 90 days)
+3. **Review Capacity Forecasts**: Monthly review of capacity forecasts and scaling recommendations
+4. **Audit Billing Records**: Monthly audit of billing records for accuracy
+5. **Update Documentation**: Keep documentation updated with any configuration changes
+
+### Backup and Recovery
+
+1. **Database Backups**: Ensure regular backups of SLA and billing tables
+2. **Configuration Backups**: Backup configuration files and environment variables
+3. **Recovery Procedures**: Document recovery procedures for SLA and billing systems
+4. **Testing Backups**: Regularly test backup and recovery procedures
+
+## References
+
+- [Pool-Hub README](/opt/aitbc/apps/pool-hub/README.md)
+- [Coordinator-API Billing Documentation](/opt/aitbc/apps/coordinator-api/README.md)
+- [Roadmap](/opt/aitbc/docs/beginner/02_project/2_roadmap.md)
+- [Deployment Guide](/opt/aitbc/docs/advanced/04_deployment/0_index.md)
--- a/docs/beginner/02_project/2_roadmap.md
+++ b/docs/beginner/02_project/2_roadmap.md
@@ -797,6 +797,48 @@ Operations (see docs/10_plan/00_nextMileston.md)

 - **Git & Repository Management**
  - ✅ Fixed gitea pull conflicts on aitbc1
+  - ✅ Successfully pulled latest changes from gitea (fast-forward)
+  - ✅ Both nodes now up to date with origin/main
+
+## Stage 30 — ait-mainnet Migration & Cross-Node Blockchain Tests [COMPLETED: 2026-04-22]
+
+- **ait-mainnet Chain Migration**
+  - ✅ Migrated all blockchain nodes from ait-devnet to ait-mainnet
+  - ✅ Updated `/etc/aitbc/.env` on aitbc: CHAIN_ID=ait-mainnet (already configured)
+  - ✅ Updated `/etc/aitbc/.env` on aitbc1: CHAIN_ID=ait-mainnet (changed from ait-devnet)
+  - ✅ Updated `/etc/aitbc/.env` on gitea-runner: CHAIN_ID=ait-mainnet (changed from ait-devnet)
+  - ✅ All three nodes now on same blockchain (ait-mainnet)
+  - ✅ Updated blockchain node configuration: supported_chains from "ait-devnet" to "ait-mainnet"
+
+- **Cross-Node Blockchain Tests**
+  - ✅ Created comprehensive cross-node test suite
+  - ✅ File: `/opt/aitbc/tests/verification/test_cross_node_blockchain.py`
+  - ✅ Tests: Chain ID Consistency, Block Synchronization, Block Range Query, RPC Connectivity
+  - ✅ Tests all three nodes: aitbc, aitbc1, gitea-runner
+  - ✅ Verifies chain_id consistency via SSH configuration check
+  - ✅ Tests block import functionality and RPC connectivity
+  - ✅ All 4 tests passing across 3 nodes
+
+- **Test File Updates for ait-mainnet**
+  - ✅ test_tx_import.py: Updated CHAIN_ID and endpoint path
+  - ✅ test_simple_import.py: Updated CHAIN_ID and endpoint path
+  - ✅ test_minimal.py: Updated CHAIN_ID and endpoint path
+  - ✅ test_block_import.py: Updated CHAIN_ID and endpoint path
+  - ✅ test_block_import_complete.py: Updated CHAIN_ID and endpoint path
+  - ✅ All tests now include chain_id in block data payloads
+
+- **SQLite Database Corruption Fix**
+  - ✅ Fixed SQLite corruption on aitbc1 caused by Btrfs CoW behavior
+  - ✅ Applied `chattr +C` to `/var/lib/aitbc/data` to disable CoW
+  - ✅ Cleared corrupted database files (chain.db*)
+  - ✅ Restarted aitbc-blockchain-node.service
+  - ✅ Service now running successfully without corruption errors
+
+- **Network Connectivity Fixes**
+  - ✅ Corrected aitbc1 RPC URL from 10.0.3.107:8006 to 10.1.223.40:8006
+  - ✅ Added gitea-runner RPC URL: 10.1.223.93:8006
+  - ✅ All nodes now reachable via RPC endpoints
+  - ✅ Cross-node tests verify connectivity between all nodes
  - ✅ Stashed local changes causing conflicts in blockchain files
  - ✅ Successfully pulled latest changes from gitea (fast-forward)
  - ✅ Both nodes now up to date with origin/main
@@ -811,7 +853,97 @@ Operations (see docs/10_plan/00_nextMileston.md)
  - ✅ File: `services/agent_daemon.py`
  - ✅ Systemd service: `systemd/aitbc-agent-daemon.service`

-## Current Status: Multi-Node Blockchain Synchronization Complete
+## Stage 31 — SLA-Backed Coordinator/Pool Hubs [COMPLETED: 2026-04-22]
+
+- **Coordinator-API SLA Monitoring Extension**
+  - ✅ Extended `marketplace_monitor.py` with pool-hub specific SLA metrics
+  - ✅ Added miner uptime tracking, response time tracking, job completion rate tracking
+  - ✅ Added capacity availability tracking
+  - ✅ Integrated pool-hub MinerStatus for latency data
+  - ✅ Extended `_evaluate_alerts()` for pool-hub SLA violations
+  - ✅ Added pool-hub specific alert thresholds
+
+- **Capacity Planning Infrastructure Enhancement**
+  - ✅ Extended `system_maintenance.py` capacity planning
+  - ✅ Added `_collect_pool_hub_capacity()` method
+  - ✅ Enhanced `_perform_capacity_planning()` to consume pool-hub data
+  - ✅ Added pool-hub metrics to capacity results
+  - ✅ Added pool-hub specific scaling recommendations
+
+- **Pool-Hub Models Extension**
+  - ✅ Added `SLAMetric` model for tracking miner SLA data
+  - ✅ Added `SLAViolation` model for SLA breach tracking
+  - ✅ Added `CapacitySnapshot` model for capacity planning data
+  - ✅ Extended `MinerStatus` with uptime_pct and last_heartbeat_at fields
+  - ✅ Added indexes for SLA queries
+
+- **SLA Metrics Collection Service**
+  - ✅ Created `sla_collector.py` service
+  - ✅ Implemented miner uptime tracking based on heartbeat intervals
+  - ✅ Implemented response time tracking from match results
+  - ✅ Implemented job completion rate tracking from feedback
+  - ✅ Implemented capacity availability tracking
+  - ✅ Added SLA threshold configuration per metric type
+  - ✅ Added automatic violation detection
+  - ✅ Added Prometheus metrics exposure
+  - ✅ Created `SLACollectorScheduler` for automated collection
+
+- **Coordinator-API Billing Integration**
+  - ✅ Created `billing_integration.py` service
+  - ✅ Implemented usage data aggregation from pool-hub to coordinator-api
+  - ✅ Implemented tenant mapping (pool-hub miners to coordinator-api tenants)
+  - ✅ Implemented billing event emission via HTTP API
+  - ✅ Leveraged existing ServiceConfig pricing schemas
+  - ✅ Integrated with existing quota enforcement
+  - ✅ Created `BillingIntegrationScheduler` for automated sync
+
+- **API Endpoints**
+  - ✅ Created `sla.py` router with comprehensive endpoints
+  - ✅ `GET /sla/metrics/{miner_id}` - Get SLA metrics for a miner
+  - ✅ `GET /sla/metrics` - Get SLA metrics across all miners
+  - ✅ `GET /sla/violations` - Get SLA violations
+  - ✅ `POST /sla/metrics/collect` - Trigger SLA metrics collection
+  - ✅ `GET /sla/capacity/snapshots` - Get capacity planning snapshots
+  - ✅ `GET /sla/capacity/forecast` - Get capacity forecast
+  - ✅ `GET /sla/capacity/recommendations` - Get scaling recommendations
+  - ✅ `POST /sla/capacity/alerts/configure` - Configure capacity alerts
+  - ✅ `GET /sla/billing/usage` - Get billing usage data
+  - ✅ `POST /sla/billing/sync` - Trigger billing sync with coordinator-api
+  - ✅ `POST /sla/billing/usage/record` - Record usage event
+  - ✅ `POST /sla/billing/invoice/generate` - Trigger invoice generation
+  - ✅ `GET /sla/status` - Get overall SLA status
+
+- **Configuration and Settings**
+  - ✅ Added coordinator-api billing URL configuration
+  - ✅ Added coordinator-api API key configuration
+  - ✅ Added SLA threshold configurations
+  - ✅ Added capacity planning parameters
+  - ✅ Added billing sync interval configuration
+  - ✅ Added SLA collection interval configuration
+
+- **Database Migrations**
+  - ✅ Created migration `b2a1c4d5e6f7_add_sla_and_capacity_tables.py`
+  - ✅ Added SLA-related tables (sla_metrics, sla_violations)
+  - ✅ Added capacity planning table (capacity_snapshots)
+  - ✅ Extended miner_status with uptime_pct and last_heartbeat_at
+  - ✅ Added indexes for performance
+  - ✅ Added foreign key constraints
+
+- **Testing**
+  - ✅ Created `test_sla_collector.py` - SLA collection tests
+  - ✅ Created `test_billing_integration.py` - Billing integration tests
+  - ✅ Created `test_sla_endpoints.py` - API endpoint tests
+  - ✅ Created `test_integration_coordinator.py` - Integration tests
+  - ✅ Added comprehensive test coverage for SLA and billing features
+
+- **Documentation**
+  - ✅ Updated `apps/pool-hub/README.md` with SLA and billing documentation
+  - ✅ Added configuration examples
+  - ✅ Added API endpoint documentation
+  - ✅ Added database migration instructions
+  - ✅ Added testing instructions
+
+## Current Status: SLA-Backed Coordinator/Pool Hubs Complete

 **Milestone Achievement**: Successfully fixed multi-node blockchain
 synchronization issues between aitbc and aitbc1. Both nodes are now in sync with
--- a/docs/beginner/02_project/5_done.md
+++ b/docs/beginner/02_project/5_done.md
@@ -837,6 +837,60 @@ operational.
  - Includes troubleshooting steps and verification procedures

 - ✅ **OpenClaw Cross-Node Communication Documentation** - Added agent
+  communication workflow documentation
+  - File: `docs/openclaw/openclaw-cross-node-communication.md`
+  - Documents agent-to-agent communication via AITBC blockchain transactions
+  - Includes setup, testing, and troubleshooting procedures
+
+## Recent Updates (2026-04-22)
+
+### ait-mainnet Migration Complete ✅
+
+- ✅ **All Nodes Migrated to ait-mainnet** - Successfully migrated all blockchain nodes
+  from ait-devnet to ait-mainnet
+  - Updated `/etc/aitbc/.env` on aitbc: CHAIN_ID=ait-mainnet (already configured)
+  - Updated `/etc/aitbc/.env` on aitbc1: CHAIN_ID=ait-mainnet (changed from ait-devnet)
+  - Updated `/etc/aitbc/.env` on gitea-runner: CHAIN_ID=ait-mainnet (changed from ait-devnet)
+  - All three nodes now on same blockchain (ait-mainnet)
+
+- ✅ **Cross-Node Blockchain Tests Created** - New test suite for multi-node blockchain
+  features
+  - File: `/opt/aitbc/tests/verification/test_cross_node_blockchain.py`
+  - Tests: Chain ID Consistency, Block Synchronization, Block Range Query, RPC
+    Connectivity
+  - Tests all three nodes: aitbc, aitbc1, gitea-runner
+  - Verifies chain_id consistency via SSH configuration check
+  - Tests block import functionality and RPC connectivity
+  - All 4 tests passing across 3 nodes
+
+- ✅ **Test Files Updated for ait-mainnet** - Updated all verification tests to use
+  ait-mainnet chain_id
+  - test_tx_import.py: Updated CHAIN_ID and endpoint path
+  - test_simple_import.py: Updated CHAIN_ID and endpoint path
+  - test_minimal.py: Updated CHAIN_ID and endpoint path
+  - test_block_import.py: Updated CHAIN_ID and endpoint path
+  - test_block_import_complete.py: Updated CHAIN_ID and endpoint path
+  - All tests now include chain_id in block data payloads
+
+- ✅ **SQLite Database Corruption Fixed on aitbc1** - Resolved database corruption
+  issue
+  - Root cause: Btrfs copy-on-write (CoW) behavior causing SQLite corruption
+  - Fix: Applied `chattr +C` to `/var/lib/aitbc/data` to disable CoW
+  - Cleared corrupted database files (chain.db*)
+  - Restarted aitbc-blockchain-node.service
+  - Service now running successfully without corruption errors
+
+- ✅ **Network Connectivity Fixes** - Fixed cross-node RPC connectivity
+  - Corrected aitbc1 RPC URL from 10.0.3.107:8006 to 10.1.223.40:8006
+  - Added gitea-runner RPC URL: 10.1.223.93:8006
+  - All nodes now reachable via RPC endpoints
+  - Cross-node tests verify connectivity between all nodes
+
+- ✅ **Blockchain Configuration Updates** - Updated blockchain node configuration
+  - File: `/opt/aitbc/apps/blockchain-node/src/aitbc_chain/config.py`
+  - Changed supported_chains from "ait-devnet" to "ait-mainnet"
+  - All nodes now support ait-mainnet chain
+  - Blockchain node services restarted with new configuration
  communication guides
  - File: `docs/openclaw/guides/openclaw_cross_node_communication.md`
  - File: `docs/openclaw/training/cross_node_communication_training.md`