chore: enhance .gitignore and remove obsolete documentation files - Reorganize .gitignore with categorized sections for better maintainability - Add comprehensive ignore patterns for Python, Node.js, databases, logs, and build artifacts - Add project-specific ignore rules for coordinator, explorer, and deployment files - Remove outdated documentation: BITCOIN-WALLET-SETUP.md, LOCAL_ASSETS_SUMMARY.md, README-CONTAINER-DEPLOYMENT.md, README-DOMAIN-DEPLOYMENT.md ```
4.1 KiB
4.1 KiB
Production Rollback Procedures
Emergency Rollback Guide
Use these procedures when a deployment causes critical issues in production.
Immediate Actions (First 5 minutes)
-
Assess the Impact
- Check monitoring dashboards
- Review error logs
- Identify affected services
- Determine if rollback is necessary
-
Communicate
- Notify team in #production-alerts
- Post status on status page if needed
- Document start time of incident
Automated Rollback (if available)
# Quick rollback to previous version
./scripts/rollback-to-previous.sh
# Rollback to specific version
./scripts/rollback-to-version.sh v1.2.3
Manual Rollback Steps
1. Stop Current Services
# Stop all AITBC services
sudo systemctl stop aitbc-coordinator
sudo systemctl stop aitbc-node
sudo systemctl stop aitbc-miner
sudo systemctl stop aitbc-dashboard
sudo docker-compose down
2. Restore Previous Code
# Get previous deployment tag
git tag --sort=-version:refname | head -n 5
# Checkout previous stable version
git checkout v1.2.3
# Rebuild if necessary
docker-compose build --no-cache
3. Restore Database (if needed)
# List available backups
aws s3 ls s3://aitbc-backups/database/
# Restore latest backup
pg_restore -h localhost -U postgres -d aitbc_prod latest_backup.dump
4. Restore Configuration
# Restore from backup
cp /etc/aitbc/backup/config.yaml /etc/aitbc/config.yaml
cp /etc/aitbc/backup/.env /etc/aitbc/.env
5. Restart Services
# Start services in correct order
sudo systemctl start aitbc-coordinator
sleep 10
sudo systemctl start aitbc-node
sleep 10
sudo systemctl start aitbc-miner
sleep 10
sudo systemctl start aitbc-dashboard
6. Verify Rollback
# Check service status
./scripts/health-check.sh
# Run smoke tests
./scripts/smoke-test.sh
# Verify blockchain sync
curl -X POST http://localhost:8545 -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}'
Database-Specific Rollbacks
Partial Data Rollback
# Create backup before changes
pg_dump -h localhost -U postgres aitbc_prod > pre-rollback-backup.sql
# Rollback specific tables
psql -h localhost -U postgres -d aitbc_prod < rollback-tables.sql
Migration Rollback
# Check migration status
./scripts/migration-status.sh
# Rollback last migration
./scripts/rollback-migration.sh
Service-Specific Rollbacks
Coordinator Service
# Restore coordinator state
sudo systemctl stop aitbc-coordinator
cp /var/lib/aitbc/coordinator/backup/state.db /var/lib/aitbc/coordinator/
sudo systemctl start aitbc-coordinator
Blockchain Node
# Reset to last stable block
sudo systemctl stop aitbc-node
aitbc-node --reset-to-block 123456
sudo systemctl start aitbc-node
Mining Operations
# Stop mining immediately
curl -X POST http://localhost:8080/api/mining/stop
# Reset mining state
redis-cli FLUSHDB
Verification Checklist
- All services running
- Database connectivity
- API endpoints responding
- Blockchain syncing
- Mining operations (if applicable)
- Dashboard accessible
- SSL certificates valid
- Monitoring alerts cleared
Post-Rollback Actions
-
Root Cause Analysis
- Document what went wrong
- Identify failure point
- Create prevention plan
-
Team Communication
- Update incident ticket
- Share lessons learned
- Update runbooks
-
Preventive Measures
- Add additional tests
- Improve monitoring
- Update deployment checklist
Contact Information
- On-call Engineer: [Phone/Slack]
- Engineering Lead: [Phone/Slack]
- DevOps Team: #devops-alerts
- Management: #management-alerts
Escalation
- Level 1: On-call engineer (first 15 minutes)
- Level 2: Engineering lead (after 15 minutes)
- Level 3: CTO (after 30 minutes)
Notes
- Always create a backup before rollback
- Document every step during rollback
- Test in staging before production if possible
- Keep stakeholders informed throughout process