Files
aitbc/infra/README.md
oib c8be9d7414 feat: add marketplace metrics, privacy features, and service registry endpoints
- Add Prometheus metrics for marketplace API throughput and error rates with new dashboard panels
- Implement confidential transaction models with encryption support and access control
- Add key management system with registration, rotation, and audit logging
- Create services and registry routers for service discovery and management
- Integrate ZK proof generation for privacy-preserving receipts
- Add metrics instru
2025-12-22 10:33:23 +01:00

159 lines
4.6 KiB
Markdown

# AITBC Infrastructure Templates
This directory contains Terraform and Helm templates for deploying AITBC services across dev, staging, and production environments.
## Directory Structure
```
infra/
├── terraform/ # Infrastructure as Code
│ ├── modules/ # Reusable Terraform modules
│ │ └── kubernetes/ # EKS cluster module
│ └── environments/ # Environment-specific configurations
│ ├── dev/
│ ├── staging/
│ └── prod/
└── helm/ # Helm Charts
├── charts/ # Application charts
│ ├── coordinator/ # Coordinator API chart
│ ├── blockchain-node/ # Blockchain node chart
│ └── monitoring/ # Monitoring stack (Prometheus, Grafana)
└── values/ # Environment-specific values
├── dev.yaml
├── staging.yaml
└── prod.yaml
```
## Quick Start
### Prerequisites
- Terraform >= 1.0
- Helm >= 3.0
- kubectl configured for your cluster
- AWS CLI configured (for EKS)
### Deploy Development Environment
1. **Provision Infrastructure with Terraform:**
```bash
cd infra/terraform/environments/dev
terraform init
terraform apply
```
2. **Configure kubectl:**
```bash
aws eks update-kubeconfig --name aitbc-dev --region us-west-2
```
3. **Deploy Applications with Helm:**
```bash
# Add required Helm repositories
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
# Deploy monitoring stack
helm install monitoring ../../helm/charts/monitoring -f ../../helm/values/dev.yaml
# Deploy coordinator API
helm install coordinator ../../helm/charts/coordinator -f ../../helm/values/dev.yaml
```
### Environment Configurations
#### Development
- 1 replica per service
- Minimal resource allocation
- Public EKS endpoint enabled
- 7-day metrics retention
#### Staging
- 2-3 replicas per service
- Moderate resource allocation
- Autoscaling enabled
- 30-day metrics retention
- TLS with staging certificates
#### Production
- 3+ replicas per service
- High resource allocation
- Full autoscaling configuration
- 90-day metrics retention
- TLS with production certificates
- Network policies enabled
- Backup configuration enabled
## Monitoring
The monitoring stack includes:
- **Prometheus**: Metrics collection and storage
- **Grafana**: Visualization dashboards
- **AlertManager**: Alert routing and notification
Access Grafana:
```bash
kubectl port-forward svc/monitoring-grafana 3000:3000
# Open http://localhost:3000
# Default credentials: admin/admin (check values files for environment-specific passwords)
```
## Scaling Guidelines
Based on benchmark results (`apps/blockchain-node/scripts/benchmark_throughput.py`):
- **Coordinator API**: Scale horizontally at ~500 TPS per node
- **Blockchain Node**: Scale horizontally at ~1000 TPS per node
- **Wallet Daemon**: Scale based on concurrent users
## Security Considerations
- Private subnets for all application workloads
- Network policies restrict traffic between services
- Secrets managed via Kubernetes Secrets
- TLS termination at ingress level
- Pod Security Policies enforced in production
## Backup and Recovery
- Automated daily backups of PostgreSQL databases
- EBS snapshots for persistent volumes
- Cross-region replication for production data
- Restore procedures documented in runbooks
## Cost Optimization
- Use Spot instances for non-critical workloads
- Implement cluster autoscaling
- Right-size resources based on metrics
- Schedule non-production environments to run only during business hours
## Troubleshooting
Common issues and solutions:
1. **Helm chart fails to install:**
- Check if all dependencies are added
- Verify kubectl context is correct
- Review values files for syntax errors
2. **Prometheus not scraping metrics:**
- Verify ServiceMonitor CRDs are installed
- Check service annotations
- Review network policies
3. **High memory usage:**
- Review resource limits in values files
- Check for memory leaks in applications
- Consider increasing node size
## Contributing
When adding new services:
1. Create a new Helm chart in `helm/charts/`
2. Add environment-specific values in `helm/values/`
3. Update monitoring configuration to include new service metrics
4. Document any special requirements in this README