- Change file mode from 644 to 755 for all project files - Add chain_id parameter to get_balance RPC endpoint with default "ait-devnet" - Rename Miner.extra_meta_data to extra_metadata for consistency
322 lines
9.6 KiB
Markdown
Executable File
322 lines
9.6 KiB
Markdown
Executable File
# Backend Endpoint Implementation Roadmap - March 5, 2026
|
|
|
|
## Overview
|
|
|
|
The AITBC CLI is now fully functional with proper authentication, error handling, and command structure. However, several key backend endpoints are missing, preventing full end-to-end functionality. This roadmap outlines the required backend implementations.
|
|
|
|
## 🎯 Current Status
|
|
|
|
### ✅ CLI Status: 97% Complete
|
|
- **Authentication**: ✅ Working (API keys configured)
|
|
- **Command Structure**: ✅ Complete (all commands implemented)
|
|
- **Error Handling**: ✅ Robust (proper error messages)
|
|
- **File Operations**: ✅ Working (JSON/CSV parsing, templates)
|
|
|
|
### ⚠️ Backend Limitations: Missing Endpoints
|
|
- **Job Submission**: `/v1/jobs` endpoint not implemented
|
|
- **Agent Operations**: `/v1/agents/*` endpoints not implemented
|
|
- **Swarm Operations**: `/v1/swarm/*` endpoints not implemented
|
|
- **Various Client APIs**: History, blocks, receipts endpoints missing
|
|
|
|
## 🛠️ Required Backend Implementations
|
|
|
|
### Priority 1: Core Job Management (High Impact)
|
|
|
|
#### 1.1 Job Submission Endpoint
|
|
**Endpoint**: `POST /v1/jobs`
|
|
**Purpose**: Submit inference jobs to the coordinator
|
|
**Required Features**:
|
|
```python
|
|
@app.post("/v1/jobs", response_model=JobView, status_code=201)
|
|
async def submit_job(
|
|
req: JobCreate,
|
|
request: Request,
|
|
session: SessionDep,
|
|
client_id: str = Depends(require_client_key()),
|
|
) -> JobView:
|
|
```
|
|
|
|
**Implementation Requirements**:
|
|
- Validate job payload (type, prompt, model)
|
|
- Queue job for processing
|
|
- Return job ID and initial status
|
|
- Support TTL (time-to-live) configuration
|
|
- Rate limiting per client
|
|
|
|
#### 1.2 Job Status Endpoint
|
|
**Endpoint**: `GET /v1/jobs/{job_id}`
|
|
**Purpose**: Check job execution status
|
|
**Required Features**:
|
|
- Return current job state (queued, running, completed, failed)
|
|
- Include progress information for long-running jobs
|
|
- Support real-time status updates
|
|
|
|
#### 1.3 Job Result Endpoint
|
|
**Endpoint**: `GET /v1/jobs/{job_id}/result`
|
|
**Purpose**: Retrieve completed job results
|
|
**Required Features**:
|
|
- Return job output and metadata
|
|
- Include execution time and resource usage
|
|
- Support result caching
|
|
|
|
#### 1.4 Job History Endpoint
|
|
**Endpoint**: `GET /v1/jobs/history`
|
|
**Purpose**: List job history with filtering
|
|
**Required Features**:
|
|
- Pagination support
|
|
- Filter by status, date range, job type
|
|
- Include job metadata and results
|
|
|
|
### Priority 2: Agent Management (Medium Impact)
|
|
|
|
#### 2.1 Agent Workflow Creation
|
|
**Endpoint**: `POST /v1/agents/workflows`
|
|
**Purpose**: Create AI agent workflows
|
|
**Required Features**:
|
|
```python
|
|
@app.post("/v1/agents/workflows", response_model=AgentWorkflowView)
|
|
async def create_agent_workflow(
|
|
workflow: AgentWorkflowCreate,
|
|
session: SessionDep,
|
|
client_id: str = Depends(require_client_key()),
|
|
) -> AgentWorkflowView:
|
|
```
|
|
|
|
#### 2.2 Agent Execution
|
|
**Endpoint**: `POST /v1/agents/workflows/{agent_id}/execute`
|
|
**Purpose**: Execute agent workflows
|
|
**Required Features**:
|
|
- Workflow execution engine
|
|
- Resource allocation
|
|
- Execution monitoring
|
|
|
|
#### 2.3 Agent Status & Receipts
|
|
**Endpoints**:
|
|
- `GET /v1/agents/executions/{execution_id}`
|
|
- `GET /v1/agents/executions/{execution_id}/receipt`
|
|
**Purpose**: Monitor agent execution and get verifiable receipts
|
|
|
|
### Priority 3: Swarm Intelligence (Medium Impact)
|
|
|
|
#### 3.1 Swarm Join Endpoint
|
|
**Endpoint**: `POST /v1/swarm/join`
|
|
**Purpose**: Join agent swarms for collective optimization
|
|
**Required Features**:
|
|
```python
|
|
@app.post("/v1/swarm/join", response_model=SwarmJoinView)
|
|
async def join_swarm(
|
|
swarm_data: SwarmJoinRequest,
|
|
session: SessionDep,
|
|
client_id: str = Depends(require_client_key()),
|
|
) -> SwarmJoinView:
|
|
```
|
|
|
|
#### 3.2 Swarm Coordination
|
|
**Endpoint**: `POST /v1/swarm/coordinate`
|
|
**Purpose**: Coordinate swarm task execution
|
|
**Required Features**:
|
|
- Task distribution
|
|
- Result aggregation
|
|
- Consensus mechanisms
|
|
|
|
### Priority 4: Enhanced Client Features (Low Impact)
|
|
|
|
#### 4.1 Job Management
|
|
**Endpoints**:
|
|
- `DELETE /v1/jobs/{job_id}` (Cancel job)
|
|
- `GET /v1/jobs/{job_id}/receipt` (Job receipt)
|
|
- `GET /v1/explorer/receipts` (List receipts)
|
|
|
|
#### 4.2 Payment System
|
|
**Endpoints**:
|
|
- `POST /v1/payments` (Create payment)
|
|
- `GET /v1/payments/{payment_id}/status` (Payment status)
|
|
- `GET /v1/payments/{payment_id}/receipt` (Payment receipt)
|
|
|
|
#### 4.3 Block Integration
|
|
**Endpoint**: `GET /v1/explorer/blocks`
|
|
**Purpose**: List recent blocks for client context
|
|
|
|
## 🏗️ Implementation Strategy
|
|
|
|
### Phase 1: Core Job System (Week 1-2)
|
|
1. **Job Submission API**
|
|
- Implement basic job queue
|
|
- Add job validation and routing
|
|
- Create job status tracking
|
|
|
|
2. **Job Execution Engine**
|
|
- Connect to AI model inference
|
|
- Implement job processing pipeline
|
|
- Add result storage and retrieval
|
|
|
|
3. **Testing & Validation**
|
|
- End-to-end job submission tests
|
|
- Performance benchmarking
|
|
- Error handling validation
|
|
|
|
### Phase 2: Agent System (Week 3-4)
|
|
1. **Agent Workflow Engine**
|
|
- Workflow definition and storage
|
|
- Execution orchestration
|
|
- Resource management
|
|
|
|
2. **Agent Integration**
|
|
- Connect to AI agent frameworks
|
|
- Implement agent communication
|
|
- Add monitoring and logging
|
|
|
|
### Phase 3: Swarm Intelligence (Week 5-6)
|
|
1. **Swarm Coordination**
|
|
- Implement swarm algorithms
|
|
- Add task distribution logic
|
|
- Create result aggregation
|
|
|
|
2. **Swarm Optimization**
|
|
- Performance tuning
|
|
- Load balancing
|
|
- Fault tolerance
|
|
|
|
### Phase 4: Enhanced Features (Week 7-8)
|
|
1. **Payment Integration**
|
|
- Payment processing
|
|
- Escrow management
|
|
- Receipt generation
|
|
|
|
2. **Advanced Features**
|
|
- Batch job optimization
|
|
- Template system integration
|
|
- Advanced filtering and search
|
|
|
|
## 📊 Technical Requirements
|
|
|
|
### Database Schema Updates
|
|
```sql
|
|
-- Jobs Table
|
|
CREATE TABLE jobs (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
client_id VARCHAR(255) NOT NULL,
|
|
type VARCHAR(50) NOT NULL,
|
|
payload JSONB NOT NULL,
|
|
status VARCHAR(20) DEFAULT 'queued',
|
|
result JSONB,
|
|
created_at TIMESTAMP DEFAULT NOW(),
|
|
updated_at TIMESTAMP DEFAULT NOW(),
|
|
ttl_seconds INTEGER DEFAULT 900
|
|
);
|
|
|
|
-- Agent Workflows Table
|
|
CREATE TABLE agent_workflows (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
name VARCHAR(255) NOT NULL,
|
|
description TEXT,
|
|
workflow_definition JSONB NOT NULL,
|
|
client_id VARCHAR(255) NOT NULL,
|
|
created_at TIMESTAMP DEFAULT NOW()
|
|
);
|
|
|
|
-- Swarm Members Table
|
|
CREATE TABLE swarm_members (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
swarm_id UUID NOT NULL,
|
|
agent_id VARCHAR(255) NOT NULL,
|
|
role VARCHAR(50) NOT NULL,
|
|
capability VARCHAR(100),
|
|
joined_at TIMESTAMP DEFAULT NOW()
|
|
);
|
|
```
|
|
|
|
### Service Dependencies
|
|
1. **AI Model Integration**: Connect to Ollama or other inference services
|
|
2. **Message Queue**: Redis/RabbitMQ for job queuing
|
|
3. **Storage**: Database for job and agent state
|
|
4. **Monitoring**: Metrics and logging for observability
|
|
|
|
### API Documentation
|
|
- OpenAPI/Swagger specifications
|
|
- Request/response examples
|
|
- Error code documentation
|
|
- Rate limiting information
|
|
|
|
## 🔧 Development Environment Setup
|
|
|
|
### Local Development
|
|
```bash
|
|
# Start coordinator API with job endpoints
|
|
cd /opt/aitbc/apps/coordinator-api
|
|
.venv/bin/python -m uvicorn app.main:app --reload --port 8000
|
|
|
|
# Test with CLI
|
|
aitbc client submit --prompt "test" --model gemma3:1b
|
|
```
|
|
|
|
### Testing Strategy
|
|
1. **Unit Tests**: Individual endpoint testing
|
|
2. **Integration Tests**: End-to-end workflow testing
|
|
3. **Load Tests**: Performance under load
|
|
4. **Security Tests**: Authentication and authorization
|
|
|
|
## 📈 Success Metrics
|
|
|
|
### Phase 1 Success Criteria
|
|
- [ ] Job submission working end-to-end
|
|
- [ ] 100+ concurrent job support
|
|
- [ ] <2s average job submission time
|
|
- [ ] 99.9% uptime for job APIs
|
|
|
|
### Phase 2 Success Criteria
|
|
- [ ] Agent workflow creation and execution
|
|
- [ ] Multi-agent coordination working
|
|
- [ ] Agent receipt generation
|
|
- [ ] Resource utilization optimization
|
|
|
|
### Phase 3 Success Criteria
|
|
- [ ] Swarm join and coordination
|
|
- [ ] Collective optimization results
|
|
- [ ] Swarm performance metrics
|
|
- [ ] Fault tolerance testing
|
|
|
|
### Phase 4 Success Criteria
|
|
- [ ] Payment system integration
|
|
- [ ] Advanced client features
|
|
- [ ] Full CLI functionality
|
|
- [ ] Production readiness
|
|
|
|
## 🚀 Deployment Plan
|
|
|
|
### Staging Environment
|
|
1. **Infrastructure Setup**: Deploy to staging cluster
|
|
2. **Database Migration**: Apply schema updates
|
|
3. **Service Configuration**: Configure all endpoints
|
|
4. **Integration Testing**: Full workflow testing
|
|
|
|
### Production Deployment
|
|
1. **Blue-Green Deployment**: Zero-downtime deployment
|
|
2. **Monitoring Setup**: Metrics and alerting
|
|
3. **Performance Tuning**: Optimize for production load
|
|
4. **Documentation Update**: Update API documentation
|
|
|
|
## 📝 Next Steps
|
|
|
|
### Immediate Actions (This Week)
|
|
1. **Implement Job Submission**: Start with basic `/v1/jobs` endpoint
|
|
2. **Database Setup**: Create required tables and indexes
|
|
3. **Testing Framework**: Set up automated testing
|
|
4. **CLI Integration**: Test with existing CLI commands
|
|
|
|
### Short Term (2-4 Weeks)
|
|
1. **Complete Job System**: Full job lifecycle management
|
|
2. **Agent System**: Basic agent workflow support
|
|
3. **Performance Optimization**: Optimize for production load
|
|
4. **Documentation**: Complete API documentation
|
|
|
|
### Long Term (1-2 Months)
|
|
1. **Swarm Intelligence**: Full swarm coordination
|
|
2. **Advanced Features**: Payment system, advanced filtering
|
|
3. **Production Deployment**: Full production readiness
|
|
4. **Monitoring & Analytics**: Comprehensive observability
|
|
|
|
---
|
|
|
|
**Summary**: The CLI is 97% complete and ready for production use. The main remaining work is implementing the backend endpoints to support full end-to-end functionality. This roadmap provides a clear path to 100% completion.
|