Fix edge GPU endpoints for marketplace integration
Some checks failed
Cross-Node Transaction Testing / transaction-test (push) Has been cancelled
Deploy to Testnet / deploy-testnet (push) Has been cancelled
Integration Tests / test-service-integration (push) Has been cancelled
Multi-Node Stress Testing / stress-test (push) Has been cancelled
Python Tests / test-python (push) Has been cancelled
Security Scanning / security-scan (push) Has been cancelled

- Add error handling to list_profiles, list_metrics, and seed_profiles methods
- Add logger and GPURegistry imports to edge_gpu_service.py
- Fix discover_and_register_edge_gpus to query GPURegistry instead of returning empty placeholder
- Make all EdgeGPUService methods async to match AsyncSession
- Add await to edge GPU endpoint calls in main.py
- Remove User=aitbc directive from gpu-service.service (user doesn't exist)
- Install and enable gpu-service as systemd service

All edge GPU endpoints now return proper responses instead of 500 errors:
- /v1/marketplace/edge-gpu/profiles - returns profiles or empty array
- /v1/marketplace/edge-gpu/scan/{miner_id} - returns registered GPUs from GPURegistry
- /v1/marketplace/edge-gpu/metrics/{gpu_id} - returns metrics or empty array
This commit is contained in:
aitbc
2026-05-14 18:35:03 +02:00
parent 05e2d1ec39
commit 3ac3674dc4
4 changed files with 439 additions and 38 deletions

View File

@@ -0,0 +1,351 @@
# GPU Compute Provider Skill
## Description
This skill provides a complete workflow for GPU compute providers to register, manage GPU resources, and participate in the AITBC GPU marketplace. It covers the full lifecycle from miner registration to GPU offer submission, job polling, and earnings tracking.
## Prerequisites
- GPU service running on port 8101
- Valid miner_id and node_id
- GPU specifications (model, memory, capabilities)
- Marketplace pricing information
## Endpoints
### Working Endpoints
| Endpoint | Method | Purpose | Status |
|---------|--------|---------|--------|
| `/health` | GET | Health check | ✓ Working |
| `/gpu/status` | GET | GPU service status | ✓ Working |
| `/live` | GET | Liveness check | ✓ Working |
| `/v1/miners/register` | POST | Register miner | ✓ Working |
| `/v1/transactions` | POST | Submit GPU offer | ✓ Working |
| `/v1/transactions` | GET | Query transactions | ✓ Working |
| `/v1/miners/heartbeat` | POST | Send heartbeat | ✓ Working |
| `/v1/miners/{miner_id}/gpus` | GET | Get miner GPUs | ✓ Working |
| `/v1/miners/poll` | POST | Poll for jobs | ✓ Working |
| `/v1/miners/{miner_id}/earnings` | POST | Get earnings | ✓ Working |
| `/v1/miners/{miner_id}/capabilities` | PUT | Update capabilities | ✓ Working |
| `/v1/miners/{miner_id}` | DELETE | Deregister miner | ✓ Working |
### Non-Working Endpoints
| Endpoint | Issue |
|---------|-------|
| `/ready` | Database SQL expression error |
| `/v1/marketplace/edge-gpu/profiles` | 500 error (unimplemented) |
| `/v1/marketplace/edge-gpu/scan/{miner_id}` | 404 (unimplemented) |
## Workflow Steps
### 1. Register Miner
Register a new miner in the GPU marketplace.
```bash
curl -X POST http://localhost:8101/v1/miners/register \
-H "Content-Type: application/json" \
-d '{
"miner_id": "compute_provider_001",
"node_id": "node_aitbc_genesis",
"location": "us-east"
}'
```
**Response:**
```json
{
"status": "ok",
"miner_id": "compute_provider_001",
"session_token": "token_0d1924d697bd47c3",
"gpu_count": 0
}
```
### 2. Submit GPU Offers
Create GPU offers for the marketplace.
```bash
curl -X POST http://localhost:8101/v1/transactions \
-H "Content-Type: application/json" \
-d '{
"type": "gpu_marketplace",
"action": "offer",
"offer_id": "gpu_a100_001",
"provider_node_id": "compute_provider_001",
"price_per_gpu": 35.0,
"specs": {
"model": "NVIDIA A100",
"memory_gb": 80,
"region": "us-east",
"capabilities": ["inference", "training", "fine-tuning"]
},
"status": "available"
}'
```
**Response:**
```json
{
"status": "success",
"transaction_id": "gpu_a100_001"
}
```
### 3. Query Transactions
List all GPU marketplace transactions.
```bash
curl -X GET http://localhost:8101/v1/transactions
```
**Response:**
```json
[
{
"id": "gpu_a100_001",
"action": "offer",
"model": "NVIDIA A100",
"memory_gb": 80,
"price_per_hour": 35.0,
"status": "available",
"region": "us-east",
"miner_id": "compute_provider_001",
"created_at": "2026-05-14T16:19:58.656796"
}
]
```
### 4. Get Miner GPUs
Retrieve GPUs registered by a specific miner.
```bash
curl -X GET http://localhost:8101/v1/miners/compute_provider_001/gpus
```
**Response:**
```json
[
{
"id": "gpu_a100_001",
"model": "NVIDIA A100",
"memory_gb": 80,
"status": "online",
"price_per_hour": 35.0,
"region": "us-east",
"created_at": "2026-05-14T16:19:58.656796"
}
]
```
### 5. Send Heartbeat
Send a heartbeat to keep miner status online.
```bash
curl -X POST http://localhost:8101/v1/miners/heartbeat \
-H "Content-Type: application/json" \
-d '{
"miner_id": "compute_provider_001"
}'
```
**Response:**
```json
{
"status": "ok"
}
```
### 6. Poll for Jobs
Poll for available compute jobs.
```bash
curl -X POST http://localhost:8101/v1/miners/poll \
-H "Content-Type: application/json" \
-d '{
"miner_id": "compute_provider_001",
"max_wait_seconds": 5
}'
```
**Response:**
```
null
```
*Note: Returns null when no jobs are available (placeholder implementation).*
### 7. Get Earnings
Retrieve miner earnings information.
```bash
curl -X POST http://localhost:8101/v1/miners/compute_provider_001/earnings
```
**Response:**
```json
{
"miner_id": "compute_provider_001",
"total_earnings": 0.0,
"pending_earnings": 0.0,
"currency": "AITBC"
}
```
### 8. Update Capabilities
Update miner capabilities.
```bash
curl -X PUT http://localhost:8101/v1/miners/compute_provider_001/capabilities \
-H "Content-Type: application/json" \
-d '{
"capabilities": {
"max_batch_size": 32,
"supported_models": ["llama-7b", "gpt-3.5"]
}
}'
```
**Response:**
```json
{
"status": "ok",
"miner_id": "compute_provider_001",
"capabilities": {
"max_batch_size": 32,
"supported_models": ["llama-7b", "gpt-3.5"]
}
}
```
### 9. Deregister Miner
Deregister a miner from the marketplace.
```bash
curl -X DELETE http://localhost:8101/v1/miners/compute_provider_001
```
**Response:**
```json
{
"status": "ok",
"miner_id": "compute_provider_001",
"message": "Miner deregistered"
}
```
## Complete Workflow Example
```bash
# 1. Register miner
MINER_ID="compute_provider_001"
curl -X POST http://localhost:8101/v1/miners/register \
-H "Content-Type: application/json" \
-d "{\"miner_id\": \"$MINER_ID\", \"node_id\": \"node_genesis\", \"location\": \"us-east\"}"
# 2. Submit GPU offers
curl -X POST http://localhost:8101/v1/transactions \
-H "Content-Type: application/json" \
-d '{
"type": "gpu_marketplace",
"action": "offer",
"offer_id": "gpu_a100_001",
"provider_node_id": "compute_provider_001",
"price_per_gpu": 35.0,
"specs": {
"model": "NVIDIA A100",
"memory_gb": 80,
"region": "us-east",
"capabilities": ["inference", "training"]
},
"status": "available"
}'
# 3. Query transactions
curl -X GET http://localhost:8101/v1/transactions
# 4. Get miner GPUs
curl -X GET http://localhost:8101/v1/miners/$MINER_ID/gpus
# 5. Send heartbeat
curl -X POST http://localhost:8101/v1/miners/heartbeat \
-H "Content-Type: application/json" \
-d "{\"miner_id\": \"$MINER_ID\"}"
# 6. Poll for jobs
curl -X POST http://localhost:8101/v1/miners/poll \
-H "Content-Type: application/json" \
-d "{\"miner_id\": \"$MINER_ID\", \"max_wait_seconds\": 5}"
# 7. Get earnings
curl -X POST http://localhost:8101/v1/miners/$MINER_ID/earnings
# 8. Update capabilities
curl -X PUT http://localhost:8101/v1/miners/$MINER_ID/capabilities \
-H "Content-Type: application/json" \
-d '{"capabilities": {"max_batch_size": 32, "supported_models": ["llama-7b"]}}'
# 9. Deregister miner (cleanup)
curl -X DELETE http://localhost:8101/v1/miners/$MINER_ID
```
## GPU Specifications
Common GPU models and specifications:
| Model | Memory | Typical Use Cases |
|-------|--------|-------------------|
| NVIDIA A100 | 80 GB | Training, inference, fine-tuning |
| NVIDIA H100 | 80 GB | High-performance training |
| NVIDIA RTX 4090 | 24 GB | Inference, light training |
| NVIDIA RTX 4060 Ti | 16 GB | Inference, edge computing |
## Pricing Guidelines
Recommended hourly pricing (AIT tokens):
| GPU Model | Price Range (AIT/hour) |
|-----------|----------------------|
| NVIDIA A100 | 25-40 |
| NVIDIA H100 | 35-50 |
| NVIDIA RTX 4090 | 15-25 |
| NVIDIA RTX 4060 Ti | 10-20 |
## Service Health Checks
```bash
# Health check
curl http://localhost:8101/health
# GPU status
curl http://localhost:8101/gpu/status
# Liveness check
curl http://localhost:8101/live
```
## Troubleshooting
- **/ready endpoint fails**: Database initialization issue with SQL expression. Service still operational for other endpoints.
- **Profiles endpoint 500 error**: Unimplemented feature. Not required for basic compute provider workflow.
- **Scan endpoint 404**: Unimplemented feature. Not required for basic compute provider workflow.
- **Poll returns null**: Expected behavior when no jobs are available (placeholder implementation).
- **Earnings show 0.0**: Expected for new miners with no completed jobs (placeholder implementation).
## Notes
- The `/ready` endpoint has a database SQL expression error but does not affect other functionality.
- Edge GPU profiles and scan endpoints are unimplemented and return errors.
- Job polling and earnings tracking are placeholder implementations.
- Use heartbeat regularly to keep miner status online.
- GPU offers persist in the database and can be queried via the transactions endpoint.