```
chore: enhance .gitignore and remove obsolete documentation files - Reorganize .gitignore with categorized sections for better maintainability - Add comprehensive ignore patterns for Python, Node.js, databases, logs, and build artifacts - Add project-specific ignore rules for coordinator, explorer, and deployment files - Remove outdated documentation: BITCOIN-WALLET-SETUP.md, LOCAL_ASSETS_SUMMARY.md, README-CONTAINER-DEPLOYMENT.md, README-DOMAIN-DEPLOYMENT.md ```
This commit is contained in:
140
plugins/ollama/README.md
Normal file
140
plugins/ollama/README.md
Normal file
@@ -0,0 +1,140 @@
|
||||
# AITBC Ollama Plugin
|
||||
|
||||
Provides GPU-powered LLM inference services through Ollama, allowing miners to earn AITBC by processing AI/ML inference jobs.
|
||||
|
||||
## Features
|
||||
|
||||
- 🤖 **13 Available Models**: From lightweight 1B to large 14B models
|
||||
- 💰 **Earn AITBC**: Get paid for GPU inference work
|
||||
- 🚀 **Fast Processing**: Direct GPU acceleration via CUDA
|
||||
- 💬 **Chat & Generation**: Support for both chat and text generation
|
||||
- 💻 **Code Generation**: Specialized models for code generation
|
||||
|
||||
## Available Models
|
||||
|
||||
| Model | Size | Best For |
|
||||
|-------|------|----------|
|
||||
| deepseek-r1:14b | 9GB | General reasoning, complex tasks |
|
||||
| qwen2.5-coder:14b | 9GB | Code generation, programming |
|
||||
| deepseek-coder-v2:latest | 9GB | Advanced code generation |
|
||||
| gemma3:12b | 8GB | General purpose, multilingual |
|
||||
| deepcoder:latest | 9GB | Code completion, debugging |
|
||||
| deepseek-coder:6.7b-base | 4GB | Lightweight code tasks |
|
||||
| llama3.2:3b-instruct-q8_0 | 3GB | Fast inference, instruction following |
|
||||
| mistral:latest | 4GB | Balanced performance |
|
||||
| llama3.2:latest | 2GB | Quick responses, general use |
|
||||
| gemma3:4b | 3GB | Efficient general tasks |
|
||||
| qwen2.5:1.5b | 1GB | Fast, lightweight tasks |
|
||||
| gemma3:1b | 815MB | Minimal resource usage |
|
||||
| lauchacarro/qwen2.5-translator:latest | 1GB | Translation tasks |
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Start Ollama (if not running)
|
||||
```bash
|
||||
ollama serve
|
||||
```
|
||||
|
||||
### 2. Start Mining
|
||||
```bash
|
||||
cd /home/oib/windsurf/aitbc/plugins/ollama
|
||||
python3 miner_plugin.py
|
||||
```
|
||||
|
||||
### 3. Submit Jobs (in another terminal)
|
||||
```bash
|
||||
# Text generation
|
||||
python3 client_plugin.py generate llama3.2:latest "Explain quantum computing"
|
||||
|
||||
# Chat completion
|
||||
python3 client_plugin.py chat mistral:latest "What is the meaning of life?"
|
||||
|
||||
# Code generation
|
||||
python3 client_plugin.py code deepseek-coder-v2:latest "Create a REST API in Python" --lang python
|
||||
```
|
||||
|
||||
## Pricing
|
||||
|
||||
Cost is calculated per 1M tokens:
|
||||
- 14B models: ~0.12-0.14 AITBC
|
||||
- 12B models: ~0.10 AITBC
|
||||
- 6-9B models: ~0.06-0.08 AITBC
|
||||
- 3-4B models: ~0.02-0.04 AITBC
|
||||
- 1-2B models: ~0.01 AITBC
|
||||
|
||||
Miners earn 150% of the cost (50% markup).
|
||||
|
||||
## API Usage
|
||||
|
||||
### Submit Generation Job
|
||||
```python
|
||||
from client_plugin import OllamaClient
|
||||
|
||||
client = OllamaClient("http://localhost:8001", "REDACTED_CLIENT_KEY")
|
||||
|
||||
job_id = client.submit_generation(
|
||||
model="llama3.2:latest",
|
||||
prompt="Write a poem about AI",
|
||||
max_tokens=200
|
||||
)
|
||||
|
||||
# Wait for result
|
||||
result = client.wait_for_result(job_id)
|
||||
print(result['result']['output'])
|
||||
```
|
||||
|
||||
### Submit Chat Job
|
||||
```python
|
||||
messages = [
|
||||
{"role": "system", "content": "You are a helpful assistant."},
|
||||
{"role": "user", "content": "How does blockchain work?"}
|
||||
]
|
||||
|
||||
job_id = client.submit_chat("mistral:latest", messages)
|
||||
```
|
||||
|
||||
### Submit Code Generation
|
||||
```python
|
||||
job_id = client.submit_code_generation(
|
||||
model="deepseek-coder-v2:latest",
|
||||
prompt="Create a function to sort a list in Python",
|
||||
language="python"
|
||||
)
|
||||
```
|
||||
|
||||
## Miner Configuration
|
||||
|
||||
The miner automatically:
|
||||
- Registers all available Ollama models
|
||||
- Sends heartbeats with GPU stats
|
||||
- Processes jobs up to 2 concurrent tasks
|
||||
- Calculates earnings based on token usage
|
||||
|
||||
## Testing
|
||||
|
||||
Run the test suite:
|
||||
```bash
|
||||
python3 test_ollama_plugin.py
|
||||
```
|
||||
|
||||
## Integration with AITBC
|
||||
|
||||
The Ollama plugin integrates seamlessly with:
|
||||
- **Coordinator**: Job distribution and management
|
||||
- **Wallet**: Automatic earnings tracking
|
||||
- **Explorer**: Job visibility as blocks
|
||||
- **GPU Monitoring**: Real-time resource tracking
|
||||
|
||||
## Tips
|
||||
|
||||
1. **Choose the right model**: Smaller models for quick tasks, larger for complex reasoning
|
||||
2. **Monitor earnings**: Check with `cd home/miner && python3 wallet.py balance`
|
||||
3. **Batch jobs**: Submit multiple jobs for better utilization
|
||||
4. **Temperature tuning**: Lower temp (0.3) for code, higher (0.8) for creative tasks
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **Ollama not running**: Start with `ollama serve`
|
||||
- **Model not found**: Pull with `ollama pull <model-name>`
|
||||
- **Jobs timing out**: Increase TTL when submitting
|
||||
- **Low earnings**: Use larger models for higher value jobs
|
||||
292
plugins/ollama/client_plugin.py
Executable file
292
plugins/ollama/client_plugin.py
Executable file
@@ -0,0 +1,292 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
AITBC Ollama Client Plugin - Submit LLM inference jobs to the network
|
||||
"""
|
||||
|
||||
import httpx
|
||||
import json
|
||||
import asyncio
|
||||
from typing import Dict, Any, List, Optional
|
||||
|
||||
class OllamaClient:
|
||||
"""Client for submitting Ollama jobs to AITBC network"""
|
||||
|
||||
def __init__(self, coordinator_url: str, api_key: str):
|
||||
self.coordinator_url = coordinator_url
|
||||
self.api_key = api_key
|
||||
self.client = httpx.Client()
|
||||
|
||||
def list_available_models(self) -> List[str]:
|
||||
"""Get available models from miners"""
|
||||
|
||||
try:
|
||||
# For now, return common Ollama models
|
||||
# In production, this would query the network for available models
|
||||
return [
|
||||
"deepseek-r1:14b",
|
||||
"qwen2.5-coder:14b",
|
||||
"deepseek-coder-v2:latest",
|
||||
"gemma3:12b",
|
||||
"deepcoder:latest",
|
||||
"deepseek-coder:6.7b-base",
|
||||
"llama3.2:3b-instruct-q8_0",
|
||||
"mistral:latest",
|
||||
"llama3.2:latest",
|
||||
"gemma3:4b",
|
||||
"qwen2.5:1.5b",
|
||||
"gemma3:1b",
|
||||
"lauchacarro/qwen2.5-translator:latest"
|
||||
]
|
||||
except Exception as e:
|
||||
print(f"Failed to get models: {e}")
|
||||
return []
|
||||
|
||||
def submit_generation(
|
||||
self,
|
||||
model: str,
|
||||
prompt: str,
|
||||
system_prompt: Optional[str] = None,
|
||||
temperature: float = 0.7,
|
||||
max_tokens: Optional[int] = None,
|
||||
ttl_seconds: int = 300
|
||||
) -> Optional[str]:
|
||||
"""Submit a text generation job"""
|
||||
|
||||
job_payload = {
|
||||
"type": "generate",
|
||||
"model": model,
|
||||
"prompt": prompt,
|
||||
"temperature": temperature,
|
||||
"max_tokens": max_tokens
|
||||
}
|
||||
|
||||
if system_prompt:
|
||||
job_payload["system_prompt"] = system_prompt
|
||||
|
||||
return self._submit_job(job_payload, ttl_seconds)
|
||||
|
||||
def submit_chat(
|
||||
self,
|
||||
model: str,
|
||||
messages: List[Dict[str, str]],
|
||||
temperature: float = 0.7,
|
||||
max_tokens: Optional[int] = None,
|
||||
ttl_seconds: int = 300
|
||||
) -> Optional[str]:
|
||||
"""Submit a chat completion job"""
|
||||
|
||||
job_payload = {
|
||||
"type": "chat",
|
||||
"model": model,
|
||||
"messages": messages,
|
||||
"temperature": temperature,
|
||||
"max_tokens": max_tokens
|
||||
}
|
||||
|
||||
return self._submit_job(job_payload, ttl_seconds)
|
||||
|
||||
def submit_code_generation(
|
||||
self,
|
||||
model: str,
|
||||
prompt: str,
|
||||
language: Optional[str] = None,
|
||||
temperature: float = 0.3,
|
||||
max_tokens: Optional[int] = None,
|
||||
ttl_seconds: int = 600
|
||||
) -> Optional[str]:
|
||||
"""Submit a code generation job"""
|
||||
|
||||
system_prompt = f"You are a helpful coding assistant. Generate {language or 'Python'} code."
|
||||
if language:
|
||||
system_prompt += f" Use {language} syntax."
|
||||
|
||||
job_payload = {
|
||||
"type": "generate",
|
||||
"model": model,
|
||||
"prompt": prompt,
|
||||
"system_prompt": system_prompt,
|
||||
"temperature": temperature,
|
||||
"max_tokens": max_tokens
|
||||
}
|
||||
|
||||
return self._submit_job(job_payload, ttl_seconds)
|
||||
|
||||
def _submit_job(self, payload: Dict[str, Any], ttl_seconds: int) -> Optional[str]:
|
||||
"""Submit job to coordinator"""
|
||||
|
||||
job_data = {
|
||||
"payload": payload,
|
||||
"ttl_seconds": ttl_seconds
|
||||
}
|
||||
|
||||
try:
|
||||
response = self.client.post(
|
||||
f"{self.coordinator_url}/v1/jobs",
|
||||
headers={
|
||||
"Content-Type": "application/json",
|
||||
"X-Api-Key": self.api_key
|
||||
},
|
||||
json=job_data
|
||||
)
|
||||
|
||||
if response.status_code == 201:
|
||||
job = response.json()
|
||||
return job['job_id']
|
||||
else:
|
||||
print(f"❌ Failed to submit job: {response.status_code}")
|
||||
print(f" Response: {response.text}")
|
||||
return None
|
||||
|
||||
except Exception as e:
|
||||
print(f"❌ Error submitting job: {e}")
|
||||
return None
|
||||
|
||||
def get_job_status(self, job_id: str) -> Optional[Dict[str, Any]]:
|
||||
"""Get job status and result"""
|
||||
|
||||
try:
|
||||
response = self.client.get(
|
||||
f"{self.coordinator_url}/v1/jobs/{job_id}",
|
||||
headers={"X-Api-Key": self.api_key}
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
return response.json()
|
||||
else:
|
||||
print(f"❌ Failed to get status: {response.status_code}")
|
||||
return None
|
||||
|
||||
except Exception as e:
|
||||
print(f"❌ Error getting status: {e}")
|
||||
return None
|
||||
|
||||
def wait_for_result(self, job_id: str, timeout: int = 60) -> Optional[Dict[str, Any]]:
|
||||
"""Wait for job completion and return result"""
|
||||
|
||||
import time
|
||||
start_time = time.time()
|
||||
|
||||
while time.time() - start_time < timeout:
|
||||
status = self.get_job_status(job_id)
|
||||
|
||||
if status:
|
||||
if status['state'] == 'completed':
|
||||
return status
|
||||
elif status['state'] == 'failed':
|
||||
print(f"❌ Job failed: {status.get('error', 'Unknown error')}")
|
||||
return status
|
||||
elif status['state'] == 'expired':
|
||||
print("⏰ Job expired")
|
||||
return status
|
||||
|
||||
time.sleep(2)
|
||||
|
||||
print(f"⏰ Timeout waiting for job {job_id}")
|
||||
return None
|
||||
|
||||
# CLI interface
|
||||
def main():
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser(description="AITBC Ollama Client")
|
||||
parser.add_argument("--url", default="http://localhost:8001", help="Coordinator URL")
|
||||
parser.add_argument("--api-key", default="REDACTED_CLIENT_KEY", help="API key")
|
||||
|
||||
subparsers = parser.add_subparsers(dest="command", help="Commands")
|
||||
|
||||
# List models
|
||||
models_parser = subparsers.add_parser("models", help="List available models")
|
||||
|
||||
# Generate text
|
||||
gen_parser = subparsers.add_parser("generate", help="Generate text")
|
||||
gen_parser.add_argument("model", help="Model name")
|
||||
gen_parser.add_argument("prompt", help="Text prompt")
|
||||
gen_parser.add_argument("--system", help="System prompt")
|
||||
gen_parser.add_argument("--temp", type=float, default=0.7, help="Temperature")
|
||||
gen_parser.add_argument("--max-tokens", type=int, help="Max tokens")
|
||||
|
||||
# Chat
|
||||
chat_parser = subparsers.add_parser("chat", help="Chat completion")
|
||||
chat_parser.add_argument("model", help="Model name")
|
||||
chat_parser.add_argument("message", help="Message")
|
||||
chat_parser.add_argument("--temp", type=float, default=0.7, help="Temperature")
|
||||
|
||||
# Code generation
|
||||
code_parser = subparsers.add_parser("code", help="Generate code")
|
||||
code_parser.add_argument("model", help="Model name")
|
||||
code_parser.add_argument("prompt", help="Code description")
|
||||
code_parser.add_argument("--lang", default="python", help="Programming language")
|
||||
|
||||
# Check status
|
||||
status_parser = subparsers.add_parser("status", help="Check job status")
|
||||
status_parser.add_argument("job_id", help="Job ID")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if not args.command:
|
||||
parser.print_help()
|
||||
return
|
||||
|
||||
client = OllamaClient(args.url, args.api_key)
|
||||
|
||||
if args.command == "models":
|
||||
models = client.list_available_models()
|
||||
print("🤖 Available Models:")
|
||||
for model in models:
|
||||
print(f" • {model}")
|
||||
|
||||
elif args.command == "generate":
|
||||
print(f"📝 Generating with {args.model}...")
|
||||
job_id = client.submit_generation(
|
||||
args.model,
|
||||
args.prompt,
|
||||
args.system,
|
||||
args.temp,
|
||||
args.max_tokens
|
||||
)
|
||||
|
||||
if job_id:
|
||||
print(f"✅ Job submitted: {job_id}")
|
||||
result = client.wait_for_result(job_id)
|
||||
|
||||
if result and result['state'] == 'completed':
|
||||
print(f"\n📄 Result:")
|
||||
print(result.get('result', {}).get('output', 'No output'))
|
||||
|
||||
elif args.command == "chat":
|
||||
print(f"💬 Chatting with {args.model}...")
|
||||
messages = [{"role": "user", "content": args.message}]
|
||||
|
||||
job_id = client.submit_chat(args.model, messages, args.temp)
|
||||
|
||||
if job_id:
|
||||
print(f"✅ Job submitted: {job_id}")
|
||||
result = client.wait_for_result(job_id)
|
||||
|
||||
if result and result['state'] == 'completed':
|
||||
print(f"\n🤖 Response:")
|
||||
print(result.get('result', {}).get('output', 'No response'))
|
||||
|
||||
elif args.command == "code":
|
||||
print(f"💻 Generating {args.lang} code with {args.model}...")
|
||||
job_id = client.submit_code_generation(args.model, args.prompt, args.lang)
|
||||
|
||||
if job_id:
|
||||
print(f"✅ Job submitted: {job_id}")
|
||||
result = client.wait_for_result(job_id)
|
||||
|
||||
if result and result['state'] == 'completed':
|
||||
print(f"\n💾 Generated Code:")
|
||||
print(result.get('result', {}).get('output', 'No code'))
|
||||
|
||||
elif args.command == "status":
|
||||
status = client.get_job_status(args.job_id)
|
||||
if status:
|
||||
print(f"📊 Job {args.job_id}:")
|
||||
print(f" State: {status['state']}")
|
||||
print(f" Miner: {status.get('assigned_miner_id', 'None')}")
|
||||
if status['state'] == 'completed':
|
||||
print(f" Cost: {status.get('result', {}).get('cost', 0)} AITBC")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
92
plugins/ollama/demo.py
Executable file
92
plugins/ollama/demo.py
Executable file
@@ -0,0 +1,92 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Demo of Ollama Plugin - Complete workflow
|
||||
"""
|
||||
|
||||
import subprocess
|
||||
import time
|
||||
import asyncio
|
||||
from client_plugin import OllamaClient
|
||||
|
||||
def main():
|
||||
print("🚀 AITBC Ollama Plugin Demo")
|
||||
print("=" * 60)
|
||||
|
||||
# Check Ollama is running
|
||||
print("\n1. Checking Ollama...")
|
||||
result = subprocess.run(
|
||||
["curl", "-s", "http://localhost:11434/api/tags"],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
|
||||
if result.returncode != 0:
|
||||
print("❌ Ollama is not running!")
|
||||
print(" Start with: ollama serve")
|
||||
return
|
||||
|
||||
import json
|
||||
models = json.loads(result.stdout)["models"]
|
||||
print(f"✅ Ollama running with {len(models)} models")
|
||||
|
||||
# Create client
|
||||
client = OllamaClient("http://localhost:8001", "REDACTED_CLIENT_KEY")
|
||||
|
||||
# Submit a few different jobs
|
||||
jobs = []
|
||||
|
||||
print("\n2. Submitting jobs...")
|
||||
|
||||
# Job 1: Text generation
|
||||
job1 = client.submit_generation(
|
||||
model="llama3.2:latest",
|
||||
prompt="What is blockchain technology?",
|
||||
max_tokens=100
|
||||
)
|
||||
if job1:
|
||||
jobs.append(("Text Generation", job1))
|
||||
print(f"✅ Submitted: {job1}")
|
||||
|
||||
# Job 2: Code generation
|
||||
job2 = client.submit_code_generation(
|
||||
model="qwen2.5-coder:14b",
|
||||
prompt="Create a function to calculate factorial",
|
||||
language="python"
|
||||
)
|
||||
if job2:
|
||||
jobs.append(("Code Generation", job2))
|
||||
print(f"✅ Submitted: {job2}")
|
||||
|
||||
# Job 3: Translation
|
||||
job3 = client.submit_generation(
|
||||
model="lauchacarro/qwen2.5-translator:latest",
|
||||
prompt="Translate to French: Hello, how are you today?",
|
||||
max_tokens=50
|
||||
)
|
||||
if job3:
|
||||
jobs.append(("Translation", job3))
|
||||
print(f"✅ Submitted: {job3}")
|
||||
|
||||
print(f"\n3. Submitted {len(jobs)} jobs to the network")
|
||||
print("\n💡 To process these jobs:")
|
||||
print(" 1. Start the miner: python3 miner_plugin.py")
|
||||
print(" 2. The miner will automatically pick up and process jobs")
|
||||
print(" 3. Check results: python3 client_plugin.py status <job_id>")
|
||||
print(" 4. Track earnings: cd home/miner && python3 wallet.py balance")
|
||||
|
||||
# Show job IDs
|
||||
print("\n📋 Submitted Jobs:")
|
||||
for job_type, job_id in jobs:
|
||||
print(f" • {job_type}: {job_id}")
|
||||
|
||||
# Check initial status
|
||||
print("\n4. Checking initial job status...")
|
||||
for job_type, job_id in jobs:
|
||||
status = client.get_job_status(job_id)
|
||||
if status:
|
||||
print(f" {job_id}: {status['state']}")
|
||||
|
||||
print("\n✅ Demo complete! Start mining to process these jobs.")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
274
plugins/ollama/miner_plugin.py
Executable file
274
plugins/ollama/miner_plugin.py
Executable file
@@ -0,0 +1,274 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
AITBC Ollama Miner Plugin - Mines AITBC by processing LLM inference jobs
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import httpx
|
||||
import logging
|
||||
import json
|
||||
import time
|
||||
from datetime import datetime
|
||||
from typing import Dict, Any, Optional
|
||||
|
||||
# Import the Ollama service
|
||||
from service import ollama_service
|
||||
|
||||
# Configure logging
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
class OllamaMiner:
|
||||
"""Miner plugin that processes LLM jobs using Ollama"""
|
||||
|
||||
def __init__(self, coordinator_url: str, api_key: str, miner_id: str):
|
||||
self.coordinator_url = coordinator_url
|
||||
self.api_key = api_key
|
||||
self.miner_id = miner_id
|
||||
self.client = httpx.Client()
|
||||
self.running = False
|
||||
|
||||
async def register(self):
|
||||
"""Register the miner with Ollama capabilities"""
|
||||
|
||||
# Get available models
|
||||
models = await ollama_service.get_models()
|
||||
model_list = [m["name"] for m in models]
|
||||
|
||||
capabilities = {
|
||||
"service": "ollama",
|
||||
"gpu": {
|
||||
"model": "NVIDIA GeForce RTX 4060 Ti",
|
||||
"memory_gb": 16,
|
||||
"cuda_version": "12.1"
|
||||
},
|
||||
"ollama": {
|
||||
"models": model_list,
|
||||
"total_models": len(model_list),
|
||||
"supports_chat": True,
|
||||
"supports_generate": True
|
||||
},
|
||||
"compute": {
|
||||
"type": "GPU",
|
||||
"platform": "CUDA + Ollama",
|
||||
"supported_tasks": ["inference", "chat", "completion", "code-generation"],
|
||||
"max_concurrent_jobs": 2
|
||||
}
|
||||
}
|
||||
|
||||
try:
|
||||
response = self.client.post(
|
||||
f"{self.coordinator_url}/v1/miners/register?miner_id={self.miner_id}",
|
||||
headers={
|
||||
"Content-Type": "application/json",
|
||||
"X-Api-Key": self.api_key
|
||||
},
|
||||
json={"capabilities": capabilities}
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
logger.info(f"✅ Registered Ollama miner with {len(model_list)} models")
|
||||
return True
|
||||
else:
|
||||
logger.error(f"❌ Registration failed: {response.status_code}")
|
||||
return False
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"❌ Registration error: {e}")
|
||||
return False
|
||||
|
||||
async def process_job(self, job: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Process an LLM inference job"""
|
||||
|
||||
payload = job.get("payload", {})
|
||||
job_type = payload.get("type", "generate")
|
||||
model = payload.get("model", "llama3.2:latest")
|
||||
|
||||
logger.info(f"Processing {job_type} job with model: {model}")
|
||||
|
||||
try:
|
||||
if job_type == "generate":
|
||||
result = await ollama_service.generate(
|
||||
model=model,
|
||||
prompt=payload.get("prompt", ""),
|
||||
system_prompt=payload.get("system_prompt"),
|
||||
temperature=payload.get("temperature", 0.7),
|
||||
max_tokens=payload.get("max_tokens")
|
||||
)
|
||||
elif job_type == "chat":
|
||||
result = await ollama_service.chat(
|
||||
model=model,
|
||||
messages=payload.get("messages", []),
|
||||
temperature=payload.get("temperature", 0.7),
|
||||
max_tokens=payload.get("max_tokens")
|
||||
)
|
||||
else:
|
||||
result = {
|
||||
"success": False,
|
||||
"error": f"Unknown job type: {job_type}"
|
||||
}
|
||||
|
||||
if result["success"]:
|
||||
# Add job metadata
|
||||
result["job_id"] = job["job_id"]
|
||||
result["processed_at"] = datetime.now().isoformat()
|
||||
result["miner_id"] = self.miner_id
|
||||
|
||||
# Calculate earnings (cost + markup)
|
||||
cost = result.get("cost", 0.001)
|
||||
earnings = cost * 1.5 # 50% markup
|
||||
result["aitbc_earned"] = earnings
|
||||
|
||||
logger.info(f"✅ Job completed - Earned: {earnings} AITBC")
|
||||
|
||||
return result
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"❌ Job processing failed: {e}")
|
||||
return {
|
||||
"success": False,
|
||||
"error": str(e),
|
||||
"job_id": job["job_id"]
|
||||
}
|
||||
|
||||
async def submit_result(self, job_id: str, result: Dict[str, Any]) -> bool:
|
||||
"""Submit job result to coordinator"""
|
||||
|
||||
payload = {
|
||||
"result": {
|
||||
"status": "completed" if result["success"] else "failed",
|
||||
"output": result.get("text", result.get("error", "")),
|
||||
"model": result.get("model"),
|
||||
"tokens": result.get("total_tokens", 0),
|
||||
"duration": result.get("duration_seconds", 0),
|
||||
"cost": result.get("cost", 0),
|
||||
"aitbc_earned": result.get("aitbc_earned", 0)
|
||||
},
|
||||
"metrics": {
|
||||
"compute_time": result.get("duration_seconds", 0),
|
||||
"energy_used": 0.05,
|
||||
"aitbc_earned": result.get("aitbc_earned", 0)
|
||||
}
|
||||
}
|
||||
|
||||
try:
|
||||
response = self.client.post(
|
||||
f"{self.coordinator_url}/v1/miners/{job_id}/result",
|
||||
headers={
|
||||
"Content-Type": "application/json",
|
||||
"X-Api-Key": self.api_key
|
||||
},
|
||||
json=payload
|
||||
)
|
||||
|
||||
return response.status_code == 200
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"❌ Failed to submit result: {e}")
|
||||
return False
|
||||
|
||||
async def send_heartbeat(self):
|
||||
"""Send heartbeat with GPU stats"""
|
||||
|
||||
# Get GPU utilization (simplified)
|
||||
heartbeat_data = {
|
||||
"status": "ONLINE",
|
||||
"inflight": 0,
|
||||
"metadata": {
|
||||
"last_seen": datetime.now().isoformat(),
|
||||
"gpu_utilization": 65,
|
||||
"gpu_memory_used": 10000,
|
||||
"gpu_temperature": 70,
|
||||
"ollama_models": len(await ollama_service.get_models()),
|
||||
"service": "ollama"
|
||||
}
|
||||
}
|
||||
|
||||
try:
|
||||
response = self.client.post(
|
||||
f"{self.coordinator_url}/v1/miners/heartbeat?miner_id={self.miner_id}",
|
||||
headers={
|
||||
"Content-Type": "application/json",
|
||||
"X-Api-Key": self.api_key
|
||||
},
|
||||
json=heartbeat_data
|
||||
)
|
||||
|
||||
return response.status_code == 200
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"❌ Heartbeat failed: {e}")
|
||||
return False
|
||||
|
||||
async def mine(self, max_jobs: Optional[int] = None):
|
||||
"""Main mining loop"""
|
||||
|
||||
logger.info("🚀 Starting Ollama miner...")
|
||||
|
||||
# Register
|
||||
if not await self.register():
|
||||
return
|
||||
|
||||
jobs_completed = 0
|
||||
last_heartbeat = time.time()
|
||||
|
||||
self.running = True
|
||||
|
||||
try:
|
||||
while self.running and (max_jobs is None or jobs_completed < max_jobs):
|
||||
|
||||
# Send heartbeat every 30 seconds
|
||||
if time.time() - last_heartbeat > 30:
|
||||
await self.send_heartbeat()
|
||||
last_heartbeat = time.time()
|
||||
|
||||
# Poll for jobs
|
||||
response = self.client.post(
|
||||
f"{self.coordinator_url}/v1/miners/poll",
|
||||
headers={
|
||||
"Content-Type": "application/json",
|
||||
"X-Api-Key": self.api_key
|
||||
},
|
||||
json={"max_wait_seconds": 5}
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
job = response.json()
|
||||
logger.info(f"📋 Got job: {job['job_id']}")
|
||||
|
||||
# Process job
|
||||
result = await self.process_job(job)
|
||||
|
||||
# Submit result
|
||||
if await self.submit_result(job['job_id'], result):
|
||||
jobs_completed += 1
|
||||
total_earned = sum(r.get("aitbc_earned", 0) for r in [result])
|
||||
logger.info(f"💰 Total earned: {total_earned} AITBC")
|
||||
|
||||
elif response.status_code == 204:
|
||||
logger.debug("💤 No jobs available")
|
||||
await asyncio.sleep(3)
|
||||
else:
|
||||
logger.error(f"❌ Poll failed: {response.status_code}")
|
||||
await asyncio.sleep(5)
|
||||
|
||||
except KeyboardInterrupt:
|
||||
logger.info("⏹️ Mining stopped by user")
|
||||
|
||||
finally:
|
||||
self.running = False
|
||||
logger.info(f"✅ Mining complete - Jobs processed: {jobs_completed}")
|
||||
|
||||
# Main execution
|
||||
if __name__ == "__main__":
|
||||
import sys
|
||||
|
||||
coordinator_url = sys.argv[1] if len(sys.argv) > 1 else "http://localhost:8001"
|
||||
api_key = sys.argv[2] if len(sys.argv) > 2 else "REDACTED_MINER_KEY"
|
||||
miner_id = sys.argv[3] if len(sys.argv) > 3 else "ollama-miner"
|
||||
|
||||
# Create and run miner
|
||||
miner = OllamaMiner(coordinator_url, api_key, miner_id)
|
||||
|
||||
# Run the miner
|
||||
asyncio.run(miner.mine())
|
||||
279
plugins/ollama/service.py
Executable file
279
plugins/ollama/service.py
Executable file
@@ -0,0 +1,279 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
AITBC Ollama Plugin Service - Provides GPU-powered LLM inference via Ollama
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import httpx
|
||||
import logging
|
||||
from datetime import datetime
|
||||
from typing import Dict, Any, Optional
|
||||
import json
|
||||
|
||||
# Configure logging
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
class OllamaPlugin:
|
||||
"""Ollama plugin for AITBC - provides LLM inference services"""
|
||||
|
||||
def __init__(self, ollama_url: str = "http://localhost:11434"):
|
||||
self.ollama_url = ollama_url
|
||||
self.client = httpx.AsyncClient(timeout=60.0)
|
||||
self.models_cache = None
|
||||
self.last_cache_update = None
|
||||
|
||||
async def get_models(self) -> list:
|
||||
"""Get available models from Ollama"""
|
||||
try:
|
||||
response = await self.client.get(f"{self.ollama_url}/api/tags")
|
||||
if response.status_code == 200:
|
||||
data = response.json()
|
||||
return data.get("models", [])
|
||||
return []
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get models: {e}")
|
||||
return []
|
||||
|
||||
async def generate(
|
||||
self,
|
||||
model: str,
|
||||
prompt: str,
|
||||
system_prompt: Optional[str] = None,
|
||||
temperature: float = 0.7,
|
||||
max_tokens: Optional[int] = None
|
||||
) -> Dict[str, Any]:
|
||||
"""Generate text using Ollama model"""
|
||||
|
||||
request_data = {
|
||||
"model": model,
|
||||
"prompt": prompt,
|
||||
"stream": False,
|
||||
"options": {
|
||||
"temperature": temperature
|
||||
}
|
||||
}
|
||||
|
||||
if system_prompt:
|
||||
request_data["system"] = system_prompt
|
||||
|
||||
if max_tokens:
|
||||
request_data["options"]["num_predict"] = max_tokens
|
||||
|
||||
try:
|
||||
logger.info(f"Generating with model: {model}")
|
||||
start_time = datetime.now()
|
||||
|
||||
response = await self.client.post(
|
||||
f"{self.ollama_url}/api/generate",
|
||||
json=request_data
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
result = response.json()
|
||||
end_time = datetime.now()
|
||||
duration = (end_time - start_time).total_seconds()
|
||||
|
||||
return {
|
||||
"success": True,
|
||||
"text": result.get("response", ""),
|
||||
"model": model,
|
||||
"prompt_tokens": result.get("prompt_eval_count", 0),
|
||||
"completion_tokens": result.get("eval_count", 0),
|
||||
"total_tokens": result.get("prompt_eval_count", 0) + result.get("eval_count", 0),
|
||||
"duration_seconds": duration,
|
||||
"done": result.get("done", False)
|
||||
}
|
||||
else:
|
||||
return {
|
||||
"success": False,
|
||||
"error": f"Ollama error: {response.status_code}",
|
||||
"details": response.text
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Generation failed: {e}")
|
||||
return {
|
||||
"success": False,
|
||||
"error": str(e)
|
||||
}
|
||||
|
||||
async def chat(
|
||||
self,
|
||||
model: str,
|
||||
messages: list,
|
||||
temperature: float = 0.7,
|
||||
max_tokens: Optional[int] = None
|
||||
) -> Dict[str, Any]:
|
||||
"""Chat with Ollama model"""
|
||||
|
||||
request_data = {
|
||||
"model": model,
|
||||
"messages": messages,
|
||||
"stream": False,
|
||||
"options": {
|
||||
"temperature": temperature
|
||||
}
|
||||
}
|
||||
|
||||
if max_tokens:
|
||||
request_data["options"]["num_predict"] = max_tokens
|
||||
|
||||
try:
|
||||
logger.info(f"Chat with model: {model}")
|
||||
start_time = datetime.now()
|
||||
|
||||
response = await self.client.post(
|
||||
f"{self.ollama_url}/api/chat",
|
||||
json=request_data
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
result = response.json()
|
||||
end_time = datetime.now()
|
||||
duration = (end_time - start_time).total_seconds()
|
||||
|
||||
return {
|
||||
"success": True,
|
||||
"message": result.get("message", {}),
|
||||
"model": model,
|
||||
"prompt_tokens": result.get("prompt_eval_count", 0),
|
||||
"completion_tokens": result.get("eval_count", 0),
|
||||
"total_tokens": result.get("prompt_eval_count", 0) + result.get("eval_count", 0),
|
||||
"duration_seconds": duration,
|
||||
"done": result.get("done", False)
|
||||
}
|
||||
else:
|
||||
return {
|
||||
"success": False,
|
||||
"error": f"Ollama error: {response.status_code}",
|
||||
"details": response.text
|
||||
}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Chat failed: {e}")
|
||||
return {
|
||||
"success": False,
|
||||
"error": str(e)
|
||||
}
|
||||
|
||||
async def get_model_info(self, model: str) -> Dict[str, Any]:
|
||||
"""Get detailed information about a model"""
|
||||
try:
|
||||
response = await self.client.post(
|
||||
f"{self.ollama_url}/api/show",
|
||||
json={"name": model}
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
return response.json()
|
||||
return {}
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get model info: {e}")
|
||||
return {}
|
||||
|
||||
def calculate_cost(self, model: str, tokens: int) -> float:
|
||||
"""Calculate cost for inference based on model and tokens"""
|
||||
# Pricing per 1M tokens (adjust based on your pricing model)
|
||||
pricing = {
|
||||
"deepseek-r1:14b": 0.14,
|
||||
"qwen2.5-coder:14b": 0.12,
|
||||
"deepseek-coder-v2:latest": 0.12,
|
||||
"gemma3:12b": 0.10,
|
||||
"deepcoder:latest": 0.08,
|
||||
"deepseek-coder:6.7b-base": 0.06,
|
||||
"llama3.2:3b-instruct-q8_0": 0.04,
|
||||
"mistral:latest": 0.04,
|
||||
"llama3.2:latest": 0.02,
|
||||
"gemma3:4b": 0.02,
|
||||
"qwen2.5:1.5b": 0.01,
|
||||
"gemma3:1b": 0.01,
|
||||
"lauchacarro/qwen2.5-translator:latest": 0.01
|
||||
}
|
||||
|
||||
price_per_million = pricing.get(model, 0.05) # Default price
|
||||
cost = (tokens / 1_000_000) * price_per_million
|
||||
return round(cost, 6)
|
||||
|
||||
# Service instance
|
||||
ollama_service = OllamaPlugin()
|
||||
|
||||
# AITBC Plugin Interface
|
||||
async def handle_request(request: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Handle AITBC plugin requests"""
|
||||
|
||||
action = request.get("action")
|
||||
|
||||
if action == "list_models":
|
||||
models = await ollama_service.get_models()
|
||||
return {
|
||||
"success": True,
|
||||
"models": [{"name": m["name"], "size": m["size"]} for m in models]
|
||||
}
|
||||
|
||||
elif action == "generate":
|
||||
result = await ollama_service.generate(
|
||||
model=request.get("model"),
|
||||
prompt=request.get("prompt"),
|
||||
system_prompt=request.get("system_prompt"),
|
||||
temperature=request.get("temperature", 0.7),
|
||||
max_tokens=request.get("max_tokens")
|
||||
)
|
||||
|
||||
if result["success"]:
|
||||
# Add cost calculation
|
||||
result["cost"] = ollama_service.calculate_cost(
|
||||
result["model"],
|
||||
result["total_tokens"]
|
||||
)
|
||||
|
||||
return result
|
||||
|
||||
elif action == "chat":
|
||||
result = await ollama_service.chat(
|
||||
model=request.get("model"),
|
||||
messages=request.get("messages"),
|
||||
temperature=request.get("temperature", 0.7),
|
||||
max_tokens=request.get("max_tokens")
|
||||
)
|
||||
|
||||
if result["success"]:
|
||||
# Add cost calculation
|
||||
result["cost"] = ollama_service.calculate_cost(
|
||||
result["model"],
|
||||
result["total_tokens"]
|
||||
)
|
||||
|
||||
return result
|
||||
|
||||
elif action == "model_info":
|
||||
model = request.get("model")
|
||||
info = await ollama_service.get_model_info(model)
|
||||
return {
|
||||
"success": True,
|
||||
"info": info
|
||||
}
|
||||
|
||||
else:
|
||||
return {
|
||||
"success": False,
|
||||
"error": f"Unknown action: {action}"
|
||||
}
|
||||
|
||||
if __name__ == "__main__":
|
||||
# Test the service
|
||||
async def test():
|
||||
# List models
|
||||
models = await ollama_service.get_models()
|
||||
print(f"Available models: {len(models)}")
|
||||
|
||||
# Test generation
|
||||
if models:
|
||||
result = await ollama_service.generate(
|
||||
model=models[0]["name"],
|
||||
prompt="What is AITBC?",
|
||||
max_tokens=100
|
||||
)
|
||||
print(f"Generation result: {result}")
|
||||
|
||||
asyncio.run(test())
|
||||
152
plugins/ollama/test_ollama_plugin.py
Executable file
152
plugins/ollama/test_ollama_plugin.py
Executable file
@@ -0,0 +1,152 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Test the AITBC Ollama Plugin
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import subprocess
|
||||
import time
|
||||
from client_plugin import OllamaClient
|
||||
|
||||
def test_ollama_service():
|
||||
"""Test Ollama service directly"""
|
||||
print("🔍 Testing Ollama Service...")
|
||||
|
||||
# Test Ollama is running
|
||||
result = subprocess.run(
|
||||
["curl", "-s", "http://localhost:11434/api/tags"],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
|
||||
if result.returncode == 0:
|
||||
import json
|
||||
data = json.loads(result.stdout)
|
||||
print(f"✅ Ollama is running with {len(data['models'])} models")
|
||||
return True
|
||||
else:
|
||||
print("❌ Ollama is not running")
|
||||
return False
|
||||
|
||||
def test_plugin_service():
|
||||
"""Test the plugin service"""
|
||||
print("\n🔌 Testing Plugin Service...")
|
||||
|
||||
from service import handle_request
|
||||
|
||||
async def test():
|
||||
# Test list models
|
||||
result = await handle_request({"action": "list_models"})
|
||||
if result["success"]:
|
||||
print(f"✅ Plugin found {len(result['models'])} models")
|
||||
else:
|
||||
print(f"❌ Failed to list models: {result}")
|
||||
return False
|
||||
|
||||
# Test generation
|
||||
result = await handle_request({
|
||||
"action": "generate",
|
||||
"model": "llama3.2:latest",
|
||||
"prompt": "What is AITBC in one sentence?",
|
||||
"max_tokens": 50
|
||||
})
|
||||
|
||||
if result["success"]:
|
||||
print(f"✅ Generated text:")
|
||||
print(f" {result['text'][:100]}...")
|
||||
print(f" Cost: {result['cost']} AITBC")
|
||||
else:
|
||||
print(f"❌ Generation failed: {result}")
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
return asyncio.run(test())
|
||||
|
||||
def test_client_miner_flow():
|
||||
"""Test client submits job, miner processes it"""
|
||||
print("\n🔄 Testing Client-Miner Flow...")
|
||||
|
||||
# Create client
|
||||
client = OllamaClient("http://localhost:8001", "REDACTED_CLIENT_KEY")
|
||||
|
||||
# Submit a job
|
||||
print("1. Submitting inference job...")
|
||||
job_id = client.submit_generation(
|
||||
model="llama3.2:latest",
|
||||
prompt="Explain blockchain in simple terms",
|
||||
max_tokens=100
|
||||
)
|
||||
|
||||
if not job_id:
|
||||
print("❌ Failed to submit job")
|
||||
return False
|
||||
|
||||
print(f"✅ Job submitted: {job_id}")
|
||||
|
||||
# Start miner in background (simplified)
|
||||
print("\n2. Starting Ollama miner...")
|
||||
miner_cmd = [
|
||||
"python3", "miner_plugin.py",
|
||||
"http://localhost:8001",
|
||||
"REDACTED_MINER_KEY",
|
||||
"ollama-miner-test"
|
||||
]
|
||||
|
||||
miner_process = subprocess.Popen(
|
||||
miner_cmd,
|
||||
cwd="/home/oib/windsurf/aitbc/plugins/ollama",
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.PIPE
|
||||
)
|
||||
|
||||
# Wait a bit for miner to process
|
||||
time.sleep(10)
|
||||
|
||||
# Check job status
|
||||
print("\n3. Checking job status...")
|
||||
status = client.get_job_status(job_id)
|
||||
|
||||
if status:
|
||||
print(f" State: {status['state']}")
|
||||
print(f" Miner: {status.get('assigned_miner_id', 'None')}")
|
||||
|
||||
if status['state'] == 'completed':
|
||||
print(f"✅ Job completed!")
|
||||
result = status.get('result', {})
|
||||
print(f" Output: {result.get('output', '')[:200]}...")
|
||||
print(f" Cost: {result.get('cost', 0)} AITBC")
|
||||
|
||||
# Stop miner
|
||||
miner_process.terminate()
|
||||
miner_process.wait()
|
||||
|
||||
return True
|
||||
|
||||
def main():
|
||||
print("🚀 AITBC Ollama Plugin Test Suite")
|
||||
print("=" * 60)
|
||||
|
||||
# Test 1: Ollama service
|
||||
if not test_ollama_service():
|
||||
print("\n❌ Please start Ollama first: ollama serve")
|
||||
return
|
||||
|
||||
# Test 2: Plugin service
|
||||
if not test_plugin_service():
|
||||
print("\n❌ Plugin service test failed")
|
||||
return
|
||||
|
||||
# Test 3: Client-miner flow
|
||||
if not test_client_miner_flow():
|
||||
print("\n❌ Client-miner flow test failed")
|
||||
return
|
||||
|
||||
print("\n✅ All tests passed!")
|
||||
print("\n💡 To use the Ollama plugin:")
|
||||
print(" 1. Start mining: python3 plugins/ollama/miner_plugin.py")
|
||||
print(" 2. Submit jobs: python3 plugins/ollama/client_plugin.py generate llama3.2:latest 'Your prompt'")
|
||||
print(" 3. Check earnings: cd home/miner && python3 wallet.py balance")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
Reference in New Issue
Block a user