# AITBC Ollama Plugin Provides GPU-powered LLM inference services through Ollama, allowing miners to earn AITBC by processing AI/ML inference jobs. ## Features - 🤖 **13 Available Models**: From lightweight 1B to large 14B models - 💰 **Earn AITBC**: Get paid for GPU inference work - 🚀 **Fast Processing**: Direct GPU acceleration via CUDA - 💬 **Chat & Generation**: Support for both chat and text generation - 💻 **Code Generation**: Specialized models for code generation ## Available Models | Model | Size | Best For | |-------|------|----------| | deepseek-r1:14b | 9GB | General reasoning, complex tasks | | qwen2.5-coder:14b | 9GB | Code generation, programming | | deepseek-coder-v2:latest | 9GB | Advanced code generation | | gemma3:12b | 8GB | General purpose, multilingual | | deepcoder:latest | 9GB | Code completion, debugging | | deepseek-coder:6.7b-base | 4GB | Lightweight code tasks | | llama3.2:3b-instruct-q8_0 | 3GB | Fast inference, instruction following | | mistral:latest | 4GB | Balanced performance | | llama3.2:latest | 2GB | Quick responses, general use | | gemma3:4b | 3GB | Efficient general tasks | | qwen2.5:1.5b | 1GB | Fast, lightweight tasks | | gemma3:1b | 815MB | Minimal resource usage | | lauchacarro/qwen2.5-translator:latest | 1GB | Translation tasks | ## Quick Start ### Prerequisites - Ollama installed and running locally (`ollama serve`) - At least one model pulled (example: `ollama pull mistral:latest`) - Python 3.13.5+ with `pip install -e .` if running from repo root ### Minimal Usage Example ```bash # 1) Run miner (exposes inference endpoint for jobs) python3 miner_plugin.py --host 0.0.0.0 --port 8001 # 2) In another terminal, submit a job via client python3 client_plugin.py chat mistral:latest "Summarize the AITBC marketplace in 3 bullets" # 3) View logs/results tail -f miner.log ``` Optional environment variables: - `OLLAMA_HOST` (default: http://127.0.0.1:11434) - `OLLAMA_MODELS` (comma-separated list to register; defaults to discovered models) - `OLLAMA_MAX_CONCURRENCY` (default: 2) ### 1. Start Ollama (if not running) ```bash ollama serve ``` ### 2. Start Mining ```bash cd /home/oib/windsurf/aitbc/plugins/ollama python3 miner_plugin.py ``` ### 3. Submit Jobs (in another terminal) ```bash # Text generation python3 client_plugin.py generate llama3.2:latest "Explain quantum computing" # Chat completion python3 client_plugin.py chat mistral:latest "What is the meaning of life?" # Code generation python3 client_plugin.py code deepseek-coder-v2:latest "Create a REST API in Python" --lang python ``` ## Pricing Cost is calculated per 1M tokens: - 14B models: ~0.12-0.14 AITBC - 12B models: ~0.10 AITBC - 6-9B models: ~0.06-0.08 AITBC - 3-4B models: ~0.02-0.04 AITBC - 1-2B models: ~0.01 AITBC Miners earn 150% of the cost (50% markup). ## API Usage ### Submit Generation Job ```python from client_plugin import OllamaClient client = OllamaClient("http://localhost:8001", "${CLIENT_API_KEY}") job_id = client.submit_generation( model="llama3.2:latest", prompt="Write a poem about AI", max_tokens=200 ) # Wait for result result = client.wait_for_result(job_id) print(result['result']['output']) ``` ### Submit Chat Job ```python messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "How does blockchain work?"} ] job_id = client.submit_chat("mistral:latest", messages) ``` ### Submit Code Generation ```python job_id = client.submit_code_generation( model="deepseek-coder-v2:latest", prompt="Create a function to sort a list in Python", language="python" ) ``` ## Miner Configuration The miner automatically: - Registers all available Ollama models - Sends heartbeats with GPU stats - Processes jobs up to 2 concurrent tasks - Calculates earnings based on token usage ## Testing Run the test suite: ```bash python3 test_ollama_plugin.py ``` ## Integration with AITBC The Ollama plugin integrates seamlessly with: - **Coordinator**: Job distribution and management - **Wallet**: Automatic earnings tracking - **Explorer**: Job visibility as blocks - **GPU Monitoring**: Real-time resource tracking ## Tips 1. **Choose the right model**: Smaller models for quick tasks, larger for complex reasoning 2. **Monitor earnings**: Check with `cd home/miner && python3 wallet.py balance` 3. **Batch jobs**: Submit multiple jobs for better utilization 4. **Temperature tuning**: Lower temp (0.3) for code, higher (0.8) for creative tasks ## Troubleshooting - **Ollama not running**: Start with `ollama serve` - **Model not found**: Pull with `ollama pull ` - **Jobs timing out**: Increase TTL when submitting - **Low earnings**: Use larger models for higher value jobs