Files
aitbc/docs/1_project/PROJECT_STRUCTURE.md
oib f353e00172 chore(security): enhance environment configuration, CI workflows, and wallet daemon with security improvements
- Restructure .env.example with security-focused documentation, service-specific environment file references, and AWS Secrets Manager integration
- Update CLI tests workflow to single Python 3.13 version, add pytest-mock dependency, and consolidate test execution with coverage
- Add comprehensive security validation to package publishing workflow with manual approval gates, secret scanning, and release
2026-03-03 10:33:46 +01:00

6.6 KiB

GPU Acceleration Project Structure

📁 Directory Organization

gpu_acceleration/
├── __init__.py                    # Public API and module initialization
├── compute_provider.py            # Abstract interface for compute providers
├── cuda_provider.py              # CUDA backend implementation
├── cpu_provider.py               # CPU fallback implementation
├── apple_silicon_provider.py     # Apple Silicon backend implementation
├── gpu_manager.py                # High-level manager with auto-detection
├── api_service.py                # Refactored FastAPI service
├── REFACTORING_GUIDE.md          # Complete refactoring documentation
├── PROJECT_STRUCTURE.md          # This file
├── migration_examples/           # Migration examples and guides
│   ├── basic_migration.py        # Basic code migration example
│   ├── api_migration.py          # API migration example
│   ├── config_migration.py       # Configuration migration example
│   └── MIGRATION_CHECKLIST.md    # Complete migration checklist
├── legacy/                       # Legacy files (moved during migration)
│   ├── high_performance_cuda_accelerator.py
│   ├── fastapi_cuda_zk_api.py
│   ├── production_cuda_zk_api.py
│   └── marketplace_gpu_optimizer.py
├── cuda_kernels/                 # Existing CUDA kernels (unchanged)
│   ├── cuda_zk_accelerator.py
│   ├── field_operations.cu
│   └── liboptimized_field_operations.so
├── parallel_processing/          # Existing parallel processing (unchanged)
│   ├── distributed_framework.py
│   ├── marketplace_cache_optimizer.py
│   └── marketplace_monitor.py
├── research/                     # Existing research (unchanged)
│   ├── gpu_zk_research/
│   └── research_findings.md
└── backup_YYYYMMDD_HHMMSS/       # Backup of migrated files

🎯 Architecture Overview

Layer 1: Abstract Interface (compute_provider.py)

  • ComputeProvider: Abstract base class for all backends
  • ComputeBackend: Enumeration of available backends
  • ComputeDevice: Device information and management
  • ComputeProviderFactory: Factory pattern for backend creation

Layer 2: Backend Implementations

  • CUDA Provider: NVIDIA GPU acceleration with PyCUDA
  • CPU Provider: NumPy-based fallback implementation
  • Apple Silicon Provider: Metal-based Apple Silicon acceleration

Layer 3: High-Level Manager (gpu_manager.py)

  • GPUAccelerationManager: Main user-facing class
  • Auto-detection: Automatic backend selection
  • Fallback handling: Graceful degradation to CPU
  • Performance monitoring: Comprehensive metrics

Layer 4: API Layer (api_service.py)

  • FastAPI Integration: REST API for ZK operations
  • Backend-agnostic: No backend-specific code
  • Error handling: Proper error responses
  • Performance endpoints: Built-in performance monitoring

🔄 Migration Path

Before (Legacy)

gpu_acceleration/
├── high_performance_cuda_accelerator.py  # CUDA-specific implementation
├── fastapi_cuda_zk_api.py               # CUDA-specific API
├── production_cuda_zk_api.py            # CUDA-specific production API
└── marketplace_gpu_optimizer.py         # CUDA-specific optimizer

After (Refactored)

gpu_acceleration/
├── __init__.py                    # Clean public API
├── compute_provider.py            # Abstract interface
├── cuda_provider.py              # CUDA implementation
├── cpu_provider.py               # CPU fallback
├── apple_silicon_provider.py     # Apple Silicon implementation
├── gpu_manager.py                # High-level manager
├── api_service.py                # Refactored API
├── migration_examples/           # Migration guides
└── legacy/                       # Moved legacy files

🚀 Usage Patterns

Basic Usage

from gpu_acceleration import GPUAccelerationManager

# Auto-detect and initialize
gpu = GPUAccelerationManager()
gpu.initialize()
result = gpu.field_add(a, b)

Context Manager

from gpu_acceleration import GPUAccelerationContext

with GPUAccelerationContext() as gpu:
    result = gpu.field_mul(a, b)
    # Automatically shutdown

Backend Selection

from gpu_acceleration import create_gpu_manager

# Specify backend
gpu = create_gpu_manager(backend="cuda")
result = gpu.field_add(a, b)

Quick Functions

from gpu_acceleration import quick_field_add

result = quick_field_add(a, b)

📊 Benefits

Clean Architecture

  • Separation of Concerns: Clear interface between layers
  • Backend Agnostic: Business logic independent of backend
  • Testable: Easy to mock and test individual components

Flexibility

  • Multiple Backends: CUDA, Apple Silicon, CPU support
  • Auto-detection: Automatically selects best backend
  • Fallback Handling: Graceful degradation

Maintainability

  • Single Interface: One API to learn and maintain
  • Easy Extension: Simple to add new backends
  • Clear Documentation: Comprehensive documentation and examples

🔧 Configuration

Environment Variables

export AITBC_GPU_BACKEND=cuda
export AITBC_GPU_FALLBACK=true

Code Configuration

from gpu_acceleration import ZKOperationConfig

config = ZKOperationConfig(
    batch_size=2048,
    use_gpu=True,
    fallback_to_cpu=True,
    timeout=60.0
)

📈 Performance

Backend Performance

  • CUDA: ~95% of direct CUDA performance
  • Apple Silicon: Native Metal acceleration
  • CPU: Baseline performance with NumPy

Overhead

  • Interface Layer: <5% performance overhead
  • Auto-detection: One-time cost at initialization
  • Fallback Handling: Minimal overhead when not needed

🧪 Testing

Unit Tests

  • Backend interface compliance
  • Auto-detection logic
  • Fallback handling
  • Performance regression

Integration Tests

  • Multi-backend scenarios
  • API endpoint testing
  • Configuration validation
  • Error handling

Performance Tests

  • Benchmark comparisons
  • Memory usage analysis
  • Scalability testing
  • Resource utilization

🔮 Future Enhancements

Planned Backends

  • ROCm: AMD GPU support
  • OpenCL: Cross-platform support
  • Vulkan: Modern GPU API
  • WebGPU: Browser acceleration

Advanced Features

  • Multi-GPU: Automatic multi-GPU utilization
  • Memory Pooling: Efficient memory management
  • Async Operations: Asynchronous compute
  • Streaming: Large dataset support