Some checks failed
Blockchain Synchronization Verification / sync-verification (push) Failing after 7s
CLI Tests / test-cli (push) Failing after 9s
Cross-Chain Functionality Tests / test-cross-chain-sync (push) Successful in 3s
Cross-Chain Functionality Tests / test-cross-chain-transactions (push) Successful in 6s
Cross-Chain Functionality Tests / test-multi-chain-consensus (push) Successful in 2s
Deploy to Testnet / deploy-testnet (push) Successful in 1m28s
Documentation Validation / validate-docs (push) Failing after 10s
Documentation Validation / validate-policies-strict (push) Successful in 4s
Integration Tests / test-service-integration (push) Failing after 47s
Multi-Chain Island Architecture Tests / test-multi-chain-island (push) Successful in 3s
Multi-Node Blockchain Health Monitoring / health-check (push) Successful in 2s
Node Failover Simulation / failover-test (push) Failing after 45m25s
P2P Network Verification / p2p-verification (push) Successful in 6s
Package Tests / Python package - aitbc-agent-sdk (push) Successful in 33s
Package Tests / Python package - aitbc-core (push) Successful in 14s
Package Tests / Python package - aitbc-crypto (push) Successful in 9s
Package Tests / Python package - aitbc-sdk (push) Successful in 11s
Package Tests / JavaScript package - aitbc-sdk-js (push) Successful in 8s
Package Tests / JavaScript package - aitbc-token (push) Successful in 16s
Python Tests / test-python (push) Failing after 42s
Security Scanning / security-scan (push) Failing after 35s
Cross-Chain Functionality Tests / aggregate-results (push) Successful in 2s
Multi-Node Stress Testing / stress-test (push) Successful in 5s
Cross-Node Transaction Testing / transaction-test (push) Successful in 3s
- Added PACKAGE_VERSION to aitbc/__init__.py exports - Added block height validation and duplicate block handling in blockchain-node RPC router - Added HTTP 409 conflict response for duplicate block heights with different hashes - Changed certification router to use PartnershipProgramRequest model instead of individual parameters - Fixed import paths: certification_service -> certification, advanced_reinforcement_learning -> advanced_rl - Added MarketplaceStrategyOptimizer to advanced_rl module exports
412 lines
23 KiB
Markdown
412 lines
23 KiB
Markdown
# AITBC Codebase Analysis & Recommendations Plan
|
|
|
|
**Date:** 2026-05-12
|
|
**Scope:** Full analysis of /opt/aitbc/ -- architecture, performance, security, maintainability
|
|
**Codebase Version:** 0.6.0
|
|
**Python:** 3.13.5
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
AITBC is a large monorepo (~212K LOC Python, 75K LOC Solidity, 75K LOC CLI) implementing a blockchain-based AI agent compute network. The system has strong foundational infrastructure (28 systemd services, PoA consensus, multi-chain support, 20 Solidity contracts) but carries significant technical debt: a bloated coordinator-api (117K LOC, 91 files >500 lines), 925 print() statements in production code, 50 bare except clauses, 41 potentially hardcoded secrets, duplicated config/database modules across apps, and 14.3K LOC of stubs that may be dead code. The core blockchain-node (27K LOC) is well-structured, but the coordinator-api has grown into a monolith that needs decomposition.
|
|
|
|
---
|
|
|
|
## 1. Codebase Scale Metrics
|
|
|
|
| Component | Files | LOC | Notes |
|
|
|-----------|-------|-----|-------|
|
|
| **apps/** | 840 | 211,844 | Main application code |
|
|
| **cli/** | 232 | 74,805 | Unified CLI tool |
|
|
| **aitbc/** (core pkg) | 52 | 11,190 | Shared utilities |
|
|
| **packages/py/** | 42 | 5,458 | SDK packages |
|
|
| **contracts/** | 490 | 74,589 | Solidity contracts |
|
|
| **tests/** | 56 files | ~789 test functions | Good test count |
|
|
| **stubs/** | 65 | 14,300 | Placeholder services |
|
|
| **systemd/** | 28 | - | Service definitions |
|
|
|
|
**App breakdown (largest):**
|
|
- `coordinator-api`: 338 files, 116,568 LOC (55% of all app code)
|
|
- `blockchain-node`: 127 files, 26,895 LOC
|
|
- `zk-circuits`: 52 files, 15,160 LOC
|
|
- `stubs`: 65 files, 14,300 LOC
|
|
- `pool-hub`: 48 files, 5,461 LOC
|
|
- `agent-coordinator`: 45 files, 9,983 LOC
|
|
- `exchange`: 19 files, 5,144 LOC
|
|
- `wallet`: 27 files, 3,482 LOC
|
|
|
|
---
|
|
|
|
## 2. Architecture Analysis
|
|
|
|
### 2.1 Current Architecture Pattern
|
|
|
|
The codebase follows a **monorepo with per-app packaging** pattern:
|
|
- Root `pyproject.toml` at `/opt/aitbc/pyproject.toml` with shared deps
|
|
- 15 per-app `pyproject.toml` files for independent packaging
|
|
- 28 systemd services for process management
|
|
- Core shared library at `/opt/aitbc/aitbc/` (52 files)
|
|
- 3 Python packages in `/opt/aitbc/packages/py/` (crypto, sdk, agent-sdk)
|
|
|
|
### 2.2 Strengths
|
|
|
|
1. **Well-structured blockchain-node**: Clear separation of concerns with `consensus/`, `contracts/`, `network/`, `economics/`, `state/` subdirectories. Largest file is 1,278 lines (RPC router) which is reasonable.
|
|
2. **Solid middleware layer**: Core package provides `error_handler`, `performance`, `request_id`, `validation` middleware -- all properly implemented.
|
|
3. **Database migration support**: Alembic configs for blockchain-node (7 migrations), coordinator-api (2), pool-hub (5).
|
|
4. **Comprehensive infrastructure**: Terraform (15 files), Kubernetes (6 manifests), nginx config, Prometheus dashboards, Gitea CI/CD.
|
|
5. **Circuit breaker pattern**: Implemented in PoA consensus for resilience.
|
|
6. **Good test infrastructure**: 789 test functions across 56 files, 41 conftest.py files, pytest markers for unit/integration/e2e/slow.
|
|
|
|
### 2.3 Architecture Issues
|
|
|
|
#### CRITICAL: Coordinator-API Monolith (117K LOC)
|
|
|
|
The coordinator-api at 116,568 LOC across 338 files is the single biggest architectural risk:
|
|
- **91 files over 500 lines** (27% of all files)
|
|
- Largest file: `advanced_reinforcement_learning.py` at 2,000 lines
|
|
- Service layer has 99 files + 24 sub-module files = 123 service files
|
|
- Router layer has 61 files
|
|
- Domain layer has 32 files
|
|
- This service alone is 55% of all application code
|
|
|
|
**Root cause**: The coordinator-api has accumulated features without decomposition. It handles RL agents, marketplace, trading, certification, compliance, analytics, payments, staking, governance, AMM, AI surveillance, multi-language support, ZK proofs, and more -- all in one deployable unit.
|
|
|
|
#### HIGH: Duplicated Cross-App Modules
|
|
|
|
Each app independently implements:
|
|
- `config.py` in 6 different apps (blockchain-node, coordinator-api, agent-coordinator, blockchain-event-bridge, pool-hub)
|
|
- `database.py` in 4 apps (blockchain-node, coordinator-api, exchange, pool-hub)
|
|
- No shared database connection pooling across apps
|
|
|
|
#### HIGH: Stubs Directory (14.3K LOC of Dead Code?)
|
|
|
|
The `apps/stubs/` directory contains 65 files across 68 subdirectories totaling 14,300 LOC. These appear to be placeholder implementations. Including `compliance-service`, `exchange-integration`, `global-ai-agents`, `global-infrastructure`, `multi-region-load-balancer`, `plugin-analytics`, `plugin-marketplace`, `plugin-registry`, `plugin-security`, `simple-explorer`, `trading-engine`. Each has its own tests, suggesting they were meant to be real services but remain as stubs.
|
|
|
|
#### MEDIUM: Missing Package Structure
|
|
|
|
Only 128 of 1,026 app directories have `__init__.py` files (12.5%). This means most directories are not proper Python packages, which breaks relative imports and makes the codebase harder to navigate programmatically.
|
|
|
|
#### MEDIUM: Overlapping Marketplace Services
|
|
|
|
Three separate marketplace implementations exist:
|
|
- `apps/marketplace` (9 files, 1,762 LOC) -- appears to be the primary marketplace
|
|
- `apps/marketplace-service` (10 files, 875 LOC) -- unclear differentiation
|
|
- `apps/marketplace-service-debug` (8 files, 821 LOC) -- debug variant
|
|
|
|
The relationship and boundary between these three is unclear.
|
|
|
|
#### LOW: Coordinator-API Entry Point is a Stub
|
|
|
|
`/opt/aitbc/apps/coordinator-api/src/app.py` is a single line: `# Import the FastAPI app from main.py for compatibility`. There is no `main.py` in the same directory -- the actual entry point appears to be elsewhere.
|
|
|
|
---
|
|
|
|
## 3. Performance Analysis
|
|
|
|
### 3.1 Strengths
|
|
|
|
1. **Async-first architecture**: 362 files use `async def`, consistent with FastAPI best practices
|
|
2. **FastAPI Depends for DI**: 94 files use FastAPI's dependency injection system
|
|
3. **LRU caching implementation**: Core package provides `LRU cache` with hit/miss tracking
|
|
4. **Performance middleware**: `PerformanceLoggingMiddleware` tracks request duration and adds `X-Process-Time` headers
|
|
5. **Adaptive sync**: Blockchain sync uses tiered batch sizing (10K+ blocks: 500-1000 batch)
|
|
6. **Circuit breaker**: PoA consensus implements circuit breaker for fault tolerance
|
|
|
|
### 3.2 Performance Issues
|
|
|
|
#### HIGH: No Distributed Caching Layer
|
|
|
|
Despite Redis being in `pyproject.toml` dependencies, grep found only 21 files mentioning Redis (mostly multi-language services and docs). There is no Redis-based caching layer for:
|
|
- API response caching
|
|
- Session storage
|
|
- Rate limiting state
|
|
- Blockchain state caching
|
|
|
|
This means every request hits the database directly.
|
|
|
|
#### HIGH: No Connection Pooling
|
|
|
|
Despite `asyncpg` and `sqlalchemy` with async support being dependencies, there is no visible connection pool configuration across apps. Each app creates its own database connections independently.
|
|
|
|
#### HIGH: No Background Task System
|
|
|
|
No Celery, RQ, or FastAPI BackgroundTasks usage found outside of 3 files. Long-running operations (blockchain sync, analytics processing, RL training) appear to be handled inline or via systemd timers rather than a proper task queue.
|
|
|
|
#### MEDIUM: No API Rate Limiting
|
|
|
|
`slowapi` and `limits` are in production dependencies but only found in 3 files in coordinator-api routers. Most endpoints have no rate limiting.
|
|
|
|
#### MEDIUM: No Request Batching
|
|
|
|
The coordinator-api receives requests for 61 different router domains but implements no request batching for high-throughput operations like marketplace queries or analytics.
|
|
|
|
#### LOW: Logging Performance
|
|
|
|
The `PerformanceLoggingMiddleware` logs every request synchronously with `logger.info()`. At high throughput, this creates I/O bottlenecks. Should use `logger.debug()` or sampling.
|
|
|
|
---
|
|
|
|
## 4. Security Analysis
|
|
|
|
### 4.1 Strengths
|
|
|
|
1. **Security hardening module**: `aitbc/security_hardening.py` (396 lines) provides input validation, sanitization, ethereum address validation, URL validation, email validation
|
|
2. **JWT infrastructure**: 30 files reference JWT authentication
|
|
3. **CORS configuration**: 23 files configure CORS policies
|
|
4. **HTTPS enforcement**: Security headers module (`aitbc/security_headers.py`) present
|
|
5. **Encryption at rest**: `database_encryption.py` in blockchain-node provides SQLCipher support
|
|
6. **Multi-sig wallets**: Mentioned in README as implemented
|
|
7. **Audit logging**: `audit_logging.py` service in coordinator-api (584 lines)
|
|
8. **Dev dependencies**: `bandit` (security linter) and `safety` (dependency vulnerability scanner) configured
|
|
9. **Pre-commit hooks**: `.pre-commit-config.yaml` configured
|
|
10. **Error handler middleware**: Properly returns generic 500 messages without leaking internals
|
|
|
|
### 4.2 Security Issues
|
|
|
|
#### CRITICAL: Potentially Hardcoded Secrets (41 instances)
|
|
|
|
41 files contain patterns matching potential hardcoded secrets, API keys, or tokens:
|
|
- `coordinator-api/.../cross_chain_integration.py:191`: `private_key = "mock_private_key"`
|
|
- `coordinator-api/.../wallet_service.py:72`: `encrypted_private_key = "[ENCRYPTED_MOCK]"`
|
|
- Multiple test files with `api_key="test_key"` patterns
|
|
|
|
While some are test files, production code also contains these patterns. Need audit to distinguish real secrets from test fixtures.
|
|
|
|
#### HIGH: 50 Bare Except Clauses
|
|
|
|
50 instances of bare `except:` in production code across:
|
|
- coordinator-api: 44 instances
|
|
- exchange: 1 instance
|
|
- Others: 5 instances
|
|
|
|
Bare excepts catch `SystemExit`, `KeyboardInterrupt`, and `GeneratorExit`, making the application unkillable in some error scenarios and hiding real errors.
|
|
|
|
#### HIGH: SQL Injection Risks (21 instances)
|
|
|
|
21 files use f-strings in SQL execute statements:
|
|
- `multichain_ledger.py:98,110,123`: `cursor.execute(f"""...`) with table name interpolation
|
|
- Various migration scripts using `f"SELECT * FROM {table_name}"`
|
|
|
|
While some are internal migration scripts, the `multichain_ledger.py` pattern is dangerous if table names come from user input.
|
|
|
|
#### HIGH: Insufficient Rate Limiting
|
|
|
|
Only 3 coordinator-api routers implement rate limiting. The remaining 58+ routers have no protection against:
|
|
- Brute force authentication attempts
|
|
- API abuse
|
|
- DDoS at the application layer
|
|
|
|
#### MEDIUM: Feature Flags Not Utilized
|
|
|
|
`/opt/aitbc/feature_flags.json` is an empty object `{}`. Despite having `aitbc/feature_flags.py` (a feature flag system), no features are actually flagged. This means:
|
|
- No gradual rollout capability
|
|
- No kill switches for problematic features
|
|
- No A/B testing infrastructure
|
|
|
|
#### MEDIUM: CORS Configuration Cents
|
|
|
|
23 files configure CORS but the configuration is scattered and inconsistent. No centralized CORS policy management.
|
|
|
|
#### MEDIUM: Dependency Vulnerability Scanning Not Enforced
|
|
|
|
While `safety` and `bandit` are in dev dependencies and pre-commit config, there is no evidence of CI/CD pipeline enforcement of security scanning. The Gitea workflows reference "security scanning optimized for changed files" but this should be comprehensive.
|
|
|
|
#### LOW: Debug Mode References
|
|
|
|
Several debug patterns found:
|
|
- `marketplace-service-debug` app (8 files) appears to be a debug variant
|
|
- `mock_coordinator.py` in blockchain-node observability
|
|
|
|
---
|
|
|
|
## 5. Maintainability Analysis
|
|
|
|
### 5.1 Strengths
|
|
|
|
1. **Comprehensive documentation**: Full docs directory with architecture, deployment, security, and operations guides
|
|
2. **Standardized quality tools**: ruff (line length 127), mypy (relaxed mode for gradual adoption), black, isort, pre-commit
|
|
3. **Good test coverage infrastructure**: 789 test functions, pytest with coverage (50% minimum threshold), markers for test types
|
|
4. **API versioning**: `aitbc/api_versioning.py` provides API versioning support
|
|
5. **Distributed tracing**: `aitbc/distributed_tracing.py` exists for observability
|
|
6. **Blue-green deployment**: `aitbc/blue_green_deployment.py` for zero-downtime deploys
|
|
7. **Domain-driven design**: coordinator-api uses `domain/` subdirectory with bounded contexts
|
|
8. **Repository pattern**: coordinator-api has `repositories/` directory
|
|
9. **Skills documentation**: 8 operational skills docs for agents
|
|
|
|
### 5.2 Maintainability Issues
|
|
|
|
#### CRITICAL: 925 print() Statements in Production Code
|
|
|
|
925 instances of `print()` in `/opt/aitbc/apps/` and `/opt/aitbc/packages/`. This is the most impactful maintainability issue:
|
|
- Prints bypass structured logging, making log aggregation impossible
|
|
- Prints cannot be filtered by severity level
|
|
- Prints go to stdout, not to configured log handlers
|
|
- In async code, prints can interleave unpredictably
|
|
|
|
Most concentrated in:
|
|
- `agent-coordinator/src/coordinator.py` (many debug prints)
|
|
- `agent-compliance/src/compliance_agent.py`
|
|
- `blockchain-node/scripts/` (expected for scripts, but still 925 total)
|
|
|
|
#### HIGH: 65 Stub Services (14.3K LOC)
|
|
|
|
The `apps/stubs/` directory is a significant maintenance burden:
|
|
- Each stub has its own tests (increasing test suite time)
|
|
- Each stub has its own `pyproject.toml` (15 total per-app pyproject files)
|
|
- Creates confusion about which services are real vs. placeholders
|
|
- Must be maintained/skipped during CI/CD runs
|
|
|
|
#### HIGH: Massive Service File Sizes
|
|
|
|
The coordinator-api service layer has extreme file sizes:
|
|
- 2,000 lines: `advanced_reinforcement_learning.py`
|
|
- 1,368 lines: `certification_service.py`
|
|
- 1,324 lines: `multi_modal_fusion.py`
|
|
- 1,159 lines: `agent_integration.py`
|
|
- 1,127 lines: `enterprise_integration.py`
|
|
- 10 more files over 800 lines
|
|
|
|
Single-responsibility principle is clearly violated.
|
|
|
|
#### MEDIUM: Type Coverage Gaps
|
|
|
|
While mypy is configured, it runs in very permissive mode:
|
|
- `check_untyped_defs = false`
|
|
- `disallow_intyped_defs = false`
|
|
- `disallow_untyped_defs = false`
|
|
- `no_implicit_optional = false`
|
|
|
|
9 generator overrides exist that completely ignore errors in coordinator-api modules:
|
|
```python
|
|
[[tool.mypy.overrides]]
|
|
module = ["apps.coordinator-api.src.app.routers.*", ...]
|
|
ignore_errors = true
|
|
```
|
|
|
|
#### MEDIUM: Cross-App Module Duplication
|
|
|
|
The same concerns are implemented independently across apps:
|
|
- **Config management**: 6 separate `config.py` files with no shared base
|
|
- **Database**: 4 separate `database.py` files
|
|
- **Validation**: Separate validation in each app
|
|
|
|
The core `aitbc/` package provides utilities but apps don't consistently use them.
|
|
|
|
#### MEDIUM: Per-App pyproject.toml Proliferation
|
|
|
|
15 per-app `pyproject.toml` files create:
|
|
- Dependency version drift across apps
|
|
- Inconsistent linting/formatting rules
|
|
- Complex CI/CD matrix
|
|
- Harder to maintain security patches across all packages
|
|
|
|
#### LOW: 25 Uncommitted Changes
|
|
|
|
Repository has 25 uncommitted changes, indicating active development without proper commit hygiene. This makes it harder to track what changes are in production.
|
|
|
|
#### LOW: TODO/FIXME/HACK Count is Zero
|
|
|
|
While the analysis found 0 TODO/FIXME/HACK/XXX/BUG comments, this likely indicates these markers are not used rather than the absence of known issues. This is a maintainability concern because there's no in-code tracking of technical debt.
|
|
|
|
---
|
|
|
|
## 6. Recommendations by Time Horizon
|
|
|
|
### SHORT-TERM (0-2 weeks) -- Quick Wins
|
|
|
|
| # | Recommendation | Effort | Impact | File(s) |
|
|
|---|---------------|--------|--------|---------|
|
|
| S1 | **Replace all print() with logger**: Run automated refactoring to replace `print()` with `logger.debug/info` across all production code. Use `sed`/`ruff` automated fix. | Low | Critical | All 925 instances |
|
|
| S2 | **Fix bare except clauses**: Replace `except:` with `except Exception:` at minimum. Prefer specific exceptions. | Low | High | 50 locations |
|
|
| S3 | **Remove or isolate stubs**: Move `apps/stubs/` to a separate repo or mark clearly as `examples/`. Remove from main test suite and CI/CD. | Low | Medium | `/opt/aitbc/apps/stubs/` |
|
|
| S4 | **Add `.coveragerc` or omit stubs from coverage**: Prevent stub code from diluting coverage metrics. | Low | Low | `pyproject.toml` |
|
|
| S5 | **Unify CORS configuration**: Create a single CORS policy in core `aitbc/` package, reference from all apps. | Low | Medium | 23 files |
|
|
| S6 | **Audit hardcoded secrets**: Review all 41 instances, move real secrets to environment variables/vault. | Medium | Critical | 41 locations |
|
|
| S7 | **Fix SQL injection risks**: Use parameterized queries in `multichain_ledger.py` and migration scripts. | Medium | High | 21 locations |
|
|
|
|
### MEDIUM-TERM (2-6 weeks) -- Structural Improvements
|
|
|
|
| # | Recommendation | Effort | Impact | File(s) |
|
|
|---|---------------|--------|--------|---------|
|
|
| M1 | **Decompose coordinator-api**: Split into bounded-context services: `rl-service`, `marketplace-router`, `analytics-service`, `certification-service`, `compliance-service`. Each gets its own deployable unit. | High | Critical | 338 files |
|
|
| M2 | **Implement shared config base class**: Create `aitbc.config.BaseSettings` that all apps inherit from, with common patterns for env var loading, secrets management, and validation. | Medium | High | 6 config files |
|
|
| M3 | **Add connection pool management**: Implement shared `aitbc.database.ConnectionPool` using `asyncpg` or SQLAlchemy async pool. All apps use the pool instead of creating independent connections. | Medium | High | 4 database files |
|
|
| M4 | **Implement distributed caching**: Deploy Redis cluster, add caching layer to coordinator-api for: API responses, session state, rate limiting counters, blockchain state reads. | Medium | High | `aitbc/caching.py` expansion |
|
|
| M5 | **Add rate limiting to all routers**: Apply `slowapi` limits to all 61 coordinator-api routers with tiered limits by endpoint sensitivity. | Medium | High | 61 router files |
|
|
| M6 | **Tighten mypy configuration**: Enable `check_untyped_defs`, `disallow_untyped_defs` incrementally. Remove `ignore_errors = true` overrides. Start with blockchain-node (best structured). | Medium | Medium | `pyproject.toml` |
|
|
| M7 | **Consolidate marketplace services**: Clarify boundaries between `marketplace`, `marketplace-service`, and `marketplace-service-debug`. Merge or clearly document the purpose of each. | Medium | Medium | 3 marketplace dirs |
|
|
| M8 | **Add background task system**: Implement Celery or ARQ for long-running tasks: RL training, analytics aggregation, blockchain event processing. | Medium | High | New infrastructure |
|
|
|
|
### LONG-TERM (1-3 months) -- Strategic Improvements
|
|
|
|
| # | Recommendation | Effort | Impact | File(s) |
|
|
|---|---------------|--------|--------|---------|
|
|
| L1 | **Implement API gateway pattern**: The `apps/api-gateway` (4 files, 425 LOC) should become the single entry point handling auth, rate limiting, routing, and load balancing for all backend services. | High | Critical | `apps/api-gateway/` |
|
|
| L2 | **Adopt event-driven architecture**: Implement message bus (Redis Streams, RabbitMQ, or NATS) for inter-service communication. Replace direct service-to-service calls with events. | High | High | Cross-cutting |
|
|
| L3 | **Implement feature flag system**: Populate `feature_flags.json` with actual flags. Use `aitbc/feature_flags.py` for gradual rollouts of: new RL algorithms, marketplace features, pricing changes. | Medium | Medium | `feature_flags.json` |
|
|
| L4 | **Add comprehensive observability**: Implement distributed tracing (OpenTelemetry), structured logging (structlog is already a dep), and metrics (Prometheus client is already a dep) across all services. | High | High | Cross-cutting |
|
|
| L5 | **Create shared test fixtures**: Extract common test setup into `packages/py/aitbc-test-utils/` with shared fixtures, factories, and mocks. Reduce duplication across 41 conftest.py files. | Medium | Medium | `tests/` |
|
|
| L6 | **Implement contract upgrade pattern**: Review Solidity contracts for proxy pattern usage. 20 contracts with no visible upgrade mechanism is a long-term risk. | High | Medium | `contracts/` |
|
|
| L7 | **Documentation-as-code**: Move operational runbooks from docs/ into executable skills. The 8 existing skills docs are a good start but should be executable. | Medium | Low | `docs/skills/` |
|
|
|
|
---
|
|
|
|
## 7. Risk Matrix
|
|
|
|
| Issue | Severity | Likelihood | Risk Score | Time to Fix |
|
|
|-------|----------|------------|------------|-------------|
|
|
| Coordinator-api monolith | High | High (growing) | **Critical** | 4-8 weeks |
|
|
| Hardcoded secrets | Critical | Medium | **Critical** | 1-2 weeks |
|
|
| print() in production | Medium | High (growing) | **High** | 1 week |
|
|
| Bare except clauses | High | Medium | **High** | 1 week |
|
|
| No rate limiting | High | Medium | **High** | 2-3 weeks |
|
|
| SQL injection risks | Critical | Low (internal) | **High** | 1-2 weeks |
|
|
| No connection pooling | Medium | High | **Medium** | 2-3 weeks |
|
|
| No distributed caching | Medium | High | **Medium** | 3-4 weeks |
|
|
| Stub code in main repo | Low | Low | **Low** | 1 week |
|
|
| Type coverage gaps | Medium | Medium | **Medium** | 4-6 weeks |
|
|
|
|
---
|
|
|
|
## 8. Files Most Likely to Change
|
|
|
|
Based on the analysis, these files/directories will be most impacted by the recommendations:
|
|
|
|
1. `/opt/aitbc/apps/coordinator-api/src/app/services/` -- Decomposition target (123 files)
|
|
2. `/opt/aitbc/apps/coordinator-api/src/app/routers/` -- Rate limiting, CORS (61 files)
|
|
3. `/opt/aitbc/apps/coordinator-api/src/app/config.py` -- Shared config base
|
|
4. `/opt/aitbc/apps/coordinator-api/src/app/database.py` -- Connection pooling
|
|
5. `/opt/aitbc/aitbc/caching.py` -- Distributed caching expansion
|
|
6. `/opt/aitbc/aitbc/middleware/` -- CORS unification
|
|
7. `/opt/aitbc/apps/stubs/` -- Removal/isolation
|
|
8. `/opt/aitbc/pyproject.toml` -- mypy tightening, coverage config
|
|
9. `/opt/aitbc/apps/wallet/src/app/chain/multichain_ledger.py` -- SQL injection fix
|
|
10. `/opt/aitbc/apps/agent-coordinator/src/coordinator.py` -- print() removal
|
|
|
|
---
|
|
|
|
## 9. Verification Steps
|
|
|
|
After implementing recommendations, verify with:
|
|
|
|
1. **print() elimination**: `grep -rn "print(" apps/ packages/ | grep -v "node_modules" | grep -v "__pycache__" | wc -l` should return 0
|
|
2. **Bare except elimination**: `grep -rn "^\s*except\s*:" apps/ packages/ | wc -l` should return 0
|
|
3. **Test suite**: `pytest --cov=apps --cov=packages --cov=cli --cov-fail-under=50` should pass
|
|
4. **Security scan**: `bandit -r apps/ packages/ cli/` should return 0 issues
|
|
5. **Type check**: `mypy apps/ packages/ cli/` should pass with tightened config
|
|
6. **Rate limiting**: `curl -s -o /dev/null -w "%{http_code}" http://localhost:9001/api/...` repeated rapidly should return 429
|
|
7. **Health checks**: All 28 systemd services should report `active (running)` and `/health` endpoints should return 200
|
|
|
|
---
|
|
|
|
## 10. Open Questions
|
|
|
|
1. **Stubs purpose**: Are the 65 stub files in `apps/stubs/` planned for implementation, or should they be archived? Who owns the decision?
|
|
2. **Coordinator-api decomposition**: What is the preferred decomposition strategy -- by domain (RL, marketplace, analytics) or by function (read vs write)?
|
|
3. **Redis deployment**: Is there an existing Redis cluster, or does one need to be provisioned?
|
|
4. **Background task system**: Is there a preference between Celery, ARQ, or FastAPI BackgroundTasks for the task queue?
|
|
5. **Contract upgrade strategy**: Are the Solidity contracts expected to be upgradeable, or is immutability intentional?
|
|
6. **Marketplace service boundary**: What is the intended relationship between `marketplace`, `marketplace-service`, and `marketplace-service-debug`?
|
|
7. **Feature flag ownership**: Who manages the feature flag lifecycle -- product or engineering?
|