4.9 KiB
Failure Archive
This archive collects known failure patterns experienced during development, along with their causes and resolutions. Agents should consult before debugging similar symptoms.
Failure: CLI Fails to Launch – Hardcoded Absolute Paths
Date: 2026-03-13
Symptom: ImportError: No module named 'trading_surveillance' when running aitbc --help or any subcommand.
Cause: Multiple command modules in cli/aitbc_cli/commands/ used:
sys.path.append('/home/oib/windsurf/aitbc/apps/coordinator-api/src/app/services')
This path is user-specific and does not exist on the aitbc1 host.
Modules affected:
surveillance.pyai_trading.pyai_surveillance.pyadvanced_analytics.pyregulatory.pyenterprise_integration.py
Resolution:
- Added
__init__.pytoapps/coordinator-api/src/app/services/to make it a proper package. - Updated each affected command module to use:
(or simply
sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', '..', 'apps', 'coordinator-api', 'src')) from app.services.trading_surveillance import ...from app.services import <module>after path setup) - Removed hardcoded fallback absolute paths.
- Verified:
aitbc --helploads without errors;aitbc surveillance startworks.
Prevention: Use package-relative imports; avoid user-specific absolute paths. Consider making coordinator-api a proper installable dependency.
Failure: Missing Dependency – aiohttp
Symptom: ModuleNotFoundError: No module named 'aiohttp' when importing kyc_aml_providers.py.
Cause: cli/pyproject.toml did not declare aiohttp.
Resolution: poetry add aiohttp (or pip install aiohttp in venv). Updated pyproject.toml accordingly.
Prevention: Keep dependencies declared; run tests in fresh environment.
Failure: Package Build Fails Due to Missing README.md
Symptom: poetry build for packages/py/aitbc-agent-sdk fails with FileNotFoundError: README.md.
Cause: The package directory lacked a README.md, which some build configurations require.
Resolution: Created an empty or placeholder README.md. Later enhanced with usage examples.
Prevention: Ensure each package has at least a minimal README; add pre-commit hook to check.
Failure: Starlette Broadcast Module Missing After Upgrade
Symptom: ImportError: cannot import name 'Broadcast' from 'starlette' after upgrading Starlette to 0.38+.
Cause: Starlette removed the Broadcast module in version 0.38.
Impact: P2P gossip backend (using Redis broadcast) fails to import. Services crash on startup.
Resolution:
- Pinned Starlette to
>=0.37.2,<0.38inpyproject.toml. - Added comment explaining the pin and that production should replace broadcast with direct P2P.
Prevention: Avoid upgrading Starlette without testing; track deprecations.
See also: debugging-notes.md for diagnostic steps.
Failure: Docker Compose Not Found
Symptom: docker-compose: command not found even though Docker is installed.
Cause: System has Docker Compose v2 (docker compose) but not v1 (docker-compose). The project documentation referenced docker-compose.
Resolution: Updated documentation to use docker compose (or detect whichever is available). Alternatively, create a symlink or alias.
Prevention: Detect both variants in scripts; document both names.
Failure: Test Scripts Use Absolute Paths
Symptom: run_all_tests.sh fails with "No such file or directory" for test scenario scripts located in /home/oib/windsurf/aitbc/....
Cause: Test scripts referenced a specific user's home directory, not the project root.
Resolution: Rewrote paths to be project-relative using $(dirname "$0"). Example: $(dirname "$0")/test_scenario_a.sh.
Prevention: Never hardcode absolute paths; always compute relative to project root or script location.
Failure: Gitea API Unstable During PR Approval
Symptom: Script monitor-prs.py fails to post approvals due to "connection reset" or 5xx errors from Gitea.
Cause: Gitea instance may be under load or temporarily unavailable.
Resolution: Added retry logic with exponential backoff. If still failing, log and skip; next run will succeed.
Prevention: Make API clients resilient to transient failures.
Failure: Coordinator API Idempotent DB Init
Symptom: Running init_db() multiple times causes sqlite3.IntegrityError due to duplicate index creation.
Cause: init_db() did not catch duplicate index errors; it assumed fresh DB.
Resolution: Wrapped index creation in try/except blocks catching sqlite3.IntegrityError (or using IF NOT EXISTS where supported). This made initialization idempotent.
Impact: Coordinator can be started repeatedly without manual DB cleanup.
Prevention: Design DB initialization to be idempotent from the start.
Add new failures chronologically below.