116 lines
3.6 KiB
Markdown
116 lines
3.6 KiB
Markdown
# CI Failures
|
||
|
||
This file tracks continuous integration failures, their diagnosis, and fixes. Consult when CI breaks.
|
||
|
||
---
|
||
|
||
## CI Failure: Poetry Build Error – Missing README
|
||
|
||
**Date**: 2026-03-13
|
||
|
||
**Symptom**: Gitea Actions job fails during `poetry build`:
|
||
```
|
||
FileNotFoundError: [Errno 2] No such file or directory: 'README.md'
|
||
```
|
||
|
||
**Package**: `packages/py/aitbc-agent-sdk`
|
||
|
||
**Cause**: The package directory lacked a README.md, which Poetry expects when building a package.
|
||
|
||
**Fix**: Added a minimal README.md (later expanded with usage examples). Re-ran CI; build passed.
|
||
|
||
**Action**: Recorded in `failures/failure-archive.md` as "Package Build Fails Due to Missing README.md".
|
||
|
||
---
|
||
|
||
## CI Failure: ImportError in CLI Tests
|
||
|
||
**Symptom**: Test job for `cli` or import validation fails with:
|
||
```
|
||
ImportError: cannot import name 'trading_surveillance' from 'app.services'
|
||
```
|
||
|
||
**Cause**: Starlette/Broadcast mismatch or missing `app/services/__init__.py`, or path issues.
|
||
|
||
**Resolution**: Ensured `app/services/__init__.py` exists; fixed command module imports as per failure-archive; pinned Starlette version.
|
||
|
||
---
|
||
|
||
## CI Failure: Pytest Fails Due to Database Lock
|
||
|
||
**Symptom**: Intermittent test failures with `sqlite3.OperationalError: database is locked`.
|
||
|
||
**Cause**: Tests using the same SQLite file in parallel without proper isolation.
|
||
|
||
**Fix**: Switched to in-memory SQLite (`sqlite+aiosqlite:///:memory:`) for unit tests; ensured each test gets a fresh DB. Alternatively, use file-based with `cache=shared` and proper cleanup.
|
||
|
||
**Action**: Add test isolation to `conftest.py`; ensure fixtures tear down connections.
|
||
|
||
---
|
||
|
||
## CI Failure: Missing aiohttp Dependency
|
||
|
||
**Symptom**: Import error for `aiohttp` in `kyc_aml_providers.py`.
|
||
|
||
**Cause**: Dependency not declared in `pyproject.toml`.
|
||
|
||
**Fix**: Added `aiohttp` to dependencies. Pushed fix; CI passed after install.
|
||
|
||
---
|
||
|
||
## CI Failure: Syntax Error in Sibling's PR
|
||
|
||
**Symptom**: `monitor-prs.py` auto-requests changes because `py_compile` fails.
|
||
|
||
**Typical Cause**: Simple syntax mistake (missing colon, unmatched parentheses).
|
||
|
||
**Response**: Comment on PR with the syntax error. Developer fixes and pushes; CI re-runs.
|
||
|
||
**Note**: This is expected behavior; the script is doing its job.
|
||
|
||
---
|
||
|
||
## CI Failure: Redis Connection Refused
|
||
|
||
**Symptom**: Tests that rely on Redis connectivity fail:
|
||
```
|
||
redis.exceptions.ConnectionError: Error 111 connecting to localhost:6379. Connection refused.
|
||
```
|
||
|
||
**Cause**: Redis service not running in CI environment.
|
||
|
||
**Fix**: Either start Redis in CI job before tests, or mock Redis in tests. For integration tests that need Redis, add a service container or start Redis as a background process.
|
||
|
||
---
|
||
|
||
## CI Failure: Port Already in Use
|
||
|
||
**Symptom**: Test that starts a server fails with `OSError: [Errno 98] Address already in use`.
|
||
|
||
**Cause**: Previous test did not cleanly shut down the server; port 8006 (or other) still bound.
|
||
|
||
**Fix**: Ensure proper shutdown of servers in test teardown; use `asyncio` cancellation and wait for port release. Alternatively, use dynamic port allocation for CI.
|
||
|
||
---
|
||
|
||
## CI Failure: Out of Memory (OOM)
|
||
|
||
**Symptom**: CI job killed with signal SIGKILL (exit code 137).
|
||
|
||
**Cause**: Building many packages or running heavy tests exceeded CI container memory limits.
|
||
|
||
**Fix**: Reduce parallelism; use swap if allowed; split CI into smaller jobs; optimize tests.
|
||
|
||
---
|
||
|
||
## CI Failure: Permission Denied on Executable Scripts
|
||
|
||
**Symptom**: `./scripts/claim-task.py: Permission denied` when cron tries to run it.
|
||
|
||
**Cause**: Script file not executable (`chmod +x` missing).
|
||
|
||
**Fix**: `chmod +x scripts/claim-task.py`; ensure all scripts have correct mode in repo.
|
||
|
||
---
|
||
|
||
*Log new CI failures chronologically.* |