# SneakyCode Implementation Roadmap A phased plan progressing from bare-bones foundation to full autonomous coding agent. --- ## Phase 1 — Foundation: Models, Config, and Utilities Establish the data layer and shared infrastructure everything else builds on. | File | Description | |------|-------------| | `app/models/config.py` | Pydantic v2 config model — load and validate `config/config.yaml` | | `app/models/message.py` | Message schema (role, content, tool_calls) | | `app/models/tool_call.py` | ToolCall and ToolResult schemas | | `app/utils/logging.py` | Centralized logger with Rich handler | | `app/utils/display.py` | Rich console output helpers (stub — expanded in Phase 2) | | `app/utils/file_helpers.py` | Safe path resolution, binary detection, size guards | | `app/utils/token_counter.py` | Approximate token usage tracking (character-based heuristic for v1) | | `app/main.py` | Entrypoint stub — arg parsing, config load, Rich console setup | **Exit criteria:** `python -m app.main --help` runs, config loads and validates, models can be instantiated and serialized. --- ## Phase 2 — TUI and Interactive Shell Get a working interactive terminal before wiring up the LLM. | File | Description | |------|-------------| | `app/main.py` | Rich-based interactive REPL loop — prompt for user input, display responses | | `app/utils/display.py` | Formatted output for agent messages, tool calls, errors, token usage | | `app/agent/context.py` | Session state and conversation history management | **Exit criteria:** User can type messages into a styled REPL, see them echoed back with formatting, and conversation history is tracked in memory. --- ## Phase 3 — LLM Integration (Ollama) Connect to the local LLM and stream responses into the TUI. | File | Description | |------|-------------| | `app/services/llm.py` | Async httpx client wrapping Ollama's OpenAI-compatible `/v1/chat/completions` endpoint | | `app/services/streaming.py` | SSE parsing, Rich live display, tool call extraction from accumulated stream | **Integration:** Wire LLM into the REPL — user message goes to LLM, streamed response displays in real time. **Exit criteria:** User can chat with the local model through the TUI with streamed output. Tool call JSON is parsed from the stream but not yet executed. --- ## Phase 4 — Tool Framework and Core Tools Build the tool abstraction and implement safe, read-only tools first. | File | Description | |------|-------------| | `app/tools/base.py` | `BaseTool` ABC and `ToolResult` dataclass | | `app/tools/registry.py` | Tool registration, discovery, and JSON schema export for LLM system prompt | | `app/services/permissions.py` | Two-tier approval gating (auto-approve reads; prompt for writes/deletes/shell) | | `app/tools/filesystem.py` | `read_file`, `list_dir` | | `app/tools/search.py` | `grep_files`, `find_files` | **Exit criteria:** Tools register themselves, schemas export correctly for inclusion in the system prompt, read-only tools execute and return `ToolResult` objects. Permissions service gates execution. --- ## Phase 5 — Agent Loop (ReAct) The core autonomy layer — reason, act, observe, repeat. | File | Description | |------|-------------| | `app/agent/loop.py` | ReAct cycle: send conversation to LLM, parse tool calls, execute, feed results back, repeat | **Key behaviors:** - System prompt constructed with tool schemas from registry - Permissions checks before each tool execution - Loop termination on: plain-text response (no tool calls), explicit `finish` tool call, or `max_iterations` exceeded **Exit criteria:** Agent can autonomously answer questions about the codebase by chaining `read_file`, `list_dir`, `grep_files`, and `find_files` tool calls in a multi-turn loop. --- ## Phase 6 — Write Tools and Shell Unlock the agent's ability to modify code and run commands. | File | Description | |------|-------------| | `app/tools/filesystem.py` | `write_file`, `make_dir`, `delete_file` (additions to existing module) | | `app/tools/edit.py` | `str_replace` (unique-match required), `patch_apply` | | `app/tools/shell.py` | `run_command` with command allow/deny lists and output truncation | **All write/shell operations gated through permissions service.** **Exit criteria:** Agent can autonomously create files, edit code via string replacement, and run shell commands — all with user approval for destructive operations. --- ## Phase 7 — Polish and Hardening Production-readiness: error handling, resource limits, and documentation. | Area | Description | |------|-------------| | Error handling | Recovery from malformed tool calls, LLM errors, network timeouts in agent loop | | Token budget | Conversation truncation or summarization when approaching context limit | | Graceful shutdown | Clean Ctrl+C handling, session state preservation | | Testing | End-to-end integration tests (`tests/integration/`), unit tests (`tests/unit/`) | | Documentation | `README.md` with setup and usage instructions, `docs/tools.md` tool reference | **Exit criteria:** Agent handles edge cases gracefully, tests pass, and a new user can set up and use the project from the README alone. --- ## File Coverage Every file from the project structure in CLAUDE.md is accounted for: | File | Phase | |------|-------| | `app/main.py` | 1, 2 | | `app/models/config.py` | 1 | | `app/models/message.py` | 1 | | `app/models/tool_call.py` | 1 | | `app/utils/logging.py` | 1 | | `app/utils/display.py` | 1, 2 | | `app/utils/file_helpers.py` | 1 | | `app/utils/token_counter.py` | 1 | | `app/agent/context.py` | 2 | | `app/services/llm.py` | 3 | | `app/services/streaming.py` | 3 | | `app/tools/base.py` | 4 | | `app/tools/registry.py` | 4 | | `app/services/permissions.py` | 4 | | `app/tools/filesystem.py` | 4, 6 | | `app/tools/search.py` | 4 | | `app/agent/loop.py` | 5 | | `app/tools/edit.py` | 6 | | `app/tools/shell.py` | 6 |