Files
SneakyCode/docs/ROADMAP.md
2026-03-11 07:21:21 -05:00

145 lines
5.8 KiB
Markdown

# SneakyCode Implementation Roadmap
A phased plan progressing from bare-bones foundation to full autonomous coding agent.
---
## Phase 1 — Foundation: Models, Config, and Utilities
Establish the data layer and shared infrastructure everything else builds on.
| File | Description |
|------|-------------|
| `app/models/config.py` | Pydantic v2 config model — load and validate `config/config.yaml` |
| `app/models/message.py` | Message schema (role, content, tool_calls) |
| `app/models/tool_call.py` | ToolCall and ToolResult schemas |
| `app/utils/logging.py` | Centralized logger with Rich handler |
| `app/utils/display.py` | Rich console output helpers (stub — expanded in Phase 2) |
| `app/utils/file_helpers.py` | Safe path resolution, binary detection, size guards |
| `app/utils/token_counter.py` | Approximate token usage tracking (character-based heuristic for v1) |
| `app/main.py` | Entrypoint stub — arg parsing, config load, Rich console setup |
**Exit criteria:** `python -m app.main --help` runs, config loads and validates, models can be instantiated and serialized.
---
## Phase 2 — TUI and Interactive Shell
Get a working interactive terminal before wiring up the LLM.
| File | Description |
|------|-------------|
| `app/main.py` | Rich-based interactive REPL loop — prompt for user input, display responses |
| `app/utils/display.py` | Formatted output for agent messages, tool calls, errors, token usage |
| `app/agent/context.py` | Session state and conversation history management |
**Exit criteria:** User can type messages into a styled REPL, see them echoed back with formatting, and conversation history is tracked in memory.
---
## Phase 3 — LLM Integration (Ollama)
Connect to the local LLM and stream responses into the TUI.
| File | Description |
|------|-------------|
| `app/services/llm.py` | Async httpx client wrapping Ollama's OpenAI-compatible `/v1/chat/completions` endpoint |
| `app/services/streaming.py` | SSE parsing, Rich live display, tool call extraction from accumulated stream |
**Integration:** Wire LLM into the REPL — user message goes to LLM, streamed response displays in real time.
**Exit criteria:** User can chat with the local model through the TUI with streamed output. Tool call JSON is parsed from the stream but not yet executed.
---
## Phase 4 — Tool Framework and Core Tools
Build the tool abstraction and implement safe, read-only tools first.
| File | Description |
|------|-------------|
| `app/tools/base.py` | `BaseTool` ABC and `ToolResult` dataclass |
| `app/tools/registry.py` | Tool registration, discovery, and JSON schema export for LLM system prompt |
| `app/services/permissions.py` | Two-tier approval gating (auto-approve reads; prompt for writes/deletes/shell) |
| `app/tools/filesystem.py` | `read_file`, `list_dir` |
| `app/tools/search.py` | `grep_files`, `find_files` |
**Exit criteria:** Tools register themselves, schemas export correctly for inclusion in the system prompt, read-only tools execute and return `ToolResult` objects. Permissions service gates execution.
---
## Phase 5 — Agent Loop (ReAct)
The core autonomy layer — reason, act, observe, repeat.
| File | Description |
|------|-------------|
| `app/agent/loop.py` | ReAct cycle: send conversation to LLM, parse tool calls, execute, feed results back, repeat |
**Key behaviors:**
- System prompt constructed with tool schemas from registry
- Permissions checks before each tool execution
- Loop termination on: plain-text response (no tool calls), explicit `finish` tool call, or `max_iterations` exceeded
**Exit criteria:** Agent can autonomously answer questions about the codebase by chaining `read_file`, `list_dir`, `grep_files`, and `find_files` tool calls in a multi-turn loop.
---
## Phase 6 — Write Tools and Shell
Unlock the agent's ability to modify code and run commands.
| File | Description |
|------|-------------|
| `app/tools/filesystem.py` | `write_file`, `make_dir`, `delete_file` (additions to existing module) |
| `app/tools/edit.py` | `str_replace` (unique-match required), `patch_apply` |
| `app/tools/shell.py` | `run_command` with command allow/deny lists and output truncation |
**All write/shell operations gated through permissions service.**
**Exit criteria:** Agent can autonomously create files, edit code via string replacement, and run shell commands — all with user approval for destructive operations.
---
## Phase 7 — Polish and Hardening
Production-readiness: error handling, resource limits, and documentation.
| Area | Description |
|------|-------------|
| Error handling | Recovery from malformed tool calls, LLM errors, network timeouts in agent loop |
| Token budget | Conversation truncation or summarization when approaching context limit |
| Graceful shutdown | Clean Ctrl+C handling, session state preservation |
| Testing | End-to-end integration tests (`tests/integration/`), unit tests (`tests/unit/`) |
| Documentation | `README.md` with setup and usage instructions, `docs/tools.md` tool reference |
**Exit criteria:** Agent handles edge cases gracefully, tests pass, and a new user can set up and use the project from the README alone.
---
## File Coverage
Every file from the project structure in CLAUDE.md is accounted for:
| File | Phase |
|------|-------|
| `app/main.py` | 1, 2 |
| `app/models/config.py` | 1 |
| `app/models/message.py` | 1 |
| `app/models/tool_call.py` | 1 |
| `app/utils/logging.py` | 1 |
| `app/utils/display.py` | 1, 2 |
| `app/utils/file_helpers.py` | 1 |
| `app/utils/token_counter.py` | 1 |
| `app/agent/context.py` | 2 |
| `app/services/llm.py` | 3 |
| `app/services/streaming.py` | 3 |
| `app/tools/base.py` | 4 |
| `app/tools/registry.py` | 4 |
| `app/services/permissions.py` | 4 |
| `app/tools/filesystem.py` | 4, 6 |
| `app/tools/search.py` | 4 |
| `app/agent/loop.py` | 5 |
| `app/tools/edit.py` | 6 |
| `app/tools/shell.py` | 6 |