Files
SneakyCode/docs/ROADMAP.md
2026-03-11 07:21:21 -05:00

5.8 KiB

SneakyCode Implementation Roadmap

A phased plan progressing from bare-bones foundation to full autonomous coding agent.


Phase 1 — Foundation: Models, Config, and Utilities

Establish the data layer and shared infrastructure everything else builds on.

File Description
app/models/config.py Pydantic v2 config model — load and validate config/config.yaml
app/models/message.py Message schema (role, content, tool_calls)
app/models/tool_call.py ToolCall and ToolResult schemas
app/utils/logging.py Centralized logger with Rich handler
app/utils/display.py Rich console output helpers (stub — expanded in Phase 2)
app/utils/file_helpers.py Safe path resolution, binary detection, size guards
app/utils/token_counter.py Approximate token usage tracking (character-based heuristic for v1)
app/main.py Entrypoint stub — arg parsing, config load, Rich console setup

Exit criteria: python -m app.main --help runs, config loads and validates, models can be instantiated and serialized.


Phase 2 — TUI and Interactive Shell

Get a working interactive terminal before wiring up the LLM.

File Description
app/main.py Rich-based interactive REPL loop — prompt for user input, display responses
app/utils/display.py Formatted output for agent messages, tool calls, errors, token usage
app/agent/context.py Session state and conversation history management

Exit criteria: User can type messages into a styled REPL, see them echoed back with formatting, and conversation history is tracked in memory.


Phase 3 — LLM Integration (Ollama)

Connect to the local LLM and stream responses into the TUI.

File Description
app/services/llm.py Async httpx client wrapping Ollama's OpenAI-compatible /v1/chat/completions endpoint
app/services/streaming.py SSE parsing, Rich live display, tool call extraction from accumulated stream

Integration: Wire LLM into the REPL — user message goes to LLM, streamed response displays in real time.

Exit criteria: User can chat with the local model through the TUI with streamed output. Tool call JSON is parsed from the stream but not yet executed.


Phase 4 — Tool Framework and Core Tools

Build the tool abstraction and implement safe, read-only tools first.

File Description
app/tools/base.py BaseTool ABC and ToolResult dataclass
app/tools/registry.py Tool registration, discovery, and JSON schema export for LLM system prompt
app/services/permissions.py Two-tier approval gating (auto-approve reads; prompt for writes/deletes/shell)
app/tools/filesystem.py read_file, list_dir
app/tools/search.py grep_files, find_files

Exit criteria: Tools register themselves, schemas export correctly for inclusion in the system prompt, read-only tools execute and return ToolResult objects. Permissions service gates execution.


Phase 5 — Agent Loop (ReAct)

The core autonomy layer — reason, act, observe, repeat.

File Description
app/agent/loop.py ReAct cycle: send conversation to LLM, parse tool calls, execute, feed results back, repeat

Key behaviors:

  • System prompt constructed with tool schemas from registry
  • Permissions checks before each tool execution
  • Loop termination on: plain-text response (no tool calls), explicit finish tool call, or max_iterations exceeded

Exit criteria: Agent can autonomously answer questions about the codebase by chaining read_file, list_dir, grep_files, and find_files tool calls in a multi-turn loop.


Phase 6 — Write Tools and Shell

Unlock the agent's ability to modify code and run commands.

File Description
app/tools/filesystem.py write_file, make_dir, delete_file (additions to existing module)
app/tools/edit.py str_replace (unique-match required), patch_apply
app/tools/shell.py run_command with command allow/deny lists and output truncation

All write/shell operations gated through permissions service.

Exit criteria: Agent can autonomously create files, edit code via string replacement, and run shell commands — all with user approval for destructive operations.


Phase 7 — Polish and Hardening

Production-readiness: error handling, resource limits, and documentation.

Area Description
Error handling Recovery from malformed tool calls, LLM errors, network timeouts in agent loop
Token budget Conversation truncation or summarization when approaching context limit
Graceful shutdown Clean Ctrl+C handling, session state preservation
Testing End-to-end integration tests (tests/integration/), unit tests (tests/unit/)
Documentation README.md with setup and usage instructions, docs/tools.md tool reference

Exit criteria: Agent handles edge cases gracefully, tests pass, and a new user can set up and use the project from the README alone.


File Coverage

Every file from the project structure in CLAUDE.md is accounted for:

File Phase
app/main.py 1, 2
app/models/config.py 1
app/models/message.py 1
app/models/tool_call.py 1
app/utils/logging.py 1
app/utils/display.py 1, 2
app/utils/file_helpers.py 1
app/utils/token_counter.py 1
app/agent/context.py 2
app/services/llm.py 3
app/services/streaming.py 3
app/tools/base.py 4
app/tools/registry.py 4
app/services/permissions.py 4
app/tools/filesystem.py 4, 6
app/tools/search.py 4
app/agent/loop.py 5
app/tools/edit.py 6
app/tools/shell.py 6