5.8 KiB
SneakyCode Implementation Roadmap
A phased plan progressing from bare-bones foundation to full autonomous coding agent.
Phase 1 — Foundation: Models, Config, and Utilities
Establish the data layer and shared infrastructure everything else builds on.
| File | Description |
|---|---|
app/models/config.py |
Pydantic v2 config model — load and validate config/config.yaml |
app/models/message.py |
Message schema (role, content, tool_calls) |
app/models/tool_call.py |
ToolCall and ToolResult schemas |
app/utils/logging.py |
Centralized logger with Rich handler |
app/utils/display.py |
Rich console output helpers (stub — expanded in Phase 2) |
app/utils/file_helpers.py |
Safe path resolution, binary detection, size guards |
app/utils/token_counter.py |
Approximate token usage tracking (character-based heuristic for v1) |
app/main.py |
Entrypoint stub — arg parsing, config load, Rich console setup |
Exit criteria: python -m app.main --help runs, config loads and validates, models can be instantiated and serialized.
Phase 2 — TUI and Interactive Shell
Get a working interactive terminal before wiring up the LLM.
| File | Description |
|---|---|
app/main.py |
Rich-based interactive REPL loop — prompt for user input, display responses |
app/utils/display.py |
Formatted output for agent messages, tool calls, errors, token usage |
app/agent/context.py |
Session state and conversation history management |
Exit criteria: User can type messages into a styled REPL, see them echoed back with formatting, and conversation history is tracked in memory.
Phase 3 — LLM Integration (Ollama)
Connect to the local LLM and stream responses into the TUI.
| File | Description |
|---|---|
app/services/llm.py |
Async httpx client wrapping Ollama's OpenAI-compatible /v1/chat/completions endpoint |
app/services/streaming.py |
SSE parsing, Rich live display, tool call extraction from accumulated stream |
Integration: Wire LLM into the REPL — user message goes to LLM, streamed response displays in real time.
Exit criteria: User can chat with the local model through the TUI with streamed output. Tool call JSON is parsed from the stream but not yet executed.
Phase 4 — Tool Framework and Core Tools
Build the tool abstraction and implement safe, read-only tools first.
| File | Description |
|---|---|
app/tools/base.py |
BaseTool ABC and ToolResult dataclass |
app/tools/registry.py |
Tool registration, discovery, and JSON schema export for LLM system prompt |
app/services/permissions.py |
Two-tier approval gating (auto-approve reads; prompt for writes/deletes/shell) |
app/tools/filesystem.py |
read_file, list_dir |
app/tools/search.py |
grep_files, find_files |
Exit criteria: Tools register themselves, schemas export correctly for inclusion in the system prompt, read-only tools execute and return ToolResult objects. Permissions service gates execution.
Phase 5 — Agent Loop (ReAct)
The core autonomy layer — reason, act, observe, repeat.
| File | Description |
|---|---|
app/agent/loop.py |
ReAct cycle: send conversation to LLM, parse tool calls, execute, feed results back, repeat |
Key behaviors:
- System prompt constructed with tool schemas from registry
- Permissions checks before each tool execution
- Loop termination on: plain-text response (no tool calls), explicit
finishtool call, ormax_iterationsexceeded
Exit criteria: Agent can autonomously answer questions about the codebase by chaining read_file, list_dir, grep_files, and find_files tool calls in a multi-turn loop.
Phase 6 — Write Tools and Shell
Unlock the agent's ability to modify code and run commands.
| File | Description |
|---|---|
app/tools/filesystem.py |
write_file, make_dir, delete_file (additions to existing module) |
app/tools/edit.py |
str_replace (unique-match required), patch_apply |
app/tools/shell.py |
run_command with command allow/deny lists and output truncation |
All write/shell operations gated through permissions service.
Exit criteria: Agent can autonomously create files, edit code via string replacement, and run shell commands — all with user approval for destructive operations.
Phase 7 — Polish and Hardening
Production-readiness: error handling, resource limits, and documentation.
| Area | Description |
|---|---|
| Error handling | Recovery from malformed tool calls, LLM errors, network timeouts in agent loop |
| Token budget | Conversation truncation or summarization when approaching context limit |
| Graceful shutdown | Clean Ctrl+C handling, session state preservation |
| Testing | End-to-end integration tests (tests/integration/), unit tests (tests/unit/) |
| Documentation | README.md with setup and usage instructions, docs/tools.md tool reference |
Exit criteria: Agent handles edge cases gracefully, tests pass, and a new user can set up and use the project from the README alone.
File Coverage
Every file from the project structure in CLAUDE.md is accounted for:
| File | Phase |
|---|---|
app/main.py |
1, 2 |
app/models/config.py |
1 |
app/models/message.py |
1 |
app/models/tool_call.py |
1 |
app/utils/logging.py |
1 |
app/utils/display.py |
1, 2 |
app/utils/file_helpers.py |
1 |
app/utils/token_counter.py |
1 |
app/agent/context.py |
2 |
app/services/llm.py |
3 |
app/services/streaming.py |
3 |
app/tools/base.py |
4 |
app/tools/registry.py |
4 |
app/services/permissions.py |
4 |
app/tools/filesystem.py |
4, 6 |
app/tools/search.py |
4 |
app/agent/loop.py |
5 |
app/tools/edit.py |
6 |
app/tools/shell.py |
6 |