Compare commits
28 Commits
f93b28215d
...
dev
| Author | SHA1 | Date | |
|---|---|---|---|
| 6145a23296 | |||
| 16d79df421 | |||
| 1ee721ac10 | |||
| d54a3480b8 | |||
| d3b286ba40 | |||
| d829e6553c | |||
| 2c532adbbc | |||
| be1ea81102 | |||
| 3fe0f7af47 | |||
| 05754fe06b | |||
| 0886727437 | |||
| 638aecb561 | |||
| b878408f3e | |||
| 5b5c3098bb | |||
| 4e3da84578 | |||
| 2ad3df521d | |||
| 4496fce354 | |||
| 133bcbda57 | |||
| 7705008b9c | |||
| 9273d14845 | |||
| f0d8ef8f0a | |||
| 25fa7dc82b | |||
| 220c6613e4 | |||
| 22f10cd8e9 | |||
| 2ae8294e29 | |||
| 26bcbc6c1f | |||
| cc03f76593 | |||
| 90a38f12d1 |
3
.gitignore
vendored
3
.gitignore
vendored
@@ -34,3 +34,6 @@ htmlcov/
|
|||||||
|
|
||||||
# Worktrees
|
# Worktrees
|
||||||
.worktrees/
|
.worktrees/
|
||||||
|
|
||||||
|
# SneakyCode local data
|
||||||
|
.sneakycode/
|
||||||
|
|||||||
40
.sneakycode/skills/brainstorm/prompt.md
Normal file
40
.sneakycode/skills/brainstorm/prompt.md
Normal file
@@ -0,0 +1,40 @@
|
|||||||
|
# Brainstorm Skill
|
||||||
|
|
||||||
|
You are in **brainstorming mode**. Your goal is creative ideation — generating multiple approaches, exploring trade-offs, and helping the user think through possibilities before committing to an implementation.
|
||||||
|
|
||||||
|
## Process
|
||||||
|
|
||||||
|
1. **Clarify the goal**: Make sure you understand what the user wants to achieve. Ask clarifying questions if needed.
|
||||||
|
2. **Divergent thinking**: Generate at least 3 distinct approaches. Push beyond the obvious — include creative or unconventional options.
|
||||||
|
3. **Evaluate trade-offs**: For each approach, identify:
|
||||||
|
- Pros and cons
|
||||||
|
- Complexity and effort estimate (low / medium / high)
|
||||||
|
- Risk factors
|
||||||
|
- What it enables or prevents in the future
|
||||||
|
4. **Synthesize**: Recommend your top pick with reasoning, but present all options fairly.
|
||||||
|
5. **Refine**: Ask the user which direction appeals to them and iterate.
|
||||||
|
|
||||||
|
## Guidelines
|
||||||
|
|
||||||
|
- Read relevant code first to ground your suggestions in reality (the explore skill has already run if chained).
|
||||||
|
- Don't just list options — explain *why* each one is interesting or viable.
|
||||||
|
- Be bold. Brainstorming is the place for ambitious ideas.
|
||||||
|
- If the user's initial framing seems limiting, gently challenge it.
|
||||||
|
- Avoid implementation details at this stage — focus on approach and design.
|
||||||
|
|
||||||
|
## Output Format
|
||||||
|
|
||||||
|
Present options as numbered approaches with clear headings:
|
||||||
|
|
||||||
|
### Approach 1: [Name]
|
||||||
|
[Description, pros, cons, complexity]
|
||||||
|
|
||||||
|
### Approach 2: [Name]
|
||||||
|
[Description, pros, cons, complexity]
|
||||||
|
|
||||||
|
### Approach 3: [Name]
|
||||||
|
[Description, pros, cons, complexity]
|
||||||
|
|
||||||
|
**Recommendation**: [Your pick and why]
|
||||||
|
|
||||||
|
When brainstorming is complete and the user has chosen a direction, call `finish_skill` summarizing the chosen approach.
|
||||||
9
.sneakycode/skills/brainstorm/skill.yaml
Normal file
9
.sneakycode/skills/brainstorm/skill.yaml
Normal file
@@ -0,0 +1,9 @@
|
|||||||
|
name: brainstorm
|
||||||
|
description: Creative ideation — divergent thinking, option generation, structured exploration
|
||||||
|
version: "1.0"
|
||||||
|
triggers: ["/brainstorm", "/bs"]
|
||||||
|
config_overrides:
|
||||||
|
temperature: 1.2
|
||||||
|
tools_disable: [write_file, make_dir, delete_file, str_replace, patch_apply, run_command]
|
||||||
|
chain: [explore]
|
||||||
|
prompts: [prompt.md]
|
||||||
31
.sneakycode/skills/explore/prompt.md
Normal file
31
.sneakycode/skills/explore/prompt.md
Normal file
@@ -0,0 +1,31 @@
|
|||||||
|
# Explore Skill
|
||||||
|
|
||||||
|
You are in **exploration mode**. Your goal is to deeply understand the codebase or a specific area of it. Do NOT make any changes — only read, search, and analyze.
|
||||||
|
|
||||||
|
## Approach
|
||||||
|
|
||||||
|
1. **Start broad**: Use `list_dir` and `find_files` to understand the project structure
|
||||||
|
2. **Trace paths**: Follow imports, function calls, and data flow through the code
|
||||||
|
3. **Map relationships**: Identify which files depend on which, and how components interact
|
||||||
|
4. **Read carefully**: Use `read_file` to examine key files in detail
|
||||||
|
5. **Search patterns**: Use `grep_files` to find usage patterns, implementations, and references
|
||||||
|
|
||||||
|
## Output Format
|
||||||
|
|
||||||
|
Produce a structured summary with:
|
||||||
|
|
||||||
|
- **Architecture overview**: High-level description of the system's structure
|
||||||
|
- **Key components**: List of important files/classes and their responsibilities
|
||||||
|
- **Data flow**: How data moves through the system (requests, transformations, storage)
|
||||||
|
- **Dependencies**: Internal and external dependency map
|
||||||
|
- **Patterns**: Design patterns, conventions, and idioms used in the codebase
|
||||||
|
- **Observations**: Anything notable — potential issues, tech debt, clever solutions
|
||||||
|
|
||||||
|
## Guidelines
|
||||||
|
|
||||||
|
- Be thorough but focused. If the user specified an area, concentrate there.
|
||||||
|
- Don't guess — read the actual code before making claims.
|
||||||
|
- Quote specific file paths and line numbers when referencing code.
|
||||||
|
- If you find something unexpected or concerning, flag it clearly.
|
||||||
|
|
||||||
|
When you have completed your exploration, call `finish_skill` with a brief summary of your findings.
|
||||||
9
.sneakycode/skills/explore/skill.yaml
Normal file
9
.sneakycode/skills/explore/skill.yaml
Normal file
@@ -0,0 +1,9 @@
|
|||||||
|
name: explore
|
||||||
|
description: Deep codebase exploration — traces paths, maps architecture, summarizes findings
|
||||||
|
version: "1.0"
|
||||||
|
triggers: ["/explore", "/ex"]
|
||||||
|
config_overrides:
|
||||||
|
temperature: 0.3
|
||||||
|
tools_disable: [write_file, make_dir, delete_file, str_replace, patch_apply, run_command]
|
||||||
|
chain: []
|
||||||
|
prompts: [prompt.md]
|
||||||
50
.sneakycode/skills/plan/prompt.md
Normal file
50
.sneakycode/skills/plan/prompt.md
Normal file
@@ -0,0 +1,50 @@
|
|||||||
|
# Plan Skill
|
||||||
|
|
||||||
|
You are in **planning mode**. Your goal is to break down a task into a clear, actionable implementation plan. The explore skill has already run (if chained), so you have codebase context.
|
||||||
|
|
||||||
|
## Process
|
||||||
|
|
||||||
|
1. **Define scope**: Clearly state what the plan covers and what it does not.
|
||||||
|
2. **Decompose**: Break the task into discrete, ordered steps. Each step should be:
|
||||||
|
- Small enough to implement in one focused session
|
||||||
|
- Clear enough that someone unfamiliar could follow it
|
||||||
|
- Testable — you can verify the step was done correctly
|
||||||
|
3. **Identify dependencies**: Note which steps depend on others and the critical path.
|
||||||
|
4. **Map to files**: For each step, list the specific files to create or modify.
|
||||||
|
5. **Flag risks**: Identify anything that could go wrong, require decisions, or block progress.
|
||||||
|
|
||||||
|
## Output Format
|
||||||
|
|
||||||
|
```
|
||||||
|
# Implementation Plan: [Title]
|
||||||
|
|
||||||
|
## Scope
|
||||||
|
[What this covers and what it doesn't]
|
||||||
|
|
||||||
|
## Steps
|
||||||
|
|
||||||
|
### Step 1: [Title]
|
||||||
|
- **Files**: [files to create/modify]
|
||||||
|
- **Description**: [what to do]
|
||||||
|
- **Depends on**: [prior steps, if any]
|
||||||
|
- **Verification**: [how to confirm it's done]
|
||||||
|
|
||||||
|
### Step 2: [Title]
|
||||||
|
...
|
||||||
|
|
||||||
|
## Risks & Open Questions
|
||||||
|
- [Risk or question]
|
||||||
|
|
||||||
|
## Build Order
|
||||||
|
[Recommended sequence, considering dependencies]
|
||||||
|
```
|
||||||
|
|
||||||
|
## Guidelines
|
||||||
|
|
||||||
|
- Be specific — name exact files, functions, and modules.
|
||||||
|
- Keep steps granular. "Implement the backend" is too vague. "Add the /api/users endpoint with GET and POST handlers" is good.
|
||||||
|
- Consider both happy path and error cases in your plan.
|
||||||
|
- If you need to make assumptions, state them explicitly.
|
||||||
|
- Use `run_command` if you need to check project state (e.g., installed packages, running services).
|
||||||
|
|
||||||
|
When the plan is complete and the user has approved it, call `finish_skill` with a one-line summary.
|
||||||
9
.sneakycode/skills/plan/skill.yaml
Normal file
9
.sneakycode/skills/plan/skill.yaml
Normal file
@@ -0,0 +1,9 @@
|
|||||||
|
name: plan
|
||||||
|
description: Break down tasks, create roadmaps, plan implementations
|
||||||
|
version: "1.0"
|
||||||
|
triggers: ["/plan"]
|
||||||
|
config_overrides:
|
||||||
|
temperature: 0.5
|
||||||
|
tools_disable: [write_file, make_dir, delete_file, str_replace, patch_apply]
|
||||||
|
chain: [explore]
|
||||||
|
prompts: [prompt.md]
|
||||||
47
.sneakycode/skills/write-document/prompt.md
Normal file
47
.sneakycode/skills/write-document/prompt.md
Normal file
@@ -0,0 +1,47 @@
|
|||||||
|
# Write Document Skill
|
||||||
|
|
||||||
|
You are in **document writing mode**. Your goal is to draft, edit, or improve written documents — READMEs, technical specs, changelogs, guides, or any prose content.
|
||||||
|
|
||||||
|
## Workflow
|
||||||
|
|
||||||
|
### 1. Understand the Document
|
||||||
|
- What type of document? (README, spec, changelog, tutorial, etc.)
|
||||||
|
- Who is the audience? (developers, users, stakeholders)
|
||||||
|
- What is the desired tone? (formal, casual, technical)
|
||||||
|
- Are there existing documents to reference or update?
|
||||||
|
|
||||||
|
### 2. Outline
|
||||||
|
Before writing, propose a structure:
|
||||||
|
- List the main sections
|
||||||
|
- Note what each section should cover
|
||||||
|
- Get user approval on the outline before drafting
|
||||||
|
|
||||||
|
### 3. Draft
|
||||||
|
Write the full document based on the approved outline:
|
||||||
|
- Use clear, concise language
|
||||||
|
- Follow Markdown formatting conventions
|
||||||
|
- Include code examples where appropriate
|
||||||
|
- Be specific — avoid vague statements
|
||||||
|
|
||||||
|
### 4. Revise
|
||||||
|
After the initial draft:
|
||||||
|
- Check for consistency in tone and terminology
|
||||||
|
- Verify technical accuracy by reading referenced code
|
||||||
|
- Ensure all sections from the outline are covered
|
||||||
|
- Trim unnecessary content
|
||||||
|
|
||||||
|
## Document Templates
|
||||||
|
|
||||||
|
**README**: Project name, description, installation, usage, configuration, contributing, license
|
||||||
|
**Technical Spec**: Context, goals, non-goals, design, alternatives considered, implementation plan
|
||||||
|
**Changelog**: Version, date, categories (Added, Changed, Fixed, Removed)
|
||||||
|
**Guide/Tutorial**: Prerequisites, step-by-step instructions, examples, troubleshooting
|
||||||
|
|
||||||
|
## Guidelines
|
||||||
|
|
||||||
|
- Read existing project docs and code to ensure accuracy.
|
||||||
|
- Match the existing documentation style if updating.
|
||||||
|
- Prefer concrete examples over abstract descriptions.
|
||||||
|
- Use the `write_file` tool to save the document when the user approves.
|
||||||
|
|
||||||
|
When the document is complete and saved, call `finish_skill` with a summary of what was written.
|
||||||
8
.sneakycode/skills/write-document/skill.yaml
Normal file
8
.sneakycode/skills/write-document/skill.yaml
Normal file
@@ -0,0 +1,8 @@
|
|||||||
|
name: write-document
|
||||||
|
description: Draft and edit documents — READMEs, specs, changelogs, prose
|
||||||
|
version: "1.0"
|
||||||
|
triggers: ["/write-doc", "/doc"]
|
||||||
|
config_overrides:
|
||||||
|
temperature: 0.7
|
||||||
|
chain: []
|
||||||
|
prompts: [prompt.md]
|
||||||
122
README.md
122
README.md
@@ -26,31 +26,79 @@ pip install -e ".[dev]"
|
|||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
Edit `config/config.yaml` to configure the agent. Key settings:
|
Edit `config/config.yaml` to configure the agent. The full configuration reference:
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
llm:
|
llm:
|
||||||
model: "qwen3.5:latest" # Ollama model name
|
model: "qwen3.5:latest" # Ollama model name
|
||||||
endpoint: "http://localhost:11434" # Ollama endpoint
|
endpoint: "http://localhost:11434" # Ollama endpoint
|
||||||
max_retries: 3 # Retry attempts on transient errors
|
api_path: "/v1/chat/completions" # API endpoint path
|
||||||
retry_backoff_base: 1.0 # Exponential backoff base (seconds)
|
temperature: 0.1 # Sampling temperature
|
||||||
|
max_tokens: 4096 # Maximum tokens in LLM response
|
||||||
|
timeout: 120 # Request timeout in seconds
|
||||||
|
max_retries: 3 # Retry attempts on transient errors
|
||||||
|
retry_backoff_base: 1.0 # Exponential backoff base (seconds)
|
||||||
|
retry_backoff_max: 30.0 # Maximum backoff seconds
|
||||||
|
|
||||||
agent:
|
agent:
|
||||||
max_iterations: 25 # Max tool-call iterations per turn
|
max_iterations: 25 # Max tool-call iterations per turn
|
||||||
max_conversation_tokens: 32000 # Token budget for conversation
|
max_conversation_tokens: 32000 # Token budget for conversation
|
||||||
workspace_root: "." # Project directory for file operations
|
workspace_root: "." # Project directory for file operations
|
||||||
truncation_keep_recent: 10 # Messages preserved during truncation
|
truncation_keep_recent: 10 # Messages preserved during truncation
|
||||||
truncation_threshold: 0.85 # Budget fraction that triggers truncation
|
truncation_threshold: 0.85 # Budget fraction that triggers truncation
|
||||||
|
|
||||||
session:
|
session:
|
||||||
auto_save: true # Save session after each turn
|
session_dir: ".sneakycode/sessions" # Directory for session files
|
||||||
max_session_age_hours: 72 # Auto-cleanup old sessions
|
auto_save: true # Save session after each turn
|
||||||
offer_resume: true # Offer to resume on startup
|
max_session_age_hours: 72 # Auto-cleanup old sessions
|
||||||
|
offer_resume: true # Offer to resume on startup
|
||||||
|
|
||||||
permissions:
|
permissions:
|
||||||
auto_approve: [read_file, list_dir, grep_files, find_files, finish]
|
auto_approve: [read_file, list_dir, grep_files, find_files, finish]
|
||||||
prompt_user: [write_file, delete_file, run_command, str_replace, patch_apply, make_dir]
|
prompt_user: [write_file, delete_file, run_command, str_replace, patch_apply, make_dir]
|
||||||
deny: []
|
deny: []
|
||||||
|
|
||||||
|
tools:
|
||||||
|
shell:
|
||||||
|
allowed_commands: # Commands the LLM may run
|
||||||
|
- git
|
||||||
|
- python
|
||||||
|
- pip
|
||||||
|
- pytest
|
||||||
|
- ruff
|
||||||
|
- ls
|
||||||
|
- cat
|
||||||
|
- head
|
||||||
|
- tail
|
||||||
|
- wc
|
||||||
|
- diff
|
||||||
|
- grep
|
||||||
|
- find
|
||||||
|
- echo
|
||||||
|
denied_commands: # Blocked commands
|
||||||
|
- rm -rf /
|
||||||
|
- sudo
|
||||||
|
- curl
|
||||||
|
- wget
|
||||||
|
max_output_bytes: 65536 # Max captured output size (bytes)
|
||||||
|
filesystem:
|
||||||
|
max_file_size_bytes: 1048576 # 1 MB — max file size for read/write
|
||||||
|
binary_detection: true # Detect and reject binary files
|
||||||
|
|
||||||
|
display:
|
||||||
|
show_tool_calls: true # Show tool call details in output
|
||||||
|
show_token_usage: true # Show token usage stats
|
||||||
|
stream_output: true # Stream LLM output to terminal
|
||||||
|
|
||||||
|
skills:
|
||||||
|
enabled: true # Enable the skills system
|
||||||
|
directories: # Directories to scan for skill files
|
||||||
|
- ".sneakycode/skills"
|
||||||
|
|
||||||
|
debug:
|
||||||
|
enabled: false # Enable debug logging
|
||||||
|
log_dir: ".sneakycode/logs" # Debug log directory
|
||||||
|
max_files: 10 # Max debug log files to retain
|
||||||
```
|
```
|
||||||
|
|
||||||
Environment variable `SNEAKYCODE_CONFIG` can override the config file path.
|
Environment variable `SNEAKYCODE_CONFIG` can override the config file path.
|
||||||
@@ -58,9 +106,12 @@ Environment variable `SNEAKYCODE_CONFIG` can override the config file path.
|
|||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Start the interactive REPL
|
# Start the interactive TUI
|
||||||
sneakycode
|
sneakycode
|
||||||
|
|
||||||
|
# Open a specific project directory
|
||||||
|
sneakycode /path/to/project
|
||||||
|
|
||||||
# Or run directly
|
# Or run directly
|
||||||
python -m app.main
|
python -m app.main
|
||||||
|
|
||||||
@@ -68,25 +119,38 @@ python -m app.main
|
|||||||
sneakycode --config path/to/config.yaml --verbose --log-file sneakycode.log
|
sneakycode --config path/to/config.yaml --verbose --log-file sneakycode.log
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### CLI Options
|
||||||
|
|
||||||
|
| Option | Description |
|
||||||
|
|--------------------------|--------------------------------------------------|
|
||||||
|
| `DIRECTORY` | Project directory to use as workspace root |
|
||||||
|
| `--config PATH` | Path to config YAML file (default: `config/config.yaml`) |
|
||||||
|
| `-v`, `--verbose` | Enable verbose (DEBUG) logging |
|
||||||
|
| `--log-file PATH` | Path to log file for persistent logging |
|
||||||
|
|
||||||
### REPL Commands
|
### REPL Commands
|
||||||
|
|
||||||
| Command | Description |
|
| Command | Description |
|
||||||
|------------|--------------------------------------|
|
|-------------------|----------------------------------------------------|
|
||||||
| `/quit` | Save session and exit |
|
| `/help` | Show available commands |
|
||||||
| `/history` | Show conversation history |
|
| `/quit` | Save session and exit (also `/exit`, `/bye`) |
|
||||||
| `/clear` | Clear conversation history |
|
| `/history` | Show conversation history |
|
||||||
| `/save` | Manually save session |
|
| `/clear` | Clear conversation history |
|
||||||
| `/session` | Show session info (messages, tokens) |
|
| `/save` | Manually save session |
|
||||||
|
| `/session` | Show session info (messages, tokens, start time) |
|
||||||
|
| `/models` | List available Ollama models |
|
||||||
|
| `/models <name>` | Switch to a different model |
|
||||||
|
| `/skills` | List available skills |
|
||||||
|
|
||||||
### Session Persistence
|
### Session Persistence
|
||||||
|
|
||||||
Sessions are automatically saved after each agent turn and on exit. On startup, SneakyCode offers to resume the most recent session for the current workspace.
|
Sessions are automatically saved after each agent turn and on exit. On startup, SneakyCode offers to resume the most recent session for the current workspace.
|
||||||
|
|
||||||
Session files are stored in `.sneakycode/sessions/` within the workspace root.
|
Session files are stored in `.sneakycode/sessions/` within the workspace root (configurable via `session.session_dir`).
|
||||||
|
|
||||||
## Available Tools
|
## Available Tools
|
||||||
|
|
||||||
SneakyCode provides 11 tools across 5 categories. See [docs/tools.md](docs/tools.md) for the full reference.
|
SneakyCode provides tools across 6 categories. See [docs/tools.md](docs/tools.md) for the full reference.
|
||||||
|
|
||||||
| Category | Tools | Permission |
|
| Category | Tools | Permission |
|
||||||
|------------|-------------------------------------------------|---------------|
|
|------------|-------------------------------------------------|---------------|
|
||||||
@@ -96,6 +160,17 @@ SneakyCode provides 11 tools across 5 categories. See [docs/tools.md](docs/tools
|
|||||||
| Edit | `str_replace`, `patch_apply` | User confirm |
|
| Edit | `str_replace`, `patch_apply` | User confirm |
|
||||||
| Shell | `run_command` | User confirm |
|
| Shell | `run_command` | User confirm |
|
||||||
| Control | `finish` | Auto-approved |
|
| Control | `finish` | Auto-approved |
|
||||||
|
| Skills | `load_skill` | Auto-approved |
|
||||||
|
|
||||||
|
The `load_skill` tool is available when `skills.enabled` is `true` in the config. It allows the LLM to load skill instructions from the configured skill directories.
|
||||||
|
|
||||||
|
## Skills
|
||||||
|
|
||||||
|
SneakyCode includes a skills system that lets you provide reusable instruction sets to the LLM. Skills are markdown files placed in `.sneakycode/skills/` (or any directory listed in `skills.directories`).
|
||||||
|
|
||||||
|
Skills are auto-discovered on startup. The LLM can load them via the `load_skill` tool, and you can list available skills with the `/skills` command.
|
||||||
|
|
||||||
|
To create a skill, add a `.md` file to your skills directory with a descriptive filename (e.g., `refactoring.md`). The file content is injected into the conversation when the skill is loaded.
|
||||||
|
|
||||||
## Development
|
## Development
|
||||||
|
|
||||||
@@ -121,6 +196,7 @@ app/
|
|||||||
├── models/ # Pydantic config and message schemas
|
├── models/ # Pydantic config and message schemas
|
||||||
├── services/ # LLM client, streaming, permissions, session persistence
|
├── services/ # LLM client, streaming, permissions, session persistence
|
||||||
├── tools/ # Tool implementations (one file per group)
|
├── tools/ # Tool implementations (one file per group)
|
||||||
|
├── ui/ # Textual TUI application and widgets
|
||||||
└── utils/ # Logging, display, file helpers, token counter
|
└── utils/ # Logging, display, file helpers, token counter
|
||||||
config/
|
config/
|
||||||
└── config.yaml # Application configuration
|
└── config.yaml # Application configuration
|
||||||
|
|||||||
@@ -7,7 +7,7 @@ import time
|
|||||||
from typing import TYPE_CHECKING, Any
|
from typing import TYPE_CHECKING, Any
|
||||||
|
|
||||||
from app.agent.context import SessionContext
|
from app.agent.context import SessionContext
|
||||||
from app.models.config import AppConfig
|
from app.models.config import AgentMode, AppConfig
|
||||||
from app.models.message import Message
|
from app.models.message import Message
|
||||||
from app.models.tool_call import ToolCall, ToolResult, ToolResultStatus
|
from app.models.tool_call import ToolCall, ToolResult, ToolResultStatus
|
||||||
from app.services.llm import LLMClient, LLMConnectionError, LLMError, LLMStreamError
|
from app.services.llm import LLMClient, LLMConnectionError, LLMError, LLMStreamError
|
||||||
@@ -19,6 +19,7 @@ from app.utils.logging import get_logger
|
|||||||
|
|
||||||
if TYPE_CHECKING:
|
if TYPE_CHECKING:
|
||||||
from app.services.debug_log import DebugLogger
|
from app.services.debug_log import DebugLogger
|
||||||
|
from app.services.skill_runner import SkillRunner
|
||||||
from app.services.skills import SkillsManager
|
from app.services.skills import SkillsManager
|
||||||
|
|
||||||
logger = get_logger(__name__)
|
logger = get_logger(__name__)
|
||||||
@@ -45,6 +46,7 @@ class AgentLoop:
|
|||||||
display: DisplayAdapter | None = None,
|
display: DisplayAdapter | None = None,
|
||||||
debug_logger: DebugLogger | None = None,
|
debug_logger: DebugLogger | None = None,
|
||||||
skills_manager: SkillsManager | None = None,
|
skills_manager: SkillsManager | None = None,
|
||||||
|
skill_runner: SkillRunner | None = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
self._config = config
|
self._config = config
|
||||||
self._ctx = ctx
|
self._ctx = ctx
|
||||||
@@ -55,7 +57,14 @@ class AgentLoop:
|
|||||||
self._display = display
|
self._display = display
|
||||||
self._debug = debug_logger
|
self._debug = debug_logger
|
||||||
self._skills = skills_manager
|
self._skills = skills_manager
|
||||||
|
self._skill_runner = skill_runner
|
||||||
self._tools_schema = registry.get_openai_tools_schema()
|
self._tools_schema = registry.get_openai_tools_schema()
|
||||||
|
if self._permissions.mode == AgentMode.PLAN:
|
||||||
|
read_only = PermissionsService.READ_ONLY_TOOLS
|
||||||
|
self._tools_schema = [
|
||||||
|
t for t in self._tools_schema
|
||||||
|
if t["function"]["name"] in read_only
|
||||||
|
]
|
||||||
self._system_prompt = self._build_system_prompt()
|
self._system_prompt = self._build_system_prompt()
|
||||||
self._cancelled = False
|
self._cancelled = False
|
||||||
|
|
||||||
@@ -81,12 +90,52 @@ class AgentLoop:
|
|||||||
)
|
)
|
||||||
if self._skills:
|
if self._skills:
|
||||||
prompt += self._skills.get_system_prompt_snippet()
|
prompt += self._skills.get_system_prompt_snippet()
|
||||||
|
if self._skill_runner and self._skill_runner.is_active:
|
||||||
|
prompt += (
|
||||||
|
f"\n\nCurrently active skill: {self._skill_runner.active_skill_name}. "
|
||||||
|
"When the skill's objective is complete, call the `finish_skill` tool."
|
||||||
|
)
|
||||||
|
if self._permissions.mode == AgentMode.PLAN:
|
||||||
|
prompt += (
|
||||||
|
"\n\nYou are in PLAN mode. You may only use read-only tools: "
|
||||||
|
"read_file, list_dir, grep_files, find_files, finish. "
|
||||||
|
"Do NOT attempt to write files, edit code, or run commands. "
|
||||||
|
"Instead, describe what changes you would make, which files "
|
||||||
|
"you would modify, and provide the reasoning for each change."
|
||||||
|
)
|
||||||
return prompt
|
return prompt
|
||||||
|
|
||||||
|
# Models whose chat templates understand /no_think directives.
|
||||||
|
_THINKING_MODEL_PREFIXES = ("qwen", "qwq")
|
||||||
|
|
||||||
|
def _model_supports_no_think(self) -> bool:
|
||||||
|
"""Check if the current model uses a thinking chat template."""
|
||||||
|
model_lower = self._config.llm.model.lower()
|
||||||
|
return any(model_lower.startswith(p) for p in self._THINKING_MODEL_PREFIXES)
|
||||||
|
|
||||||
def _get_messages_with_system_prompt(self) -> list[Message]:
|
def _get_messages_with_system_prompt(self) -> list[Message]:
|
||||||
"""Prepend the system prompt to conversation history."""
|
"""Prepend the system prompt to conversation history.
|
||||||
|
|
||||||
|
When thinking is disabled on a model that supports it, appends a
|
||||||
|
system-level /no_think directive after the last user message so
|
||||||
|
Qwen 3.x (and similar) chat templates see it.
|
||||||
|
"""
|
||||||
system_msg = Message(role="system", content=self._system_prompt)
|
system_msg = Message(role="system", content=self._system_prompt)
|
||||||
return [system_msg] + self._ctx.get_history()
|
history = self._ctx.get_history()
|
||||||
|
|
||||||
|
if not self._config.llm.thinking and self._model_supports_no_think() and history:
|
||||||
|
history = list(history)
|
||||||
|
# Find last user message and insert a system hint after it
|
||||||
|
for i in range(len(history) - 1, -1, -1):
|
||||||
|
if history[i].role == "user":
|
||||||
|
no_think_msg = Message(
|
||||||
|
role="system",
|
||||||
|
content="/no_think",
|
||||||
|
)
|
||||||
|
history.insert(i + 1, no_think_msg)
|
||||||
|
break
|
||||||
|
|
||||||
|
return [system_msg] + history
|
||||||
|
|
||||||
async def run_turn(self, user_input: str) -> None:
|
async def run_turn(self, user_input: str) -> None:
|
||||||
"""Execute one full agent turn: add user message, loop until done.
|
"""Execute one full agent turn: add user message, loop until done.
|
||||||
@@ -99,6 +148,7 @@ class AgentLoop:
|
|||||||
|
|
||||||
max_iter = self._config.agent.max_iterations
|
max_iter = self._config.agent.max_iterations
|
||||||
reasoning_only_streak = 0
|
reasoning_only_streak = 0
|
||||||
|
empty_streak = 0
|
||||||
for iteration in range(1, max_iter + 1):
|
for iteration in range(1, max_iter + 1):
|
||||||
if self._cancelled:
|
if self._cancelled:
|
||||||
if self._display:
|
if self._display:
|
||||||
@@ -153,16 +203,32 @@ class AgentLoop:
|
|||||||
reasoning_only_streak += 1
|
reasoning_only_streak += 1
|
||||||
self._ctx.pop_last_message()
|
self._ctx.pop_last_message()
|
||||||
|
|
||||||
if reasoning_only_streak >= _MAX_REASONING_RETRIES:
|
# When thinking is disabled, reasoning-only is expected model noise.
|
||||||
# Nudge the model by injecting a user hint
|
# Nudge immediately and silently to avoid wasting iterations.
|
||||||
if self._display:
|
thinking_disabled = not self._config.llm.thinking
|
||||||
|
|
||||||
|
# If the last context messages are tool errors, nudge immediately
|
||||||
|
# rather than wasting retries — the model is likely confused by the error.
|
||||||
|
has_recent_tool_error = any(
|
||||||
|
m.role == "tool" and m.content and m.content.startswith("Unknown ")
|
||||||
|
for m in self._ctx.get_history()[-3:]
|
||||||
|
)
|
||||||
|
|
||||||
|
should_nudge = (
|
||||||
|
thinking_disabled
|
||||||
|
or has_recent_tool_error
|
||||||
|
or reasoning_only_streak >= _MAX_REASONING_RETRIES
|
||||||
|
)
|
||||||
|
|
||||||
|
if should_nudge:
|
||||||
|
if not thinking_disabled and self._display:
|
||||||
self._display.write_warning(
|
self._display.write_warning(
|
||||||
f"Model produced reasoning but no response {reasoning_only_streak} times. "
|
f"Model produced reasoning but no response {reasoning_only_streak} times. "
|
||||||
"Nudging model to respond..."
|
"Nudging model to respond..."
|
||||||
)
|
)
|
||||||
self._ctx.add_message(
|
self._ctx.add_message(
|
||||||
"user",
|
"user",
|
||||||
"Please respond with your answer. Do not just think — provide your actual response.",
|
"Please respond with your answer. If a tool call failed, briefly explain what happened and continue.",
|
||||||
)
|
)
|
||||||
reasoning_only_streak = 0
|
reasoning_only_streak = 0
|
||||||
else:
|
else:
|
||||||
@@ -173,10 +239,42 @@ class AgentLoop:
|
|||||||
# Successful response — reset streak
|
# Successful response — reset streak
|
||||||
reasoning_only_streak = 0
|
reasoning_only_streak = 0
|
||||||
|
|
||||||
|
# Detect completely empty response (no content, no tool calls)
|
||||||
|
if not assistant_msg.content and not assistant_msg.tool_calls:
|
||||||
|
empty_streak += 1
|
||||||
|
self._ctx.pop_last_message() # Don't keep empty messages
|
||||||
|
if empty_streak >= 2:
|
||||||
|
if self._display:
|
||||||
|
self._display.write_warning(
|
||||||
|
"Model returned repeated empty responses — "
|
||||||
|
"try a different model or check Ollama logs."
|
||||||
|
)
|
||||||
|
break
|
||||||
|
if self._display:
|
||||||
|
self._display.write_warning("Model returned empty response. Retrying without tools...")
|
||||||
|
# Retry without tool schemas — some models return empty when
|
||||||
|
# tools are in the payload but the model can't handle them.
|
||||||
|
assistant_msg = await self._llm_step(skip_tools=True)
|
||||||
|
if assistant_msg is None:
|
||||||
|
break
|
||||||
|
if assistant_msg.content:
|
||||||
|
self._ctx.add_message("assistant", assistant_msg.content)
|
||||||
|
if self._display:
|
||||||
|
self._display.write_assistant_message(assistant_msg.content)
|
||||||
|
self._handler.reset()
|
||||||
|
break
|
||||||
|
# Still empty even without tools
|
||||||
|
self._handler.reset()
|
||||||
|
continue
|
||||||
|
|
||||||
|
empty_streak = 0 # reset on successful non-empty response
|
||||||
|
|
||||||
|
# Display any assistant text content (even if tool calls follow)
|
||||||
|
if self._display and assistant_msg.content:
|
||||||
|
self._display.write_assistant_message(assistant_msg.content)
|
||||||
|
|
||||||
# No tool calls → task complete (plain text response)
|
# No tool calls → task complete (plain text response)
|
||||||
if not assistant_msg.tool_calls:
|
if not assistant_msg.tool_calls:
|
||||||
if self._display and assistant_msg.content:
|
|
||||||
self._display.write_assistant_message(assistant_msg.content)
|
|
||||||
break
|
break
|
||||||
|
|
||||||
# Execute tool calls
|
# Execute tool calls
|
||||||
@@ -192,28 +290,37 @@ class AgentLoop:
|
|||||||
name=result.tool_name,
|
name=result.tool_name,
|
||||||
)
|
)
|
||||||
|
|
||||||
# Check if finish tool was called
|
# Rebuild tools schema and system prompt if skill state may have changed
|
||||||
|
if any(r.tool_name in ("load_skill", "finish_skill") for r in results):
|
||||||
|
self._tools_schema = self._registry.get_openai_tools_schema()
|
||||||
|
self._system_prompt = self._build_system_prompt()
|
||||||
|
|
||||||
|
# Check if finish tool was called (finish_skill does NOT break the loop)
|
||||||
if any(r.tool_name == "finish" for r in results):
|
if any(r.tool_name == "finish" for r in results):
|
||||||
break
|
break
|
||||||
else:
|
else:
|
||||||
if self._display:
|
if self._display:
|
||||||
self._display.write_warning(f"Agent reached maximum iterations ({max_iter}). Stopping.")
|
self._display.write_warning(f"Agent reached maximum iterations ({max_iter}). Stopping.")
|
||||||
|
|
||||||
async def _llm_step(self) -> Message | None:
|
async def _llm_step(self, *, skip_tools: bool = False) -> Message | None:
|
||||||
"""Stream one LLM response and return the accumulated Message.
|
"""Stream one LLM response and return the accumulated Message.
|
||||||
|
|
||||||
Uses retry-enabled streaming. On mid-stream errors, attempts to recover
|
Uses retry-enabled streaming. On mid-stream errors, attempts to recover
|
||||||
partial content if available.
|
partial content if available.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
skip_tools: If True, send the request without tool schemas (fallback mode).
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
The assistant Message, or None if an error occurred.
|
The assistant Message, or None if an error occurred.
|
||||||
"""
|
"""
|
||||||
messages = self._get_messages_with_system_prompt()
|
messages = self._get_messages_with_system_prompt()
|
||||||
if self._debug:
|
if self._debug:
|
||||||
self._debug.log_request(messages, self._config.llm.model)
|
self._debug.log_request(messages, self._config.llm.model)
|
||||||
|
tools = None if skip_tools else self._tools_schema
|
||||||
t0 = time.monotonic()
|
t0 = time.monotonic()
|
||||||
try:
|
try:
|
||||||
chunk_iter = self._client.stream_chat_with_retry(messages, tools=self._tools_schema)
|
chunk_iter = self._client.stream_chat_with_retry(messages, tools=tools)
|
||||||
result = await self._handler.process_stream(chunk_iter)
|
result = await self._handler.process_stream(chunk_iter)
|
||||||
if result and self._debug:
|
if result and self._debug:
|
||||||
elapsed = (time.monotonic() - t0) * 1000
|
elapsed = (time.monotonic() - t0) * 1000
|
||||||
|
|||||||
@@ -1,12 +1,39 @@
|
|||||||
"""Pydantic configuration models mapping to config/config.yaml."""
|
"""Pydantic configuration models mapping to config/config.yaml."""
|
||||||
|
|
||||||
import os
|
import os
|
||||||
|
from enum import StrEnum
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
import yaml
|
import yaml
|
||||||
from pydantic import BaseModel, Field, model_validator
|
from pydantic import BaseModel, Field, model_validator
|
||||||
|
|
||||||
|
|
||||||
|
class AgentMode(StrEnum):
|
||||||
|
"""Runtime agent mode controlling permission behavior."""
|
||||||
|
|
||||||
|
NORMAL = "normal"
|
||||||
|
PLAN = "plan"
|
||||||
|
AUTO = "auto"
|
||||||
|
|
||||||
|
|
||||||
|
class ModelProfile(BaseModel):
|
||||||
|
"""Per-model overrides applied when switching models."""
|
||||||
|
|
||||||
|
max_conversation_tokens: int | None = Field(
|
||||||
|
default=None, description="Token budget override for this model's context window"
|
||||||
|
)
|
||||||
|
thinking: bool | None = Field(
|
||||||
|
default=None, description="Override thinking mode for this model"
|
||||||
|
)
|
||||||
|
temperature: float | None = Field(
|
||||||
|
default=None, description="Override sampling temperature"
|
||||||
|
)
|
||||||
|
max_tokens: int | None = Field(
|
||||||
|
default=None, description="Override max response tokens"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class LLMConfig(BaseModel):
|
class LLMConfig(BaseModel):
|
||||||
"""LLM backend configuration."""
|
"""LLM backend configuration."""
|
||||||
|
|
||||||
@@ -19,6 +46,14 @@ class LLMConfig(BaseModel):
|
|||||||
max_retries: int = Field(default=3, description="Max retry attempts on transient errors")
|
max_retries: int = Field(default=3, description="Max retry attempts on transient errors")
|
||||||
retry_backoff_base: float = Field(default=1.0, description="Base seconds for exponential backoff")
|
retry_backoff_base: float = Field(default=1.0, description="Base seconds for exponential backoff")
|
||||||
retry_backoff_max: float = Field(default=30.0, description="Maximum backoff seconds")
|
retry_backoff_max: float = Field(default=30.0, description="Maximum backoff seconds")
|
||||||
|
thinking: bool = Field(
|
||||||
|
default=True,
|
||||||
|
description="Enable model thinking/reasoning mode (disable to reduce reasoning-only loops)",
|
||||||
|
)
|
||||||
|
extra_body: dict[str, Any] = Field(
|
||||||
|
default_factory=dict,
|
||||||
|
description="Extra parameters merged into the API request body (model-specific)",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class AgentConfig(BaseModel):
|
class AgentConfig(BaseModel):
|
||||||
@@ -55,11 +90,19 @@ class ShellToolConfig(BaseModel):
|
|||||||
max_output_bytes: int = Field(default=65536, description="Max output capture size in bytes")
|
max_output_bytes: int = Field(default=65536, description="Max output capture size in bytes")
|
||||||
|
|
||||||
|
|
||||||
|
class FileCacheConfig(BaseModel):
|
||||||
|
"""File cache configuration."""
|
||||||
|
|
||||||
|
enabled: bool = Field(default=True, description="Enable file content caching")
|
||||||
|
max_entries: int = Field(default=128, description="Maximum cached file entries (LRU eviction)")
|
||||||
|
|
||||||
|
|
||||||
class FilesystemToolConfig(BaseModel):
|
class FilesystemToolConfig(BaseModel):
|
||||||
"""Filesystem tool limits."""
|
"""Filesystem tool limits."""
|
||||||
|
|
||||||
max_file_size_bytes: int = Field(default=1_048_576, description="Max file size for read/write")
|
max_file_size_bytes: int = Field(default=1_048_576, description="Max file size for read/write")
|
||||||
binary_detection: bool = Field(default=True, description="Detect and reject binary files")
|
binary_detection: bool = Field(default=True, description="Detect and reject binary files")
|
||||||
|
cache: FileCacheConfig = Field(default_factory=FileCacheConfig, description="File cache settings")
|
||||||
|
|
||||||
|
|
||||||
class ToolsConfig(BaseModel):
|
class ToolsConfig(BaseModel):
|
||||||
@@ -119,6 +162,10 @@ class AppConfig(BaseModel):
|
|||||||
session: SessionConfig = Field(default_factory=SessionConfig)
|
session: SessionConfig = Field(default_factory=SessionConfig)
|
||||||
debug: DebugConfig = Field(default_factory=DebugConfig)
|
debug: DebugConfig = Field(default_factory=DebugConfig)
|
||||||
skills: SkillsConfig = Field(default_factory=SkillsConfig)
|
skills: SkillsConfig = Field(default_factory=SkillsConfig)
|
||||||
|
model_profiles: dict[str, ModelProfile] = Field(
|
||||||
|
default_factory=dict,
|
||||||
|
description="Per-model overrides keyed by model name prefix",
|
||||||
|
)
|
||||||
|
|
||||||
@model_validator(mode="after")
|
@model_validator(mode="after")
|
||||||
def resolve_workspace_root(self) -> "AppConfig":
|
def resolve_workspace_root(self) -> "AppConfig":
|
||||||
@@ -126,6 +173,39 @@ class AppConfig(BaseModel):
|
|||||||
self.agent.workspace_root = self.agent.workspace_root.resolve()
|
self.agent.workspace_root = self.agent.workspace_root.resolve()
|
||||||
return self
|
return self
|
||||||
|
|
||||||
|
def get_model_profile(self, model: str) -> ModelProfile | None:
|
||||||
|
"""Find the best matching model profile by prefix.
|
||||||
|
|
||||||
|
Matches the longest prefix first (e.g., "llama3.1" beats "llama3"
|
||||||
|
for model "llama3.1:latest"). Returns None if no profile matches.
|
||||||
|
"""
|
||||||
|
model_lower = model.lower().split(":")[0] # strip tag
|
||||||
|
best_match: str | None = None
|
||||||
|
for key in self.model_profiles:
|
||||||
|
key_lower = key.lower()
|
||||||
|
if model_lower == key_lower or model_lower.startswith(key_lower):
|
||||||
|
if best_match is None or len(key) > len(best_match):
|
||||||
|
best_match = key
|
||||||
|
return self.model_profiles.get(best_match) if best_match else None
|
||||||
|
|
||||||
|
def apply_model_profile(self, model: str) -> ModelProfile | None:
|
||||||
|
"""Apply the matching model profile overrides to the active config.
|
||||||
|
|
||||||
|
Returns the applied profile, or None if no profile matched.
|
||||||
|
"""
|
||||||
|
profile = self.get_model_profile(model)
|
||||||
|
if profile is None:
|
||||||
|
return None
|
||||||
|
if profile.max_conversation_tokens is not None:
|
||||||
|
self.agent.max_conversation_tokens = profile.max_conversation_tokens
|
||||||
|
if profile.thinking is not None:
|
||||||
|
self.llm.thinking = profile.thinking
|
||||||
|
if profile.temperature is not None:
|
||||||
|
self.llm.temperature = profile.temperature
|
||||||
|
if profile.max_tokens is not None:
|
||||||
|
self.llm.max_tokens = profile.max_tokens
|
||||||
|
return profile
|
||||||
|
|
||||||
|
|
||||||
# Default config file location relative to project root
|
# Default config file location relative to project root
|
||||||
_DEFAULT_CONFIG_PATH = Path("config/config.yaml")
|
_DEFAULT_CONFIG_PATH = Path("config/config.yaml")
|
||||||
|
|||||||
39
app/models/skill.py
Normal file
39
app/models/skill.py
Normal file
@@ -0,0 +1,39 @@
|
|||||||
|
"""Pydantic models for structured skill packages."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from pydantic import BaseModel, Field
|
||||||
|
|
||||||
|
|
||||||
|
class SkillConfigOverrides(BaseModel):
|
||||||
|
"""Scoped config overrides applied while a skill is active."""
|
||||||
|
|
||||||
|
temperature: float | None = Field(default=None, description="Override sampling temperature")
|
||||||
|
max_tokens: int | None = Field(default=None, description="Override max tokens")
|
||||||
|
tools_enable: list[str] | None = Field(
|
||||||
|
default=None, description="Whitelist — only these tools available when set"
|
||||||
|
)
|
||||||
|
tools_disable: list[str] | None = Field(
|
||||||
|
default=None, description="Blacklist — disable specific tools"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class SkillManifest(BaseModel):
|
||||||
|
"""Parsed skill.yaml manifest for a skill package directory."""
|
||||||
|
|
||||||
|
name: str = Field(description="Unique skill identifier")
|
||||||
|
description: str = Field(description="Human-readable skill description")
|
||||||
|
version: str = Field(default="1.0", description="Skill version")
|
||||||
|
triggers: list[str] = Field(
|
||||||
|
default_factory=list, description="Slash commands that activate this skill"
|
||||||
|
)
|
||||||
|
config_overrides: SkillConfigOverrides = Field(
|
||||||
|
default_factory=SkillConfigOverrides, description="Scoped config overrides"
|
||||||
|
)
|
||||||
|
chain: list[str] = Field(
|
||||||
|
default_factory=list, description="Skill names to run first (dependencies)"
|
||||||
|
)
|
||||||
|
prompts: list[str] = Field(
|
||||||
|
default_factory=lambda: ["prompt.md"],
|
||||||
|
description="Markdown prompt files to load, in order",
|
||||||
|
)
|
||||||
@@ -151,6 +151,15 @@ class LLMClient:
|
|||||||
if tools:
|
if tools:
|
||||||
payload["tools"] = tools
|
payload["tools"] = tools
|
||||||
|
|
||||||
|
# When thinking is disabled, inject chat_template_kwargs for backends
|
||||||
|
# that support it (Qwen 3.x thinking models).
|
||||||
|
if not self._config.thinking and self._config.model.lower().startswith(("qwen", "qwq")):
|
||||||
|
payload.setdefault("chat_template_kwargs", {})["enable_thinking"] = False
|
||||||
|
|
||||||
|
# Merge model-specific extra parameters (e.g., reasoning_effort)
|
||||||
|
if self._config.extra_body:
|
||||||
|
payload.update(self._config.extra_body)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
async with self._client.stream(
|
async with self._client.stream(
|
||||||
"POST", self._config.api_path, json=payload
|
"POST", self._config.api_path, json=payload
|
||||||
@@ -162,20 +171,32 @@ class LLMClient:
|
|||||||
status_code=response.status_code,
|
status_code=response.status_code,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
chunk_count = 0
|
||||||
async for line in response.aiter_lines():
|
async for line in response.aiter_lines():
|
||||||
if not line.startswith("data: "):
|
line = line.strip()
|
||||||
|
if not line:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
data = line[6:] # strip "data: " prefix
|
# SSE format: "data: {json}" or "data: [DONE]"
|
||||||
|
if line.startswith("data: "):
|
||||||
if data.strip() == "[DONE]":
|
data = line[6:]
|
||||||
return
|
if data.strip() == "[DONE]":
|
||||||
|
break
|
||||||
|
elif line.startswith("{"):
|
||||||
|
# Plain NDJSON fallback (some Ollama versions)
|
||||||
|
data = line
|
||||||
|
else:
|
||||||
|
continue
|
||||||
|
|
||||||
try:
|
try:
|
||||||
yield json.loads(data)
|
yield json.loads(data)
|
||||||
|
chunk_count += 1
|
||||||
except json.JSONDecodeError:
|
except json.JSONDecodeError:
|
||||||
logger.warning("malformed_sse_chunk", data=data[:200])
|
logger.warning("malformed_sse_chunk", data=data[:200])
|
||||||
|
|
||||||
|
if chunk_count == 0:
|
||||||
|
logger.warning("empty_stream", model=self._config.model)
|
||||||
|
|
||||||
except httpx.ConnectError as e:
|
except httpx.ConnectError as e:
|
||||||
raise LLMConnectionError(f"Cannot connect to LLM endpoint: {e}") from e
|
raise LLMConnectionError(f"Cannot connect to LLM endpoint: {e}") from e
|
||||||
except httpx.TimeoutException as e:
|
except httpx.TimeoutException as e:
|
||||||
|
|||||||
@@ -4,16 +4,20 @@ from __future__ import annotations
|
|||||||
|
|
||||||
import json
|
import json
|
||||||
import logging
|
import logging
|
||||||
|
import re
|
||||||
import shlex
|
import shlex
|
||||||
from collections.abc import Awaitable, Callable
|
from collections.abc import Awaitable, Callable
|
||||||
|
|
||||||
from app.models.config import PermissionsConfig, ToolsConfig
|
from app.models.config import AgentMode, PermissionsConfig, ToolsConfig
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# Type alias for the async prompt callback
|
# Type alias for the async prompt callback
|
||||||
PromptCallback = Callable[[str, str], Awaitable[bool]]
|
PromptCallback = Callable[[str, str], Awaitable[bool]]
|
||||||
|
|
||||||
|
# Detect shell redirects that write to files (>, >>, heredocs)
|
||||||
|
_WRITE_REDIRECT_PATTERN = re.compile(r"(?:>\s*\S|>>|<<)")
|
||||||
|
|
||||||
|
|
||||||
class PermissionDenied(Exception):
|
class PermissionDenied(Exception):
|
||||||
"""Raised when a tool is denied execution by permissions policy."""
|
"""Raised when a tool is denied execution by permissions policy."""
|
||||||
@@ -26,6 +30,10 @@ class PermissionsService:
|
|||||||
shows a modal dialog. Without a callback, unlisted tools are denied.
|
shows a modal dialog. Without a callback, unlisted tools are denied.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
READ_ONLY_TOOLS: frozenset[str] = frozenset({
|
||||||
|
"read_file", "list_dir", "grep_files", "find_files", "finish",
|
||||||
|
})
|
||||||
|
|
||||||
def __init__(
|
def __init__(
|
||||||
self,
|
self,
|
||||||
config: PermissionsConfig,
|
config: PermissionsConfig,
|
||||||
@@ -34,6 +42,16 @@ class PermissionsService:
|
|||||||
self.config = config
|
self.config = config
|
||||||
self._tools_config = tools_config
|
self._tools_config = tools_config
|
||||||
self._prompt_callback: PromptCallback | None = None
|
self._prompt_callback: PromptCallback | None = None
|
||||||
|
self._mode: AgentMode = AgentMode.NORMAL
|
||||||
|
|
||||||
|
@property
|
||||||
|
def mode(self) -> AgentMode:
|
||||||
|
"""Current agent mode."""
|
||||||
|
return self._mode
|
||||||
|
|
||||||
|
@mode.setter
|
||||||
|
def mode(self, value: AgentMode) -> None:
|
||||||
|
self._mode = value
|
||||||
|
|
||||||
def set_prompt_callback(self, callback: PromptCallback) -> None:
|
def set_prompt_callback(self, callback: PromptCallback) -> None:
|
||||||
"""Set the async callback used to prompt the user for permission.
|
"""Set the async callback used to prompt the user for permission.
|
||||||
@@ -63,6 +81,16 @@ class PermissionsService:
|
|||||||
logger.info("Tool '%s' is in deny list — blocked", tool_name)
|
logger.info("Tool '%s' is in deny list — blocked", tool_name)
|
||||||
return False
|
return False
|
||||||
|
|
||||||
|
if self._mode == AgentMode.AUTO:
|
||||||
|
logger.debug("Tool '%s' auto-approved (AUTO mode)", tool_name)
|
||||||
|
return True
|
||||||
|
|
||||||
|
if self._mode == AgentMode.PLAN:
|
||||||
|
if tool_name not in self.READ_ONLY_TOOLS:
|
||||||
|
logger.info("Tool '%s' blocked in Plan mode (read-only tools only)", tool_name)
|
||||||
|
return False
|
||||||
|
return True
|
||||||
|
|
||||||
if tool_name in self.config.auto_approve:
|
if tool_name in self.config.auto_approve:
|
||||||
logger.debug("Tool '%s' is auto-approved", tool_name)
|
logger.debug("Tool '%s' is auto-approved", tool_name)
|
||||||
return True
|
return True
|
||||||
@@ -104,6 +132,11 @@ class PermissionsService:
|
|||||||
logger.info("Shell command '%s' matches denied prefix '%s'", cmd, denied)
|
logger.info("Shell command '%s' matches denied prefix '%s'", cmd, denied)
|
||||||
return False
|
return False
|
||||||
|
|
||||||
|
# Detect shell redirects that write to files — require approval
|
||||||
|
if _WRITE_REDIRECT_PATTERN.search(cmd):
|
||||||
|
logger.info("Shell command '%s' contains file-write redirect — requiring approval", cmd)
|
||||||
|
return None # fall through to user prompt
|
||||||
|
|
||||||
# Allowed commands: base executable match
|
# Allowed commands: base executable match
|
||||||
if shell_config.allowed_commands:
|
if shell_config.allowed_commands:
|
||||||
if base_cmd in shell_config.allowed_commands:
|
if base_cmd in shell_config.allowed_commands:
|
||||||
|
|||||||
@@ -52,6 +52,10 @@ class SessionManager:
|
|||||||
self._session_dir = workspace_root / config.session_dir
|
self._session_dir = workspace_root / config.session_dir
|
||||||
self._session_id = f"{self._workspace_hash}_{datetime.now(UTC).strftime('%Y%m%d_%H%M%S')}"
|
self._session_id = f"{self._workspace_hash}_{datetime.now(UTC).strftime('%Y%m%d_%H%M%S')}"
|
||||||
|
|
||||||
|
def update_model(self, model: str) -> None:
|
||||||
|
"""Update the model name for session metadata."""
|
||||||
|
self._model = model
|
||||||
|
|
||||||
def save(self, ctx: "SessionContext") -> Path:
|
def save(self, ctx: "SessionContext") -> Path:
|
||||||
"""Save session state to a JSON file via atomic write.
|
"""Save session state to a JSON file via atomic write.
|
||||||
|
|
||||||
|
|||||||
234
app/services/skill_runner.py
Normal file
234
app/services/skill_runner.py
Normal file
@@ -0,0 +1,234 @@
|
|||||||
|
"""SkillRunner — orchestrates skill activation, chaining, config scoping, and deactivation."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from dataclasses import dataclass, field
|
||||||
|
|
||||||
|
from app.agent.context import SessionContext
|
||||||
|
from app.models.config import AppConfig
|
||||||
|
from app.models.skill import SkillManifest
|
||||||
|
from app.services.skills import Skill, SkillsManager
|
||||||
|
from app.tools.registry import ToolRegistry
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class SkillChainError(Exception):
|
||||||
|
"""Raised when skill chain resolution fails (e.g., cycle detected)."""
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class _SkillSnapshot:
|
||||||
|
"""Captured state before skill activation, for restoration on deactivate."""
|
||||||
|
|
||||||
|
temperature: float
|
||||||
|
max_tokens: int
|
||||||
|
disabled_tools: set[str] = field(default_factory=set)
|
||||||
|
|
||||||
|
|
||||||
|
class SkillRunner:
|
||||||
|
"""Manages skill lifecycle: activation, chaining, config overrides, deactivation.
|
||||||
|
|
||||||
|
Only one skill can be active at a time. Activating a new skill while one
|
||||||
|
is active will first deactivate the current skill.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
skills_manager: SkillsManager,
|
||||||
|
config: AppConfig,
|
||||||
|
ctx: SessionContext,
|
||||||
|
registry: ToolRegistry,
|
||||||
|
) -> None:
|
||||||
|
self._skills = skills_manager
|
||||||
|
self._config = config
|
||||||
|
self._ctx = ctx
|
||||||
|
self._registry = registry
|
||||||
|
self._active_skill: Skill | None = None
|
||||||
|
self._snapshot: _SkillSnapshot | None = None
|
||||||
|
|
||||||
|
@property
|
||||||
|
def is_active(self) -> bool:
|
||||||
|
"""Whether a skill is currently active."""
|
||||||
|
return self._active_skill is not None
|
||||||
|
|
||||||
|
@property
|
||||||
|
def active_skill_name(self) -> str | None:
|
||||||
|
"""Name of the currently active skill, or None."""
|
||||||
|
return self._active_skill.name if self._active_skill else None
|
||||||
|
|
||||||
|
@property
|
||||||
|
def active_skill(self) -> Skill | None:
|
||||||
|
"""The currently active skill, or None."""
|
||||||
|
return self._active_skill
|
||||||
|
|
||||||
|
def activate(self, skill_name: str) -> str | None:
|
||||||
|
"""Activate a skill by name.
|
||||||
|
|
||||||
|
Resolves chain dependencies (depth-first), applies config overrides,
|
||||||
|
injects prompt content into conversation context.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
skill_name: Name of the skill to activate.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The concatenated prompt content injected, or None on failure.
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
SkillChainError: If chain resolution detects a cycle.
|
||||||
|
"""
|
||||||
|
skill = self._skills.get_skill(skill_name)
|
||||||
|
if skill is None:
|
||||||
|
logger.warning("Cannot activate unknown skill: %s", skill_name)
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Deactivate current skill if one is active
|
||||||
|
if self._active_skill is not None:
|
||||||
|
self.deactivate()
|
||||||
|
|
||||||
|
# Resolve chain dependencies
|
||||||
|
chain = self._resolve_chain(skill, set())
|
||||||
|
|
||||||
|
# Snapshot current config for restoration
|
||||||
|
self._snapshot = _SkillSnapshot(
|
||||||
|
temperature=self._config.llm.temperature,
|
||||||
|
max_tokens=self._config.llm.max_tokens,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Collect and inject chain skill prompts first
|
||||||
|
all_prompts: list[str] = []
|
||||||
|
for chained_skill in chain:
|
||||||
|
content = self._skills.load_skill(chained_skill.name)
|
||||||
|
if content:
|
||||||
|
all_prompts.append(f"[Chained skill: {chained_skill.name}]\n{content}")
|
||||||
|
|
||||||
|
# Load the target skill's prompts
|
||||||
|
content = self._skills.load_skill(skill.name)
|
||||||
|
if content:
|
||||||
|
all_prompts.append(content)
|
||||||
|
|
||||||
|
# Apply config overrides from the target skill
|
||||||
|
if skill.manifest:
|
||||||
|
self._apply_overrides(skill.manifest)
|
||||||
|
|
||||||
|
# Inject prompts into context
|
||||||
|
full_prompt = "\n\n".join(all_prompts) if all_prompts else None
|
||||||
|
if full_prompt:
|
||||||
|
self._ctx.add_message(
|
||||||
|
"system",
|
||||||
|
f"[Skill activated: {skill.name}]\n{full_prompt}",
|
||||||
|
)
|
||||||
|
|
||||||
|
self._active_skill = skill
|
||||||
|
logger.info("Skill activated: %s", skill.name)
|
||||||
|
return full_prompt
|
||||||
|
|
||||||
|
def activate_by_trigger(self, trigger: str) -> str | None:
|
||||||
|
"""Activate a skill by its /command trigger.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
trigger: The trigger string (with or without leading /).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The concatenated prompt content, or None if no skill matches.
|
||||||
|
"""
|
||||||
|
skill = self._skills.get_skill_by_trigger(trigger)
|
||||||
|
if skill is None:
|
||||||
|
return None
|
||||||
|
return self.activate(skill.name)
|
||||||
|
|
||||||
|
def deactivate(self, summary: str | None = None) -> None:
|
||||||
|
"""Deactivate the current skill, restoring config and tool state.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
summary: Optional summary message to inject into context.
|
||||||
|
"""
|
||||||
|
if self._active_skill is None:
|
||||||
|
return
|
||||||
|
|
||||||
|
skill_name = self._active_skill.name
|
||||||
|
|
||||||
|
# Restore config
|
||||||
|
if self._snapshot is not None:
|
||||||
|
self._config.llm.temperature = self._snapshot.temperature
|
||||||
|
self._config.llm.max_tokens = self._snapshot.max_tokens
|
||||||
|
self._registry.restore_filter(self._snapshot.disabled_tools)
|
||||||
|
self._snapshot = None
|
||||||
|
|
||||||
|
if summary:
|
||||||
|
self._ctx.add_message(
|
||||||
|
"system",
|
||||||
|
f"[Skill completed: {skill_name}] {summary}",
|
||||||
|
)
|
||||||
|
|
||||||
|
self._active_skill = None
|
||||||
|
logger.info("Skill deactivated: %s", skill_name)
|
||||||
|
|
||||||
|
def _resolve_chain(
|
||||||
|
self, skill: Skill, in_progress: set[str], completed: set[str] | None = None,
|
||||||
|
) -> list[Skill]:
|
||||||
|
"""Depth-first resolution of skill chain dependencies.
|
||||||
|
|
||||||
|
Uses separate in_progress (current path) and completed sets to correctly
|
||||||
|
handle diamond dependencies without false cycle detection.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
skill: The skill whose chain to resolve.
|
||||||
|
in_progress: Skills on the current recursion path (for cycle detection).
|
||||||
|
completed: Skills already fully resolved (skip duplicates).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Ordered list of chained skills to activate before the target.
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
SkillChainError: If a cycle is detected.
|
||||||
|
"""
|
||||||
|
if completed is None:
|
||||||
|
completed = set()
|
||||||
|
|
||||||
|
if skill.manifest is None or not skill.manifest.chain:
|
||||||
|
return []
|
||||||
|
|
||||||
|
result: list[Skill] = []
|
||||||
|
for dep_name in skill.manifest.chain:
|
||||||
|
if dep_name in completed:
|
||||||
|
continue # Already resolved via another branch (diamond dep)
|
||||||
|
|
||||||
|
if dep_name in in_progress:
|
||||||
|
raise SkillChainError(
|
||||||
|
f"Cycle detected in skill chain: {dep_name} already in progress "
|
||||||
|
f"(path: {' -> '.join(in_progress)} -> {dep_name})"
|
||||||
|
)
|
||||||
|
|
||||||
|
dep_skill = self._skills.get_skill(dep_name)
|
||||||
|
if dep_skill is None:
|
||||||
|
logger.warning("Chained skill not found: %s (required by %s)", dep_name, skill.name)
|
||||||
|
continue
|
||||||
|
|
||||||
|
in_progress.add(dep_name)
|
||||||
|
result.extend(self._resolve_chain(dep_skill, in_progress, completed))
|
||||||
|
in_progress.discard(dep_name)
|
||||||
|
completed.add(dep_name)
|
||||||
|
result.append(dep_skill)
|
||||||
|
|
||||||
|
return result
|
||||||
|
|
||||||
|
def _apply_overrides(self, manifest: SkillManifest) -> None:
|
||||||
|
"""Apply config overrides from a skill manifest."""
|
||||||
|
overrides = manifest.config_overrides
|
||||||
|
|
||||||
|
if overrides.temperature is not None:
|
||||||
|
self._config.llm.temperature = overrides.temperature
|
||||||
|
|
||||||
|
if overrides.max_tokens is not None:
|
||||||
|
self._config.llm.max_tokens = overrides.max_tokens
|
||||||
|
|
||||||
|
if overrides.tools_enable is not None or overrides.tools_disable is not None:
|
||||||
|
previous = self._registry.apply_filter(
|
||||||
|
enable=overrides.tools_enable,
|
||||||
|
disable=overrides.tools_disable,
|
||||||
|
)
|
||||||
|
# Store for restoration
|
||||||
|
if self._snapshot:
|
||||||
|
self._snapshot.disabled_tools = previous
|
||||||
@@ -1,46 +1,92 @@
|
|||||||
"""Skills manager — scans for and loads skill markdown files."""
|
"""Skills manager — scans for and loads skill packages and legacy markdown files."""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import logging
|
import logging
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
from pydantic import BaseModel
|
import yaml
|
||||||
|
from pydantic import BaseModel, ValidationError
|
||||||
|
|
||||||
from app.models.config import SkillsConfig
|
from app.models.config import SkillsConfig
|
||||||
|
from app.models.skill import SkillManifest
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
class Skill(BaseModel):
|
class Skill(BaseModel):
|
||||||
"""Metadata for a discovered skill file."""
|
"""Metadata for a discovered skill (package or legacy flat file)."""
|
||||||
|
|
||||||
name: str
|
name: str
|
||||||
description: str
|
description: str
|
||||||
path: Path
|
path: Path
|
||||||
|
manifest: SkillManifest | None = None
|
||||||
|
|
||||||
|
|
||||||
class SkillsManager:
|
class SkillsManager:
|
||||||
"""Discovers, indexes, and loads skill files from configured directories."""
|
"""Discovers, indexes, and loads skill files from configured directories.
|
||||||
|
|
||||||
|
Supports both:
|
||||||
|
- Directory-based packages (contain skill.yaml + prompt .md files)
|
||||||
|
- Legacy flat .md files (backwards compatible)
|
||||||
|
"""
|
||||||
|
|
||||||
def __init__(self, config: SkillsConfig, workspace_root: Path) -> None:
|
def __init__(self, config: SkillsConfig, workspace_root: Path) -> None:
|
||||||
self._config = config
|
self._config = config
|
||||||
self._workspace = workspace_root
|
self._workspace = workspace_root
|
||||||
self._skills: dict[str, Skill] = {}
|
self._skills: dict[str, Skill] = {}
|
||||||
|
self._trigger_map: dict[str, str] = {} # trigger -> skill name
|
||||||
self._scan()
|
self._scan()
|
||||||
|
|
||||||
def _scan(self) -> None:
|
def _scan(self) -> None:
|
||||||
"""Scan configured directories for .md skill files."""
|
"""Scan configured directories for skill packages and legacy .md files."""
|
||||||
for skill_dir in self._config.directories:
|
for skill_dir in self._config.directories:
|
||||||
resolved = (self._workspace / skill_dir) if not skill_dir.is_absolute() else skill_dir
|
resolved = (self._workspace / skill_dir) if not skill_dir.is_absolute() else skill_dir
|
||||||
if not resolved.is_dir():
|
if not resolved.is_dir():
|
||||||
logger.debug("Skills directory does not exist: %s", resolved)
|
logger.debug("Skills directory does not exist: %s", resolved)
|
||||||
continue
|
continue
|
||||||
for md in sorted(resolved.glob("*.md")):
|
|
||||||
name = md.stem
|
for entry in sorted(resolved.iterdir()):
|
||||||
desc = self._extract_description(md)
|
if entry.is_dir():
|
||||||
self._skills[name] = Skill(name=name, description=desc, path=md)
|
self._scan_package(entry)
|
||||||
logger.debug("Discovered skill: %s (%s)", name, desc)
|
elif entry.suffix == ".md":
|
||||||
|
self._scan_legacy(entry)
|
||||||
|
|
||||||
|
def _scan_package(self, pkg_dir: Path) -> None:
|
||||||
|
"""Scan a directory-based skill package containing skill.yaml."""
|
||||||
|
manifest_path = pkg_dir / "skill.yaml"
|
||||||
|
if not manifest_path.exists():
|
||||||
|
logger.debug("Skipping directory without skill.yaml: %s", pkg_dir)
|
||||||
|
return
|
||||||
|
|
||||||
|
try:
|
||||||
|
raw = yaml.safe_load(manifest_path.read_text())
|
||||||
|
manifest = SkillManifest(**raw)
|
||||||
|
except (yaml.YAMLError, ValidationError, TypeError) as e:
|
||||||
|
logger.warning("Failed to parse skill manifest %s: %s", manifest_path, e)
|
||||||
|
return
|
||||||
|
|
||||||
|
skill = Skill(
|
||||||
|
name=manifest.name,
|
||||||
|
description=manifest.description,
|
||||||
|
path=pkg_dir,
|
||||||
|
manifest=manifest,
|
||||||
|
)
|
||||||
|
self._skills[manifest.name] = skill
|
||||||
|
|
||||||
|
# Register triggers
|
||||||
|
for trigger in manifest.triggers:
|
||||||
|
normalized = trigger.lstrip("/").lower()
|
||||||
|
self._trigger_map[normalized] = manifest.name
|
||||||
|
|
||||||
|
logger.debug("Discovered skill package: %s (%s)", manifest.name, manifest.description)
|
||||||
|
|
||||||
|
def _scan_legacy(self, md_path: Path) -> None:
|
||||||
|
"""Scan a legacy flat .md skill file."""
|
||||||
|
name = md_path.stem
|
||||||
|
desc = self._extract_description(md_path)
|
||||||
|
self._skills[name] = Skill(name=name, description=desc, path=md_path)
|
||||||
|
logger.debug("Discovered legacy skill: %s (%s)", name, desc)
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def _extract_description(path: Path) -> str:
|
def _extract_description(path: Path) -> str:
|
||||||
@@ -55,10 +101,51 @@ class SkillsManager:
|
|||||||
"""Return all discovered skills."""
|
"""Return all discovered skills."""
|
||||||
return list(self._skills.values())
|
return list(self._skills.values())
|
||||||
|
|
||||||
|
def get_skill(self, name: str) -> Skill | None:
|
||||||
|
"""Look up a skill by name."""
|
||||||
|
return self._skills.get(name)
|
||||||
|
|
||||||
|
def get_skill_by_trigger(self, trigger: str) -> Skill | None:
|
||||||
|
"""Look up a skill by /command trigger.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
trigger: The trigger string (with or without leading /).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The matching Skill, or None.
|
||||||
|
"""
|
||||||
|
normalized = trigger.lstrip("/").lower()
|
||||||
|
skill_name = self._trigger_map.get(normalized)
|
||||||
|
if skill_name:
|
||||||
|
return self._skills.get(skill_name)
|
||||||
|
return None
|
||||||
|
|
||||||
def load_skill(self, name: str) -> str | None:
|
def load_skill(self, name: str) -> str | None:
|
||||||
"""Load the full content of a skill by name. Returns None if not found."""
|
"""Load the full content of a skill by name.
|
||||||
|
|
||||||
|
For package skills, concatenates all prompt .md files.
|
||||||
|
For legacy skills, returns the .md file content.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Concatenated prompt content, or None if not found.
|
||||||
|
"""
|
||||||
skill = self._skills.get(name)
|
skill = self._skills.get(name)
|
||||||
return skill.path.read_text() if skill else None
|
if skill is None:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if skill.manifest is not None:
|
||||||
|
# Package skill: load prompt files
|
||||||
|
parts: list[str] = []
|
||||||
|
for prompt_file in skill.manifest.prompts:
|
||||||
|
prompt_path = skill.path / prompt_file
|
||||||
|
if prompt_path.exists():
|
||||||
|
parts.append(prompt_path.read_text())
|
||||||
|
else:
|
||||||
|
logger.warning("Prompt file not found: %s", prompt_path)
|
||||||
|
return "\n\n".join(parts) if parts else None
|
||||||
|
else:
|
||||||
|
# Legacy flat file
|
||||||
|
return skill.path.read_text()
|
||||||
|
|
||||||
def get_system_prompt_snippet(self) -> str:
|
def get_system_prompt_snippet(self) -> str:
|
||||||
"""Generate a snippet for the system prompt listing available skills."""
|
"""Generate a snippet for the system prompt listing available skills."""
|
||||||
@@ -66,6 +153,10 @@ class SkillsManager:
|
|||||||
return ""
|
return ""
|
||||||
lines = ["\nAvailable skills (invoke with /skill-name):"]
|
lines = ["\nAvailable skills (invoke with /skill-name):"]
|
||||||
for s in self._skills.values():
|
for s in self._skills.values():
|
||||||
lines.append(f" - /{s.name}: {s.description}")
|
if s.manifest and s.manifest.triggers:
|
||||||
|
trigger_str = ", ".join(s.manifest.triggers)
|
||||||
|
lines.append(f" - {trigger_str}: {s.description}")
|
||||||
|
else:
|
||||||
|
lines.append(f" - /{s.name}: {s.description}")
|
||||||
lines.append("To use a skill's full instructions, call the load_skill tool.")
|
lines.append("To use a skill's full instructions, call the load_skill tool.")
|
||||||
return "\n".join(lines)
|
return "\n".join(lines)
|
||||||
|
|||||||
@@ -60,8 +60,10 @@ class StreamHandler:
|
|||||||
"""
|
"""
|
||||||
thinking_notified = False
|
thinking_notified = False
|
||||||
last_update_time = 0.0
|
last_update_time = 0.0
|
||||||
|
chunk_count = 0
|
||||||
|
|
||||||
async for chunk in chunk_iter:
|
async for chunk in chunk_iter:
|
||||||
|
chunk_count += 1
|
||||||
self._process_chunk(chunk)
|
self._process_chunk(chunk)
|
||||||
|
|
||||||
if not self._display_config.stream_output:
|
if not self._display_config.stream_output:
|
||||||
@@ -96,6 +98,14 @@ class StreamHandler:
|
|||||||
self._on_done()
|
self._on_done()
|
||||||
|
|
||||||
tool_calls = self._build_tool_calls() or None
|
tool_calls = self._build_tool_calls() or None
|
||||||
|
|
||||||
|
if chunk_count > 0 and not self._accumulated_content and not tool_calls:
|
||||||
|
logger.debug(
|
||||||
|
"stream_empty_result",
|
||||||
|
chunks_received=chunk_count,
|
||||||
|
had_reasoning=bool(self._accumulated_reasoning),
|
||||||
|
)
|
||||||
|
|
||||||
return Message(
|
return Message(
|
||||||
role="assistant",
|
role="assistant",
|
||||||
content=self._accumulated_content or None,
|
content=self._accumulated_content or None,
|
||||||
@@ -183,11 +193,8 @@ class StreamHandler:
|
|||||||
return bool(self._accumulated_reasoning) and not self._accumulated_content and not self._tool_calls
|
return bool(self._accumulated_reasoning) and not self._accumulated_content and not self._tool_calls
|
||||||
|
|
||||||
def reset(self) -> None:
|
def reset(self) -> None:
|
||||||
"""Clear all accumulators for the next turn."""
|
"""Clear accumulators for the next LLM call, preserving UI callbacks."""
|
||||||
self._accumulated_content = ""
|
self._accumulated_content = ""
|
||||||
self._accumulated_reasoning = ""
|
self._accumulated_reasoning = ""
|
||||||
self._tool_calls.clear()
|
self._tool_calls.clear()
|
||||||
self._usage = None
|
self._usage = None
|
||||||
self._on_content = None
|
|
||||||
self._on_thinking = None
|
|
||||||
self._on_done = None
|
|
||||||
|
|||||||
@@ -1,13 +1,17 @@
|
|||||||
"""Edit tools: str_replace and patch_apply."""
|
"""Edit tools: str_replace and patch_apply."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
import subprocess
|
import subprocess
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
from pydantic import BaseModel, Field
|
from pydantic import BaseModel, Field
|
||||||
|
|
||||||
|
from app.models.config import AppConfig
|
||||||
from app.models.tool_call import ToolResult, ToolResultStatus
|
from app.models.tool_call import ToolResult, ToolResultStatus
|
||||||
from app.tools.base import BaseTool
|
from app.tools.base import BaseTool
|
||||||
|
from app.utils.file_cache import FileCache, cached_read_file
|
||||||
from app.utils.file_helpers import (
|
from app.utils.file_helpers import (
|
||||||
FileSizeError,
|
FileSizeError,
|
||||||
PathSecurityError,
|
PathSecurityError,
|
||||||
@@ -37,6 +41,12 @@ class StrReplaceTool(BaseTool):
|
|||||||
)
|
)
|
||||||
params_model = StrReplaceParams
|
params_model = StrReplaceParams
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self, workspace_root: Path, config: AppConfig, file_cache: FileCache | None = None
|
||||||
|
) -> None:
|
||||||
|
super().__init__(workspace_root, config)
|
||||||
|
self._file_cache = file_cache
|
||||||
|
|
||||||
def execute(
|
def execute(
|
||||||
self, *, tool_call_id: str, file_path: str, old_str: str, new_str: str, **kwargs: Any
|
self, *, tool_call_id: str, file_path: str, old_str: str, new_str: str, **kwargs: Any
|
||||||
) -> ToolResult:
|
) -> ToolResult:
|
||||||
@@ -44,11 +54,12 @@ class StrReplaceTool(BaseTool):
|
|||||||
|
|
||||||
# Read the file
|
# Read the file
|
||||||
try:
|
try:
|
||||||
content = safe_read_file(
|
content = cached_read_file(
|
||||||
file_path,
|
file_path,
|
||||||
self.workspace_root,
|
self.workspace_root,
|
||||||
max_size_bytes=fs_config.max_file_size_bytes,
|
max_size_bytes=fs_config.max_file_size_bytes,
|
||||||
check_binary=fs_config.binary_detection,
|
check_binary=fs_config.binary_detection,
|
||||||
|
cache=self._file_cache,
|
||||||
)
|
)
|
||||||
except PathSecurityError as exc:
|
except PathSecurityError as exc:
|
||||||
return ToolResult(
|
return ToolResult(
|
||||||
@@ -117,8 +128,14 @@ class StrReplaceTool(BaseTool):
|
|||||||
safe_path = resolve_safe_path(file_path, self.workspace_root)
|
safe_path = resolve_safe_path(file_path, self.workspace_root)
|
||||||
rel_path = safe_path.relative_to(self.workspace_root)
|
rel_path = safe_path.relative_to(self.workspace_root)
|
||||||
except (PathSecurityError, ValueError):
|
except (PathSecurityError, ValueError):
|
||||||
|
safe_path = None
|
||||||
rel_path = Path(file_path)
|
rel_path = Path(file_path)
|
||||||
|
|
||||||
|
# Pre-warm cache with the new content (we already have it in memory).
|
||||||
|
if self._file_cache is not None and safe_path is not None:
|
||||||
|
self._file_cache.invalidate(safe_path)
|
||||||
|
self._file_cache.put(safe_path, new_content)
|
||||||
|
|
||||||
return ToolResult(
|
return ToolResult(
|
||||||
tool_call_id=tool_call_id,
|
tool_call_id=tool_call_id,
|
||||||
tool_name=self.name,
|
tool_name=self.name,
|
||||||
@@ -144,6 +161,12 @@ class PatchApplyTool(BaseTool):
|
|||||||
)
|
)
|
||||||
params_model = PatchApplyParams
|
params_model = PatchApplyParams
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self, workspace_root: Path, config: AppConfig, file_cache: FileCache | None = None
|
||||||
|
) -> None:
|
||||||
|
super().__init__(workspace_root, config)
|
||||||
|
self._file_cache = file_cache
|
||||||
|
|
||||||
def execute(self, *, tool_call_id: str, file_path: str, patch: str, **kwargs: Any) -> ToolResult:
|
def execute(self, *, tool_call_id: str, file_path: str, patch: str, **kwargs: Any) -> ToolResult:
|
||||||
try:
|
try:
|
||||||
safe_path = resolve_safe_path(file_path, self.workspace_root)
|
safe_path = resolve_safe_path(file_path, self.workspace_root)
|
||||||
@@ -195,6 +218,9 @@ class PatchApplyTool(BaseTool):
|
|||||||
error=f"Patch failed (exit {result.returncode}): {result.stderr or result.stdout}",
|
error=f"Patch failed (exit {result.returncode}): {result.stderr or result.stdout}",
|
||||||
)
|
)
|
||||||
|
|
||||||
|
if self._file_cache is not None:
|
||||||
|
self._file_cache.invalidate(safe_path)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
rel_path = safe_path.relative_to(self.workspace_root)
|
rel_path = safe_path.relative_to(self.workspace_root)
|
||||||
except ValueError:
|
except ValueError:
|
||||||
|
|||||||
@@ -1,12 +1,16 @@
|
|||||||
"""Filesystem tools: read_file, list_dir, write_file, make_dir, delete_file."""
|
"""Filesystem tools: read_file, list_dir, write_file, make_dir, delete_file."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
from pydantic import BaseModel, Field
|
from pydantic import BaseModel, Field
|
||||||
|
|
||||||
|
from app.models.config import AppConfig
|
||||||
from app.models.tool_call import ToolResult, ToolResultStatus
|
from app.models.tool_call import ToolResult, ToolResultStatus
|
||||||
from app.tools.base import BaseTool
|
from app.tools.base import BaseTool
|
||||||
|
from app.utils.file_cache import FileCache, cached_read_file
|
||||||
from app.utils.file_helpers import (
|
from app.utils.file_helpers import (
|
||||||
BinaryFileError,
|
BinaryFileError,
|
||||||
FileSizeError,
|
FileSizeError,
|
||||||
@@ -23,6 +27,12 @@ class ReadFileParams(BaseModel):
|
|||||||
file_path: str = Field(description="Path to the file to read (relative to workspace root)")
|
file_path: str = Field(description="Path to the file to read (relative to workspace root)")
|
||||||
|
|
||||||
|
|
||||||
|
class ReadManyFilesParams(BaseModel):
|
||||||
|
"""Parameters for the read_many_files tool."""
|
||||||
|
|
||||||
|
file_paths: list[str] = Field(description="List of file paths to read (relative to workspace root)")
|
||||||
|
|
||||||
|
|
||||||
class ReadFileTool(BaseTool):
|
class ReadFileTool(BaseTool):
|
||||||
"""Read the contents of a file within the workspace."""
|
"""Read the contents of a file within the workspace."""
|
||||||
|
|
||||||
@@ -30,14 +40,22 @@ class ReadFileTool(BaseTool):
|
|||||||
description = "Read the full contents of a text file. Returns the file content as a string."
|
description = "Read the full contents of a text file. Returns the file content as a string."
|
||||||
params_model = ReadFileParams
|
params_model = ReadFileParams
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self, workspace_root: Path, config: AppConfig, file_cache: FileCache | None = None
|
||||||
|
) -> None:
|
||||||
|
super().__init__(workspace_root, config)
|
||||||
|
self._file_cache = file_cache
|
||||||
|
|
||||||
def execute(self, *, tool_call_id: str, file_path: str, **kwargs: Any) -> ToolResult:
|
def execute(self, *, tool_call_id: str, file_path: str, **kwargs: Any) -> ToolResult:
|
||||||
fs_config = self.config.tools.filesystem
|
fs_config = self.config.tools.filesystem
|
||||||
|
hits_before = self._file_cache.stats.hits if self._file_cache else 0
|
||||||
try:
|
try:
|
||||||
content = safe_read_file(
|
content = cached_read_file(
|
||||||
file_path,
|
file_path,
|
||||||
self.workspace_root,
|
self.workspace_root,
|
||||||
max_size_bytes=fs_config.max_file_size_bytes,
|
max_size_bytes=fs_config.max_file_size_bytes,
|
||||||
check_binary=fs_config.binary_detection,
|
check_binary=fs_config.binary_detection,
|
||||||
|
cache=self._file_cache,
|
||||||
)
|
)
|
||||||
except PathSecurityError as exc:
|
except PathSecurityError as exc:
|
||||||
return ToolResult(
|
return ToolResult(
|
||||||
@@ -47,11 +65,12 @@ class ReadFileTool(BaseTool):
|
|||||||
error=str(exc),
|
error=str(exc),
|
||||||
)
|
)
|
||||||
except FileNotFoundError as exc:
|
except FileNotFoundError as exc:
|
||||||
|
filename = Path(file_path).name
|
||||||
return ToolResult(
|
return ToolResult(
|
||||||
tool_call_id=tool_call_id,
|
tool_call_id=tool_call_id,
|
||||||
tool_name=self.name,
|
tool_name=self.name,
|
||||||
status=ToolResultStatus.ERROR,
|
status=ToolResultStatus.ERROR,
|
||||||
error=str(exc),
|
error=f"{exc}. Use find_files to locate it, e.g. find_files(pattern=\"{filename}\")",
|
||||||
)
|
)
|
||||||
except FileSizeError as exc:
|
except FileSizeError as exc:
|
||||||
return ToolResult(
|
return ToolResult(
|
||||||
@@ -68,6 +87,23 @@ class ReadFileTool(BaseTool):
|
|||||||
error=str(exc),
|
error=str(exc),
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# On cache hit the file is unchanged — its content is already in
|
||||||
|
# conversation context from the earlier read, so avoid resending it.
|
||||||
|
was_cache_hit = (
|
||||||
|
self._file_cache is not None
|
||||||
|
and self._file_cache.stats.hits > hits_before
|
||||||
|
)
|
||||||
|
if was_cache_hit:
|
||||||
|
return ToolResult(
|
||||||
|
tool_call_id=tool_call_id,
|
||||||
|
tool_name=self.name,
|
||||||
|
status=ToolResultStatus.SUCCESS,
|
||||||
|
output=(
|
||||||
|
f"[Cached] {file_path} is unchanged since last read "
|
||||||
|
f"({len(content):,} chars). Content is already in conversation context."
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
return ToolResult(
|
return ToolResult(
|
||||||
tool_call_id=tool_call_id,
|
tool_call_id=tool_call_id,
|
||||||
tool_name=self.name,
|
tool_name=self.name,
|
||||||
@@ -76,6 +112,76 @@ class ReadFileTool(BaseTool):
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class ReadManyFilesTool(BaseTool):
|
||||||
|
"""Read contents of multiple files at once."""
|
||||||
|
|
||||||
|
name = "read_many_files"
|
||||||
|
description = (
|
||||||
|
"Read contents of multiple files at once. Returns each file's content "
|
||||||
|
"prefixed with its path header."
|
||||||
|
)
|
||||||
|
params_model = ReadManyFilesParams
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self, workspace_root: Path, config: AppConfig, file_cache: FileCache | None = None
|
||||||
|
) -> None:
|
||||||
|
super().__init__(workspace_root, config)
|
||||||
|
self._file_cache = file_cache
|
||||||
|
|
||||||
|
def execute(self, *, tool_call_id: str, file_paths: list[str], **kwargs: Any) -> ToolResult:
|
||||||
|
if not file_paths:
|
||||||
|
return ToolResult(
|
||||||
|
tool_call_id=tool_call_id,
|
||||||
|
tool_name=self.name,
|
||||||
|
status=ToolResultStatus.ERROR,
|
||||||
|
error="file_paths list is empty",
|
||||||
|
)
|
||||||
|
|
||||||
|
fs_config = self.config.tools.filesystem
|
||||||
|
sections: list[str] = []
|
||||||
|
success_count = 0
|
||||||
|
|
||||||
|
for fp in file_paths:
|
||||||
|
hits_before = self._file_cache.stats.hits if self._file_cache else 0
|
||||||
|
try:
|
||||||
|
content = cached_read_file(
|
||||||
|
fp,
|
||||||
|
self.workspace_root,
|
||||||
|
max_size_bytes=fs_config.max_file_size_bytes,
|
||||||
|
check_binary=fs_config.binary_detection,
|
||||||
|
cache=self._file_cache,
|
||||||
|
)
|
||||||
|
was_hit = (
|
||||||
|
self._file_cache is not None
|
||||||
|
and self._file_cache.stats.hits > hits_before
|
||||||
|
)
|
||||||
|
if was_hit:
|
||||||
|
sections.append(
|
||||||
|
f"=== {fp} ===\n[Cached] Unchanged since last read "
|
||||||
|
f"({len(content):,} chars). Already in conversation context."
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
sections.append(f"=== {fp} ===\n{content}")
|
||||||
|
success_count += 1
|
||||||
|
except (PathSecurityError, FileNotFoundError, FileSizeError, BinaryFileError) as exc:
|
||||||
|
sections.append(f"=== {fp} ===\n[ERROR] {exc}")
|
||||||
|
|
||||||
|
if success_count == 0:
|
||||||
|
return ToolResult(
|
||||||
|
tool_call_id=tool_call_id,
|
||||||
|
tool_name=self.name,
|
||||||
|
status=ToolResultStatus.ERROR,
|
||||||
|
error="All files failed to read:\n" + "\n".join(sections),
|
||||||
|
)
|
||||||
|
|
||||||
|
return ToolResult(
|
||||||
|
tool_call_id=tool_call_id,
|
||||||
|
tool_name=self.name,
|
||||||
|
status=ToolResultStatus.SUCCESS,
|
||||||
|
output="\n".join(sections),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
class ListDirParams(BaseModel):
|
class ListDirParams(BaseModel):
|
||||||
"""Parameters for the list_dir tool."""
|
"""Parameters for the list_dir tool."""
|
||||||
|
|
||||||
@@ -167,6 +273,12 @@ class WriteFileTool(BaseTool):
|
|||||||
)
|
)
|
||||||
params_model = WriteFileParams
|
params_model = WriteFileParams
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self, workspace_root: Path, config: AppConfig, file_cache: FileCache | None = None
|
||||||
|
) -> None:
|
||||||
|
super().__init__(workspace_root, config)
|
||||||
|
self._file_cache = file_cache
|
||||||
|
|
||||||
def execute(self, *, tool_call_id: str, file_path: str, content: str, **kwargs: Any) -> ToolResult:
|
def execute(self, *, tool_call_id: str, file_path: str, content: str, **kwargs: Any) -> ToolResult:
|
||||||
fs_config = self.config.tools.filesystem
|
fs_config = self.config.tools.filesystem
|
||||||
try:
|
try:
|
||||||
@@ -191,6 +303,9 @@ class WriteFileTool(BaseTool):
|
|||||||
error=str(exc),
|
error=str(exc),
|
||||||
)
|
)
|
||||||
|
|
||||||
|
if self._file_cache is not None:
|
||||||
|
self._file_cache.invalidate(safe_path)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
rel_path = safe_path.relative_to(self.workspace_root)
|
rel_path = safe_path.relative_to(self.workspace_root)
|
||||||
except ValueError:
|
except ValueError:
|
||||||
@@ -272,6 +387,12 @@ class DeleteFileTool(BaseTool):
|
|||||||
description = "Delete a single file. Does not delete directories."
|
description = "Delete a single file. Does not delete directories."
|
||||||
params_model = DeleteFileParams
|
params_model = DeleteFileParams
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self, workspace_root: Path, config: AppConfig, file_cache: FileCache | None = None
|
||||||
|
) -> None:
|
||||||
|
super().__init__(workspace_root, config)
|
||||||
|
self._file_cache = file_cache
|
||||||
|
|
||||||
def execute(self, *, tool_call_id: str, file_path: str, **kwargs: Any) -> ToolResult:
|
def execute(self, *, tool_call_id: str, file_path: str, **kwargs: Any) -> ToolResult:
|
||||||
try:
|
try:
|
||||||
safe_path = resolve_safe_path(file_path, self.workspace_root)
|
safe_path = resolve_safe_path(file_path, self.workspace_root)
|
||||||
@@ -309,6 +430,9 @@ class DeleteFileTool(BaseTool):
|
|||||||
error=f"Failed to delete file: {exc}",
|
error=f"Failed to delete file: {exc}",
|
||||||
)
|
)
|
||||||
|
|
||||||
|
if self._file_cache is not None:
|
||||||
|
self._file_cache.invalidate(safe_path)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
rel_path = safe_path.relative_to(self.workspace_root)
|
rel_path = safe_path.relative_to(self.workspace_root)
|
||||||
except ValueError:
|
except ValueError:
|
||||||
|
|||||||
@@ -8,6 +8,7 @@ from typing import TYPE_CHECKING, Any
|
|||||||
|
|
||||||
from app.models.config import AppConfig
|
from app.models.config import AppConfig
|
||||||
from app.tools.base import BaseTool
|
from app.tools.base import BaseTool
|
||||||
|
from app.utils.file_cache import FileCache
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
if TYPE_CHECKING:
|
||||||
from app.services.skills import SkillsManager
|
from app.services.skills import SkillsManager
|
||||||
@@ -20,6 +21,7 @@ class ToolRegistry:
|
|||||||
|
|
||||||
def __init__(self) -> None:
|
def __init__(self) -> None:
|
||||||
self._tools: dict[str, BaseTool] = {}
|
self._tools: dict[str, BaseTool] = {}
|
||||||
|
self._disabled: set[str] = set()
|
||||||
|
|
||||||
def register(self, tool: BaseTool) -> None:
|
def register(self, tool: BaseTool) -> None:
|
||||||
"""Register a tool instance. Raises ValueError on duplicate name."""
|
"""Register a tool instance. Raises ValueError on duplicate name."""
|
||||||
@@ -29,26 +31,78 @@ class ToolRegistry:
|
|||||||
logger.debug("Registered tool: %s", tool.name)
|
logger.debug("Registered tool: %s", tool.name)
|
||||||
|
|
||||||
def get(self, name: str) -> BaseTool | None:
|
def get(self, name: str) -> BaseTool | None:
|
||||||
"""Look up a tool by name."""
|
"""Look up a tool by name. Returns None if disabled or not found."""
|
||||||
|
if name in self._disabled:
|
||||||
|
return None
|
||||||
return self._tools.get(name)
|
return self._tools.get(name)
|
||||||
|
|
||||||
def get_all(self) -> dict[str, BaseTool]:
|
def get_all(self) -> dict[str, BaseTool]:
|
||||||
"""Return all registered tools."""
|
"""Return all registered tools (excluding disabled)."""
|
||||||
return dict(self._tools)
|
return {k: v for k, v in self._tools.items() if k not in self._disabled}
|
||||||
|
|
||||||
def get_openai_tools_schema(self) -> list[dict[str, Any]]:
|
def get_openai_tools_schema(self) -> list[dict[str, Any]]:
|
||||||
"""Return OpenAI function-calling schemas for all registered tools."""
|
"""Return OpenAI function-calling schemas for all active tools."""
|
||||||
return [tool.get_openai_schema() for tool in self._tools.values()]
|
return [
|
||||||
|
tool.get_openai_schema()
|
||||||
|
for tool in self._tools.values()
|
||||||
|
if tool.name not in self._disabled
|
||||||
|
]
|
||||||
|
|
||||||
|
def apply_filter(
|
||||||
|
self,
|
||||||
|
*,
|
||||||
|
enable: list[str] | None = None,
|
||||||
|
disable: list[str] | None = None,
|
||||||
|
) -> set[str]:
|
||||||
|
"""Apply a tool filter, returning the previous disabled set for restoration.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
enable: If set, only these tools (plus always-on tools) are available.
|
||||||
|
disable: Specific tools to disable.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The previous disabled set (snapshot for restore).
|
||||||
|
"""
|
||||||
|
previous = set(self._disabled)
|
||||||
|
|
||||||
|
if enable is not None:
|
||||||
|
# Whitelist mode: disable everything not in the enable list
|
||||||
|
self._disabled = {name for name in self._tools if name not in enable}
|
||||||
|
elif disable is not None:
|
||||||
|
# Blacklist mode: add to existing disabled set (preserves global disables)
|
||||||
|
self._disabled = set(self._disabled) | set(disable)
|
||||||
|
else:
|
||||||
|
self._disabled = set()
|
||||||
|
|
||||||
|
return previous
|
||||||
|
|
||||||
|
def restore_filter(self, previous: set[str]) -> None:
|
||||||
|
"""Restore a previous filter state."""
|
||||||
|
self._disabled = previous
|
||||||
|
|
||||||
|
def all_tool_names(self) -> list[str]:
|
||||||
|
"""Return all registered tool names (including disabled)."""
|
||||||
|
return list(self._tools.keys())
|
||||||
|
|
||||||
|
|
||||||
def create_default_registry(
|
def create_default_registry(
|
||||||
workspace_root: Path,
|
workspace_root: Path,
|
||||||
config: AppConfig,
|
config: AppConfig,
|
||||||
skills_manager: SkillsManager | None = None,
|
skills_manager: SkillsManager | None = None,
|
||||||
|
skill_runner: object | None = None,
|
||||||
|
file_cache: FileCache | None = None,
|
||||||
) -> ToolRegistry:
|
) -> ToolRegistry:
|
||||||
"""Create a ToolRegistry populated with all built-in tools."""
|
"""Create a ToolRegistry populated with all built-in tools.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
workspace_root: Workspace root path.
|
||||||
|
config: Application configuration.
|
||||||
|
skills_manager: Optional skills manager for skill tools.
|
||||||
|
skill_runner: Optional SkillRunner for package skill activation.
|
||||||
|
file_cache: Optional file cache shared across file-reading tools.
|
||||||
|
"""
|
||||||
# Read tools
|
# Read tools
|
||||||
from app.tools.filesystem import ListDirTool, ReadFileTool
|
from app.tools.filesystem import ListDirTool, ReadFileTool, ReadManyFilesTool
|
||||||
|
|
||||||
# Write tools
|
# Write tools
|
||||||
from app.tools.filesystem import DeleteFileTool, MakeDirTool, WriteFileTool
|
from app.tools.filesystem import DeleteFileTool, MakeDirTool, WriteFileTool
|
||||||
@@ -68,7 +122,8 @@ def create_default_registry(
|
|||||||
registry = ToolRegistry()
|
registry = ToolRegistry()
|
||||||
|
|
||||||
# Read
|
# Read
|
||||||
registry.register(ReadFileTool(workspace_root, config))
|
registry.register(ReadFileTool(workspace_root, config, file_cache=file_cache))
|
||||||
|
registry.register(ReadManyFilesTool(workspace_root, config, file_cache=file_cache))
|
||||||
registry.register(ListDirTool(workspace_root, config))
|
registry.register(ListDirTool(workspace_root, config))
|
||||||
|
|
||||||
# Search
|
# Search
|
||||||
@@ -76,13 +131,13 @@ def create_default_registry(
|
|||||||
registry.register(FindFilesTool(workspace_root, config))
|
registry.register(FindFilesTool(workspace_root, config))
|
||||||
|
|
||||||
# Write
|
# Write
|
||||||
registry.register(WriteFileTool(workspace_root, config))
|
registry.register(WriteFileTool(workspace_root, config, file_cache=file_cache))
|
||||||
registry.register(MakeDirTool(workspace_root, config))
|
registry.register(MakeDirTool(workspace_root, config))
|
||||||
registry.register(DeleteFileTool(workspace_root, config))
|
registry.register(DeleteFileTool(workspace_root, config, file_cache=file_cache))
|
||||||
|
|
||||||
# Edit
|
# Edit
|
||||||
registry.register(StrReplaceTool(workspace_root, config))
|
registry.register(StrReplaceTool(workspace_root, config, file_cache=file_cache))
|
||||||
registry.register(PatchApplyTool(workspace_root, config))
|
registry.register(PatchApplyTool(workspace_root, config, file_cache=file_cache))
|
||||||
|
|
||||||
# Shell
|
# Shell
|
||||||
registry.register(RunCommandTool(workspace_root, config))
|
registry.register(RunCommandTool(workspace_root, config))
|
||||||
@@ -92,8 +147,11 @@ def create_default_registry(
|
|||||||
|
|
||||||
# Skills (conditional)
|
# Skills (conditional)
|
||||||
if skills_manager is not None:
|
if skills_manager is not None:
|
||||||
from app.tools.skills import LoadSkillTool
|
from app.services.skill_runner import SkillRunner as SkillRunnerType
|
||||||
|
from app.tools.skills import FinishSkillTool, LoadSkillTool
|
||||||
|
|
||||||
registry.register(LoadSkillTool(workspace_root, config, skills_manager))
|
runner = skill_runner if isinstance(skill_runner, SkillRunnerType) else None
|
||||||
|
registry.register(LoadSkillTool(workspace_root, config, skills_manager, runner))
|
||||||
|
registry.register(FinishSkillTool(workspace_root, config, runner))
|
||||||
|
|
||||||
return registry
|
return registry
|
||||||
|
|||||||
@@ -1,5 +1,6 @@
|
|||||||
"""Shell tool: run_command."""
|
"""Shell tool: run_command."""
|
||||||
|
|
||||||
|
import re
|
||||||
import shlex
|
import shlex
|
||||||
import subprocess
|
import subprocess
|
||||||
from typing import Any
|
from typing import Any
|
||||||
@@ -11,6 +12,9 @@ from app.tools.base import BaseTool
|
|||||||
|
|
||||||
_DEFAULT_TIMEOUT = 30
|
_DEFAULT_TIMEOUT = 30
|
||||||
|
|
||||||
|
# Detect shell redirects that write to files (>, >>, heredocs)
|
||||||
|
_WRITE_REDIRECT_PATTERN = re.compile(r"(?:>\s*\S|>>|<<)")
|
||||||
|
|
||||||
|
|
||||||
class RunCommandParams(BaseModel):
|
class RunCommandParams(BaseModel):
|
||||||
"""Parameters for the run_command tool."""
|
"""Parameters for the run_command tool."""
|
||||||
@@ -43,6 +47,18 @@ class RunCommandTool(BaseTool):
|
|||||||
error=f"Command denied: matches blocked prefix '{denied}'",
|
error=f"Command denied: matches blocked prefix '{denied}'",
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# Defense-in-depth: flag file-write redirects in tool result
|
||||||
|
if _WRITE_REDIRECT_PATTERN.search(command):
|
||||||
|
return ToolResult(
|
||||||
|
tool_call_id=tool_call_id,
|
||||||
|
tool_name=self.name,
|
||||||
|
status=ToolResultStatus.ERROR,
|
||||||
|
error=(
|
||||||
|
f"Command contains file-write redirect (>, >>, or <<) "
|
||||||
|
f"which bypasses file-write permissions. Use write_file instead."
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
# Allow check: first token must be in allowed_commands
|
# Allow check: first token must be in allowed_commands
|
||||||
try:
|
try:
|
||||||
tokens = shlex.split(command)
|
tokens = shlex.split(command)
|
||||||
|
|||||||
@@ -1,17 +1,20 @@
|
|||||||
"""Load skill tool — allows the LLM to load skill instructions on demand."""
|
"""Skill tools — load and finish skills during agent operation."""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from typing import Any, ClassVar
|
from typing import TYPE_CHECKING, Any, ClassVar
|
||||||
|
|
||||||
from pydantic import BaseModel, Field
|
from pydantic import BaseModel, Field
|
||||||
|
|
||||||
from app.models.config import AppConfig
|
from app.models.config import AppConfig
|
||||||
from app.models.tool_call import ToolResult, ToolResultStatus
|
from app.models.tool_call import ToolResult, ToolResultStatus
|
||||||
from app.services.skills import SkillsManager
|
|
||||||
from app.tools.base import BaseTool
|
from app.tools.base import BaseTool
|
||||||
|
|
||||||
|
if TYPE_CHECKING:
|
||||||
|
from app.services.skill_runner import SkillRunner
|
||||||
|
from app.services.skills import SkillsManager
|
||||||
|
|
||||||
|
|
||||||
class LoadSkillParams(BaseModel):
|
class LoadSkillParams(BaseModel):
|
||||||
"""Parameters for the load_skill tool."""
|
"""Parameters for the load_skill tool."""
|
||||||
@@ -23,6 +26,8 @@ class LoadSkillTool(BaseTool):
|
|||||||
"""Load a skill's full instructions by name.
|
"""Load a skill's full instructions by name.
|
||||||
|
|
||||||
Use when a skill is relevant to the current task.
|
Use when a skill is relevant to the current task.
|
||||||
|
For package skills, this activates the full skill lifecycle
|
||||||
|
(config overrides, chaining, prompt injection).
|
||||||
"""
|
"""
|
||||||
|
|
||||||
name: ClassVar[str] = "load_skill"
|
name: ClassVar[str] = "load_skill"
|
||||||
@@ -37,14 +42,22 @@ class LoadSkillTool(BaseTool):
|
|||||||
workspace_root: Path,
|
workspace_root: Path,
|
||||||
config: AppConfig,
|
config: AppConfig,
|
||||||
skills_manager: SkillsManager,
|
skills_manager: SkillsManager,
|
||||||
|
skill_runner: SkillRunner | None = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
super().__init__(workspace_root, config)
|
super().__init__(workspace_root, config)
|
||||||
self._skills = skills_manager
|
self._skills = skills_manager
|
||||||
|
self._runner = skill_runner
|
||||||
|
|
||||||
|
def set_skill_runner(self, runner: SkillRunner) -> None:
|
||||||
|
"""Late-bind the SkillRunner (avoids circular init dependencies)."""
|
||||||
|
self._runner = runner
|
||||||
|
|
||||||
def execute(self, *, tool_call_id: str, **kwargs: Any) -> ToolResult:
|
def execute(self, *, tool_call_id: str, **kwargs: Any) -> ToolResult:
|
||||||
skill_name: str = kwargs["name"]
|
skill_name: str = kwargs["name"]
|
||||||
content = self._skills.load_skill(skill_name)
|
|
||||||
if content is None:
|
# Check if skill exists
|
||||||
|
skill = self._skills.get_skill(skill_name)
|
||||||
|
if skill is None:
|
||||||
available = [s.name for s in self._skills.list_skills()]
|
available = [s.name for s in self._skills.list_skills()]
|
||||||
return ToolResult(
|
return ToolResult(
|
||||||
tool_call_id=tool_call_id,
|
tool_call_id=tool_call_id,
|
||||||
@@ -52,9 +65,94 @@ class LoadSkillTool(BaseTool):
|
|||||||
status=ToolResultStatus.ERROR,
|
status=ToolResultStatus.ERROR,
|
||||||
error=f"Unknown skill '{skill_name}'. Available: {available}",
|
error=f"Unknown skill '{skill_name}'. Available: {available}",
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# For package skills with a runner, use full activation flow
|
||||||
|
if skill.manifest is not None and self._runner is not None:
|
||||||
|
content = self._runner.activate(skill_name)
|
||||||
|
if content is None:
|
||||||
|
return ToolResult(
|
||||||
|
tool_call_id=tool_call_id,
|
||||||
|
tool_name=self.name,
|
||||||
|
status=ToolResultStatus.ERROR,
|
||||||
|
error=f"Failed to activate skill '{skill_name}'",
|
||||||
|
)
|
||||||
|
return ToolResult(
|
||||||
|
tool_call_id=tool_call_id,
|
||||||
|
tool_name=self.name,
|
||||||
|
status=ToolResultStatus.SUCCESS,
|
||||||
|
output=f"Skill '{skill_name}' activated.\n\n{content}",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Legacy skill: just load content
|
||||||
|
content = self._skills.load_skill(skill_name)
|
||||||
|
if content is None:
|
||||||
|
return ToolResult(
|
||||||
|
tool_call_id=tool_call_id,
|
||||||
|
tool_name=self.name,
|
||||||
|
status=ToolResultStatus.ERROR,
|
||||||
|
error=f"Failed to load skill '{skill_name}'",
|
||||||
|
)
|
||||||
return ToolResult(
|
return ToolResult(
|
||||||
tool_call_id=tool_call_id,
|
tool_call_id=tool_call_id,
|
||||||
tool_name=self.name,
|
tool_name=self.name,
|
||||||
status=ToolResultStatus.SUCCESS,
|
status=ToolResultStatus.SUCCESS,
|
||||||
output=content,
|
output=content,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class FinishSkillParams(BaseModel):
|
||||||
|
"""Parameters for the finish_skill tool."""
|
||||||
|
|
||||||
|
summary: str = Field(
|
||||||
|
default="Skill complete.",
|
||||||
|
description="Brief summary of what was accomplished during the skill",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class FinishSkillTool(BaseTool):
|
||||||
|
"""Signal that the active skill is complete and should be deactivated.
|
||||||
|
|
||||||
|
Restores config overrides and tool availability to pre-skill state.
|
||||||
|
The agent loop continues after this (unlike the finish tool).
|
||||||
|
"""
|
||||||
|
|
||||||
|
name: ClassVar[str] = "finish_skill"
|
||||||
|
description: ClassVar[str] = (
|
||||||
|
"Call this when the active skill's task is complete. "
|
||||||
|
"Deactivates the skill and restores normal config. "
|
||||||
|
"The conversation continues after this."
|
||||||
|
)
|
||||||
|
params_model: ClassVar[type[BaseModel]] = FinishSkillParams
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
workspace_root: Path,
|
||||||
|
config: AppConfig,
|
||||||
|
skill_runner: SkillRunner | None = None,
|
||||||
|
) -> None:
|
||||||
|
super().__init__(workspace_root, config)
|
||||||
|
self._runner = skill_runner
|
||||||
|
|
||||||
|
def set_skill_runner(self, runner: SkillRunner) -> None:
|
||||||
|
"""Late-bind the SkillRunner (avoids circular init dependencies)."""
|
||||||
|
self._runner = runner
|
||||||
|
|
||||||
|
def execute(self, *, tool_call_id: str, **kwargs: Any) -> ToolResult:
|
||||||
|
summary: str = kwargs.get("summary", "Skill complete.")
|
||||||
|
|
||||||
|
if self._runner is None or not self._runner.is_active:
|
||||||
|
return ToolResult(
|
||||||
|
tool_call_id=tool_call_id,
|
||||||
|
tool_name=self.name,
|
||||||
|
status=ToolResultStatus.ERROR,
|
||||||
|
error="No skill is currently active.",
|
||||||
|
)
|
||||||
|
|
||||||
|
skill_name = self._runner.active_skill_name
|
||||||
|
self._runner.deactivate(summary=summary)
|
||||||
|
return ToolResult(
|
||||||
|
tool_call_id=tool_call_id,
|
||||||
|
tool_name=self.name,
|
||||||
|
status=ToolResultStatus.SUCCESS,
|
||||||
|
output=f"Skill '{skill_name}' completed: {summary}",
|
||||||
|
)
|
||||||
|
|||||||
173
app/ui/app.py
173
app/ui/app.py
@@ -10,17 +10,19 @@ from rich.panel import Panel
|
|||||||
from rich.text import Text
|
from rich.text import Text
|
||||||
from textual.app import App, ComposeResult
|
from textual.app import App, ComposeResult
|
||||||
from textual.binding import Binding
|
from textual.binding import Binding
|
||||||
from textual.widgets import Header, RichLog
|
from textual.widgets import Input, RichLog
|
||||||
|
from textual import work
|
||||||
|
|
||||||
from app.agent.context import SessionContext
|
from app.agent.context import SessionContext
|
||||||
from app.agent.loop import AgentLoop
|
from app.agent.loop import AgentLoop
|
||||||
from app.models.config import AppConfig
|
from app.models.config import AgentMode, AppConfig
|
||||||
from app.services.llm import LLMClient
|
from app.services.llm import LLMClient
|
||||||
from app.services.permissions import PermissionsService
|
from app.services.permissions import PermissionsService
|
||||||
from app.services.session import SessionManager
|
from app.services.session import SessionManager
|
||||||
from app.services.streaming import StreamHandler
|
from app.services.streaming import StreamHandler
|
||||||
from app.tools.registry import create_default_registry
|
from app.tools.registry import create_default_registry
|
||||||
from app.ui.widgets import (
|
from app.ui.widgets import (
|
||||||
|
HeaderPanel,
|
||||||
HistoryInput,
|
HistoryInput,
|
||||||
PermissionModal,
|
PermissionModal,
|
||||||
SessionResumeModal,
|
SessionResumeModal,
|
||||||
@@ -44,6 +46,7 @@ class SneakyCodeApp(App):
|
|||||||
|
|
||||||
BINDINGS = [
|
BINDINGS = [
|
||||||
Binding("ctrl+c", "cancel_or_quit", "Cancel/Quit", show=False),
|
Binding("ctrl+c", "cancel_or_quit", "Cancel/Quit", show=False),
|
||||||
|
Binding("ctrl+p", "cycle_mode", "Cycle Mode"),
|
||||||
]
|
]
|
||||||
|
|
||||||
def __init__(self, config: AppConfig, session_mgr: SessionManager | None = None) -> None:
|
def __init__(self, config: AppConfig, session_mgr: SessionManager | None = None) -> None:
|
||||||
@@ -57,12 +60,12 @@ class SneakyCodeApp(App):
|
|||||||
self._permissions: PermissionsService | None = None
|
self._permissions: PermissionsService | None = None
|
||||||
self._debug_logger = None
|
self._debug_logger = None
|
||||||
self._skills_manager = None
|
self._skills_manager = None
|
||||||
|
self._skill_runner = None
|
||||||
self._current_worker: Worker | None = None
|
self._current_worker: Worker | None = None
|
||||||
self._cancel_count = 0
|
self._cancel_count = 0
|
||||||
self.sub_title = config.llm.model
|
|
||||||
|
|
||||||
def compose(self) -> ComposeResult:
|
def compose(self) -> ComposeResult:
|
||||||
yield Header()
|
yield HeaderPanel(model_name=self._config.llm.model)
|
||||||
yield RichLog(id="chat-log", highlight=True, markup=True)
|
yield RichLog(id="chat-log", highlight=True, markup=True)
|
||||||
yield StreamingStatic("", id="streaming")
|
yield StreamingStatic("", id="streaming")
|
||||||
yield StatusBar()
|
yield StatusBar()
|
||||||
@@ -72,6 +75,9 @@ class SneakyCodeApp(App):
|
|||||||
"""Initialize agent components after the app is mounted."""
|
"""Initialize agent components after the app is mounted."""
|
||||||
setup_logging_for_tui()
|
setup_logging_for_tui()
|
||||||
|
|
||||||
|
# Apply model profile for the initial model before creating context
|
||||||
|
self._config.apply_model_profile(self._config.llm.model)
|
||||||
|
|
||||||
self._ctx = SessionContext(self._config)
|
self._ctx = SessionContext(self._config)
|
||||||
|
|
||||||
# Create long-lived agent dependencies (reused across turns)
|
# Create long-lived agent dependencies (reused across turns)
|
||||||
@@ -94,19 +100,52 @@ class SneakyCodeApp(App):
|
|||||||
self._config.skills, self._config.agent.workspace_root
|
self._config.skills, self._config.agent.workspace_root
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# Create file cache if enabled
|
||||||
|
self._file_cache = None
|
||||||
|
fs_cache_cfg = self._config.tools.filesystem.cache
|
||||||
|
if fs_cache_cfg.enabled:
|
||||||
|
from app.utils.file_cache import FileCache
|
||||||
|
|
||||||
|
self._file_cache = FileCache(max_entries=fs_cache_cfg.max_entries)
|
||||||
|
|
||||||
|
# Create tool registry (SkillRunner wired after registry exists)
|
||||||
self._tool_registry = create_default_registry(
|
self._tool_registry = create_default_registry(
|
||||||
self._config.agent.workspace_root,
|
self._config.agent.workspace_root,
|
||||||
self._config,
|
self._config,
|
||||||
skills_manager=self._skills_manager,
|
skills_manager=self._skills_manager,
|
||||||
|
file_cache=self._file_cache,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# Create SkillRunner and late-bind it to skill tools
|
||||||
|
if self._skills_manager is not None and self._tool_registry is not None:
|
||||||
|
from app.services.skill_runner import SkillRunner
|
||||||
|
|
||||||
|
self._skill_runner = SkillRunner(
|
||||||
|
self._skills_manager,
|
||||||
|
self._config,
|
||||||
|
self._ctx,
|
||||||
|
self._tool_registry,
|
||||||
|
)
|
||||||
|
# Late-bind runner to skill tools already in the registry
|
||||||
|
load_tool = self._tool_registry.get("load_skill")
|
||||||
|
if load_tool and hasattr(load_tool, "set_skill_runner"):
|
||||||
|
load_tool.set_skill_runner(self._skill_runner)
|
||||||
|
finish_tool = self._tool_registry.get("finish_skill")
|
||||||
|
if finish_tool and hasattr(finish_tool, "set_skill_runner"):
|
||||||
|
finish_tool.set_skill_runner(self._skill_runner)
|
||||||
|
|
||||||
# Set up permission prompt callback
|
# Set up permission prompt callback
|
||||||
async def permission_prompt(tool_name: str, description: str) -> bool:
|
async def permission_prompt(tool_name: str, description: str) -> bool:
|
||||||
return await self._show_permission_modal(tool_name, description)
|
return await self._show_permission_modal(tool_name, description)
|
||||||
|
|
||||||
self._permissions.set_prompt_callback(permission_prompt)
|
self._permissions.set_prompt_callback(permission_prompt)
|
||||||
|
|
||||||
# Offer session resume if configured
|
# Offer session resume if configured (must run in a worker for push_screen_wait)
|
||||||
|
self._offer_session_resume()
|
||||||
|
|
||||||
|
@work
|
||||||
|
async def _offer_session_resume(self) -> None:
|
||||||
|
"""Offer to resume a previous session, running in a worker for modal support."""
|
||||||
if self._session_mgr and self._config.session.offer_resume:
|
if self._session_mgr and self._config.session.offer_resume:
|
||||||
saved = self._session_mgr.load_latest()
|
saved = self._session_mgr.load_latest()
|
||||||
if saved:
|
if saved:
|
||||||
@@ -118,7 +157,6 @@ class SneakyCodeApp(App):
|
|||||||
log.write(Text("Session restored", style="bold green"))
|
log.write(Text("Session restored", style="bold green"))
|
||||||
else:
|
else:
|
||||||
log.write(Text("Starting fresh session", style="cyan"))
|
log.write(Text("Starting fresh session", style="cyan"))
|
||||||
|
|
||||||
self.query_one(HistoryInput).focus()
|
self.query_one(HistoryInput).focus()
|
||||||
|
|
||||||
async def on_input_submitted(self, event: Input.Submitted) -> None:
|
async def on_input_submitted(self, event: Input.Submitted) -> None:
|
||||||
@@ -131,6 +169,10 @@ class SneakyCodeApp(App):
|
|||||||
event.input.record(user_input)
|
event.input.record(user_input)
|
||||||
log = self.query_one("#chat-log", RichLog)
|
log = self.query_one("#chat-log", RichLog)
|
||||||
|
|
||||||
|
# Echo user prompt (condensed for multi-line)
|
||||||
|
from app.utils.display import render_user_message
|
||||||
|
log.write(render_user_message(user_input))
|
||||||
|
|
||||||
# Handle slash commands
|
# Handle slash commands
|
||||||
if user_input.startswith("/"):
|
if user_input.startswith("/"):
|
||||||
await self._handle_slash_command(user_input, log)
|
await self._handle_slash_command(user_input, log)
|
||||||
@@ -147,7 +189,26 @@ class SneakyCodeApp(App):
|
|||||||
async def _handle_slash_command(self, command: str, log: RichLog) -> None:
|
async def _handle_slash_command(self, command: str, log: RichLog) -> None:
|
||||||
"""Process slash commands."""
|
"""Process slash commands."""
|
||||||
cmd = command.lower()
|
cmd = command.lower()
|
||||||
if cmd == "/quit":
|
if cmd == "/help":
|
||||||
|
from rich.table import Table
|
||||||
|
|
||||||
|
table = Table(title="SneakyCode Commands", show_lines=False)
|
||||||
|
table.add_column("Command", style="cyan", no_wrap=True)
|
||||||
|
table.add_column("Description")
|
||||||
|
table.add_row("/help", "Show this help message")
|
||||||
|
table.add_row("/quit, /exit, /bye", "Save session and exit")
|
||||||
|
table.add_row("/clear", "Clear conversation history")
|
||||||
|
table.add_row("/history", "Show conversation history")
|
||||||
|
table.add_row("/save", "Manually save session")
|
||||||
|
table.add_row("/session", "Show session info (messages, tokens, start time)")
|
||||||
|
table.add_row("/models, /model", "List available Ollama models")
|
||||||
|
table.add_row("/model <name>", "Switch to a different model")
|
||||||
|
table.add_row("/mode", "Show current agent mode")
|
||||||
|
table.add_row("/mode normal|plan|auto", "Switch agent mode")
|
||||||
|
table.add_row("/skills", "List available skills")
|
||||||
|
table.add_row("/<skill>", "Load a skill by name")
|
||||||
|
log.write(table)
|
||||||
|
elif cmd in ("/quit", "/exit", "/bye"):
|
||||||
self._save_session()
|
self._save_session()
|
||||||
self.exit()
|
self.exit()
|
||||||
elif cmd == "/clear":
|
elif cmd == "/clear":
|
||||||
@@ -173,7 +234,7 @@ class SneakyCodeApp(App):
|
|||||||
f"Started: {self._ctx.start_time.isoformat()}",
|
f"Started: {self._ctx.start_time.isoformat()}",
|
||||||
style="cyan",
|
style="cyan",
|
||||||
))
|
))
|
||||||
elif cmd.startswith("/models"):
|
elif cmd.split()[0] in ("/models", "/model"):
|
||||||
parts = command.split(maxsplit=1)
|
parts = command.split(maxsplit=1)
|
||||||
if len(parts) == 1:
|
if len(parts) == 1:
|
||||||
# List available models
|
# List available models
|
||||||
@@ -197,8 +258,44 @@ class SneakyCodeApp(App):
|
|||||||
else:
|
else:
|
||||||
new_model = parts[1].strip()
|
new_model = parts[1].strip()
|
||||||
self._config.llm.model = new_model
|
self._config.llm.model = new_model
|
||||||
self.sub_title = new_model
|
if self._session_mgr:
|
||||||
log.write(Text(f"Switched to model: {new_model}", style="bold green"))
|
self._session_mgr.update_model(new_model)
|
||||||
|
# Apply model-specific profile overrides
|
||||||
|
profile = self._config.apply_model_profile(new_model)
|
||||||
|
if profile and self._ctx:
|
||||||
|
# Update token budget if the profile overrides it
|
||||||
|
self._ctx.token_counter.budget = self._config.agent.max_conversation_tokens
|
||||||
|
self.query_one(HeaderPanel).update_model(new_model)
|
||||||
|
header = self.query_one(HeaderPanel)
|
||||||
|
header.update_tokens(
|
||||||
|
self._ctx.estimated_tokens if self._ctx else 0,
|
||||||
|
self._config.agent.max_conversation_tokens,
|
||||||
|
)
|
||||||
|
msg = f"Switched to model: {new_model}"
|
||||||
|
if profile:
|
||||||
|
overrides = []
|
||||||
|
if profile.max_conversation_tokens is not None:
|
||||||
|
overrides.append(f"tokens={profile.max_conversation_tokens:,}")
|
||||||
|
if profile.thinking is not None:
|
||||||
|
overrides.append(f"thinking={'on' if profile.thinking else 'off'}")
|
||||||
|
if overrides:
|
||||||
|
msg += f" ({', '.join(overrides)})"
|
||||||
|
log.write(Text(msg, style="bold green"))
|
||||||
|
elif cmd.split()[0] == "/mode":
|
||||||
|
parts = command.split(maxsplit=1)
|
||||||
|
if len(parts) == 1:
|
||||||
|
current = self._permissions.mode
|
||||||
|
log.write(Text(f"Current mode: {current.value}", style="cyan"))
|
||||||
|
else:
|
||||||
|
mode_str = parts[1].strip().lower()
|
||||||
|
try:
|
||||||
|
new_mode = AgentMode(mode_str)
|
||||||
|
except ValueError:
|
||||||
|
log.write(Text(f"Unknown mode: {mode_str}. Use normal, plan, or auto.", style="yellow"))
|
||||||
|
return
|
||||||
|
self._permissions.mode = new_mode
|
||||||
|
self.query_one(HeaderPanel).update_mode(new_mode)
|
||||||
|
log.write(Text(f"Switched to {new_mode.value} mode", style="bold green"))
|
||||||
elif cmd == "/skills":
|
elif cmd == "/skills":
|
||||||
if self._skills_manager:
|
if self._skills_manager:
|
||||||
skills = self._skills_manager.list_skills()
|
skills = self._skills_manager.list_skills()
|
||||||
@@ -216,7 +313,24 @@ class SneakyCodeApp(App):
|
|||||||
else:
|
else:
|
||||||
log.write(Text("Skills system is disabled", style="yellow"))
|
log.write(Text("Skills system is disabled", style="yellow"))
|
||||||
else:
|
else:
|
||||||
# Try as skill invocation
|
# Try as skill trigger (package skill via SkillRunner)
|
||||||
|
if self._skill_runner and self._skills_manager:
|
||||||
|
skill = self._skills_manager.get_skill_by_trigger(cmd.lstrip("/"))
|
||||||
|
if skill is not None:
|
||||||
|
content = self._skill_runner.activate(skill.name)
|
||||||
|
status_bar = self.query_one(StatusBar)
|
||||||
|
status_bar.set_active_skill(skill.name)
|
||||||
|
log.write(Text(f"Skill activated: {skill.name}", style="bold green"))
|
||||||
|
# Run an agent turn so the LLM sees the skill context
|
||||||
|
self._cancel_count = 0
|
||||||
|
self._current_worker = self.run_worker(
|
||||||
|
self._run_agent_turn(f"[Skill activated: {skill.name}]"),
|
||||||
|
name="agent-turn",
|
||||||
|
exclusive=True,
|
||||||
|
)
|
||||||
|
return
|
||||||
|
|
||||||
|
# Try as legacy skill invocation
|
||||||
skill_name = cmd.lstrip("/")
|
skill_name = cmd.lstrip("/")
|
||||||
if self._skills_manager:
|
if self._skills_manager:
|
||||||
content = self._skills_manager.load_skill(skill_name)
|
content = self._skills_manager.load_skill(skill_name)
|
||||||
@@ -225,7 +339,7 @@ class SneakyCodeApp(App):
|
|||||||
self._ctx.add_message("system", f"[Skill: {skill_name}]\n{content}")
|
self._ctx.add_message("system", f"[Skill: {skill_name}]\n{content}")
|
||||||
log.write(Text(f"Loaded skill: {skill_name}", style="bold green"))
|
log.write(Text(f"Loaded skill: {skill_name}", style="bold green"))
|
||||||
return
|
return
|
||||||
log.write(Text(f"⚠ Unknown command: {command}", style="yellow"))
|
log.write(Text(f"Unknown command: {command}", style="yellow"))
|
||||||
|
|
||||||
async def _run_agent_turn(self, user_input: str) -> None:
|
async def _run_agent_turn(self, user_input: str) -> None:
|
||||||
"""Run a single agent turn (called as a worker)."""
|
"""Run a single agent turn (called as a worker)."""
|
||||||
@@ -243,12 +357,19 @@ class SneakyCodeApp(App):
|
|||||||
status_bar.start_streaming()
|
status_bar.start_streaming()
|
||||||
|
|
||||||
# Set up streaming UI callbacks
|
# Set up streaming UI callbacks
|
||||||
|
header = self.query_one(HeaderPanel)
|
||||||
|
|
||||||
def on_content(content: str) -> None:
|
def on_content(content: str) -> None:
|
||||||
streaming_widget.update(
|
streaming_widget.update(
|
||||||
Panel(Markdown(content), title="Assistant", border_style="green", expand=True)
|
Panel(Markdown(content), title="Assistant", border_style="green", expand=True)
|
||||||
)
|
)
|
||||||
streaming_widget.show_streaming()
|
streaming_widget.show_streaming()
|
||||||
status_bar.update_stream_tokens(len(content) // 4)
|
stream_tokens = len(content) // 4
|
||||||
|
status_bar.update_stream_tokens(stream_tokens)
|
||||||
|
header.update_tokens(
|
||||||
|
self._ctx.estimated_tokens + stream_tokens,
|
||||||
|
self._ctx.token_counter.budget,
|
||||||
|
)
|
||||||
|
|
||||||
def on_thinking() -> None:
|
def on_thinking() -> None:
|
||||||
streaming_widget.update(Text("Thinking...", style="dim"))
|
streaming_widget.update(Text("Thinking...", style="dim"))
|
||||||
@@ -265,12 +386,23 @@ class SneakyCodeApp(App):
|
|||||||
self._tool_registry, self._permissions, display,
|
self._tool_registry, self._permissions, display,
|
||||||
debug_logger=self._debug_logger,
|
debug_logger=self._debug_logger,
|
||||||
skills_manager=self._skills_manager,
|
skills_manager=self._skills_manager,
|
||||||
|
skill_runner=self._skill_runner,
|
||||||
)
|
)
|
||||||
|
|
||||||
await agent.run_turn(user_input)
|
await agent.run_turn(user_input)
|
||||||
|
|
||||||
status_bar.stop_streaming()
|
status_bar.stop_streaming()
|
||||||
|
|
||||||
|
# Update token display in header
|
||||||
|
header = self.query_one(HeaderPanel)
|
||||||
|
header.update_tokens(self._ctx.estimated_tokens, self._ctx.token_counter.budget)
|
||||||
|
|
||||||
|
# Update skill indicator (skill may have been deactivated via finish_skill)
|
||||||
|
if self._skill_runner and not self._skill_runner.is_active:
|
||||||
|
status_bar.set_active_skill(None)
|
||||||
|
elif self._skill_runner and self._skill_runner.is_active:
|
||||||
|
status_bar.set_active_skill(self._skill_runner.active_skill_name)
|
||||||
|
|
||||||
# Auto-save
|
# Auto-save
|
||||||
if self._config.session.auto_save:
|
if self._config.session.auto_save:
|
||||||
self._save_session()
|
self._save_session()
|
||||||
@@ -290,6 +422,21 @@ class SneakyCodeApp(App):
|
|||||||
log = self.query_one("#chat-log", RichLog)
|
log = self.query_one("#chat-log", RichLog)
|
||||||
log.write(Text("⚠ Cancelling... (press Ctrl+C again to quit)", style="yellow"))
|
log.write(Text("⚠ Cancelling... (press Ctrl+C again to quit)", style="yellow"))
|
||||||
|
|
||||||
|
def action_cycle_mode(self) -> None:
|
||||||
|
"""Cycle through agent modes: Normal → Plan → Auto → Normal."""
|
||||||
|
if self._permissions is None:
|
||||||
|
return
|
||||||
|
cycle = {
|
||||||
|
AgentMode.NORMAL: AgentMode.PLAN,
|
||||||
|
AgentMode.PLAN: AgentMode.AUTO,
|
||||||
|
AgentMode.AUTO: AgentMode.NORMAL,
|
||||||
|
}
|
||||||
|
new_mode = cycle[self._permissions.mode]
|
||||||
|
self._permissions.mode = new_mode
|
||||||
|
self.query_one(HeaderPanel).update_mode(new_mode)
|
||||||
|
log = self.query_one("#chat-log", RichLog)
|
||||||
|
log.write(Text(f"Mode: {new_mode.value}", style="bold green"))
|
||||||
|
|
||||||
async def on_unmount(self) -> None:
|
async def on_unmount(self) -> None:
|
||||||
"""Clean up the LLM client on app shutdown."""
|
"""Clean up the LLM client on app shutdown."""
|
||||||
if self._client is not None:
|
if self._client is not None:
|
||||||
|
|||||||
@@ -55,8 +55,8 @@ Screen {
|
|||||||
Input {
|
Input {
|
||||||
dock: bottom;
|
dock: bottom;
|
||||||
margin: 0;
|
margin: 0;
|
||||||
|
border: heavy darkcyan;
|
||||||
|
padding: 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
Header {
|
/* HeaderPanel styles are in DEFAULT_CSS on the widget itself */
|
||||||
dock: top;
|
|
||||||
}
|
|
||||||
|
|||||||
@@ -11,6 +11,82 @@ from textual.widgets import Button, Input, Static
|
|||||||
|
|
||||||
from rich.text import Text
|
from rich.text import Text
|
||||||
|
|
||||||
|
from app.models.config import AgentMode
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Header Panel
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class HeaderPanel(Static):
|
||||||
|
"""Single-line header showing model name, agent mode, and token usage."""
|
||||||
|
|
||||||
|
DEFAULT_CSS = """
|
||||||
|
HeaderPanel {
|
||||||
|
dock: top;
|
||||||
|
height: 1;
|
||||||
|
background: darkcyan;
|
||||||
|
color: $text;
|
||||||
|
padding: 0 2;
|
||||||
|
}
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, model_name: str) -> None:
|
||||||
|
super().__init__("")
|
||||||
|
self._model_name = model_name
|
||||||
|
self._mode: AgentMode = AgentMode.NORMAL
|
||||||
|
self._tokens: int = 0
|
||||||
|
self._budget: int = 0
|
||||||
|
|
||||||
|
def on_resize(self) -> None:
|
||||||
|
self._refresh_display()
|
||||||
|
|
||||||
|
def update_model(self, name: str) -> None:
|
||||||
|
"""Update the displayed model name."""
|
||||||
|
self._model_name = name
|
||||||
|
self._refresh_display()
|
||||||
|
|
||||||
|
def update_mode(self, mode: AgentMode) -> None:
|
||||||
|
"""Update the displayed agent mode."""
|
||||||
|
self._mode = mode
|
||||||
|
self._refresh_display()
|
||||||
|
|
||||||
|
def update_tokens(self, tokens: int, budget: int) -> None:
|
||||||
|
"""Update the token usage display."""
|
||||||
|
self._tokens = tokens
|
||||||
|
self._budget = budget
|
||||||
|
self._refresh_display()
|
||||||
|
|
||||||
|
def _refresh_display(self) -> None:
|
||||||
|
"""Rebuild the header text."""
|
||||||
|
left = Text.assemble(
|
||||||
|
("⚡ SneakyCode", "bold"),
|
||||||
|
" │ ",
|
||||||
|
(self._model_name, "bold"),
|
||||||
|
)
|
||||||
|
|
||||||
|
mode_styles = {
|
||||||
|
AgentMode.NORMAL: ("NORMAL", "bold black on white"),
|
||||||
|
AgentMode.PLAN: ("PLAN", "bold black on yellow"),
|
||||||
|
AgentMode.AUTO: ("AUTO", "bold white on red"),
|
||||||
|
}
|
||||||
|
mode_label, mode_style = mode_styles[self._mode]
|
||||||
|
mode_text = Text.assemble((" ", mode_style), (mode_label, mode_style), (" ", mode_style))
|
||||||
|
|
||||||
|
right = Text(f"~{self._tokens:,} / {self._budget:,} tokens")
|
||||||
|
|
||||||
|
# Pad between sections
|
||||||
|
total_content = left.plain + " " + mode_text.plain + " " + right.plain
|
||||||
|
available = self.size.width if self.size.width > 0 else 80
|
||||||
|
gap_left = max(1, (available - len(total_content)) // 2)
|
||||||
|
gap_right = max(1, available - len(total_content) - gap_left)
|
||||||
|
|
||||||
|
full = Text.assemble(
|
||||||
|
left, " " * gap_left, mode_text, " " * gap_right, right,
|
||||||
|
)
|
||||||
|
self.update(full)
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# Modal Dialogs
|
# Modal Dialogs
|
||||||
@@ -139,20 +215,13 @@ class StatusBar(Static):
|
|||||||
|
|
||||||
def __init__(self) -> None:
|
def __init__(self) -> None:
|
||||||
super().__init__("")
|
super().__init__("")
|
||||||
self._tokens: int = 0
|
|
||||||
self._budget: int = 0
|
|
||||||
self._iteration: int = 0
|
self._iteration: int = 0
|
||||||
self._max_iterations: int = 0
|
self._max_iterations: int = 0
|
||||||
self._streaming: bool = False
|
self._streaming: bool = False
|
||||||
self._spinner_frame: int = 0
|
self._spinner_frame: int = 0
|
||||||
self._spinner_timer: Timer | None = None
|
self._spinner_timer: Timer | None = None
|
||||||
self._stream_tokens: int = 0
|
self._stream_tokens: int = 0
|
||||||
|
self._active_skill: str | None = None
|
||||||
def update_tokens(self, tokens: int, budget: int) -> None:
|
|
||||||
"""Update the token usage display."""
|
|
||||||
self._tokens = tokens
|
|
||||||
self._budget = budget
|
|
||||||
self._refresh_display()
|
|
||||||
|
|
||||||
def update_iteration(self, iteration: int, max_iterations: int) -> None:
|
def update_iteration(self, iteration: int, max_iterations: int) -> None:
|
||||||
"""Update the iteration count display."""
|
"""Update the iteration count display."""
|
||||||
@@ -184,16 +253,21 @@ class StatusBar(Static):
|
|||||||
self._spinner_frame = (self._spinner_frame + 1) % len(self._SPINNER)
|
self._spinner_frame = (self._spinner_frame + 1) % len(self._SPINNER)
|
||||||
self._refresh_display()
|
self._refresh_display()
|
||||||
|
|
||||||
|
def set_active_skill(self, skill_name: str | None) -> None:
|
||||||
|
"""Set or clear the active skill indicator."""
|
||||||
|
self._active_skill = skill_name
|
||||||
|
self._refresh_display()
|
||||||
|
|
||||||
def _refresh_display(self) -> None:
|
def _refresh_display(self) -> None:
|
||||||
"""Rebuild the status bar text."""
|
"""Rebuild the status bar text."""
|
||||||
parts: list[str] = []
|
parts: list[str] = []
|
||||||
|
if self._active_skill:
|
||||||
|
parts.append(f"[Skill: {self._active_skill}]")
|
||||||
if self._streaming:
|
if self._streaming:
|
||||||
spinner = self._SPINNER[self._spinner_frame]
|
spinner = self._SPINNER[self._spinner_frame]
|
||||||
parts.append(f"{spinner} Thinking")
|
parts.append(f"{spinner} Thinking")
|
||||||
if self._stream_tokens > 0:
|
if self._stream_tokens > 0:
|
||||||
parts.append(f"~{self._stream_tokens:,} tokens")
|
parts.append(f"~{self._stream_tokens:,} tokens")
|
||||||
if self._budget > 0:
|
|
||||||
parts.append(f"Tokens: ~{self._tokens:,} / {self._budget:,}")
|
|
||||||
if self._max_iterations > 0:
|
if self._max_iterations > 0:
|
||||||
parts.append(f"Iteration {self._iteration}/{self._max_iterations}")
|
parts.append(f"Iteration {self._iteration}/{self._max_iterations}")
|
||||||
self.update(Text(" \u2502 ".join(parts), style="dim"))
|
self.update(Text(" \u2502 ".join(parts), style="dim"))
|
||||||
|
|||||||
@@ -10,6 +10,7 @@ from __future__ import annotations
|
|||||||
|
|
||||||
from typing import TYPE_CHECKING, Protocol
|
from typing import TYPE_CHECKING, Protocol
|
||||||
|
|
||||||
|
from rich.markdown import Markdown
|
||||||
from rich.panel import Panel
|
from rich.panel import Panel
|
||||||
from rich.table import Table
|
from rich.table import Table
|
||||||
from rich.text import Text
|
from rich.text import Text
|
||||||
@@ -43,14 +44,27 @@ if TYPE_CHECKING:
|
|||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
def render_user_message(content: str) -> Panel:
|
def render_user_message(content: str) -> Text:
|
||||||
"""Render a user message as a styled panel."""
|
"""Render a condensed user prompt as a single styled line.
|
||||||
return Panel(content, title="You", border_style="cyan", expand=False)
|
|
||||||
|
Multi-line input is collapsed to the first line with a line count suffix.
|
||||||
|
Long single lines are truncated.
|
||||||
|
"""
|
||||||
|
lines = content.splitlines()
|
||||||
|
first = lines[0] if lines else content
|
||||||
|
max_len = 120
|
||||||
|
if len(first) > max_len:
|
||||||
|
first = first[:max_len] + "…"
|
||||||
|
suffix = f" (+{len(lines) - 1} lines)" if len(lines) > 1 else ""
|
||||||
|
text = Text()
|
||||||
|
text.append("You: ", style="bold cyan")
|
||||||
|
text.append(first + suffix, style="cyan")
|
||||||
|
return text
|
||||||
|
|
||||||
|
|
||||||
def render_assistant_message(content: str) -> Panel:
|
def render_assistant_message(content: str) -> Panel:
|
||||||
"""Render an assistant message as a styled panel."""
|
"""Render an assistant message as a styled panel."""
|
||||||
return Panel(content, title="Assistant", border_style="green", expand=True)
|
return Panel(Markdown(content), title="Assistant", border_style="green", expand=True)
|
||||||
|
|
||||||
|
|
||||||
def render_tool_call(name: str, args: str) -> Text:
|
def render_tool_call(name: str, args: str) -> Text:
|
||||||
@@ -222,8 +236,8 @@ def print_success(message: str) -> None:
|
|||||||
|
|
||||||
|
|
||||||
def print_user_message(content: str) -> None:
|
def print_user_message(content: str) -> None:
|
||||||
"""Print a user message in a styled panel."""
|
"""Print a condensed user prompt line."""
|
||||||
console.print(Panel(content, title="You", border_style="cyan", expand=False))
|
console.print(render_user_message(content))
|
||||||
|
|
||||||
|
|
||||||
def print_assistant_message(content: str) -> None:
|
def print_assistant_message(content: str) -> None:
|
||||||
|
|||||||
185
app/utils/file_cache.py
Normal file
185
app/utils/file_cache.py
Normal file
@@ -0,0 +1,185 @@
|
|||||||
|
"""File cache with LRU eviction and mtime-based invalidation."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from collections import OrderedDict
|
||||||
|
from dataclasses import dataclass, field
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from app.utils.file_helpers import (
|
||||||
|
BinaryFileError,
|
||||||
|
FileSizeError,
|
||||||
|
PathSecurityError,
|
||||||
|
check_file_size,
|
||||||
|
is_binary_file,
|
||||||
|
resolve_safe_path,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass(slots=True)
|
||||||
|
class CacheEntry:
|
||||||
|
"""A cached file's content and modification timestamp."""
|
||||||
|
|
||||||
|
content: str
|
||||||
|
mtime_ns: int
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class CacheStats:
|
||||||
|
"""Running statistics for a FileCache instance."""
|
||||||
|
|
||||||
|
hits: int = 0
|
||||||
|
misses: int = 0
|
||||||
|
invalidations: int = 0
|
||||||
|
evictions: int = 0
|
||||||
|
|
||||||
|
@property
|
||||||
|
def hit_rate(self) -> float:
|
||||||
|
"""Return cache hit rate as a float between 0.0 and 1.0."""
|
||||||
|
total = self.hits + self.misses
|
||||||
|
if total == 0:
|
||||||
|
return 0.0
|
||||||
|
return self.hits / total
|
||||||
|
|
||||||
|
|
||||||
|
class FileCache:
|
||||||
|
"""LRU file-content cache with mtime-based invalidation.
|
||||||
|
|
||||||
|
Keyed by resolved absolute ``Path``. Each lookup performs a cheap
|
||||||
|
``stat()`` syscall to verify the file hasn't changed on disk — if the
|
||||||
|
nanosecond mtime differs the entry is evicted and the caller gets a
|
||||||
|
cache miss.
|
||||||
|
|
||||||
|
Not thread-safe (single-threaded agent loop).
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, max_entries: int = 128) -> None:
|
||||||
|
self._max_entries = max_entries
|
||||||
|
self._entries: OrderedDict[Path, CacheEntry] = OrderedDict()
|
||||||
|
self._stats = CacheStats()
|
||||||
|
|
||||||
|
# -- public API --------------------------------------------------
|
||||||
|
|
||||||
|
def get(self, path: Path) -> str | None:
|
||||||
|
"""Return cached content if *path* hasn't changed, else ``None``.
|
||||||
|
|
||||||
|
A ``stat()`` call checks ``st_mtime_ns``; on mismatch the stale
|
||||||
|
entry is silently removed.
|
||||||
|
"""
|
||||||
|
entry = self._entries.get(path)
|
||||||
|
if entry is None:
|
||||||
|
self._stats.misses += 1
|
||||||
|
return None
|
||||||
|
|
||||||
|
try:
|
||||||
|
current_mtime_ns = path.stat().st_mtime_ns
|
||||||
|
except OSError:
|
||||||
|
# File gone — evict and miss.
|
||||||
|
self._remove(path)
|
||||||
|
self._stats.misses += 1
|
||||||
|
return None
|
||||||
|
|
||||||
|
if current_mtime_ns != entry.mtime_ns:
|
||||||
|
self._remove(path)
|
||||||
|
self._stats.invalidations += 1
|
||||||
|
self._stats.misses += 1
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Cache hit — move to end (most-recently used).
|
||||||
|
self._entries.move_to_end(path)
|
||||||
|
self._stats.hits += 1
|
||||||
|
return entry.content
|
||||||
|
|
||||||
|
def put(self, path: Path, content: str) -> None:
|
||||||
|
"""Store *content* for *path* with its current ``st_mtime_ns``.
|
||||||
|
|
||||||
|
Evicts the least-recently-used entry when over capacity.
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
mtime_ns = path.stat().st_mtime_ns
|
||||||
|
except OSError:
|
||||||
|
# Can't stat — don't cache.
|
||||||
|
return
|
||||||
|
|
||||||
|
if path in self._entries:
|
||||||
|
# Update existing; move to end.
|
||||||
|
self._entries[path] = CacheEntry(content=content, mtime_ns=mtime_ns)
|
||||||
|
self._entries.move_to_end(path)
|
||||||
|
else:
|
||||||
|
self._entries[path] = CacheEntry(content=content, mtime_ns=mtime_ns)
|
||||||
|
|
||||||
|
# Evict LRU if over capacity.
|
||||||
|
while len(self._entries) > self._max_entries:
|
||||||
|
self._entries.popitem(last=False)
|
||||||
|
self._stats.evictions += 1
|
||||||
|
|
||||||
|
def invalidate(self, path: Path) -> None:
|
||||||
|
"""Remove *path* from the cache if present."""
|
||||||
|
if path in self._entries:
|
||||||
|
del self._entries[path]
|
||||||
|
self._stats.invalidations += 1
|
||||||
|
|
||||||
|
def clear(self) -> None:
|
||||||
|
"""Remove all entries."""
|
||||||
|
self._entries.clear()
|
||||||
|
|
||||||
|
@property
|
||||||
|
def stats(self) -> CacheStats:
|
||||||
|
"""Return the running cache statistics."""
|
||||||
|
return self._stats
|
||||||
|
|
||||||
|
def __len__(self) -> int:
|
||||||
|
return len(self._entries)
|
||||||
|
|
||||||
|
# -- internals ---------------------------------------------------
|
||||||
|
|
||||||
|
def _remove(self, path: Path) -> None:
|
||||||
|
"""Delete an entry without bumping invalidation stats."""
|
||||||
|
self._entries.pop(path, None)
|
||||||
|
|
||||||
|
|
||||||
|
def cached_read_file(
|
||||||
|
path: str | Path,
|
||||||
|
workspace_root: Path,
|
||||||
|
max_size_bytes: int = 1_048_576,
|
||||||
|
check_binary: bool = True,
|
||||||
|
cache: FileCache | None = None,
|
||||||
|
) -> str:
|
||||||
|
"""Read a file with full security checks, using *cache* when available.
|
||||||
|
|
||||||
|
Security checks (path sandboxing, size limit, binary detection) run on
|
||||||
|
**every** call — only the ``Path.read_text()`` I/O is skipped on a cache
|
||||||
|
hit.
|
||||||
|
|
||||||
|
When *cache* is ``None`` this behaves identically to
|
||||||
|
:func:`~app.utils.file_helpers.safe_read_file`.
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
PathSecurityError: If the path escapes the workspace.
|
||||||
|
FileSizeError: If the file is too large.
|
||||||
|
BinaryFileError: If the file is binary and *check_binary* is True.
|
||||||
|
FileNotFoundError: If the file does not exist.
|
||||||
|
"""
|
||||||
|
safe_path = resolve_safe_path(path, workspace_root)
|
||||||
|
|
||||||
|
if not safe_path.exists():
|
||||||
|
raise FileNotFoundError(f"File not found: {safe_path}")
|
||||||
|
|
||||||
|
check_file_size(safe_path, max_size_bytes)
|
||||||
|
|
||||||
|
if check_binary and is_binary_file(safe_path):
|
||||||
|
raise BinaryFileError(f"File appears to be binary: {safe_path}")
|
||||||
|
|
||||||
|
# Try cache.
|
||||||
|
if cache is not None:
|
||||||
|
cached = cache.get(safe_path)
|
||||||
|
if cached is not None:
|
||||||
|
return cached
|
||||||
|
|
||||||
|
# Cache miss (or no cache) — read from disk.
|
||||||
|
content = safe_path.read_text(encoding="utf-8")
|
||||||
|
|
||||||
|
if cache is not None:
|
||||||
|
cache.put(safe_path, content)
|
||||||
|
|
||||||
|
return content
|
||||||
@@ -36,6 +36,11 @@ class TokenCounter:
|
|||||||
"""The configured token budget."""
|
"""The configured token budget."""
|
||||||
return self._budget
|
return self._budget
|
||||||
|
|
||||||
|
@budget.setter
|
||||||
|
def budget(self, value: int) -> None:
|
||||||
|
"""Update the token budget (e.g., when switching models)."""
|
||||||
|
self._budget = value
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def cumulative_usage(self) -> TokenUsage:
|
def cumulative_usage(self) -> TokenUsage:
|
||||||
"""Cumulative token usage across all tracked calls."""
|
"""Cumulative token usage across all tracked calls."""
|
||||||
|
|||||||
@@ -10,14 +10,32 @@ llm:
|
|||||||
max_retries: 3
|
max_retries: 3
|
||||||
retry_backoff_base: 1.0
|
retry_backoff_base: 1.0
|
||||||
retry_backoff_max: 30.0
|
retry_backoff_max: 30.0
|
||||||
|
thinking: false # Disable model thinking/reasoning mode (reduces reasoning-only loops)
|
||||||
|
# Extra parameters merged into the API request body (model-specific).
|
||||||
|
# Examples:
|
||||||
|
# OpenAI: reasoning_effort: "low"
|
||||||
|
extra_body: {}
|
||||||
|
|
||||||
agent:
|
agent:
|
||||||
max_iterations: 25
|
max_iterations: 25
|
||||||
max_conversation_tokens: 32000
|
max_conversation_tokens: 32000 # Default token budget (overridden by model_profiles)
|
||||||
workspace_root: "."
|
workspace_root: "."
|
||||||
truncation_keep_recent: 10
|
truncation_keep_recent: 10
|
||||||
truncation_threshold: 0.85
|
truncation_threshold: 0.85
|
||||||
|
|
||||||
|
# Per-model overrides — matched by longest model name prefix.
|
||||||
|
# Unset fields fall through to the defaults above.
|
||||||
|
model_profiles:
|
||||||
|
llama3:
|
||||||
|
max_conversation_tokens: 120000
|
||||||
|
thinking: false
|
||||||
|
qwen:
|
||||||
|
max_conversation_tokens: 32000
|
||||||
|
thinking: false
|
||||||
|
qwq:
|
||||||
|
max_conversation_tokens: 32000
|
||||||
|
thinking: true
|
||||||
|
|
||||||
permissions:
|
permissions:
|
||||||
auto_approve:
|
auto_approve:
|
||||||
- read_file
|
- read_file
|
||||||
@@ -43,7 +61,6 @@ tools:
|
|||||||
- pytest
|
- pytest
|
||||||
- ruff
|
- ruff
|
||||||
- ls
|
- ls
|
||||||
- cat
|
|
||||||
- head
|
- head
|
||||||
- tail
|
- tail
|
||||||
- wc
|
- wc
|
||||||
@@ -51,6 +68,10 @@ tools:
|
|||||||
- grep
|
- grep
|
||||||
- find
|
- find
|
||||||
- echo
|
- echo
|
||||||
|
- which
|
||||||
|
- jq
|
||||||
|
- type
|
||||||
|
- file
|
||||||
denied_commands:
|
denied_commands:
|
||||||
- rm -rf /
|
- rm -rf /
|
||||||
- sudo
|
- sudo
|
||||||
@@ -60,6 +81,9 @@ tools:
|
|||||||
filesystem:
|
filesystem:
|
||||||
max_file_size_bytes: 1048576 # 1 MB
|
max_file_size_bytes: 1048576 # 1 MB
|
||||||
binary_detection: true
|
binary_detection: true
|
||||||
|
cache:
|
||||||
|
enabled: true
|
||||||
|
max_entries: 128
|
||||||
|
|
||||||
session:
|
session:
|
||||||
session_dir: ".sneakycode/sessions"
|
session_dir: ".sneakycode/sessions"
|
||||||
|
|||||||
144
docs/ROADMAP.md
144
docs/ROADMAP.md
@@ -1,144 +0,0 @@
|
|||||||
# SneakyCode Implementation Roadmap
|
|
||||||
|
|
||||||
A phased plan progressing from bare-bones foundation to full autonomous coding agent.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Phase 1 — Foundation: Models, Config, and Utilities
|
|
||||||
|
|
||||||
Establish the data layer and shared infrastructure everything else builds on.
|
|
||||||
|
|
||||||
| File | Description |
|
|
||||||
|------|-------------|
|
|
||||||
| `app/models/config.py` | Pydantic v2 config model — load and validate `config/config.yaml` |
|
|
||||||
| `app/models/message.py` | Message schema (role, content, tool_calls) |
|
|
||||||
| `app/models/tool_call.py` | ToolCall and ToolResult schemas |
|
|
||||||
| `app/utils/logging.py` | Centralized logger with Rich handler |
|
|
||||||
| `app/utils/display.py` | Rich console output helpers (stub — expanded in Phase 2) |
|
|
||||||
| `app/utils/file_helpers.py` | Safe path resolution, binary detection, size guards |
|
|
||||||
| `app/utils/token_counter.py` | Approximate token usage tracking (character-based heuristic for v1) |
|
|
||||||
| `app/main.py` | Entrypoint stub — arg parsing, config load, Rich console setup |
|
|
||||||
|
|
||||||
**Exit criteria:** `python -m app.main --help` runs, config loads and validates, models can be instantiated and serialized.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Phase 2 — TUI and Interactive Shell
|
|
||||||
|
|
||||||
Get a working interactive terminal before wiring up the LLM.
|
|
||||||
|
|
||||||
| File | Description |
|
|
||||||
|------|-------------|
|
|
||||||
| `app/main.py` | Rich-based interactive REPL loop — prompt for user input, display responses |
|
|
||||||
| `app/utils/display.py` | Formatted output for agent messages, tool calls, errors, token usage |
|
|
||||||
| `app/agent/context.py` | Session state and conversation history management |
|
|
||||||
|
|
||||||
**Exit criteria:** User can type messages into a styled REPL, see them echoed back with formatting, and conversation history is tracked in memory.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Phase 3 — LLM Integration (Ollama)
|
|
||||||
|
|
||||||
Connect to the local LLM and stream responses into the TUI.
|
|
||||||
|
|
||||||
| File | Description |
|
|
||||||
|------|-------------|
|
|
||||||
| `app/services/llm.py` | Async httpx client wrapping Ollama's OpenAI-compatible `/v1/chat/completions` endpoint |
|
|
||||||
| `app/services/streaming.py` | SSE parsing, Rich live display, tool call extraction from accumulated stream |
|
|
||||||
|
|
||||||
**Integration:** Wire LLM into the REPL — user message goes to LLM, streamed response displays in real time.
|
|
||||||
|
|
||||||
**Exit criteria:** User can chat with the local model through the TUI with streamed output. Tool call JSON is parsed from the stream but not yet executed.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Phase 4 — Tool Framework and Core Tools
|
|
||||||
|
|
||||||
Build the tool abstraction and implement safe, read-only tools first.
|
|
||||||
|
|
||||||
| File | Description |
|
|
||||||
|------|-------------|
|
|
||||||
| `app/tools/base.py` | `BaseTool` ABC and `ToolResult` dataclass |
|
|
||||||
| `app/tools/registry.py` | Tool registration, discovery, and JSON schema export for LLM system prompt |
|
|
||||||
| `app/services/permissions.py` | Two-tier approval gating (auto-approve reads; prompt for writes/deletes/shell) |
|
|
||||||
| `app/tools/filesystem.py` | `read_file`, `list_dir` |
|
|
||||||
| `app/tools/search.py` | `grep_files`, `find_files` |
|
|
||||||
|
|
||||||
**Exit criteria:** Tools register themselves, schemas export correctly for inclusion in the system prompt, read-only tools execute and return `ToolResult` objects. Permissions service gates execution.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Phase 5 — Agent Loop (ReAct)
|
|
||||||
|
|
||||||
The core autonomy layer — reason, act, observe, repeat.
|
|
||||||
|
|
||||||
| File | Description |
|
|
||||||
|------|-------------|
|
|
||||||
| `app/agent/loop.py` | ReAct cycle: send conversation to LLM, parse tool calls, execute, feed results back, repeat |
|
|
||||||
|
|
||||||
**Key behaviors:**
|
|
||||||
- System prompt constructed with tool schemas from registry
|
|
||||||
- Permissions checks before each tool execution
|
|
||||||
- Loop termination on: plain-text response (no tool calls), explicit `finish` tool call, or `max_iterations` exceeded
|
|
||||||
|
|
||||||
**Exit criteria:** Agent can autonomously answer questions about the codebase by chaining `read_file`, `list_dir`, `grep_files`, and `find_files` tool calls in a multi-turn loop.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Phase 6 — Write Tools and Shell
|
|
||||||
|
|
||||||
Unlock the agent's ability to modify code and run commands.
|
|
||||||
|
|
||||||
| File | Description |
|
|
||||||
|------|-------------|
|
|
||||||
| `app/tools/filesystem.py` | `write_file`, `make_dir`, `delete_file` (additions to existing module) |
|
|
||||||
| `app/tools/edit.py` | `str_replace` (unique-match required), `patch_apply` |
|
|
||||||
| `app/tools/shell.py` | `run_command` with command allow/deny lists and output truncation |
|
|
||||||
|
|
||||||
**All write/shell operations gated through permissions service.**
|
|
||||||
|
|
||||||
**Exit criteria:** Agent can autonomously create files, edit code via string replacement, and run shell commands — all with user approval for destructive operations.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Phase 7 — Polish and Hardening
|
|
||||||
|
|
||||||
Production-readiness: error handling, resource limits, and documentation.
|
|
||||||
|
|
||||||
| Area | Description |
|
|
||||||
|------|-------------|
|
|
||||||
| Error handling | Recovery from malformed tool calls, LLM errors, network timeouts in agent loop |
|
|
||||||
| Token budget | Conversation truncation or summarization when approaching context limit |
|
|
||||||
| Graceful shutdown | Clean Ctrl+C handling, session state preservation |
|
|
||||||
| Testing | End-to-end integration tests (`tests/integration/`), unit tests (`tests/unit/`) |
|
|
||||||
| Documentation | `README.md` with setup and usage instructions, `docs/tools.md` tool reference |
|
|
||||||
|
|
||||||
**Exit criteria:** Agent handles edge cases gracefully, tests pass, and a new user can set up and use the project from the README alone.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## File Coverage
|
|
||||||
|
|
||||||
Every file from the project structure in CLAUDE.md is accounted for:
|
|
||||||
|
|
||||||
| File | Phase |
|
|
||||||
|------|-------|
|
|
||||||
| `app/main.py` | 1, 2 |
|
|
||||||
| `app/models/config.py` | 1 |
|
|
||||||
| `app/models/message.py` | 1 |
|
|
||||||
| `app/models/tool_call.py` | 1 |
|
|
||||||
| `app/utils/logging.py` | 1 |
|
|
||||||
| `app/utils/display.py` | 1, 2 |
|
|
||||||
| `app/utils/file_helpers.py` | 1 |
|
|
||||||
| `app/utils/token_counter.py` | 1 |
|
|
||||||
| `app/agent/context.py` | 2 |
|
|
||||||
| `app/services/llm.py` | 3 |
|
|
||||||
| `app/services/streaming.py` | 3 |
|
|
||||||
| `app/tools/base.py` | 4 |
|
|
||||||
| `app/tools/registry.py` | 4 |
|
|
||||||
| `app/services/permissions.py` | 4 |
|
|
||||||
| `app/tools/filesystem.py` | 4, 6 |
|
|
||||||
| `app/tools/search.py` | 4 |
|
|
||||||
| `app/agent/loop.py` | 5 |
|
|
||||||
| `app/tools/edit.py` | 6 |
|
|
||||||
| `app/tools/shell.py` | 6 |
|
|
||||||
1802
docs/superpowers/plans/2026-03-11-textual-tui.md
Normal file
1802
docs/superpowers/plans/2026-03-11-textual-tui.md
Normal file
File diff suppressed because it is too large
Load Diff
192
docs/superpowers/specs/2026-03-11-textual-tui-design.md
Normal file
192
docs/superpowers/specs/2026-03-11-textual-tui-design.md
Normal file
@@ -0,0 +1,192 @@
|
|||||||
|
# Textual TUI Redesign — Design Spec
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Replace the current sequential print-and-scroll terminal UI with a full persistent split-screen TUI using Textual. Input is pinned at the bottom, scrollable message history above, with a header showing app/model info and a footer showing token usage and iteration count.
|
||||||
|
|
||||||
|
## Layout
|
||||||
|
|
||||||
|
```
|
||||||
|
+------------------- Header --------------------+
|
||||||
|
| SneakyCode qwen2.5-coder:32b |
|
||||||
|
+-----------------------------------------------+
|
||||||
|
| |
|
||||||
|
| +--- You ---+ |
|
||||||
|
| | prompt | <- RichLog widget |
|
||||||
|
| +-----------+ (handles own scrolling) |
|
||||||
|
| |
|
||||||
|
| Thinking... |
|
||||||
|
| |
|
||||||
|
| +-- Assistant --+ |
|
||||||
|
| | response... | |
|
||||||
|
| +---------------+ |
|
||||||
|
| |
|
||||||
|
| > read_file README.md -- 148 lines, 5128 ch |
|
||||||
|
| > grep_files "pattern" -- 3 matches |
|
||||||
|
| |
|
||||||
|
+-----------------------------------------------+
|
||||||
|
| Tokens: ~1,511 / 32,000 | Iteration 5/25 | <- StatusBar
|
||||||
|
+-----------------------------------------------+
|
||||||
|
| > [input cursor] | <- Input widget
|
||||||
|
+-----------------------------------------------+
|
||||||
|
```
|
||||||
|
|
||||||
|
**Widget hierarchy (no VerticalScroll wrapper — RichLog handles its own scrolling):**
|
||||||
|
- `Header` — Textual built-in, title="SneakyCode", subtitle=model name
|
||||||
|
- `RichLog` (id="chat-log") — main scroll area, accepts Rich renderables via `.write()`
|
||||||
|
- `StreamingStatic` — persistent hidden `Static` widget, shown/hidden during streaming (avoids mount/unmount overhead)
|
||||||
|
- `StatusBar` — custom `Static` widget, 1 row, docked above Input
|
||||||
|
- `Input` — Textual built-in, pinned at bottom
|
||||||
|
|
||||||
|
## New Files
|
||||||
|
|
||||||
|
### `app/ui/app.py` — Textual App
|
||||||
|
|
||||||
|
SneakyCodeApp subclasses `textual.app.App`. Responsibilities:
|
||||||
|
|
||||||
|
- `compose()` yields: Header, RichLog(id="chat-log"), StreamingStatic(id="streaming"), StatusBar(id="status"), Input
|
||||||
|
- `on_input_submitted()` handler: reads input value, clears input, writes user panel to chat log, dispatches agent turn as a worker
|
||||||
|
- Agent turn runs via `run_worker()` (async worker, NOT threaded) so the UI stays responsive. Since the worker is async and on the event loop, widget methods can be called directly — no `call_from_thread()` needed.
|
||||||
|
- Slash commands (/quit, /history, /clear, /save, /session) parsed from input before dispatching to agent
|
||||||
|
- Holds references to config, SessionContext, AgentLoop (created in `on_mount`)
|
||||||
|
- Header subtitle set to model name from config
|
||||||
|
- `on_worker_state_changed()` handler: catches worker errors and writes error panels to RichLog
|
||||||
|
- Ctrl+C binding: cancels the running agent worker (does NOT quit the app). A second Ctrl+C or `/quit` exits.
|
||||||
|
|
||||||
|
### `app/ui/widgets.py` — Custom Widgets
|
||||||
|
|
||||||
|
**StatusBar** — A simple `Static` widget styled as a footer bar. Displays token usage and iteration count. Updated by the agent loop after each LLM step via `status_bar.update(renderable)`.
|
||||||
|
|
||||||
|
**StreamingStatic** — A `Static` widget that stays mounted but hidden. During streaming, it becomes visible and receives `update()` calls with partial content. When streaming ends, it is hidden and its content is cleared. This avoids the overhead of mounting/unmounting on every LLM response.
|
||||||
|
|
||||||
|
### `app/ui/styles.tcss` — Textual CSS
|
||||||
|
|
||||||
|
Layout rules:
|
||||||
|
- RichLog fills available height (fraction-based sizing, e.g. `height: 1fr`)
|
||||||
|
- StreamingStatic: `display: none` by default, shown during streaming
|
||||||
|
- StatusBar is 1 row, docked bottom above Input
|
||||||
|
- Input is 1 row, docked at very bottom
|
||||||
|
- Color scheme matches existing SNEAKYCODE_THEME (cyan for user, green for assistant, magenta for tools, dim for metadata)
|
||||||
|
|
||||||
|
## Modified Files
|
||||||
|
|
||||||
|
### `app/main.py`
|
||||||
|
|
||||||
|
- Remove `_run_repl()` async function entirely
|
||||||
|
- Remove `console.input()` usage
|
||||||
|
- `main()` creates config, runs preflight via `asyncio.run(_preflight(config))` (before Textual starts — this is fine, separate event loop), then instantiates and runs `SneakyCodeApp(config).run()`
|
||||||
|
- CLI arg parsing stays (--config, -v, --log-file)
|
||||||
|
- Session resume: `_offer_session_resume()` moves into `SneakyCodeApp.on_mount()` — instead of `console.input()`, push a modal screen asking "Resume previous session? [y/n]" with button/key handlers
|
||||||
|
- Auto-save: triggers after each agent turn completes (in the worker completion handler)
|
||||||
|
- SIGTERM handler: removed — Textual manages its own signal handling and shutdown lifecycle
|
||||||
|
|
||||||
|
### `app/services/streaming.py`
|
||||||
|
|
||||||
|
- Remove `from rich.live import Live` and `from rich.spinner import Spinner`
|
||||||
|
- `process_stream()` no longer creates a `Rich.Live` context
|
||||||
|
- Instead, accepts callback parameters:
|
||||||
|
- `on_content: Callable[[str], None]` — called with accumulated content on each content chunk
|
||||||
|
- `on_thinking: Callable[[], None]` — called once when first reasoning token arrives
|
||||||
|
- `on_done: Callable[[], None]` — called when streaming completes
|
||||||
|
- **Throttling:** Content callback fires at most every 100ms (track last update time, skip intermediate chunks). Final content always fires on stream end.
|
||||||
|
- Since the agent runs as an async worker (on the event loop), callbacks can directly call widget methods — no `call_from_thread()` needed.
|
||||||
|
- All accumulation and tool-call parsing logic stays identical
|
||||||
|
|
||||||
|
### `app/utils/display.py`
|
||||||
|
|
||||||
|
- All `print_*` functions become `render_*` functions that return Rich renderables:
|
||||||
|
- `render_user_message(content) -> Panel`
|
||||||
|
- `render_assistant_message(content) -> Panel`
|
||||||
|
- `render_tool_call(name, args) -> Text`
|
||||||
|
- `render_tool_result(name, output, is_error) -> Text`
|
||||||
|
- `render_iteration_header(iteration, max_iter) -> Text`
|
||||||
|
- `render_warning(message) -> Text`
|
||||||
|
- `render_error(message) -> Text`
|
||||||
|
- `print_banner()` removed — Header widget replaces it
|
||||||
|
- `print_token_usage()` becomes `render_token_usage() -> Text` for the StatusBar
|
||||||
|
- `print_history()` becomes `render_history() -> Table` — written to RichLog, may need width constraints for narrow terminals
|
||||||
|
- A `DisplayAdapter` class wraps a `RichLog` reference and provides `write_user_message()`, `write_tool_call()`, etc. methods that call `render_*` then `rich_log.write()`
|
||||||
|
|
||||||
|
### `app/agent/loop.py`
|
||||||
|
|
||||||
|
- `AgentLoop.__init__()` accepts a `DisplayAdapter` instead of calling `display.py` print functions directly
|
||||||
|
- All display calls route through the adapter: `self._display.write_tool_call(name, args)`, `self._display.write_iteration_header(i, max)`, etc.
|
||||||
|
- `_execute_tool_calls()` becomes `async def _execute_tool_calls()` to support async permission checks
|
||||||
|
- The loop logic (ReAct pattern, retry, truncation) is unchanged
|
||||||
|
|
||||||
|
### `app/services/permissions.py`
|
||||||
|
|
||||||
|
- `PermissionsService.check()` becomes `async def check()`
|
||||||
|
- Instead of `rich.prompt.Confirm.ask()` (blocking stdin read), it:
|
||||||
|
1. Creates an `asyncio.Event`
|
||||||
|
2. Posts a custom message to the app requesting a permission modal
|
||||||
|
3. The app pushes a modal screen with the permission question and approve/deny buttons
|
||||||
|
4. When the user responds, the modal sets the event and stores the result
|
||||||
|
5. `check()` awaits the event and reads the result
|
||||||
|
- Edge cases: dismiss without choosing = deny. Ctrl+C during modal = deny. Focus returns to Input after modal dismisses.
|
||||||
|
|
||||||
|
### `app/utils/logging.py`
|
||||||
|
|
||||||
|
- **Critical change:** The shared `console = Console()` instance will corrupt the Textual display since Textual takes exclusive terminal control
|
||||||
|
- When running under Textual: disable `RichHandler` (console handler), keep only the file handler
|
||||||
|
- Add a `setup_logging_for_tui()` function that reconfigures logging to file-only mode
|
||||||
|
- Called from `SneakyCodeApp.on_mount()` before any agent work begins
|
||||||
|
- The `console` object still exists but should not be used for output during TUI mode — all output goes through the DisplayAdapter
|
||||||
|
- Consider: `--log-file` becomes required (or auto-set to a default) when running in TUI mode, so logs are not lost
|
||||||
|
|
||||||
|
## Unchanged Files
|
||||||
|
|
||||||
|
- `app/services/llm.py` — HTTP client, SSE parsing untouched
|
||||||
|
- `app/agent/context.py` — session state untouched
|
||||||
|
- `app/models/*` — all data models untouched
|
||||||
|
- `app/tools/*` — all tool implementations untouched
|
||||||
|
- `app/utils/file_helpers.py` — path safety untouched
|
||||||
|
- `app/utils/token_counter.py` — token counting untouched
|
||||||
|
|
||||||
|
## Key Patterns
|
||||||
|
|
||||||
|
### Streaming in Textual
|
||||||
|
|
||||||
|
The agent loop runs as an async worker (on the event loop, NOT threaded). During streaming:
|
||||||
|
|
||||||
|
1. App shows `StreamingStatic` widget, writes "Thinking..." initially
|
||||||
|
2. Worker calls `StreamHandler.process_stream(chunks, on_content=..., on_thinking=..., on_done=...)`
|
||||||
|
3. `on_content` callback: updates `StreamingStatic` with `Panel(Markdown(partial_content), title="Assistant", border_style="green")` — throttled to ~100ms intervals
|
||||||
|
4. `on_done` callback: hides `StreamingStatic`, writes final content to `RichLog` via `DisplayAdapter`
|
||||||
|
|
||||||
|
Since the worker is async (not threaded), callbacks run on the event loop and can call widget methods directly.
|
||||||
|
|
||||||
|
### Permission Prompts
|
||||||
|
|
||||||
|
1. Agent loop (in async worker) calls `await permissions.check(operation, details)`
|
||||||
|
2. `check()` creates an `asyncio.Event` and posts `PermissionRequest` message to the app
|
||||||
|
3. App handles `PermissionRequest`: pushes a modal screen with the question, approve/deny buttons
|
||||||
|
4. Modal screen: on button press, stores result and sets the event
|
||||||
|
5. `check()` awaits the event, reads result, returns approved/denied
|
||||||
|
6. Focus management: Input loses focus when modal appears, regains focus when modal dismisses
|
||||||
|
7. Default on dismiss/Ctrl+C: deny
|
||||||
|
|
||||||
|
### Cancellation
|
||||||
|
|
||||||
|
- Ctrl+C (first press): cancels the running agent worker via `worker.cancel()`. The agent loop should check for cancellation between iterations.
|
||||||
|
- Ctrl+C (second press) or `/quit`: exits the app via `app.exit()`
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
- Add `textual>=4.0.0` to pyproject.toml dependencies
|
||||||
|
|
||||||
|
## Verification
|
||||||
|
|
||||||
|
1. Run the app — header shows app name + model, no console corruption
|
||||||
|
2. Type a prompt — user panel appears in scroll area, input clears
|
||||||
|
3. During LLM streaming — assistant response types out live (throttled) in the scroll area
|
||||||
|
4. Thinking indicator shows during reasoning-only phases
|
||||||
|
5. Tool calls appear as compact lines in the scroll area
|
||||||
|
6. Footer shows token usage and iteration count, updating each step
|
||||||
|
7. Scroll area auto-scrolls to bottom on new content
|
||||||
|
8. /quit, /clear, /history commands work from the input
|
||||||
|
9. Permission prompts show as modal, approve/deny work, focus returns to input
|
||||||
|
10. Ctrl+C cancels running agent turn without quitting
|
||||||
|
11. Worker errors display as error panels in the scroll area
|
||||||
|
12. Logging goes to file only — no console corruption
|
||||||
|
13. Session resume works on startup via modal dialog
|
||||||
@@ -1 +1,12 @@
|
|||||||
Pressing up should cycle history like claude code.
|
# UI Issues
|
||||||
|
on /clear we need to reset the token counter in the header panel.
|
||||||
|
|
||||||
|
# Bugs
|
||||||
|
|
||||||
|
# Improvements
|
||||||
|
add -p to command line args so that the agent can run the prompt and return data directly via STDOUT
|
||||||
|
|
||||||
|
# Open questions:
|
||||||
|
How might we pass a directory to this app and have it use that directory as it's workspace so I don't have to copy files or do odd things to work in other directories.
|
||||||
|
|
||||||
|
How do we handle huge files not taking up so many tokens?
|
||||||
@@ -21,10 +21,10 @@ from app.utils.display import (
|
|||||||
|
|
||||||
|
|
||||||
class TestRenderFunctions:
|
class TestRenderFunctions:
|
||||||
def test_render_user_message_returns_panel(self) -> None:
|
def test_render_user_message_returns_text(self) -> None:
|
||||||
result = render_user_message("hello")
|
result = render_user_message("hello")
|
||||||
assert isinstance(result, Panel)
|
assert isinstance(result, Text)
|
||||||
assert result.title == "You"
|
assert "hello" in result.plain
|
||||||
|
|
||||||
def test_render_assistant_message_returns_panel(self) -> None:
|
def test_render_assistant_message_returns_panel(self) -> None:
|
||||||
result = render_assistant_message("response")
|
result = render_assistant_message("response")
|
||||||
@@ -72,7 +72,7 @@ class TestDisplayAdapter:
|
|||||||
adapter.write_user_message("hello")
|
adapter.write_user_message("hello")
|
||||||
mock_log.write.assert_called_once()
|
mock_log.write.assert_called_once()
|
||||||
arg = mock_log.write.call_args[0][0]
|
arg = mock_log.write.call_args[0][0]
|
||||||
assert isinstance(arg, Panel)
|
assert isinstance(arg, Text)
|
||||||
|
|
||||||
def test_write_tool_call(self) -> None:
|
def test_write_tool_call(self) -> None:
|
||||||
mock_log = MagicMock()
|
mock_log = MagicMock()
|
||||||
|
|||||||
314
tests/unit/test_file_cache.py
Normal file
314
tests/unit/test_file_cache.py
Normal file
@@ -0,0 +1,314 @@
|
|||||||
|
"""Tests for the file cache with LRU eviction and mtime invalidation."""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import time
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from app.models.config import AppConfig, load_config
|
||||||
|
from app.models.tool_call import ToolResultStatus
|
||||||
|
from app.tools.filesystem import ReadFileTool, ReadManyFilesTool
|
||||||
|
from app.utils.file_cache import CacheStats, FileCache, cached_read_file
|
||||||
|
from app.utils.file_helpers import BinaryFileError, FileSizeError, PathSecurityError
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# FileCache unit tests
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class TestFileCache:
|
||||||
|
def test_put_and_get_roundtrip(self, tmp_path: Path) -> None:
|
||||||
|
cache = FileCache()
|
||||||
|
f = tmp_path / "hello.txt"
|
||||||
|
f.write_text("hello world")
|
||||||
|
|
||||||
|
cache.put(f, "hello world")
|
||||||
|
assert cache.get(f) == "hello world"
|
||||||
|
|
||||||
|
def test_get_returns_none_for_missing_key(self, tmp_path: Path) -> None:
|
||||||
|
cache = FileCache()
|
||||||
|
assert cache.get(tmp_path / "nope.txt") is None
|
||||||
|
|
||||||
|
def test_mtime_change_causes_miss(self, tmp_path: Path) -> None:
|
||||||
|
cache = FileCache()
|
||||||
|
f = tmp_path / "data.txt"
|
||||||
|
f.write_text("v1")
|
||||||
|
cache.put(f, "v1")
|
||||||
|
|
||||||
|
# Mutate the file so mtime changes
|
||||||
|
time.sleep(0.05) # ensure mtime differs
|
||||||
|
f.write_text("v2")
|
||||||
|
|
||||||
|
assert cache.get(f) is None # stale → miss
|
||||||
|
assert cache.stats.invalidations == 1
|
||||||
|
|
||||||
|
def test_lru_eviction_at_capacity(self, tmp_path: Path) -> None:
|
||||||
|
cache = FileCache(max_entries=3)
|
||||||
|
files = []
|
||||||
|
for i in range(4):
|
||||||
|
f = tmp_path / f"f{i}.txt"
|
||||||
|
f.write_text(f"content-{i}")
|
||||||
|
files.append(f)
|
||||||
|
|
||||||
|
# Fill cache to capacity
|
||||||
|
for f in files[:3]:
|
||||||
|
cache.put(f, f.read_text())
|
||||||
|
assert len(cache) == 3
|
||||||
|
|
||||||
|
# Adding a 4th evicts the LRU (files[0])
|
||||||
|
cache.put(files[3], files[3].read_text())
|
||||||
|
assert len(cache) == 3
|
||||||
|
assert cache.get(files[0]) is None # evicted
|
||||||
|
assert cache.stats.evictions == 1
|
||||||
|
|
||||||
|
# files[1..3] still present
|
||||||
|
for f in files[1:]:
|
||||||
|
assert cache.get(f) is not None
|
||||||
|
|
||||||
|
def test_invalidate_removes_entry(self, tmp_path: Path) -> None:
|
||||||
|
cache = FileCache()
|
||||||
|
f = tmp_path / "rm.txt"
|
||||||
|
f.write_text("bye")
|
||||||
|
cache.put(f, "bye")
|
||||||
|
assert len(cache) == 1
|
||||||
|
|
||||||
|
cache.invalidate(f)
|
||||||
|
assert len(cache) == 0
|
||||||
|
assert cache.get(f) is None
|
||||||
|
assert cache.stats.invalidations == 1
|
||||||
|
|
||||||
|
def test_invalidate_noop_for_missing(self) -> None:
|
||||||
|
cache = FileCache()
|
||||||
|
cache.invalidate(Path("/nonexistent"))
|
||||||
|
assert cache.stats.invalidations == 0
|
||||||
|
|
||||||
|
def test_clear_empties_cache(self, tmp_path: Path) -> None:
|
||||||
|
cache = FileCache()
|
||||||
|
for i in range(5):
|
||||||
|
f = tmp_path / f"c{i}.txt"
|
||||||
|
f.write_text(str(i))
|
||||||
|
cache.put(f, str(i))
|
||||||
|
assert len(cache) == 5
|
||||||
|
|
||||||
|
cache.clear()
|
||||||
|
assert len(cache) == 0
|
||||||
|
|
||||||
|
def test_stats_accuracy(self, tmp_path: Path) -> None:
|
||||||
|
cache = FileCache(max_entries=2)
|
||||||
|
a = tmp_path / "a.txt"
|
||||||
|
b = tmp_path / "b.txt"
|
||||||
|
c = tmp_path / "c.txt"
|
||||||
|
a.write_text("a")
|
||||||
|
b.write_text("b")
|
||||||
|
c.write_text("c")
|
||||||
|
|
||||||
|
# Miss
|
||||||
|
cache.get(a)
|
||||||
|
assert cache.stats.misses == 1
|
||||||
|
assert cache.stats.hits == 0
|
||||||
|
|
||||||
|
# Put + hit
|
||||||
|
cache.put(a, "a")
|
||||||
|
cache.get(a)
|
||||||
|
assert cache.stats.hits == 1
|
||||||
|
|
||||||
|
# Fill + evict
|
||||||
|
cache.put(b, "b")
|
||||||
|
cache.put(c, "c") # evicts a
|
||||||
|
assert cache.stats.evictions == 1
|
||||||
|
|
||||||
|
def test_hit_rate(self) -> None:
|
||||||
|
stats = CacheStats(hits=3, misses=1)
|
||||||
|
assert stats.hit_rate == pytest.approx(0.75)
|
||||||
|
|
||||||
|
def test_hit_rate_zero_total(self) -> None:
|
||||||
|
stats = CacheStats()
|
||||||
|
assert stats.hit_rate == 0.0
|
||||||
|
|
||||||
|
def test_file_deleted_after_caching(self, tmp_path: Path) -> None:
|
||||||
|
cache = FileCache()
|
||||||
|
f = tmp_path / "gone.txt"
|
||||||
|
f.write_text("here")
|
||||||
|
cache.put(f, "here")
|
||||||
|
|
||||||
|
f.unlink()
|
||||||
|
assert cache.get(f) is None # stat fails → miss
|
||||||
|
|
||||||
|
def test_put_skips_when_stat_fails(self) -> None:
|
||||||
|
cache = FileCache()
|
||||||
|
cache.put(Path("/totally/nonexistent"), "data")
|
||||||
|
assert len(cache) == 0
|
||||||
|
|
||||||
|
def test_get_moves_to_end(self, tmp_path: Path) -> None:
|
||||||
|
"""Accessing an entry makes it most-recently-used, protecting from eviction."""
|
||||||
|
cache = FileCache(max_entries=3)
|
||||||
|
files = []
|
||||||
|
for i in range(3):
|
||||||
|
f = tmp_path / f"lru{i}.txt"
|
||||||
|
f.write_text(f"c{i}")
|
||||||
|
files.append(f)
|
||||||
|
cache.put(f, f"c{i}")
|
||||||
|
|
||||||
|
# Touch files[0] to make it MRU
|
||||||
|
cache.get(files[0])
|
||||||
|
|
||||||
|
# Add a new entry — files[1] (LRU) should be evicted, not files[0]
|
||||||
|
extra = tmp_path / "extra.txt"
|
||||||
|
extra.write_text("x")
|
||||||
|
cache.put(extra, "x")
|
||||||
|
|
||||||
|
assert cache.get(files[0]) is not None # protected by access
|
||||||
|
assert cache.get(files[1]) is None # evicted
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# cached_read_file tests
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
class TestCachedReadFile:
|
||||||
|
def test_without_cache_matches_safe_read(self, tmp_path: Path) -> None:
|
||||||
|
f = tmp_path / "plain.txt"
|
||||||
|
f.write_text("hello")
|
||||||
|
content = cached_read_file(f, tmp_path, cache=None)
|
||||||
|
assert content == "hello"
|
||||||
|
|
||||||
|
def test_populates_on_miss_returns_on_hit(self, tmp_path: Path) -> None:
|
||||||
|
cache = FileCache()
|
||||||
|
f = tmp_path / "cached.txt"
|
||||||
|
f.write_text("data")
|
||||||
|
|
||||||
|
# First call: miss → read from disk → populate cache
|
||||||
|
content1 = cached_read_file(f, tmp_path, cache=cache)
|
||||||
|
assert content1 == "data"
|
||||||
|
assert cache.stats.misses == 1
|
||||||
|
assert cache.stats.hits == 0
|
||||||
|
|
||||||
|
# Second call: hit → from cache
|
||||||
|
content2 = cached_read_file(f, tmp_path, cache=cache)
|
||||||
|
assert content2 == "data"
|
||||||
|
assert cache.stats.hits == 1
|
||||||
|
|
||||||
|
def test_security_checks_run_on_cached_path(self, tmp_path: Path) -> None:
|
||||||
|
cache = FileCache()
|
||||||
|
with pytest.raises(PathSecurityError):
|
||||||
|
cached_read_file("/etc/passwd", tmp_path, cache=cache)
|
||||||
|
|
||||||
|
def test_binary_check_runs_on_cached_path(self, tmp_path: Path) -> None:
|
||||||
|
cache = FileCache()
|
||||||
|
f = tmp_path / "bin.dat"
|
||||||
|
f.write_bytes(b"\x00binary\x00")
|
||||||
|
with pytest.raises(BinaryFileError):
|
||||||
|
cached_read_file(f, tmp_path, cache=cache)
|
||||||
|
|
||||||
|
def test_size_check_runs_on_cached_path(self, tmp_path: Path) -> None:
|
||||||
|
cache = FileCache()
|
||||||
|
f = tmp_path / "big.txt"
|
||||||
|
f.write_text("x" * 200)
|
||||||
|
|
||||||
|
# First read populates cache
|
||||||
|
cached_read_file(f, tmp_path, max_size_bytes=1000, cache=cache)
|
||||||
|
|
||||||
|
# Now make file too big on disk — security check should catch it
|
||||||
|
# even though content is cached
|
||||||
|
f.write_text("x" * 2000)
|
||||||
|
with pytest.raises(FileSizeError):
|
||||||
|
cached_read_file(f, tmp_path, max_size_bytes=1000, cache=cache)
|
||||||
|
|
||||||
|
def test_file_not_found(self, tmp_path: Path) -> None:
|
||||||
|
cache = FileCache()
|
||||||
|
with pytest.raises(FileNotFoundError):
|
||||||
|
cached_read_file(tmp_path / "nope.txt", tmp_path, cache=cache)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Tool-level cache-hit dedup tests
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def config() -> AppConfig:
|
||||||
|
return load_config()
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def tmp_workspace(tmp_path: Path, config: AppConfig) -> tuple[Path, AppConfig]:
|
||||||
|
config.agent.workspace_root = tmp_path
|
||||||
|
return tmp_path, config
|
||||||
|
|
||||||
|
|
||||||
|
class TestReadFileToolCacheHit:
|
||||||
|
def test_first_read_returns_full_content(self, tmp_workspace: tuple[Path, AppConfig]) -> None:
|
||||||
|
ws, cfg = tmp_workspace
|
||||||
|
cache = FileCache()
|
||||||
|
(ws / "hello.txt").write_text("hello world")
|
||||||
|
|
||||||
|
tool = ReadFileTool(ws, cfg, file_cache=cache)
|
||||||
|
result = tool.run("tc-1", {"file_path": "hello.txt"})
|
||||||
|
assert result.status == ToolResultStatus.SUCCESS
|
||||||
|
assert result.output == "hello world"
|
||||||
|
|
||||||
|
def test_second_read_returns_cached_message(self, tmp_workspace: tuple[Path, AppConfig]) -> None:
|
||||||
|
ws, cfg = tmp_workspace
|
||||||
|
cache = FileCache()
|
||||||
|
(ws / "hello.txt").write_text("hello world")
|
||||||
|
|
||||||
|
tool = ReadFileTool(ws, cfg, file_cache=cache)
|
||||||
|
tool.run("tc-1", {"file_path": "hello.txt"})
|
||||||
|
|
||||||
|
result2 = tool.run("tc-2", {"file_path": "hello.txt"})
|
||||||
|
assert result2.status == ToolResultStatus.SUCCESS
|
||||||
|
assert "[Cached]" in result2.output
|
||||||
|
assert "hello.txt" in result2.output
|
||||||
|
assert "hello world" not in result2.output
|
||||||
|
|
||||||
|
def test_changed_file_returns_full_content_again(self, tmp_workspace: tuple[Path, AppConfig]) -> None:
|
||||||
|
ws, cfg = tmp_workspace
|
||||||
|
cache = FileCache()
|
||||||
|
f = ws / "data.txt"
|
||||||
|
f.write_text("v1")
|
||||||
|
|
||||||
|
tool = ReadFileTool(ws, cfg, file_cache=cache)
|
||||||
|
tool.run("tc-1", {"file_path": "data.txt"})
|
||||||
|
|
||||||
|
# Mutate file so mtime changes
|
||||||
|
time.sleep(0.05)
|
||||||
|
f.write_text("v2")
|
||||||
|
|
||||||
|
result2 = tool.run("tc-2", {"file_path": "data.txt"})
|
||||||
|
assert result2.status == ToolResultStatus.SUCCESS
|
||||||
|
assert result2.output == "v2"
|
||||||
|
assert "[Cached]" not in result2.output
|
||||||
|
|
||||||
|
def test_no_cache_always_returns_content(self, tmp_workspace: tuple[Path, AppConfig]) -> None:
|
||||||
|
ws, cfg = tmp_workspace
|
||||||
|
(ws / "hello.txt").write_text("hello")
|
||||||
|
|
||||||
|
tool = ReadFileTool(ws, cfg, file_cache=None)
|
||||||
|
r1 = tool.run("tc-1", {"file_path": "hello.txt"})
|
||||||
|
r2 = tool.run("tc-2", {"file_path": "hello.txt"})
|
||||||
|
assert r1.output == "hello"
|
||||||
|
assert r2.output == "hello"
|
||||||
|
|
||||||
|
|
||||||
|
class TestReadManyFilesToolCacheHit:
|
||||||
|
def test_cached_files_get_short_message(self, tmp_workspace: tuple[Path, AppConfig]) -> None:
|
||||||
|
ws, cfg = tmp_workspace
|
||||||
|
cache = FileCache()
|
||||||
|
(ws / "a.txt").write_text("alpha")
|
||||||
|
(ws / "b.txt").write_text("bravo")
|
||||||
|
|
||||||
|
tool = ReadManyFilesTool(ws, cfg, file_cache=cache)
|
||||||
|
|
||||||
|
# First read — full content
|
||||||
|
r1 = tool.run("tc-1", {"file_paths": ["a.txt", "b.txt"]})
|
||||||
|
assert "alpha" in r1.output
|
||||||
|
assert "bravo" in r1.output
|
||||||
|
|
||||||
|
# Second read — cached messages
|
||||||
|
r2 = tool.run("tc-2", {"file_paths": ["a.txt", "b.txt"]})
|
||||||
|
assert "[Cached]" in r2.output
|
||||||
|
assert "alpha" not in r2.output
|
||||||
|
assert "bravo" not in r2.output
|
||||||
69
tests/unit/test_filesystem_read_many.py
Normal file
69
tests/unit/test_filesystem_read_many.py
Normal file
@@ -0,0 +1,69 @@
|
|||||||
|
"""Tests for the read_many_files tool."""
|
||||||
|
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from app.models.config import AppConfig, load_config
|
||||||
|
from app.models.tool_call import ToolResultStatus
|
||||||
|
from app.tools.filesystem import ReadManyFilesTool
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def config() -> AppConfig:
|
||||||
|
return load_config()
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def tmp_workspace(tmp_path: Path, config: AppConfig) -> tuple[Path, AppConfig]:
|
||||||
|
"""Create a temporary workspace for read_many_files tests."""
|
||||||
|
config.agent.workspace_root = tmp_path
|
||||||
|
return tmp_path, config
|
||||||
|
|
||||||
|
|
||||||
|
class TestReadManyFilesTool:
|
||||||
|
def test_read_multiple_files(self, tmp_workspace: tuple[Path, AppConfig]) -> None:
|
||||||
|
ws, cfg = tmp_workspace
|
||||||
|
(ws / "a.txt").write_text("alpha")
|
||||||
|
(ws / "b.txt").write_text("bravo")
|
||||||
|
tool = ReadManyFilesTool(ws, cfg)
|
||||||
|
result = tool.run("tc-1", {"file_paths": ["a.txt", "b.txt"]})
|
||||||
|
assert result.status == ToolResultStatus.SUCCESS
|
||||||
|
assert "=== a.txt ===" in result.output
|
||||||
|
assert "alpha" in result.output
|
||||||
|
assert "=== b.txt ===" in result.output
|
||||||
|
assert "bravo" in result.output
|
||||||
|
|
||||||
|
def test_partial_failure(self, tmp_workspace: tuple[Path, AppConfig]) -> None:
|
||||||
|
ws, cfg = tmp_workspace
|
||||||
|
(ws / "exists.txt").write_text("hello")
|
||||||
|
tool = ReadManyFilesTool(ws, cfg)
|
||||||
|
result = tool.run("tc-2", {"file_paths": ["exists.txt", "missing.txt"]})
|
||||||
|
assert result.status == ToolResultStatus.SUCCESS
|
||||||
|
assert "hello" in result.output
|
||||||
|
assert "[ERROR]" in result.output
|
||||||
|
assert "=== missing.txt ===" in result.output
|
||||||
|
|
||||||
|
def test_all_files_fail(self, tmp_workspace: tuple[Path, AppConfig]) -> None:
|
||||||
|
ws, cfg = tmp_workspace
|
||||||
|
tool = ReadManyFilesTool(ws, cfg)
|
||||||
|
result = tool.run("tc-3", {"file_paths": ["no1.txt", "no2.txt"]})
|
||||||
|
assert result.status == ToolResultStatus.ERROR
|
||||||
|
assert "All files failed" in (result.error or "")
|
||||||
|
|
||||||
|
def test_empty_file_paths(self, tmp_workspace: tuple[Path, AppConfig]) -> None:
|
||||||
|
ws, cfg = tmp_workspace
|
||||||
|
tool = ReadManyFilesTool(ws, cfg)
|
||||||
|
result = tool.run("tc-4", {"file_paths": []})
|
||||||
|
assert result.status == ToolResultStatus.ERROR
|
||||||
|
assert "empty" in (result.error or "").lower()
|
||||||
|
|
||||||
|
def test_path_security_inline_error(self, tmp_workspace: tuple[Path, AppConfig]) -> None:
|
||||||
|
ws, cfg = tmp_workspace
|
||||||
|
(ws / "safe.txt").write_text("ok")
|
||||||
|
tool = ReadManyFilesTool(ws, cfg)
|
||||||
|
result = tool.run("tc-5", {"file_paths": ["safe.txt", "../../etc/passwd"]})
|
||||||
|
assert result.status == ToolResultStatus.SUCCESS
|
||||||
|
assert "ok" in result.output
|
||||||
|
assert "[ERROR]" in result.output
|
||||||
|
assert "outside" in result.output.lower()
|
||||||
@@ -90,6 +90,6 @@ class TestRunCommandTool:
|
|||||||
# Create a file in the workspace to verify cwd
|
# Create a file in the workspace to verify cwd
|
||||||
(ws / "marker.txt").write_text("found")
|
(ws / "marker.txt").write_text("found")
|
||||||
tool = RunCommandTool(ws, cfg)
|
tool = RunCommandTool(ws, cfg)
|
||||||
result = tool.run("tc-9", {"command": "cat marker.txt"})
|
result = tool.run("tc-9", {"command": "head marker.txt"})
|
||||||
assert result.status == ToolResultStatus.SUCCESS
|
assert result.status == ToolResultStatus.SUCCESS
|
||||||
assert "found" in result.output
|
assert "found" in result.output
|
||||||
|
|||||||
@@ -108,7 +108,7 @@ class TestToolRegistry:
|
|||||||
registry = create_default_registry(workspace, config)
|
registry = create_default_registry(workspace, config)
|
||||||
names = set(registry.get_all().keys())
|
names = set(registry.get_all().keys())
|
||||||
assert names == {
|
assert names == {
|
||||||
"read_file", "list_dir", "grep_files", "find_files",
|
"read_file", "read_many_files", "list_dir", "grep_files", "find_files",
|
||||||
"write_file", "make_dir", "delete_file",
|
"write_file", "make_dir", "delete_file",
|
||||||
"str_replace", "patch_apply",
|
"str_replace", "patch_apply",
|
||||||
"run_command",
|
"run_command",
|
||||||
@@ -118,7 +118,7 @@ class TestToolRegistry:
|
|||||||
def test_schema_export(self, workspace: Path, config: AppConfig) -> None:
|
def test_schema_export(self, workspace: Path, config: AppConfig) -> None:
|
||||||
registry = create_default_registry(workspace, config)
|
registry = create_default_registry(workspace, config)
|
||||||
schemas = registry.get_openai_tools_schema()
|
schemas = registry.get_openai_tools_schema()
|
||||||
assert len(schemas) == 11
|
assert len(schemas) == 12
|
||||||
assert all(s["type"] == "function" for s in schemas)
|
assert all(s["type"] == "function" for s in schemas)
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user