feat(engine,ui): unify detection in rules engine, add function rules & per-script matches; improve scripts table UX

Core changes
- Centralize detection in the Rules Engine; browser.py now focuses on fetch/extract/persist.
- Add class-based adapters:
  - FactAdapter: converts snippets → structured facts.
  - FunctionRuleAdapter: wraps dict-based rule functions for engine input (str or dict).
- Register function rules (code-based) alongside YAML rules:
  - form_action_missing
  - form_http_on_https_page
  - form_submits_to_different_host
  - script_src_uses_data_or_blob
  - script_src_has_dangerous_extension
  - script_third_party_host

Rules & YAML
- Expand/normalize YAML rules with severities + tags; tighten patterns.
- Add new regex rules: new_function_usage, unescape_usage, string_timer_usage, long_hex_constants.
- Move iframe rule to `text` category.
- Keep existing script/form/text rules; all compile under IGNORECASE.

Browser / analysis refactor
- browser.py:
  - Remove inline heuristics; rely on engine for PASS/FAIL, reason, severity, tags.
  - Build page-level overview (`rule_checks`) across categories.
  - Analyze forms: add `base_url` + `base_hostname` to snippet so function rules can evaluate; include per-form rule_checks.
  - Analyze scripts: **per-script evaluation**:
    - Inline -> run regex script rules on inline text.
    - External -> run function script rules with a facts dict (src/src_hostname/base_url/base_hostname).
    - Only include scripts that matched ≥1 rule; attach severity/tags to matches.
  - Persist single source of truth: `/data/<uuid>/results.json`.
  - Backward-compat: `fetch_page_artifacts(..., engine=...)` kwarg accepted/ignored.

UI/UX
- Suspicious Scripts table now shows only matched scripts.
- Add severity badges and tag chips; tooltips show rule description.
- Prevent table blowouts:
  - Fixed layout + ellipsis + wrapping helpers (`.scripts-table`, `.breakable`, `details pre.code`).
  - Shortened inline snippet preview (configurable).
- Minor template niceties (e.g., rel="noopener" on external links where applicable).

Config
- Add `ui.snippet_preview_len` to settings.yaml; default 160.
- Load into `app.config["SNIPPET_PREVIEW_LEN"]` and use in `analyze_scripts`.

Init / wiring
- Import and register function rules as `Rule(...)` objects (not dicts).
- Hook Rules Engine to Flask logger for verbose/diagnostic output.
- Log totals on startup; keep YAML path override via `SNEAKYSCOPE_RULES_FILE`.

Bug fixes
- Fix boot crash: pass `Rule` instances to `engine.add_rule()` instead of dicts.
- Fix “N/A” in scripts table by actually computing per-script matches.
- Ensure form rules fire by including `base_url`/`base_hostname` in form snippets.

Roadmap
- Update roadmap to reflect completed items:
  - “Show each check and whether it triggered (pass/fail list per rule)”
  - Severity levels + tags in Suspicious Scripts
  - Results.json as route source of truth
  - Scripts table UX (badges, tooltips, layout fix)
This commit is contained in:
2025-08-20 21:33:30 -05:00
parent 70d29f9f95
commit 1eb2a52f17
14 changed files with 1108 additions and 423 deletions

View File

@@ -0,0 +1,31 @@
# Feature Session Plan SneakyScope
**Feature:**
* \[Short description of the feature or improvement]
**Effort:**
* Easy / Medium / Hard
**Dependencies:**
* \[List of prerequisites or related tasks that must be done first]
**Design Notes:**
* \[Goals, considerations, analyst/UX needs, edge cases, pitfalls to avoid]
**Implementation Tasks:**
* [ ] Step 1
* [ ] Step 2
* [ ] Step 3
**Validation / Testing:**
* \[How well verify it works — e.g., test cases, UI check, API output, logs]
**Next Steps After Completion:**
* \[What this unblocks or enables, i.e. the next feature/dependency]

6
docs/README.md Normal file
View File

@@ -0,0 +1,6 @@
# Roadmap and chats
## Vibecode? Brotha Ewww
No, I don't "vibe code". There is a huge difference between asking AI to do everything vs "give me a boiler plate function" and tweaking from there. I've been coding over 20 years, these fingers have typed enough. So, yes I use AI while I code.
Some of these little files in here are just helpful ways I've started using AI to help keep me on track with the project. Feel free to borrow.

View File

@@ -1,71 +1,32 @@
# SneakyScope — Roadmap (Updated 8-20-25)
## Priority 1 Core Functionality / Stability
## Priority 1 Core Analysis / Stability
**Permissions / Storage Paths**
* Opt-in “fetch external scripts” mode (off by default): on submission, download external script content (size/time limits) and run rules on fetched content.
* Remove remaining legacy form “flagged\_reasons” plumbing once all equivalent function rules are in place.
* Unit tests: YAML compilation, function-rule adapters, and per-script/per-form rule cases.
*`/data` and other mounted volumes setup handled by `sandbox.sh`
* ✅ Downloads, screenshots, and HTML artifacts are written correctly (`safe_write` in `io_helpers.py`)
## Priority 2 API Layer
---
* API endpoints: `/screenshot`, `/source`, `/analyse`.
* OpenAPI spec: create `openapi/openapi.yaml` and serve at `/api/openapi.yaml`.
* Docs UI: Swagger UI or Redoc at `/docs`.
## Priority 2 Data Accuracy / Enrichment
## Priority 3 UI / UX
**WHOIS & GeoIP Enhancements**
* Front page/input handling: auto-prepend `http://`/`https://`/`www.` for bare domains.
* Source code viewer: embed page source in an editor view for readability.
* Scripts table: toggle between “Only suspicious” and “All scripts”.
* Rules Lab (WYSIWYG tester): paste a rule, validate/compile, run against sample text; lightweight nav entry.
* ✅ Implemented Python-based WHOIS parsing with fallback to raw WHOIS text
* ✅ Default `"Possible Privacy"` or `"N/A"` for missing WHOIS fields
* ✅ GeoIP + ASN + ISP info displayed per IP in **accordion tables**
* ✅ Cache WHOIS and GeoIP results to reduce repeated queries
## Priority 4 Artifact Management & Ops
**Suspicious Scripts & Forms**
* Retention/cleanup policy for old artifacts (age/size thresholds).
* Make periodic maintenance scripts for storage; cleanup options set in `settings.yaml`.
* Results caching UX: add “Re-run analysis” vs. “Load from cache” controls in the results UI.
* [ ] Expand flagged script and form output with reasons for analysts
* [ ] Show each check and if it triggered flags (pass/fail for each check)
## Priority 5 Extras / Integrations
**Add Suspicious BEC words**
* ✅ Look for things like `"reset password"`
* ✅ Make configurable via a config file (yaml doc with rules)
---
## Priority 3 User Interface / UX
**Front Page / Input Handling**
* [ ] Automatically prepend `http://`, `https://`, and/or `www.` if a user only enters a domain
**Result Templates / Cards**
* [ ] load sourcecode for webpage in a code editor view or code block on page so that it's easier to read
* [ ] Update result cards with clear, analyst-friendly explanations
* [ ] Include flagged logic and reason lists for scripts and forms
* ✅ Display GeoIP results in accordion tables (✅ done)
---
## Priority 4 API Layer
**API Endpoints**
* [ ] Add `/screenshot` endpoint
* [ ] Add `/source` endpoint
* [ ] Add `/analyse` endpoint
**OpenAPI + Docs**
* [ ] Create initial `openapi/openapi.yaml` spec file
* [ ] Serve spec at `/api/openapi.yaml`
* [ ] Wire up Swagger UI or Redoc at `/docs` for interactive API exploration
---
## Priority 5 Optional / Cleanup
**Artifact Management**
* [ ] Implement saving of results from a UUID as "results.json" so we don't rerun all the rules and just load from cache.
* [ ] Implement cleanup or retention policy for old artifacts
* [ ] Optional: Add periodic maintenance scripts for storage
**Extra Features**
* [ ] Placeholder for additional features (e.g., bulk URL analysis, alerting, integrations)
* Bulk URL analysis (batch/queue).
* Alerting & integrations (webhooks, Slack, email).
* Optional: analyst verdict tags and export (CSV/JSON).

12
docs/workflow.md Normal file
View File

@@ -0,0 +1,12 @@
### 🛠 SneakyScope Feature Workflow
1. Pick feature from roadmap
2. Drop in **Feature Session Plan** template
3. Fill in description, effort, dependencies, design notes
4. Expand into tasks → implement code → test/validate
5. Update roadmap (remove/complete, reorder if needed)
---
This way, every session starts with the same rhythm, and we dont lose context between chats.