Phillip Tarrant 3a24b392f2 feat: on-demand external script analysis + code viewer; refactor form analysis to rule engine
- API: add `POST /api/analyze_script` (app/blueprints/api.py)
  - Fetch one external script to artifacts, run rules, return findings + snippet
  - Uses new ExternalScriptFetcher (results_path aware) and job UUID
  - Returns: { ok, final_url, status_code, bytes, truncated, sha256, artifact_path, findings[], snippet, snippet_len }
  - TODO: document in openapi/openapi.yaml

- Fetcher: update `app/utils/external_fetch.py`
  - Constructed with `results_path` (UUID dir); writes to `<results_path>/scripts/fetched/<index>.js`
  - Loads settings via `get_settings()`, logs via std logging

- UI (results.html):
  - Move “Analyze external script” action into **Content Snippet** column for external rows
  - Clicking replaces button with `<details>` snippet, shows rule matches, and adds “open in viewer” link
  - Robust fetch handler (checks JSON, shows errors); builds viewer URL from absolute artifact path

- Viewer:
  - New route: `GET /view/artifact/<run_uuid>/<path:filename>` (app/blueprints/ui.py)
  - New template: Monaco-based read-only code viewer (viewer.html)
  - Removes SRI on loader to avoid integrity block; loads file via `raw_url` and detects language by extension

- Forms:
  - Refactor `analyze_forms` to mirror scripts analysis:
    - Uses rule engine (`category == "form"`) across regex/function rules
    - Emits rows only when matches exist
    - Includes `content_snippet`, `action`, `method`, `inputs`, `rules`
  - Replace legacy plumbing (`flagged`, `flag_reasons`, `status`) in output
  - Normalize form function rules to canonical returns `(bool, Optional[str])`:
    - `form_action_missing`
    - `form_http_on_https_page`
    - `form_submits_to_different_host`
    - Add minor hardening (lowercasing hosts, no-op actions, clearer reasons)

- CSS: add `.forms-table` to mirror `.scripts-table` (5 columns)
  - Fixed table layout, widths per column, chip/snippet styling, responsive tweaks

- Misc:
  - Fix “working outside app context” issue by avoiding `current_app` at import time (left storage logic inside routes)
  - Add “View Source” link to open page source in viewer

Refs:
- Roadmap: mark “Source code viewer” done; keep TODO to add `/api/analyze_script` to OpenAPI
2025-08-21 15:32:24 -05:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00
2025-08-21 08:58:05 -05:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00

SneakyScope

A lightweight web-based sandbox for analyzing websites and domains. SneakyScope fetches a page in a sandbox, enriches with WHOIS/GeoIP, and runs a unified Rules Engine (YAML + function rules) against scripts, forms, and text. Results are saved per-run and rendered with analyst-friendly tables, severity badges, and tags. Results are saved at time of analysis per run so you have a point in time result that doesn't change.

Repo: https://git.sneakygeek.net/ptarrant/SneakyScope Status: Private (may become public later)


🚀 Features

Unified Detection (Rules Engine)

  • Regex rules from YAML + function rules in code for context-aware checks.

  • PASS/FAIL per rule with reason, severity (low|medium|high), and tags.

  • Per-script matches:

    • Inline scripts → run regex rules on the code.
    • External scripts → run function rules with structured facts (src, hostnames, etc.).
  • Page-level overview: complete PASS/FAIL tables by category (script, form, text).

Domain & IP Enrichment

  • WHOIS with robust fallbacks (N/A, Possible Privacy when fields are missing).
  • GeoIP, ASN, and ISP details.

Results & UX

  • Per-run artifacts under /data/<uuid>/:

    • screenshot.png, source.txt, results.json (single source of truth).
  • Suspicious Scripts table shows only matched scripts with:

    • Severity badges and tag chips (tooltip shows rule reason).
    • Snippet preview length configurable via settings.yaml.

🧱 Architecture at a Glance

  • Flask app (Gunicorn in Docker)

  • Playwright for headless page fetch/render

  • BeautifulSoup4 for parsing

  • Rules Engine

    • YAML regex rules (config/suspicious_rules.yaml)
    • Function rules (app/rules/function_rules.py) registered on startup
  • Artifacts: persistent path mounted at /data (configurable)


⚙️ Setup

1) Clone

Since this repo is private, youll need credentials (HTTPS with a personal access token) or SSH access.

HTTPS (with token):

git clone https://git.sneakygeek.net/ptarrant/SneakyScope.git
cd SneakyScope

SSH:

git clone git@git.sneakygeek.net:ptarrant/SneakyScope.git
cd SneakyScope

2) Configure Environment

Copy and edit env:

cp .env.example .env

Important vars:

  • SECRET_KEY Flask secret (set in production).
  • MAXMIND_LICENSE_KEY for GeoIP (optional if you disable GeoIP).
  • SNEAKYSCOPE_RULES_FILE override path to YAML rules (optional).

3) Settings

settings.yaml controls UI/behavior. Example:

app:
  name: "SneakyScope"
  version_major: 0
  version_minor: 1

ui:
  snippet_preview_len: 160  # controls inline script snippet length in UI

4) Run with Docker Compose

docker-compose up --build

This builds the image and starts the web app. The /data directory in the container is where run artifacts are written—mount a host directory in Compose to persist between restarts.


🧪 Using SneakyScope

  1. Open the web UI and submit a URL.

  2. On completion youll see:

    • URL Overview (with permalink to /results/<uuid>)
    • Enrichment (WHOIS/GeoIP)
    • Redirects
    • Forms (inputs + per-form rule checks)
    • Suspicious Scripts (only scripts that matched rules; badges/tags, snippet)
    • Screenshot and Source

Artifacts for each run live under /data/<uuid>/:

  • results.json complete structured result consumed by the UI.
  • source.txt, screenshot.png, and other files as added.

📝 Rules

YAML (regex) Rules

config/suspicious_rules.yaml contains regex rules (compiled IGNORECASE). Example:

- name: eval_usage
  description: "Use of eval() in script"
  category: script
  type: regex
  pattern: '\beval\s*\('
  severity: high
  tags: [obfuscation, unsafe-eval]

Function Rules (code)

Rules needing context (e.g., compare action host to page host) live in:

  • app/rules/function_rules.py:

    • FactAdapter converts snippets to structured facts.

    • FunctionRuleAdapter lets dict-expecting rules run from engine inputs.

    • Implementations like:

      • form_action_missing
      • form_http_on_https_page
      • form_submits_to_different_host
      • script_src_uses_data_or_blob
      • script_src_has_dangerous_extension
      • script_third_party_host

Theyre registered at startup in app/__init__.py alongside YAML rules.


🔧 Configuration Tips

  • Snippet length: tweak ui.snippet_preview_len in settings.yaml (default 160).
  • Rules file override: set SNEAKYSCOPE_RULES_FILE=/path/to/your.yaml.
  • Artifacts path: by default /data in the container (mount via Compose).

📂 Project Structure (high-level)

app/
  __init__.py                 # Flask app factory (loads YAML + function rules)
  browser.py                  # fetch + analysis orchestrator (writes results.json)
  routes.py                   # web views
  rules/
    function_rules.py         # FactAdapter, FunctionRuleAdapter, function rules
  utils/
    rules_engine.py           # engine + Rule class + YAML loader
    io_helpers.py             # safe_write, etc.
    settings.py               # get_settings()
  templates/                  # Jinja2 templates
  static/                     # CSS/JS
  config/
    suspicious_rules.yaml     # regex rules
docs/
  roadmap.md                  # ongoing plan and priorities

🧭 Roadmap (short version)

Full details: docs/roadmap.md

  • Core Analysis / Stability

    • Opt-in fetch external scripts (size/time limits) and evaluate fetched content.
    • Remove remaining legacy form “flagged_reasons” once function rules cover them.
    • Unit tests: YAML compilation, adapters, per-artifact rule cases.
  • API Layer

    • Endpoints: /screenshot, /source, /analyse
    • OpenAPI at /api/openapi.yaml; docs at /docs (Swagger/Redoc)
  • UI / UX

    • Auto-prepend http(s):///www. for bare domains
    • Source viewer (embedded editor)
    • Scripts table toggle: “Only suspicious” / “All scripts”
    • Rules Lab (WYSIWYG tester) for rapid rule validation
  • Artifact Management & Ops

    • Retention/cleanup policy (age/size)
    • Periodic maintenance scripts (configurable in settings.yaml)
    • Results caching UX (re-run vs. load from cache)
  • Extras / Integrations

    • Bulk URL analysis
    • Alerting/webhooks (Slack/email)
    • Analyst verdict tags + export (CSV/JSON)

🤝 Contributing

This repository is currently private on a self-hosted git server.

  • Internal contributors: use feature branches and open merge requests on https://git.sneakygeek.net/ptarrant/SneakyScope.
  • If/when the repo is made public, well welcome issues and PRs from the community.

⚠️ Disclaimer

SneakyScope is intended for defensive security analysis and educational use. Only analyze content you are authorized to test.


Description
No description provided
Readme 365 KiB
Languages
Python 70.7%
HTML 18.8%
CSS 8.2%
Shell 1.7%
Dockerfile 0.6%