Go to file

Phillip Tarrant 3a24b392f2 feat: on-demand external script analysis + code viewer; refactor form analysis to rule engine

- API: add `POST /api/analyze_script` (app/blueprints/api.py)
  - Fetch one external script to artifacts, run rules, return findings + snippet
  - Uses new ExternalScriptFetcher (results_path aware) and job UUID
  - Returns: { ok, final_url, status_code, bytes, truncated, sha256, artifact_path, findings[], snippet, snippet_len }
  - TODO: document in openapi/openapi.yaml

- Fetcher: update `app/utils/external_fetch.py`
  - Constructed with `results_path` (UUID dir); writes to `<results_path>/scripts/fetched/<index>.js`
  - Loads settings via `get_settings()`, logs via std logging

- UI (results.html):
  - Move “Analyze external script” action into **Content Snippet** column for external rows
  - Clicking replaces button with `<details>` snippet, shows rule matches, and adds “open in viewer” link
  - Robust fetch handler (checks JSON, shows errors); builds viewer URL from absolute artifact path

- Viewer:
  - New route: `GET /view/artifact/<run_uuid>/<path:filename>` (app/blueprints/ui.py)
  - New template: Monaco-based read-only code viewer (viewer.html)
  - Removes SRI on loader to avoid integrity block; loads file via `raw_url` and detects language by extension

- Forms:
  - Refactor `analyze_forms` to mirror scripts analysis:
    - Uses rule engine (`category == "form"`) across regex/function rules
    - Emits rows only when matches exist
    - Includes `content_snippet`, `action`, `method`, `inputs`, `rules`
  - Replace legacy plumbing (`flagged`, `flag_reasons`, `status`) in output
  - Normalize form function rules to canonical returns `(bool, Optional[str])`:
    - `form_action_missing`
    - `form_http_on_https_page`
    - `form_submits_to_different_host`
    - Add minor hardening (lowercasing hosts, no-op actions, clearer reasons)

- CSS: add `.forms-table` to mirror `.scripts-table` (5 columns)
  - Fixed table layout, widths per column, chip/snippet styling, responsive tweaks

- Misc:
  - Fix “working outside app context” issue by avoiding `current_app` at import time (left storage logic inside routes)
  - Add “View Source” link to open page source in viewer

Refs:
- Roadmap: mark “Source code viewer” done; keep TODO to add `/api/analyze_script` to OpenAPI

2025-08-21 15:32:24 -05:00

app

feat: on-demand external script analysis + code viewer; refactor form analysis to rule engine

2025-08-21 15:32:24 -05:00

docs

feat: on-demand external script analysis + code viewer; refactor form analysis to rule engine

2025-08-21 15:32:24 -05:00

openapi

first commit

2025-08-20 21:22:28 +00:00

.env.example

first commit

2025-08-20 21:22:28 +00:00

.gitignore

first commit

2025-08-20 21:22:28 +00:00

docker-compose.yaml

first commit

2025-08-20 21:22:28 +00:00

Dockerfile

first commit

2025-08-20 21:22:28 +00:00

entrypoint.sh

first commit

2025-08-20 21:22:28 +00:00

Readme.md

updating readme

2025-08-21 08:58:05 -05:00

requirements.txt

first commit

2025-08-20 21:22:28 +00:00

sandbox.sh

first commit

2025-08-20 21:22:28 +00:00

Readme.md

SneakyScope

A lightweight web-based sandbox for analyzing websites and domains. SneakyScope fetches a page in a sandbox, enriches with WHOIS/GeoIP, and runs a unified Rules Engine (YAML + function rules) against scripts, forms, and text. Results are saved per-run and rendered with analyst-friendly tables, severity badges, and tags. Results are saved at time of analysis per run so you have a point in time result that doesn't change.

Repo: https://git.sneakygeek.net/ptarrant/SneakyScope Status: Private (may become public later)

🚀 Features

Unified Detection (Rules Engine)

Regex rules from YAML + function rules in code for context-aware checks.
PASS/FAIL per rule with reason, severity (low|medium|high), and tags.
Per-script matches:
- Inline scripts → run regex rules on the code.
- External scripts → run function rules with structured facts (src, hostnames, etc.).
Page-level overview: complete PASS/FAIL tables by category (script, form, text).

Domain & IP Enrichment

WHOIS with robust fallbacks (N/A, Possible Privacy when fields are missing).
GeoIP, ASN, and ISP details.

Results & UX

Per-run artifacts under /data/<uuid>/:
- screenshot.png, source.txt, results.json (single source of truth).
Suspicious Scripts table shows only matched scripts with:
- Severity badges and tag chips (tooltip shows rule reason).
- Snippet preview length configurable via settings.yaml.

🧱 Architecture at a Glance

Flask app (Gunicorn in Docker)
Playwright for headless page fetch/render
BeautifulSoup4 for parsing
Rules Engine
- YAML regex rules (config/suspicious_rules.yaml)
- Function rules (app/rules/function_rules.py) registered on startup
Artifacts: persistent path mounted at /data (configurable)

⚙️ Setup

1) Clone

Since this repo is private, you’ll need credentials (HTTPS with a personal access token) or SSH access.

HTTPS (with token):

git clone https://git.sneakygeek.net/ptarrant/SneakyScope.git
cd SneakyScope

SSH:

git clone git@git.sneakygeek.net:ptarrant/SneakyScope.git
cd SneakyScope

2) Configure Environment

Copy and edit env:

cp .env.example .env

Important vars:

SECRET_KEY – Flask secret (set in production).
MAXMIND_LICENSE_KEY – for GeoIP (optional if you disable GeoIP).
SNEAKYSCOPE_RULES_FILE – override path to YAML rules (optional).

3) Settings

settings.yaml controls UI/behavior. Example:

app:
  name: "SneakyScope"
  version_major: 0
  version_minor: 1

ui:
  snippet_preview_len: 160  # controls inline script snippet length in UI

4) Run with Docker Compose

docker-compose up --build

This builds the image and starts the web app. The /data directory in the container is where run artifacts are written—mount a host directory in Compose to persist between restarts.

🧪 Using SneakyScope

Open the web UI and submit a URL.
On completion you’ll see:
- URL Overview (with permalink to /results/<uuid>)
- Enrichment (WHOIS/GeoIP)
- Redirects
- Forms (inputs + per-form rule checks)
- Suspicious Scripts (only scripts that matched rules; badges/tags, snippet)
- Screenshot and Source

Artifacts for each run live under /data/<uuid>/:

results.json – complete structured result consumed by the UI.
source.txt, screenshot.png, and other files as added.

📝 Rules

YAML (regex) Rules

config/suspicious_rules.yaml contains regex rules (compiled IGNORECASE). Example:

- name: eval_usage
  description: "Use of eval() in script"
  category: script
  type: regex
  pattern: '\beval\s*\('
  severity: high
  tags: [obfuscation, unsafe-eval]

Function Rules (code)

Rules needing context (e.g., compare action host to page host) live in:

app/rules/function_rules.py:
- FactAdapter – converts snippets to structured facts.
- FunctionRuleAdapter – lets dict-expecting rules run from engine inputs.
- Implementations like:
  - form_action_missing
  - form_http_on_https_page
  - form_submits_to_different_host
  - script_src_uses_data_or_blob
  - script_src_has_dangerous_extension
  - script_third_party_host

They’re registered at startup in app/__init__.py alongside YAML rules.

🔧 Configuration Tips

Snippet length: tweak ui.snippet_preview_len in settings.yaml (default 160).
Rules file override: set SNEAKYSCOPE_RULES_FILE=/path/to/your.yaml.
Artifacts path: by default /data in the container (mount via Compose).

📂 Project Structure (high-level)

app/
  __init__.py                 # Flask app factory (loads YAML + function rules)
  browser.py                  # fetch + analysis orchestrator (writes results.json)
  routes.py                   # web views
  rules/
    function_rules.py         # FactAdapter, FunctionRuleAdapter, function rules
  utils/
    rules_engine.py           # engine + Rule class + YAML loader
    io_helpers.py             # safe_write, etc.
    settings.py               # get_settings()
  templates/                  # Jinja2 templates
  static/                     # CSS/JS
  config/
    suspicious_rules.yaml     # regex rules
docs/
  roadmap.md                  # ongoing plan and priorities

🧭 Roadmap (short version)

Full details: docs/roadmap.md

Core Analysis / Stability
- Opt-in fetch external scripts (size/time limits) and evaluate fetched content.
- Remove remaining legacy form “flagged_reasons” once function rules cover them.
- Unit tests: YAML compilation, adapters, per-artifact rule cases.
API Layer
- Endpoints: /screenshot, /source, /analyse
- OpenAPI at /api/openapi.yaml; docs at /docs (Swagger/Redoc)
UI / UX
- Auto-prepend http(s):///www. for bare domains
- Source viewer (embedded editor)
- Scripts table toggle: “Only suspicious” / “All scripts”
- Rules Lab (WYSIWYG tester) for rapid rule validation
Artifact Management & Ops
- Retention/cleanup policy (age/size)
- Periodic maintenance scripts (configurable in settings.yaml)
- Results caching UX (re-run vs. load from cache)
Extras / Integrations
- Bulk URL analysis
- Alerting/webhooks (Slack/email)
- Analyst verdict tags + export (CSV/JSON)

🤝 Contributing

This repository is currently private on a self-hosted git server.

Internal contributors: use feature branches and open merge requests on https://git.sneakygeek.net/ptarrant/SneakyScope.
If/when the repo is made public, we’ll welcome issues and PRs from the community.

⚠️ Disclaimer

SneakyScope is intended for defensive security analysis and educational use. Only analyze content you are authorized to test.

Languages

Python 70.7%

HTML 18.8%

CSS 8.2%

Shell 1.7%

Dockerfile 0.6%

Readme.md Unescape Escape

SneakyScope

🚀 Features

Unified Detection (Rules Engine)

Domain & IP Enrichment

Results & UX

🧱 Architecture at a Glance

⚙️ Setup

1) Clone

2) Configure Environment

3) Settings

4) Run with Docker Compose

🧪 Using SneakyScope

📝 Rules

YAML (regex) Rules

Function Rules (code)

🔧 Configuration Tips

📂 Project Structure (high-level)

🧭 Roadmap (short version)

🤝 Contributing

⚠️ Disclaimer

Readme.md