Phillip Tarrant 1eb2a52f17 feat(engine,ui): unify detection in rules engine, add function rules & per-script matches; improve scripts table UX
Core changes
- Centralize detection in the Rules Engine; browser.py now focuses on fetch/extract/persist.
- Add class-based adapters:
  - FactAdapter: converts snippets → structured facts.
  - FunctionRuleAdapter: wraps dict-based rule functions for engine input (str or dict).
- Register function rules (code-based) alongside YAML rules:
  - form_action_missing
  - form_http_on_https_page
  - form_submits_to_different_host
  - script_src_uses_data_or_blob
  - script_src_has_dangerous_extension
  - script_third_party_host

Rules & YAML
- Expand/normalize YAML rules with severities + tags; tighten patterns.
- Add new regex rules: new_function_usage, unescape_usage, string_timer_usage, long_hex_constants.
- Move iframe rule to `text` category.
- Keep existing script/form/text rules; all compile under IGNORECASE.

Browser / analysis refactor
- browser.py:
  - Remove inline heuristics; rely on engine for PASS/FAIL, reason, severity, tags.
  - Build page-level overview (`rule_checks`) across categories.
  - Analyze forms: add `base_url` + `base_hostname` to snippet so function rules can evaluate; include per-form rule_checks.
  - Analyze scripts: **per-script evaluation**:
    - Inline -> run regex script rules on inline text.
    - External -> run function script rules with a facts dict (src/src_hostname/base_url/base_hostname).
    - Only include scripts that matched ≥1 rule; attach severity/tags to matches.
  - Persist single source of truth: `/data/<uuid>/results.json`.
  - Backward-compat: `fetch_page_artifacts(..., engine=...)` kwarg accepted/ignored.

UI/UX
- Suspicious Scripts table now shows only matched scripts.
- Add severity badges and tag chips; tooltips show rule description.
- Prevent table blowouts:
  - Fixed layout + ellipsis + wrapping helpers (`.scripts-table`, `.breakable`, `details pre.code`).
  - Shortened inline snippet preview (configurable).
- Minor template niceties (e.g., rel="noopener" on external links where applicable).

Config
- Add `ui.snippet_preview_len` to settings.yaml; default 160.
- Load into `app.config["SNIPPET_PREVIEW_LEN"]` and use in `analyze_scripts`.

Init / wiring
- Import and register function rules as `Rule(...)` objects (not dicts).
- Hook Rules Engine to Flask logger for verbose/diagnostic output.
- Log totals on startup; keep YAML path override via `SNEAKYSCOPE_RULES_FILE`.

Bug fixes
- Fix boot crash: pass `Rule` instances to `engine.add_rule()` instead of dicts.
- Fix “N/A” in scripts table by actually computing per-script matches.
- Ensure form rules fire by including `base_url`/`base_hostname` in form snippets.

Roadmap
- Update roadmap to reflect completed items:
  - “Show each check and whether it triggered (pass/fail list per rule)”
  - Severity levels + tags in Suspicious Scripts
  - Results.json as route source of truth
  - Scripts table UX (badges, tooltips, layout fix)
2025-08-20 21:33:30 -05:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00

URL Sandbox

A lightweight web-based sandbox for analyzing websites and domains.
It performs WHOIS lookups, GeoIP enrichment, script/form inspection, and provides analyst-friendly output.


🚀 Features

  • Domain & IP Enrichment
    • WHOIS lookups with fallback to raw text when fields are missing
    • Explicit handling of privacy-protected WHOIS records (N/A or Possible Privacy)
    • GeoIP (City, Region, Country, Latitude/Longitude)
    • ASN, ISP, and network details
  • Flagged Content Analysis
    • Suspicious script detection
    • Suspicious form detection
    • Nested bullet-style reporting for clarity
  • Improved UX
    • Automatic addition of http://, https://, and www. if only a domain is provided
    • Modal spinner to indicate background analysis (Analyzing website…)
  • Resilient GeoLite2 Database Management
    • Downloads the MaxMind GeoLite2-City database on first startup
    • Checks file age and only re-downloads if older than 14 days (configurable via environment variable)

⚙️ Setup Instructions

1. Clone the Repository

git clone https://github.com/yourusername/url-sandbox.git
cd url-sandbox

2. Create a MaxMind Account & License Key

  1. Go to MaxMind GeoLite2
  2. Sign up for a free account
  3. Navigate to Account > Manage License Keys
  4. Generate a new license key

3. Configure Environment Variables

All environment variables are loaded from a .env file.

  1. Copy the sample file:
   cp .env.example .env
  1. Edit .env and set your values (see .env.example for available options).

Make sure to add your MaxMind License Key under MAXMIND_LICENSE_KEY.

4. Run with Docker Compose

docker-compose up --build

This will:

  • Build the app
  • Download the GeoLite2 database if not present or too old
  • Start the web interface

📝 Example Output

WHOIS Info

  • Registrar: MarkMonitor, Inc.
  • Organization: Possible Privacy
  • Creation: 1997-09-15
  • Expiration: 2028-09-14

GeoIP Info

  • IP: 172.66.159.20
    • City: N/A
    • Region: N/A
    • Country: United States
    • Coordinates: (37.751, -97.822)
    • ASN: 13335
    • ISP: Cloudflare, Inc.

📌 Roadmap

See Next Steps Checklist for planned features:

  • Improved UI templates
  • Artifact cleanup
  • Proxy support (optional)

Description
No description provided
Readme 365 KiB
Languages
Python 70.7%
HTML 18.8%
CSS 8.2%
Shell 1.7%
Dockerfile 0.6%