Phillip Tarrant 693f7d67b9 feat: HTTPS auto-normalization; robust TLS intel UI; global rules state; clean logging; preload
- Add SSL/TLS intelligence pipeline:
  - crt.sh lookup with expired-filtering and root-domain wildcard resolution
  - live TLS version/cipher probe with weak/legacy flags and probe notes
- UI: card + matrix rendering, raw JSON toggle, and host/wildcard cert lists
- Front page: checkbox to optionally fetch certificate/CT data

- Introduce `URLNormalizer` with punycode support and typo repair
  - Auto-prepend `https://` for bare domains (e.g., `google.com`)
  - Optional quick HTTPS reachability + `http://` fallback
- Provide singleton via function-cached `@singleton_loader`:
  - `get_url_normalizer()` reads defaults from Settings (if present)

- Standardize function-rule return shape to `(bool, dict|None)` across
  `form_*` and `script_*` rules; include structured payloads (`note`, hosts, ext, etc.)
- Harden `FunctionRuleAdapter`:
  - Coerce legacy returns `(bool)`, `(bool, str)` → normalized outputs
  - Adapt non-dict inputs to facts (category-aware and via provided adapter)
  - Return `(True, dict)` on match, `(False, None)` on miss
  - Bind-time logging with file:line + function id for diagnostics
- `RuleEngine`:
  - Back rules by private `self._rules`; `rules` property returns copy
  - Idempotent `add_rule(replace=False)` with in-place replace and regex (re)compile
  - Fix AttributeError from property assignment during `__init__`

- Replace hidden singleton factory with explicit builder + global state:
  - `app/rules/factory.py::build_rules_engine()` builds and logs totals
  - `app/state.py` exposes `set_rules_engine()` / `get_rules_engine()` as the SOF
  - `app/wsgi.py` builds once at preload and publishes via `set_rules_engine()`
- Add lightweight debug hooks (`SS_DEBUG_RULES=1`) to trace engine id and rule counts

- Unify logging wiring:
  - `wire_logging_once(app)` clears and attaches a single handler chain
  - Create two named loggers: `sneakyscope.app` and `sneakyscope.engine`
  - Disable propagation to prevent dupes; include pid/logger name in format
- Remove stray/duplicate handlers and import-time logging
- Optional dedup filter for bursty repeats (kept off by default)

- Gunicorn: enable `--preload` in entrypoint to avoid thread races and double registration
- Documented foreground vs background log “double consumer” caveat (attach vs `compose logs`)

- Jinja: replace `{% return %}` with structured `if/elif/else` branches
- Add toggle button to show raw JSON for TLS/CT section

- Consumers should import the rules engine via:
  - `from app.state import get_rules_engine`
- Use `build_rules_engine()` **only** during preload/init to construct the instance,
  then publish with `set_rules_engine()`. Do not call old singleton factories.

- New/changed modules (high level):
  - `app/utils/urltools.py` (+) — URLNormalizer + `get_url_normalizer()`
  - `app/rules/function_rules.py` (±) — normalized payload returns
  - `engine/function_rule_adapter.py` (±) — coercion, fact adaptation, bind logs
  - `app/utils/rules_engine.py` (±) — `_rules`, idempotent `add_rule`, fixes
  - `app/rules/factory.py` (±) — pure builder; totals logged post-registration
  - `app/state.py` (+) — process-global rules engine
  - `app/logging_setup.py` (±) — single chain, two named loggers
  - `app/wsgi.py` (±) — preload build + `set_rules_engine()`
  - `entrypoint.sh` (±) — add `--preload`
  - templates (±) — TLS card, raw toggle; front-page checkbox

Closes: flaky rule-type warnings, duplicate logs, and multi-worker race on rules init.
2025-08-21 22:05:16 -05:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00
2025-08-20 21:22:28 +00:00
2025-08-21 08:58:05 -05:00
2025-08-20 21:22:28 +00:00

SneakyScope

A lightweight web-based sandbox for analyzing websites and domains. SneakyScope fetches a page in a sandbox, enriches with WHOIS/GeoIP, and runs a unified Rules Engine (YAML + function rules) against scripts, forms, and text. Results are saved per-run and rendered with analyst-friendly tables, severity badges, and tags. Results are saved at time of analysis per run so you have a point in time result that doesn't change.

Repo: https://git.sneakygeek.net/ptarrant/SneakyScope Status: Private (may become public later)


🚀 Features

Unified Detection (Rules Engine)

  • Regex rules from YAML + function rules in code for context-aware checks.

  • PASS/FAIL per rule with reason, severity (low|medium|high), and tags.

  • Per-script matches:

    • Inline scripts → run regex rules on the code.
    • External scripts → run function rules with structured facts (src, hostnames, etc.).
  • Page-level overview: complete PASS/FAIL tables by category (script, form, text).

Domain & IP Enrichment

  • WHOIS with robust fallbacks (N/A, Possible Privacy when fields are missing).
  • GeoIP, ASN, and ISP details.

Results & UX

  • Per-run artifacts under /data/<uuid>/:

    • screenshot.png, source.txt, results.json (single source of truth).
  • Suspicious Scripts table shows only matched scripts with:

    • Severity badges and tag chips (tooltip shows rule reason).
    • Snippet preview length configurable via settings.yaml.

🧱 Architecture at a Glance

  • Flask app (Gunicorn in Docker)

  • Playwright for headless page fetch/render

  • BeautifulSoup4 for parsing

  • Rules Engine

    • YAML regex rules (config/suspicious_rules.yaml)
    • Function rules (app/rules/function_rules.py) registered on startup
  • Artifacts: persistent path mounted at /data (configurable)


⚙️ Setup

1) Clone

Since this repo is private, youll need credentials (HTTPS with a personal access token) or SSH access.

HTTPS (with token):

git clone https://git.sneakygeek.net/ptarrant/SneakyScope.git
cd SneakyScope

SSH:

git clone git@git.sneakygeek.net:ptarrant/SneakyScope.git
cd SneakyScope

2) Configure Environment

Copy and edit env:

cp .env.example .env

Important vars:

  • SECRET_KEY Flask secret (set in production).
  • MAXMIND_LICENSE_KEY for GeoIP (optional if you disable GeoIP).
  • SNEAKYSCOPE_RULES_FILE override path to YAML rules (optional).

3) Settings

settings.yaml controls UI/behavior. Example:

app:
  name: "SneakyScope"
  version_major: 0
  version_minor: 1

ui:
  snippet_preview_len: 160  # controls inline script snippet length in UI

4) Run with Docker Compose

docker-compose up --build

This builds the image and starts the web app. The /data directory in the container is where run artifacts are written—mount a host directory in Compose to persist between restarts.


🧪 Using SneakyScope

  1. Open the web UI and submit a URL.

  2. On completion youll see:

    • URL Overview (with permalink to /results/<uuid>)
    • Enrichment (WHOIS/GeoIP)
    • Redirects
    • Forms (inputs + per-form rule checks)
    • Suspicious Scripts (only scripts that matched rules; badges/tags, snippet)
    • Screenshot and Source

Artifacts for each run live under /data/<uuid>/:

  • results.json complete structured result consumed by the UI.
  • source.txt, screenshot.png, and other files as added.

📝 Rules

YAML (regex) Rules

config/suspicious_rules.yaml contains regex rules (compiled IGNORECASE). Example:

- name: eval_usage
  description: "Use of eval() in script"
  category: script
  type: regex
  pattern: '\beval\s*\('
  severity: high
  tags: [obfuscation, unsafe-eval]

Function Rules (code)

Rules needing context (e.g., compare action host to page host) live in:

  • app/rules/function_rules.py:

    • FactAdapter converts snippets to structured facts.

    • FunctionRuleAdapter lets dict-expecting rules run from engine inputs.

    • Implementations like:

      • form_action_missing
      • form_http_on_https_page
      • form_submits_to_different_host
      • script_src_uses_data_or_blob
      • script_src_has_dangerous_extension
      • script_third_party_host

Theyre registered at startup in app/__init__.py alongside YAML rules.


🔧 Configuration Tips

  • Snippet length: tweak ui.snippet_preview_len in settings.yaml (default 160).
  • Rules file override: set SNEAKYSCOPE_RULES_FILE=/path/to/your.yaml.
  • Artifacts path: by default /data in the container (mount via Compose).

📂 Project Structure (high-level)

app/
  __init__.py                 # Flask app factory (loads YAML + function rules)
  browser.py                  # fetch + analysis orchestrator (writes results.json)
  routes.py                   # web views
  rules/
    function_rules.py         # FactAdapter, FunctionRuleAdapter, function rules
  utils/
    rules_engine.py           # engine + Rule class + YAML loader
    io_helpers.py             # safe_write, etc.
    settings.py               # get_settings()
  templates/                  # Jinja2 templates
  static/                     # CSS/JS
  config/
    suspicious_rules.yaml     # regex rules
docs/
  roadmap.md                  # ongoing plan and priorities

🧭 Roadmap (short version)

Full details: docs/roadmap.md

  • Core Analysis / Stability

    • Opt-in fetch external scripts (size/time limits) and evaluate fetched content.
    • Remove remaining legacy form “flagged_reasons” once function rules cover them.
    • Unit tests: YAML compilation, adapters, per-artifact rule cases.
  • API Layer

    • Endpoints: /screenshot, /source, /analyse
    • OpenAPI at /api/openapi.yaml; docs at /docs (Swagger/Redoc)
  • UI / UX

    • Auto-prepend http(s):///www. for bare domains
    • Source viewer (embedded editor)
    • Scripts table toggle: “Only suspicious” / “All scripts”
    • Rules Lab (WYSIWYG tester) for rapid rule validation
  • Artifact Management & Ops

    • Retention/cleanup policy (age/size)
    • Periodic maintenance scripts (configurable in settings.yaml)
    • Results caching UX (re-run vs. load from cache)
  • Extras / Integrations

    • Bulk URL analysis
    • Alerting/webhooks (Slack/email)
    • Analyst verdict tags + export (CSV/JSON)

🤝 Contributing

This repository is currently private on a self-hosted git server.

  • Internal contributors: use feature branches and open merge requests on https://git.sneakygeek.net/ptarrant/SneakyScope.
  • If/when the repo is made public, well welcome issues and PRs from the community.

⚠️ Disclaimer

SneakyScope is intended for defensive security analysis and educational use. Only analyze content you are authorized to test.


Description
No description provided
Readme 365 KiB
Languages
Python 70.7%
HTML 18.8%
CSS 8.2%
Shell 1.7%
Dockerfile 0.6%