Go to file

Phillip Tarrant 693f7d67b9 feat: HTTPS auto-normalization; robust TLS intel UI; global rules state; clean logging; preload

- Add SSL/TLS intelligence pipeline:
  - crt.sh lookup with expired-filtering and root-domain wildcard resolution
  - live TLS version/cipher probe with weak/legacy flags and probe notes
- UI: card + matrix rendering, raw JSON toggle, and host/wildcard cert lists
- Front page: checkbox to optionally fetch certificate/CT data

- Introduce `URLNormalizer` with punycode support and typo repair
  - Auto-prepend `https://` for bare domains (e.g., `google.com`)
  - Optional quick HTTPS reachability + `http://` fallback
- Provide singleton via function-cached `@singleton_loader`:
  - `get_url_normalizer()` reads defaults from Settings (if present)

- Standardize function-rule return shape to `(bool, dict|None)` across
  `form_*` and `script_*` rules; include structured payloads (`note`, hosts, ext, etc.)
- Harden `FunctionRuleAdapter`:
  - Coerce legacy returns `(bool)`, `(bool, str)` → normalized outputs
  - Adapt non-dict inputs to facts (category-aware and via provided adapter)
  - Return `(True, dict)` on match, `(False, None)` on miss
  - Bind-time logging with file:line + function id for diagnostics
- `RuleEngine`:
  - Back rules by private `self._rules`; `rules` property returns copy
  - Idempotent `add_rule(replace=False)` with in-place replace and regex (re)compile
  - Fix AttributeError from property assignment during `__init__`

- Replace hidden singleton factory with explicit builder + global state:
  - `app/rules/factory.py::build_rules_engine()` builds and logs totals
  - `app/state.py` exposes `set_rules_engine()` / `get_rules_engine()` as the SOF
  - `app/wsgi.py` builds once at preload and publishes via `set_rules_engine()`
- Add lightweight debug hooks (`SS_DEBUG_RULES=1`) to trace engine id and rule counts

- Unify logging wiring:
  - `wire_logging_once(app)` clears and attaches a single handler chain
  - Create two named loggers: `sneakyscope.app` and `sneakyscope.engine`
  - Disable propagation to prevent dupes; include pid/logger name in format
- Remove stray/duplicate handlers and import-time logging
- Optional dedup filter for bursty repeats (kept off by default)

- Gunicorn: enable `--preload` in entrypoint to avoid thread races and double registration
- Documented foreground vs background log “double consumer” caveat (attach vs `compose logs`)

- Jinja: replace `{% return %}` with structured `if/elif/else` branches
- Add toggle button to show raw JSON for TLS/CT section

- Consumers should import the rules engine via:
  - `from app.state import get_rules_engine`
- Use `build_rules_engine()` **only** during preload/init to construct the instance,
  then publish with `set_rules_engine()`. Do not call old singleton factories.

- New/changed modules (high level):
  - `app/utils/urltools.py` (+) — URLNormalizer + `get_url_normalizer()`
  - `app/rules/function_rules.py` (±) — normalized payload returns
  - `engine/function_rule_adapter.py` (±) — coercion, fact adaptation, bind logs
  - `app/utils/rules_engine.py` (±) — `_rules`, idempotent `add_rule`, fixes
  - `app/rules/factory.py` (±) — pure builder; totals logged post-registration
  - `app/state.py` (+) — process-global rules engine
  - `app/logging_setup.py` (±) — single chain, two named loggers
  - `app/wsgi.py` (±) — preload build + `set_rules_engine()`
  - `entrypoint.sh` (±) — add `--preload`
  - templates (±) — TLS card, raw toggle; front-page checkbox

Closes: flaky rule-type warnings, duplicate logs, and multi-worker race on rules init.

2025-08-21 22:05:16 -05:00

app

feat: HTTPS auto-normalization; robust TLS intel UI; global rules state; clean logging; preload

2025-08-21 22:05:16 -05:00

docs

feat: on-demand external script analysis + code viewer; refactor form analysis to rule engine

2025-08-21 15:32:24 -05:00

openapi

first commit

2025-08-20 21:22:28 +00:00

.env.example

first commit

2025-08-20 21:22:28 +00:00

.gitignore

first commit

2025-08-20 21:22:28 +00:00

docker-compose.yaml

first commit

2025-08-20 21:22:28 +00:00

Dockerfile

first commit

2025-08-20 21:22:28 +00:00

entrypoint.sh

feat: HTTPS auto-normalization; robust TLS intel UI; global rules state; clean logging; preload

2025-08-21 22:05:16 -05:00

Readme.md

updating readme

2025-08-21 08:58:05 -05:00

requirements.txt

feat: HTTPS auto-normalization; robust TLS intel UI; global rules state; clean logging; preload

2025-08-21 22:05:16 -05:00

sandbox.sh

first commit

2025-08-20 21:22:28 +00:00

Readme.md

SneakyScope

A lightweight web-based sandbox for analyzing websites and domains. SneakyScope fetches a page in a sandbox, enriches with WHOIS/GeoIP, and runs a unified Rules Engine (YAML + function rules) against scripts, forms, and text. Results are saved per-run and rendered with analyst-friendly tables, severity badges, and tags. Results are saved at time of analysis per run so you have a point in time result that doesn't change.

Repo: https://git.sneakygeek.net/ptarrant/SneakyScope Status: Private (may become public later)

🚀 Features

Unified Detection (Rules Engine)

Regex rules from YAML + function rules in code for context-aware checks.
PASS/FAIL per rule with reason, severity (low|medium|high), and tags.
Per-script matches:
- Inline scripts → run regex rules on the code.
- External scripts → run function rules with structured facts (src, hostnames, etc.).
Page-level overview: complete PASS/FAIL tables by category (script, form, text).

Domain & IP Enrichment

WHOIS with robust fallbacks (N/A, Possible Privacy when fields are missing).
GeoIP, ASN, and ISP details.

Results & UX

Per-run artifacts under /data/<uuid>/:
- screenshot.png, source.txt, results.json (single source of truth).
Suspicious Scripts table shows only matched scripts with:
- Severity badges and tag chips (tooltip shows rule reason).
- Snippet preview length configurable via settings.yaml.

🧱 Architecture at a Glance

Flask app (Gunicorn in Docker)
Playwright for headless page fetch/render
BeautifulSoup4 for parsing
Rules Engine
- YAML regex rules (config/suspicious_rules.yaml)
- Function rules (app/rules/function_rules.py) registered on startup
Artifacts: persistent path mounted at /data (configurable)

⚙️ Setup

1) Clone

Since this repo is private, you’ll need credentials (HTTPS with a personal access token) or SSH access.

HTTPS (with token):

git clone https://git.sneakygeek.net/ptarrant/SneakyScope.git
cd SneakyScope

SSH:

git clone git@git.sneakygeek.net:ptarrant/SneakyScope.git
cd SneakyScope

2) Configure Environment

Copy and edit env:

cp .env.example .env

Important vars:

SECRET_KEY – Flask secret (set in production).
MAXMIND_LICENSE_KEY – for GeoIP (optional if you disable GeoIP).
SNEAKYSCOPE_RULES_FILE – override path to YAML rules (optional).

3) Settings

settings.yaml controls UI/behavior. Example:

app:
  name: "SneakyScope"
  version_major: 0
  version_minor: 1

ui:
  snippet_preview_len: 160  # controls inline script snippet length in UI

4) Run with Docker Compose

docker-compose up --build

This builds the image and starts the web app. The /data directory in the container is where run artifacts are written—mount a host directory in Compose to persist between restarts.

🧪 Using SneakyScope

Open the web UI and submit a URL.
On completion you’ll see:
- URL Overview (with permalink to /results/<uuid>)
- Enrichment (WHOIS/GeoIP)
- Redirects
- Forms (inputs + per-form rule checks)
- Suspicious Scripts (only scripts that matched rules; badges/tags, snippet)
- Screenshot and Source

Artifacts for each run live under /data/<uuid>/:

results.json – complete structured result consumed by the UI.
source.txt, screenshot.png, and other files as added.

📝 Rules

YAML (regex) Rules

config/suspicious_rules.yaml contains regex rules (compiled IGNORECASE). Example:

- name: eval_usage
  description: "Use of eval() in script"
  category: script
  type: regex
  pattern: '\beval\s*\('
  severity: high
  tags: [obfuscation, unsafe-eval]

Function Rules (code)

Rules needing context (e.g., compare action host to page host) live in:

app/rules/function_rules.py:
- FactAdapter – converts snippets to structured facts.
- FunctionRuleAdapter – lets dict-expecting rules run from engine inputs.
- Implementations like:
  - form_action_missing
  - form_http_on_https_page
  - form_submits_to_different_host
  - script_src_uses_data_or_blob
  - script_src_has_dangerous_extension
  - script_third_party_host

They’re registered at startup in app/__init__.py alongside YAML rules.

🔧 Configuration Tips

Snippet length: tweak ui.snippet_preview_len in settings.yaml (default 160).
Rules file override: set SNEAKYSCOPE_RULES_FILE=/path/to/your.yaml.
Artifacts path: by default /data in the container (mount via Compose).

📂 Project Structure (high-level)

app/
  __init__.py                 # Flask app factory (loads YAML + function rules)
  browser.py                  # fetch + analysis orchestrator (writes results.json)
  routes.py                   # web views
  rules/
    function_rules.py         # FactAdapter, FunctionRuleAdapter, function rules
  utils/
    rules_engine.py           # engine + Rule class + YAML loader
    io_helpers.py             # safe_write, etc.
    settings.py               # get_settings()
  templates/                  # Jinja2 templates
  static/                     # CSS/JS
  config/
    suspicious_rules.yaml     # regex rules
docs/
  roadmap.md                  # ongoing plan and priorities

🧭 Roadmap (short version)

Full details: docs/roadmap.md

Core Analysis / Stability
- Opt-in fetch external scripts (size/time limits) and evaluate fetched content.
- Remove remaining legacy form “flagged_reasons” once function rules cover them.
- Unit tests: YAML compilation, adapters, per-artifact rule cases.
API Layer
- Endpoints: /screenshot, /source, /analyse
- OpenAPI at /api/openapi.yaml; docs at /docs (Swagger/Redoc)
UI / UX
- Auto-prepend http(s):///www. for bare domains
- Source viewer (embedded editor)
- Scripts table toggle: “Only suspicious” / “All scripts”
- Rules Lab (WYSIWYG tester) for rapid rule validation
Artifact Management & Ops
- Retention/cleanup policy (age/size)
- Periodic maintenance scripts (configurable in settings.yaml)
- Results caching UX (re-run vs. load from cache)
Extras / Integrations
- Bulk URL analysis
- Alerting/webhooks (Slack/email)
- Analyst verdict tags + export (CSV/JSON)

🤝 Contributing

This repository is currently private on a self-hosted git server.

Internal contributors: use feature branches and open merge requests on https://git.sneakygeek.net/ptarrant/SneakyScope.
If/when the repo is made public, we’ll welcome issues and PRs from the community.

⚠️ Disclaimer

SneakyScope is intended for defensive security analysis and educational use. Only analyze content you are authorized to test.

Languages

Python 70.7%

HTML 18.8%

CSS 8.2%

Shell 1.7%

Dockerfile 0.6%

Readme.md Unescape Escape

SneakyScope

🚀 Features

Unified Detection (Rules Engine)

Domain & IP Enrichment

Results & UX

🧱 Architecture at a Glance

⚙️ Setup

1) Clone

2) Configure Environment

3) Settings

4) Run with Docker Compose

🧪 Using SneakyScope

📝 Rules

YAML (regex) Rules

Function Rules (code)

🔧 Configuration Tips

📂 Project Structure (high-level)

🧭 Roadmap (short version)

🤝 Contributing

⚠️ Disclaimer

Readme.md