diff --git a/Readme.md b/Readme.md index 04f47e3..def7c61 100644 --- a/Readme.md +++ b/Readme.md @@ -1,92 +1,248 @@ -# URL Sandbox +# SneakyScope -A lightweight web-based sandbox for analyzing websites and domains. -It performs WHOIS lookups, GeoIP enrichment, script/form inspection, and provides analyst-friendly output. +A lightweight web-based sandbox for analyzing websites and domains. +SneakyScope fetches a page in a sandbox, enriches with WHOIS/GeoIP, and runs a unified **Rules Engine** (YAML + function rules) against scripts, forms, and text. Results are saved per-run and rendered with analyst-friendly tables, severity badges, and tags. Results are saved at time of analysis per run so you have a point in time result that doesn't change. + +> Repo: [https://git.sneakygeek.net/ptarrant/SneakyScope](https://git.sneakygeek.net/ptarrant/SneakyScope) +> Status: **Private** (may become public later) --- ## 🚀 Features -- **Domain & IP Enrichment** - - WHOIS lookups with fallback to raw text when fields are missing - - Explicit handling of privacy-protected WHOIS records (`N/A` or `Possible Privacy`) - - GeoIP (City, Region, Country, Latitude/Longitude) - - ASN, ISP, and network details -- **Flagged Content Analysis** - - Suspicious script detection - - Suspicious form detection - - Nested bullet-style reporting for clarity -- **Improved UX** - - Automatic addition of `http://`, `https://`, and `www.` if only a domain is provided - - Modal spinner to indicate background analysis (`Analyzing website…`) -- **Resilient GeoLite2 Database Management** - - Downloads the MaxMind GeoLite2-City database on first startup - - Checks file age and only re-downloads if older than **14 days** (configurable via environment variable) +### Unified Detection (Rules Engine) + +* **Regex rules from YAML** + **function rules in code** for context-aware checks. +* PASS/FAIL per rule with **reason**, **severity** (`low|medium|high`), and **tags**. +* **Per-script matches**: + + * Inline scripts → run regex rules on the code. + * External scripts → run function rules with structured facts (`src`, hostnames, etc.). +* **Page-level overview**: complete PASS/FAIL tables by category (`script`, `form`, `text`). + +### Domain & IP Enrichment + +* WHOIS with robust fallbacks (`N/A`, `Possible Privacy` when fields are missing). +* GeoIP, ASN, and ISP details. + +### Results & UX + +* **Per-run artifacts** under `/data//`: + + * `screenshot.png`, `source.txt`, `results.json` (single source of truth). +* **Suspicious Scripts** table shows only **matched** scripts with: + + * **Severity badges** and **tag chips** (tooltip shows rule reason). + * Snippet preview length configurable via `settings.yaml`. --- -## ⚙️ Setup Instructions +## 🧱 Architecture at a Glance + +* **Flask** app (Gunicorn in Docker) +* **Playwright** for headless page fetch/render +* **BeautifulSoup4** for parsing +* **Rules Engine** + + * YAML regex rules (`config/suspicious_rules.yaml`) + * Function rules (`app/rules/function_rules.py`) registered on startup +* **Artifacts**: persistent path mounted at `/data` (configurable) + +--- + +## ⚙️ Setup + +### 1) Clone + +> Since this repo is private, you’ll need credentials (HTTPS with a personal access token) **or** SSH access. + +**HTTPS (with token):** -### 1. Clone the Repository ```bash -git clone https://github.com/yourusername/url-sandbox.git -cd url-sandbox +git clone https://git.sneakygeek.net/ptarrant/SneakyScope.git +cd SneakyScope ``` -### 2. Create a MaxMind Account & License Key -1. Go to [MaxMind GeoLite2](https://dev.maxmind.com/geoip/geolite2-free-geolocation-data) -2. Sign up for a free account -3. Navigate to **Account > Manage License Keys** -4. Generate a new license key +**SSH:** -### 3. Configure Environment Variables -All environment variables are loaded from a `.env` file. - -1. Copy the sample file: ```bash - cp .env.example .env -```` +git clone git@git.sneakygeek.net:ptarrant/SneakyScope.git +cd SneakyScope +``` -2. Edit `.env` and set your values (see [`.env.example`](./.env.example) for available options). +### 2) Configure Environment -Make sure to add your **MaxMind License Key** under `MAXMIND_LICENSE_KEY`. +Copy and edit env: +```bash +cp .env.example .env +``` + +Important vars: + +* `SECRET_KEY` – Flask secret (set in production). +* `MAXMIND_LICENSE_KEY` – for GeoIP (optional if you disable GeoIP). +* `SNEAKYSCOPE_RULES_FILE` – override path to YAML rules (optional). + +### 3) Settings + +`settings.yaml` controls UI/behavior. Example: + +```yaml +app: + name: "SneakyScope" + version_major: 0 + version_minor: 1 + +ui: + snippet_preview_len: 160 # controls inline script snippet length in UI +``` + +### 4) Run with Docker Compose -### 4. Run with Docker Compose ```bash docker-compose up --build ``` -This will: -- Build the app -- Download the GeoLite2 database if not present or too old -- Start the web interface +This builds the image and starts the web app. The `/data` directory in the container is where run artifacts are written—mount a host directory in Compose to persist between restarts. --- -## 📝 Example Output +## 🧪 Using SneakyScope -**WHOIS Info** -- Registrar: MarkMonitor, Inc. -- Organization: Possible Privacy -- Creation: 1997-09-15 -- Expiration: 2028-09-14 +1. Open the web UI and submit a URL. +2. On completion you’ll see: -**GeoIP Info** -- IP: 172.66.159.20 - - City: N/A - - Region: N/A - - Country: United States - - Coordinates: (37.751, -97.822) - - ASN: 13335 - - ISP: Cloudflare, Inc. + * **URL Overview** (with permalink to `/results/`) + * **Enrichment** (WHOIS/GeoIP) + * **Redirects** + * **Forms** (inputs + per-form rule checks) + * **Suspicious Scripts** (only scripts that matched rules; badges/tags, snippet) + * **Screenshot** and **Source** + +Artifacts for each run live under `/data//`: + +* `results.json` – complete structured result consumed by the UI. +* `source.txt`, `screenshot.png`, and other files as added. --- -## 📌 Roadmap -See [Next Steps Checklist](docs/roadmap.md) for planned features: -- Improved UI templates -- Artifact cleanup -- Proxy support (optional) +## 📝 Rules ---- \ No newline at end of file +### YAML (regex) Rules + +`config/suspicious_rules.yaml` contains regex rules (compiled `IGNORECASE`). Example: + +```yaml +- name: eval_usage + description: "Use of eval() in script" + category: script + type: regex + pattern: '\beval\s*\(' + severity: high + tags: [obfuscation, unsafe-eval] +``` + +### Function Rules (code) + +Rules needing **context** (e.g., compare action host to page host) live in: + +* `app/rules/function_rules.py`: + + * `FactAdapter` – converts snippets to structured facts. + * `FunctionRuleAdapter` – lets dict-expecting rules run from engine inputs. + * Implementations like: + + * `form_action_missing` + * `form_http_on_https_page` + * `form_submits_to_different_host` + * `script_src_uses_data_or_blob` + * `script_src_has_dangerous_extension` + * `script_third_party_host` + +They’re registered at startup in `app/__init__.py` alongside YAML rules. + +--- + +## 🔧 Configuration Tips + +* **Snippet length**: tweak `ui.snippet_preview_len` in `settings.yaml` (default 160). +* **Rules file override**: set `SNEAKYSCOPE_RULES_FILE=/path/to/your.yaml`. +* **Artifacts path**: by default `/data` in the container (mount via Compose). + +--- + +## 📂 Project Structure (high-level) + +``` +app/ + __init__.py # Flask app factory (loads YAML + function rules) + browser.py # fetch + analysis orchestrator (writes results.json) + routes.py # web views + rules/ + function_rules.py # FactAdapter, FunctionRuleAdapter, function rules + utils/ + rules_engine.py # engine + Rule class + YAML loader + io_helpers.py # safe_write, etc. + settings.py # get_settings() + templates/ # Jinja2 templates + static/ # CSS/JS + config/ + suspicious_rules.yaml # regex rules +docs/ + roadmap.md # ongoing plan and priorities +``` + +--- + +## 🧭 Roadmap (short version) + +Full details: `docs/roadmap.md` + +* **Core Analysis / Stability** + + * Opt-in **fetch external scripts** (size/time limits) and evaluate fetched content. + * Remove remaining legacy form “flagged\_reasons” once function rules cover them. + * Unit tests: YAML compilation, adapters, per-artifact rule cases. + +* **API Layer** + + * Endpoints: `/screenshot`, `/source`, `/analyse` + * OpenAPI at `/api/openapi.yaml`; docs at `/docs` (Swagger/Redoc) + +* **UI / UX** + + * Auto-prepend `http(s)://`/`www.` for bare domains + * Source viewer (embedded editor) + * Scripts table toggle: “Only suspicious” / “All scripts” + * **Rules Lab** (WYSIWYG tester) for rapid rule validation + +* **Artifact Management & Ops** + + * Retention/cleanup policy (age/size) + * Periodic maintenance scripts (configurable in `settings.yaml`) + * Results caching UX (re-run vs. load from cache) + +* **Extras / Integrations** + + * Bulk URL analysis + * Alerting/webhooks (Slack/email) + * Analyst verdict tags + export (CSV/JSON) + +--- + +## 🤝 Contributing + +This repository is currently **private** on a self-hosted git server. + +* Internal contributors: use feature branches and open merge requests on `https://git.sneakygeek.net/ptarrant/SneakyScope`. +* If/when the repo is made public, we’ll welcome issues and PRs from the community. + +--- + +## ⚠️ Disclaimer + +SneakyScope is intended for defensive security analysis and educational use. +Only analyze content you are authorized to test. + +---