updating readme
This commit is contained in:
272
Readme.md
272
Readme.md
@@ -1,92 +1,248 @@
|
||||
# URL Sandbox
|
||||
# SneakyScope
|
||||
|
||||
A lightweight web-based sandbox for analyzing websites and domains.
|
||||
It performs WHOIS lookups, GeoIP enrichment, script/form inspection, and provides analyst-friendly output.
|
||||
SneakyScope fetches a page in a sandbox, enriches with WHOIS/GeoIP, and runs a unified **Rules Engine** (YAML + function rules) against scripts, forms, and text. Results are saved per-run and rendered with analyst-friendly tables, severity badges, and tags. Results are saved at time of analysis per run so you have a point in time result that doesn't change.
|
||||
|
||||
> Repo: [https://git.sneakygeek.net/ptarrant/SneakyScope](https://git.sneakygeek.net/ptarrant/SneakyScope)
|
||||
> Status: **Private** (may become public later)
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Features
|
||||
|
||||
- **Domain & IP Enrichment**
|
||||
- WHOIS lookups with fallback to raw text when fields are missing
|
||||
- Explicit handling of privacy-protected WHOIS records (`N/A` or `Possible Privacy`)
|
||||
- GeoIP (City, Region, Country, Latitude/Longitude)
|
||||
- ASN, ISP, and network details
|
||||
- **Flagged Content Analysis**
|
||||
- Suspicious script detection
|
||||
- Suspicious form detection
|
||||
- Nested bullet-style reporting for clarity
|
||||
- **Improved UX**
|
||||
- Automatic addition of `http://`, `https://`, and `www.` if only a domain is provided
|
||||
- Modal spinner to indicate background analysis (`Analyzing website…`)
|
||||
- **Resilient GeoLite2 Database Management**
|
||||
- Downloads the MaxMind GeoLite2-City database on first startup
|
||||
- Checks file age and only re-downloads if older than **14 days** (configurable via environment variable)
|
||||
### Unified Detection (Rules Engine)
|
||||
|
||||
* **Regex rules from YAML** + **function rules in code** for context-aware checks.
|
||||
* PASS/FAIL per rule with **reason**, **severity** (`low|medium|high`), and **tags**.
|
||||
* **Per-script matches**:
|
||||
|
||||
* Inline scripts → run regex rules on the code.
|
||||
* External scripts → run function rules with structured facts (`src`, hostnames, etc.).
|
||||
* **Page-level overview**: complete PASS/FAIL tables by category (`script`, `form`, `text`).
|
||||
|
||||
### Domain & IP Enrichment
|
||||
|
||||
* WHOIS with robust fallbacks (`N/A`, `Possible Privacy` when fields are missing).
|
||||
* GeoIP, ASN, and ISP details.
|
||||
|
||||
### Results & UX
|
||||
|
||||
* **Per-run artifacts** under `/data/<uuid>/`:
|
||||
|
||||
* `screenshot.png`, `source.txt`, `results.json` (single source of truth).
|
||||
* **Suspicious Scripts** table shows only **matched** scripts with:
|
||||
|
||||
* **Severity badges** and **tag chips** (tooltip shows rule reason).
|
||||
* Snippet preview length configurable via `settings.yaml`.
|
||||
|
||||
---
|
||||
|
||||
## ⚙️ Setup Instructions
|
||||
## 🧱 Architecture at a Glance
|
||||
|
||||
* **Flask** app (Gunicorn in Docker)
|
||||
* **Playwright** for headless page fetch/render
|
||||
* **BeautifulSoup4** for parsing
|
||||
* **Rules Engine**
|
||||
|
||||
* YAML regex rules (`config/suspicious_rules.yaml`)
|
||||
* Function rules (`app/rules/function_rules.py`) registered on startup
|
||||
* **Artifacts**: persistent path mounted at `/data` (configurable)
|
||||
|
||||
---
|
||||
|
||||
## ⚙️ Setup
|
||||
|
||||
### 1) Clone
|
||||
|
||||
> Since this repo is private, you’ll need credentials (HTTPS with a personal access token) **or** SSH access.
|
||||
|
||||
**HTTPS (with token):**
|
||||
|
||||
### 1. Clone the Repository
|
||||
```bash
|
||||
git clone https://github.com/yourusername/url-sandbox.git
|
||||
cd url-sandbox
|
||||
git clone https://git.sneakygeek.net/ptarrant/SneakyScope.git
|
||||
cd SneakyScope
|
||||
```
|
||||
|
||||
### 2. Create a MaxMind Account & License Key
|
||||
1. Go to [MaxMind GeoLite2](https://dev.maxmind.com/geoip/geolite2-free-geolocation-data)
|
||||
2. Sign up for a free account
|
||||
3. Navigate to **Account > Manage License Keys**
|
||||
4. Generate a new license key
|
||||
**SSH:**
|
||||
|
||||
### 3. Configure Environment Variables
|
||||
All environment variables are loaded from a `.env` file.
|
||||
|
||||
1. Copy the sample file:
|
||||
```bash
|
||||
cp .env.example .env
|
||||
````
|
||||
git clone git@git.sneakygeek.net:ptarrant/SneakyScope.git
|
||||
cd SneakyScope
|
||||
```
|
||||
|
||||
2. Edit `.env` and set your values (see [`.env.example`](./.env.example) for available options).
|
||||
### 2) Configure Environment
|
||||
|
||||
Make sure to add your **MaxMind License Key** under `MAXMIND_LICENSE_KEY`.
|
||||
Copy and edit env:
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
Important vars:
|
||||
|
||||
* `SECRET_KEY` – Flask secret (set in production).
|
||||
* `MAXMIND_LICENSE_KEY` – for GeoIP (optional if you disable GeoIP).
|
||||
* `SNEAKYSCOPE_RULES_FILE` – override path to YAML rules (optional).
|
||||
|
||||
### 3) Settings
|
||||
|
||||
`settings.yaml` controls UI/behavior. Example:
|
||||
|
||||
```yaml
|
||||
app:
|
||||
name: "SneakyScope"
|
||||
version_major: 0
|
||||
version_minor: 1
|
||||
|
||||
ui:
|
||||
snippet_preview_len: 160 # controls inline script snippet length in UI
|
||||
```
|
||||
|
||||
### 4) Run with Docker Compose
|
||||
|
||||
### 4. Run with Docker Compose
|
||||
```bash
|
||||
docker-compose up --build
|
||||
```
|
||||
|
||||
This will:
|
||||
- Build the app
|
||||
- Download the GeoLite2 database if not present or too old
|
||||
- Start the web interface
|
||||
This builds the image and starts the web app. The `/data` directory in the container is where run artifacts are written—mount a host directory in Compose to persist between restarts.
|
||||
|
||||
---
|
||||
|
||||
## 📝 Example Output
|
||||
## 🧪 Using SneakyScope
|
||||
|
||||
**WHOIS Info**
|
||||
- Registrar: MarkMonitor, Inc.
|
||||
- Organization: Possible Privacy
|
||||
- Creation: 1997-09-15
|
||||
- Expiration: 2028-09-14
|
||||
1. Open the web UI and submit a URL.
|
||||
2. On completion you’ll see:
|
||||
|
||||
**GeoIP Info**
|
||||
- IP: 172.66.159.20
|
||||
- City: N/A
|
||||
- Region: N/A
|
||||
- Country: United States
|
||||
- Coordinates: (37.751, -97.822)
|
||||
- ASN: 13335
|
||||
- ISP: Cloudflare, Inc.
|
||||
* **URL Overview** (with permalink to `/results/<uuid>`)
|
||||
* **Enrichment** (WHOIS/GeoIP)
|
||||
* **Redirects**
|
||||
* **Forms** (inputs + per-form rule checks)
|
||||
* **Suspicious Scripts** (only scripts that matched rules; badges/tags, snippet)
|
||||
* **Screenshot** and **Source**
|
||||
|
||||
Artifacts for each run live under `/data/<uuid>/`:
|
||||
|
||||
* `results.json` – complete structured result consumed by the UI.
|
||||
* `source.txt`, `screenshot.png`, and other files as added.
|
||||
|
||||
---
|
||||
|
||||
## 📌 Roadmap
|
||||
See [Next Steps Checklist](docs/roadmap.md) for planned features:
|
||||
- Improved UI templates
|
||||
- Artifact cleanup
|
||||
- Proxy support (optional)
|
||||
## 📝 Rules
|
||||
|
||||
### YAML (regex) Rules
|
||||
|
||||
`config/suspicious_rules.yaml` contains regex rules (compiled `IGNORECASE`). Example:
|
||||
|
||||
```yaml
|
||||
- name: eval_usage
|
||||
description: "Use of eval() in script"
|
||||
category: script
|
||||
type: regex
|
||||
pattern: '\beval\s*\('
|
||||
severity: high
|
||||
tags: [obfuscation, unsafe-eval]
|
||||
```
|
||||
|
||||
### Function Rules (code)
|
||||
|
||||
Rules needing **context** (e.g., compare action host to page host) live in:
|
||||
|
||||
* `app/rules/function_rules.py`:
|
||||
|
||||
* `FactAdapter` – converts snippets to structured facts.
|
||||
* `FunctionRuleAdapter` – lets dict-expecting rules run from engine inputs.
|
||||
* Implementations like:
|
||||
|
||||
* `form_action_missing`
|
||||
* `form_http_on_https_page`
|
||||
* `form_submits_to_different_host`
|
||||
* `script_src_uses_data_or_blob`
|
||||
* `script_src_has_dangerous_extension`
|
||||
* `script_third_party_host`
|
||||
|
||||
They’re registered at startup in `app/__init__.py` alongside YAML rules.
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Configuration Tips
|
||||
|
||||
* **Snippet length**: tweak `ui.snippet_preview_len` in `settings.yaml` (default 160).
|
||||
* **Rules file override**: set `SNEAKYSCOPE_RULES_FILE=/path/to/your.yaml`.
|
||||
* **Artifacts path**: by default `/data` in the container (mount via Compose).
|
||||
|
||||
---
|
||||
|
||||
## 📂 Project Structure (high-level)
|
||||
|
||||
```
|
||||
app/
|
||||
__init__.py # Flask app factory (loads YAML + function rules)
|
||||
browser.py # fetch + analysis orchestrator (writes results.json)
|
||||
routes.py # web views
|
||||
rules/
|
||||
function_rules.py # FactAdapter, FunctionRuleAdapter, function rules
|
||||
utils/
|
||||
rules_engine.py # engine + Rule class + YAML loader
|
||||
io_helpers.py # safe_write, etc.
|
||||
settings.py # get_settings()
|
||||
templates/ # Jinja2 templates
|
||||
static/ # CSS/JS
|
||||
config/
|
||||
suspicious_rules.yaml # regex rules
|
||||
docs/
|
||||
roadmap.md # ongoing plan and priorities
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🧭 Roadmap (short version)
|
||||
|
||||
Full details: `docs/roadmap.md`
|
||||
|
||||
* **Core Analysis / Stability**
|
||||
|
||||
* Opt-in **fetch external scripts** (size/time limits) and evaluate fetched content.
|
||||
* Remove remaining legacy form “flagged\_reasons” once function rules cover them.
|
||||
* Unit tests: YAML compilation, adapters, per-artifact rule cases.
|
||||
|
||||
* **API Layer**
|
||||
|
||||
* Endpoints: `/screenshot`, `/source`, `/analyse`
|
||||
* OpenAPI at `/api/openapi.yaml`; docs at `/docs` (Swagger/Redoc)
|
||||
|
||||
* **UI / UX**
|
||||
|
||||
* Auto-prepend `http(s)://`/`www.` for bare domains
|
||||
* Source viewer (embedded editor)
|
||||
* Scripts table toggle: “Only suspicious” / “All scripts”
|
||||
* **Rules Lab** (WYSIWYG tester) for rapid rule validation
|
||||
|
||||
* **Artifact Management & Ops**
|
||||
|
||||
* Retention/cleanup policy (age/size)
|
||||
* Periodic maintenance scripts (configurable in `settings.yaml`)
|
||||
* Results caching UX (re-run vs. load from cache)
|
||||
|
||||
* **Extras / Integrations**
|
||||
|
||||
* Bulk URL analysis
|
||||
* Alerting/webhooks (Slack/email)
|
||||
* Analyst verdict tags + export (CSV/JSON)
|
||||
|
||||
---
|
||||
|
||||
## 🤝 Contributing
|
||||
|
||||
This repository is currently **private** on a self-hosted git server.
|
||||
|
||||
* Internal contributors: use feature branches and open merge requests on `https://git.sneakygeek.net/ptarrant/SneakyScope`.
|
||||
* If/when the repo is made public, we’ll welcome issues and PRs from the community.
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ Disclaimer
|
||||
|
||||
SneakyScope is intended for defensive security analysis and educational use.
|
||||
Only analyze content you are authorized to test.
|
||||
|
||||
---
|
||||
Reference in New Issue
Block a user