updating readme
This commit is contained in:
268
Readme.md
268
Readme.md
@@ -1,92 +1,248 @@
|
|||||||
# URL Sandbox
|
# SneakyScope
|
||||||
|
|
||||||
A lightweight web-based sandbox for analyzing websites and domains.
|
A lightweight web-based sandbox for analyzing websites and domains.
|
||||||
It performs WHOIS lookups, GeoIP enrichment, script/form inspection, and provides analyst-friendly output.
|
SneakyScope fetches a page in a sandbox, enriches with WHOIS/GeoIP, and runs a unified **Rules Engine** (YAML + function rules) against scripts, forms, and text. Results are saved per-run and rendered with analyst-friendly tables, severity badges, and tags. Results are saved at time of analysis per run so you have a point in time result that doesn't change.
|
||||||
|
|
||||||
|
> Repo: [https://git.sneakygeek.net/ptarrant/SneakyScope](https://git.sneakygeek.net/ptarrant/SneakyScope)
|
||||||
|
> Status: **Private** (may become public later)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 🚀 Features
|
## 🚀 Features
|
||||||
|
|
||||||
- **Domain & IP Enrichment**
|
### Unified Detection (Rules Engine)
|
||||||
- WHOIS lookups with fallback to raw text when fields are missing
|
|
||||||
- Explicit handling of privacy-protected WHOIS records (`N/A` or `Possible Privacy`)
|
* **Regex rules from YAML** + **function rules in code** for context-aware checks.
|
||||||
- GeoIP (City, Region, Country, Latitude/Longitude)
|
* PASS/FAIL per rule with **reason**, **severity** (`low|medium|high`), and **tags**.
|
||||||
- ASN, ISP, and network details
|
* **Per-script matches**:
|
||||||
- **Flagged Content Analysis**
|
|
||||||
- Suspicious script detection
|
* Inline scripts → run regex rules on the code.
|
||||||
- Suspicious form detection
|
* External scripts → run function rules with structured facts (`src`, hostnames, etc.).
|
||||||
- Nested bullet-style reporting for clarity
|
* **Page-level overview**: complete PASS/FAIL tables by category (`script`, `form`, `text`).
|
||||||
- **Improved UX**
|
|
||||||
- Automatic addition of `http://`, `https://`, and `www.` if only a domain is provided
|
### Domain & IP Enrichment
|
||||||
- Modal spinner to indicate background analysis (`Analyzing website…`)
|
|
||||||
- **Resilient GeoLite2 Database Management**
|
* WHOIS with robust fallbacks (`N/A`, `Possible Privacy` when fields are missing).
|
||||||
- Downloads the MaxMind GeoLite2-City database on first startup
|
* GeoIP, ASN, and ISP details.
|
||||||
- Checks file age and only re-downloads if older than **14 days** (configurable via environment variable)
|
|
||||||
|
### Results & UX
|
||||||
|
|
||||||
|
* **Per-run artifacts** under `/data/<uuid>/`:
|
||||||
|
|
||||||
|
* `screenshot.png`, `source.txt`, `results.json` (single source of truth).
|
||||||
|
* **Suspicious Scripts** table shows only **matched** scripts with:
|
||||||
|
|
||||||
|
* **Severity badges** and **tag chips** (tooltip shows rule reason).
|
||||||
|
* Snippet preview length configurable via `settings.yaml`.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## ⚙️ Setup Instructions
|
## 🧱 Architecture at a Glance
|
||||||
|
|
||||||
|
* **Flask** app (Gunicorn in Docker)
|
||||||
|
* **Playwright** for headless page fetch/render
|
||||||
|
* **BeautifulSoup4** for parsing
|
||||||
|
* **Rules Engine**
|
||||||
|
|
||||||
|
* YAML regex rules (`config/suspicious_rules.yaml`)
|
||||||
|
* Function rules (`app/rules/function_rules.py`) registered on startup
|
||||||
|
* **Artifacts**: persistent path mounted at `/data` (configurable)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ⚙️ Setup
|
||||||
|
|
||||||
|
### 1) Clone
|
||||||
|
|
||||||
|
> Since this repo is private, you’ll need credentials (HTTPS with a personal access token) **or** SSH access.
|
||||||
|
|
||||||
|
**HTTPS (with token):**
|
||||||
|
|
||||||
### 1. Clone the Repository
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://github.com/yourusername/url-sandbox.git
|
git clone https://git.sneakygeek.net/ptarrant/SneakyScope.git
|
||||||
cd url-sandbox
|
cd SneakyScope
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Create a MaxMind Account & License Key
|
**SSH:**
|
||||||
1. Go to [MaxMind GeoLite2](https://dev.maxmind.com/geoip/geolite2-free-geolocation-data)
|
|
||||||
2. Sign up for a free account
|
|
||||||
3. Navigate to **Account > Manage License Keys**
|
|
||||||
4. Generate a new license key
|
|
||||||
|
|
||||||
### 3. Configure Environment Variables
|
```bash
|
||||||
All environment variables are loaded from a `.env` file.
|
git clone git@git.sneakygeek.net:ptarrant/SneakyScope.git
|
||||||
|
cd SneakyScope
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2) Configure Environment
|
||||||
|
|
||||||
|
Copy and edit env:
|
||||||
|
|
||||||
1. Copy the sample file:
|
|
||||||
```bash
|
```bash
|
||||||
cp .env.example .env
|
cp .env.example .env
|
||||||
````
|
```
|
||||||
|
|
||||||
2. Edit `.env` and set your values (see [`.env.example`](./.env.example) for available options).
|
Important vars:
|
||||||
|
|
||||||
Make sure to add your **MaxMind License Key** under `MAXMIND_LICENSE_KEY`.
|
* `SECRET_KEY` – Flask secret (set in production).
|
||||||
|
* `MAXMIND_LICENSE_KEY` – for GeoIP (optional if you disable GeoIP).
|
||||||
|
* `SNEAKYSCOPE_RULES_FILE` – override path to YAML rules (optional).
|
||||||
|
|
||||||
|
### 3) Settings
|
||||||
|
|
||||||
|
`settings.yaml` controls UI/behavior. Example:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
app:
|
||||||
|
name: "SneakyScope"
|
||||||
|
version_major: 0
|
||||||
|
version_minor: 1
|
||||||
|
|
||||||
|
ui:
|
||||||
|
snippet_preview_len: 160 # controls inline script snippet length in UI
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4) Run with Docker Compose
|
||||||
|
|
||||||
### 4. Run with Docker Compose
|
|
||||||
```bash
|
```bash
|
||||||
docker-compose up --build
|
docker-compose up --build
|
||||||
```
|
```
|
||||||
|
|
||||||
This will:
|
This builds the image and starts the web app. The `/data` directory in the container is where run artifacts are written—mount a host directory in Compose to persist between restarts.
|
||||||
- Build the app
|
|
||||||
- Download the GeoLite2 database if not present or too old
|
|
||||||
- Start the web interface
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 📝 Example Output
|
## 🧪 Using SneakyScope
|
||||||
|
|
||||||
**WHOIS Info**
|
1. Open the web UI and submit a URL.
|
||||||
- Registrar: MarkMonitor, Inc.
|
2. On completion you’ll see:
|
||||||
- Organization: Possible Privacy
|
|
||||||
- Creation: 1997-09-15
|
|
||||||
- Expiration: 2028-09-14
|
|
||||||
|
|
||||||
**GeoIP Info**
|
* **URL Overview** (with permalink to `/results/<uuid>`)
|
||||||
- IP: 172.66.159.20
|
* **Enrichment** (WHOIS/GeoIP)
|
||||||
- City: N/A
|
* **Redirects**
|
||||||
- Region: N/A
|
* **Forms** (inputs + per-form rule checks)
|
||||||
- Country: United States
|
* **Suspicious Scripts** (only scripts that matched rules; badges/tags, snippet)
|
||||||
- Coordinates: (37.751, -97.822)
|
* **Screenshot** and **Source**
|
||||||
- ASN: 13335
|
|
||||||
- ISP: Cloudflare, Inc.
|
Artifacts for each run live under `/data/<uuid>/`:
|
||||||
|
|
||||||
|
* `results.json` – complete structured result consumed by the UI.
|
||||||
|
* `source.txt`, `screenshot.png`, and other files as added.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 📌 Roadmap
|
## 📝 Rules
|
||||||
See [Next Steps Checklist](docs/roadmap.md) for planned features:
|
|
||||||
- Improved UI templates
|
### YAML (regex) Rules
|
||||||
- Artifact cleanup
|
|
||||||
- Proxy support (optional)
|
`config/suspicious_rules.yaml` contains regex rules (compiled `IGNORECASE`). Example:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
- name: eval_usage
|
||||||
|
description: "Use of eval() in script"
|
||||||
|
category: script
|
||||||
|
type: regex
|
||||||
|
pattern: '\beval\s*\('
|
||||||
|
severity: high
|
||||||
|
tags: [obfuscation, unsafe-eval]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Function Rules (code)
|
||||||
|
|
||||||
|
Rules needing **context** (e.g., compare action host to page host) live in:
|
||||||
|
|
||||||
|
* `app/rules/function_rules.py`:
|
||||||
|
|
||||||
|
* `FactAdapter` – converts snippets to structured facts.
|
||||||
|
* `FunctionRuleAdapter` – lets dict-expecting rules run from engine inputs.
|
||||||
|
* Implementations like:
|
||||||
|
|
||||||
|
* `form_action_missing`
|
||||||
|
* `form_http_on_https_page`
|
||||||
|
* `form_submits_to_different_host`
|
||||||
|
* `script_src_uses_data_or_blob`
|
||||||
|
* `script_src_has_dangerous_extension`
|
||||||
|
* `script_third_party_host`
|
||||||
|
|
||||||
|
They’re registered at startup in `app/__init__.py` alongside YAML rules.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🔧 Configuration Tips
|
||||||
|
|
||||||
|
* **Snippet length**: tweak `ui.snippet_preview_len` in `settings.yaml` (default 160).
|
||||||
|
* **Rules file override**: set `SNEAKYSCOPE_RULES_FILE=/path/to/your.yaml`.
|
||||||
|
* **Artifacts path**: by default `/data` in the container (mount via Compose).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📂 Project Structure (high-level)
|
||||||
|
|
||||||
|
```
|
||||||
|
app/
|
||||||
|
__init__.py # Flask app factory (loads YAML + function rules)
|
||||||
|
browser.py # fetch + analysis orchestrator (writes results.json)
|
||||||
|
routes.py # web views
|
||||||
|
rules/
|
||||||
|
function_rules.py # FactAdapter, FunctionRuleAdapter, function rules
|
||||||
|
utils/
|
||||||
|
rules_engine.py # engine + Rule class + YAML loader
|
||||||
|
io_helpers.py # safe_write, etc.
|
||||||
|
settings.py # get_settings()
|
||||||
|
templates/ # Jinja2 templates
|
||||||
|
static/ # CSS/JS
|
||||||
|
config/
|
||||||
|
suspicious_rules.yaml # regex rules
|
||||||
|
docs/
|
||||||
|
roadmap.md # ongoing plan and priorities
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🧭 Roadmap (short version)
|
||||||
|
|
||||||
|
Full details: `docs/roadmap.md`
|
||||||
|
|
||||||
|
* **Core Analysis / Stability**
|
||||||
|
|
||||||
|
* Opt-in **fetch external scripts** (size/time limits) and evaluate fetched content.
|
||||||
|
* Remove remaining legacy form “flagged\_reasons” once function rules cover them.
|
||||||
|
* Unit tests: YAML compilation, adapters, per-artifact rule cases.
|
||||||
|
|
||||||
|
* **API Layer**
|
||||||
|
|
||||||
|
* Endpoints: `/screenshot`, `/source`, `/analyse`
|
||||||
|
* OpenAPI at `/api/openapi.yaml`; docs at `/docs` (Swagger/Redoc)
|
||||||
|
|
||||||
|
* **UI / UX**
|
||||||
|
|
||||||
|
* Auto-prepend `http(s)://`/`www.` for bare domains
|
||||||
|
* Source viewer (embedded editor)
|
||||||
|
* Scripts table toggle: “Only suspicious” / “All scripts”
|
||||||
|
* **Rules Lab** (WYSIWYG tester) for rapid rule validation
|
||||||
|
|
||||||
|
* **Artifact Management & Ops**
|
||||||
|
|
||||||
|
* Retention/cleanup policy (age/size)
|
||||||
|
* Periodic maintenance scripts (configurable in `settings.yaml`)
|
||||||
|
* Results caching UX (re-run vs. load from cache)
|
||||||
|
|
||||||
|
* **Extras / Integrations**
|
||||||
|
|
||||||
|
* Bulk URL analysis
|
||||||
|
* Alerting/webhooks (Slack/email)
|
||||||
|
* Analyst verdict tags + export (CSV/JSON)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🤝 Contributing
|
||||||
|
|
||||||
|
This repository is currently **private** on a self-hosted git server.
|
||||||
|
|
||||||
|
* Internal contributors: use feature branches and open merge requests on `https://git.sneakygeek.net/ptarrant/SneakyScope`.
|
||||||
|
* If/when the repo is made public, we’ll welcome issues and PRs from the community.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ⚠️ Disclaimer
|
||||||
|
|
||||||
|
SneakyScope is intended for defensive security analysis and educational use.
|
||||||
|
Only analyze content you are authorized to test.
|
||||||
|
|
||||||
---
|
---
|
||||||
Reference in New Issue
Block a user