feat: on-demand external script analysis + code viewer; refactor form analysis to rule engine
- API: add `POST /api/analyze_script` (app/blueprints/api.py)
- Fetch one external script to artifacts, run rules, return findings + snippet
- Uses new ExternalScriptFetcher (results_path aware) and job UUID
- Returns: { ok, final_url, status_code, bytes, truncated, sha256, artifact_path, findings[], snippet, snippet_len }
- TODO: document in openapi/openapi.yaml
- Fetcher: update `app/utils/external_fetch.py`
- Constructed with `results_path` (UUID dir); writes to `<results_path>/scripts/fetched/<index>.js`
- Loads settings via `get_settings()`, logs via std logging
- UI (results.html):
- Move “Analyze external script” action into **Content Snippet** column for external rows
- Clicking replaces button with `<details>` snippet, shows rule matches, and adds “open in viewer” link
- Robust fetch handler (checks JSON, shows errors); builds viewer URL from absolute artifact path
- Viewer:
- New route: `GET /view/artifact/<run_uuid>/<path:filename>` (app/blueprints/ui.py)
- New template: Monaco-based read-only code viewer (viewer.html)
- Removes SRI on loader to avoid integrity block; loads file via `raw_url` and detects language by extension
- Forms:
- Refactor `analyze_forms` to mirror scripts analysis:
- Uses rule engine (`category == "form"`) across regex/function rules
- Emits rows only when matches exist
- Includes `content_snippet`, `action`, `method`, `inputs`, `rules`
- Replace legacy plumbing (`flagged`, `flag_reasons`, `status`) in output
- Normalize form function rules to canonical returns `(bool, Optional[str])`:
- `form_action_missing`
- `form_http_on_https_page`
- `form_submits_to_different_host`
- Add minor hardening (lowercasing hosts, no-op actions, clearer reasons)
- CSS: add `.forms-table` to mirror `.scripts-table` (5 columns)
- Fixed table layout, widths per column, chip/snippet styling, responsive tweaks
- Misc:
- Fix “working outside app context” issue by avoiding `current_app` at import time (left storage logic inside routes)
- Add “View Source” link to open page source in viewer
Refs:
- Roadmap: mark “Source code viewer” done; keep TODO to add `/api/analyze_script` to OpenAPI
This commit is contained in:
@@ -19,7 +19,8 @@ from .rules.function_rules import (
|
|||||||
form_action_missing,
|
form_action_missing,
|
||||||
)
|
)
|
||||||
|
|
||||||
from . import routes # blueprint
|
from app.blueprints import ui # ui blueprint
|
||||||
|
from app.blueprints import api # api blueprint
|
||||||
|
|
||||||
# from .utils import io_helpers # if need logging/setup later
|
# from .utils import io_helpers # if need logging/setup later
|
||||||
# from .utils import cache_db # available for future injections
|
# from .utils import cache_db # available for future injections
|
||||||
@@ -136,7 +137,8 @@ def create_app() -> Flask:
|
|||||||
app.config["APP_VERSION"] = f"v{settings.app.version_major}.{settings.app.version_minor}"
|
app.config["APP_VERSION"] = f"v{settings.app.version_major}.{settings.app.version_minor}"
|
||||||
|
|
||||||
# Register blueprints
|
# Register blueprints
|
||||||
app.register_blueprint(routes.bp)
|
app.register_blueprint(ui.bp)
|
||||||
|
app.register_blueprint(api.api_bp)
|
||||||
|
|
||||||
# Example log lines so we know we booted cleanly
|
# Example log lines so we know we booted cleanly
|
||||||
app.logger.info(f"SneakyScope started: {app.config['APP_NAME']} {app.config['APP_VERSION']}")
|
app.logger.info(f"SneakyScope started: {app.config['APP_NAME']} {app.config['APP_VERSION']}")
|
||||||
|
|||||||
212
app/blueprints/api.py
Normal file
212
app/blueprints/api.py
Normal file
@@ -0,0 +1,212 @@
|
|||||||
|
# app/blueprints/api.py
|
||||||
|
"""
|
||||||
|
API blueprint for JSON endpoints.
|
||||||
|
|
||||||
|
Endpoints:
|
||||||
|
POST /api/analyze_script
|
||||||
|
Body:
|
||||||
|
{
|
||||||
|
"job_id": "<uuid>", # or "uuid": "<uuid>"
|
||||||
|
"url": "https://cdn.example.com/app.js",
|
||||||
|
"category": "script" # optional, defaults to "script"
|
||||||
|
}
|
||||||
|
Response:
|
||||||
|
{
|
||||||
|
"ok": true,
|
||||||
|
"final_url": "...",
|
||||||
|
"status_code": 200,
|
||||||
|
"bytes": 12345,
|
||||||
|
"truncated": false,
|
||||||
|
"sha256": "...",
|
||||||
|
"artifact_path": "/abs/path/to/<uuid>/scripts/fetched/<index>.js",
|
||||||
|
"findings": [ { "name": "...", "description": "...", "severity": "...", "tags": [...], "reason": "..." }, ... ],
|
||||||
|
"snippet": "<trimmed content>",
|
||||||
|
"snippet_len": 45678
|
||||||
|
}
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import time
|
||||||
|
from flask import Blueprint, request, jsonify, current_app, send_file, abort
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from app.utils.settings import get_settings
|
||||||
|
from app.utils.external_fetcher import ExternalScriptFetcher
|
||||||
|
from werkzeug.exceptions import HTTPException
|
||||||
|
|
||||||
|
api_bp = Blueprint("api", __name__, url_prefix="/api")
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_results_path(job_id: str) -> str:
|
||||||
|
"""
|
||||||
|
Compute the absolute results directory for a given job UUID.
|
||||||
|
Prefers <BASE>/artifacts/<uuid>, falls back to <BASE>/<uuid>.
|
||||||
|
"""
|
||||||
|
base_dir = "/data"
|
||||||
|
|
||||||
|
candidate_with_artifacts = os.path.join(base_dir, job_id)
|
||||||
|
if os.path.isdir(candidate_with_artifacts):
|
||||||
|
return candidate_with_artifacts
|
||||||
|
|
||||||
|
fallback = os.path.join(base_dir, job_id)
|
||||||
|
os.makedirs(fallback, exist_ok=True)
|
||||||
|
return fallback
|
||||||
|
|
||||||
|
|
||||||
|
def _make_snippet(text: str, max_chars: int = 1200) -> str:
|
||||||
|
"""Produce a trimmed, safe-to-render snippet of the script contents."""
|
||||||
|
if not text:
|
||||||
|
return ""
|
||||||
|
snippet = text.strip()
|
||||||
|
return (snippet[:max_chars] + "…") if len(snippet) > max_chars else snippet
|
||||||
|
|
||||||
|
@api_bp.errorhandler(400)
|
||||||
|
@api_bp.errorhandler(403)
|
||||||
|
@api_bp.errorhandler(404)
|
||||||
|
@api_bp.errorhandler(405)
|
||||||
|
def _api_err(err):
|
||||||
|
"""
|
||||||
|
Return JSON for common client errors.
|
||||||
|
"""
|
||||||
|
if isinstance(err, HTTPException):
|
||||||
|
code = err.code
|
||||||
|
name = (err.name or "error").lower()
|
||||||
|
else:
|
||||||
|
code = 400
|
||||||
|
name = "error"
|
||||||
|
return jsonify({"ok": False, "error": name}), code
|
||||||
|
|
||||||
|
|
||||||
|
@api_bp.errorhandler(500)
|
||||||
|
def _api_500(err):
|
||||||
|
"""
|
||||||
|
Return JSON for server errors and log the exception.
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
current_app.logger.exception("API 500")
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
return jsonify({"ok": False, "error": "internal server error"}), 500
|
||||||
|
|
||||||
|
|
||||||
|
@api_bp.post("/analyze_script")
|
||||||
|
def analyze_script():
|
||||||
|
"""
|
||||||
|
Analyze EXACTLY one external script URL for a given job UUID.
|
||||||
|
|
||||||
|
Expected JSON body:
|
||||||
|
{ "job_id": "<uuid>", "url": "https://cdn.example.com/app.js", "category": "script" }
|
||||||
|
"""
|
||||||
|
body = request.get_json(silent=True) or {}
|
||||||
|
|
||||||
|
job_id_raw = body.get("job_id") or body.get("uuid")
|
||||||
|
script_url_raw = body.get("url")
|
||||||
|
category = (body.get("category") or "script").strip() or None # default to "script"
|
||||||
|
|
||||||
|
job_id = (job_id_raw or "").strip() if isinstance(job_id_raw, str) else ""
|
||||||
|
script_url = (script_url_raw or "").strip() if isinstance(script_url_raw, str) else ""
|
||||||
|
|
||||||
|
# log this request
|
||||||
|
current_app.logger.info(f"Got request to analyze {script_url} via API ")
|
||||||
|
|
||||||
|
if not job_id or not script_url:
|
||||||
|
return jsonify({"ok": False, "error": "Missing job_id (or uuid) or url"}), 400
|
||||||
|
|
||||||
|
settings = get_settings()
|
||||||
|
|
||||||
|
if not settings.external_fetch.enabled:
|
||||||
|
return jsonify({"ok": False, "error": "Feature disabled"}), 400
|
||||||
|
|
||||||
|
# Resolve the UUID-backed results directory for this run.
|
||||||
|
results_path = _resolve_results_path(job_id)
|
||||||
|
|
||||||
|
# Initialize the fetcher; it reads its own settings internally.
|
||||||
|
fetcher = ExternalScriptFetcher(results_path=results_path)
|
||||||
|
|
||||||
|
# Unique index for the saved file name: <results_path>/scripts/fetched/<index>.js
|
||||||
|
unique_index = int(time.time() * 1000)
|
||||||
|
|
||||||
|
outcome = fetcher.fetch_one(script_url=script_url, index=unique_index)
|
||||||
|
if not outcome.ok or not outcome.saved_path:
|
||||||
|
return jsonify({
|
||||||
|
"ok": False,
|
||||||
|
"error": outcome.reason,
|
||||||
|
"status_code": outcome.status_code,
|
||||||
|
"final_url": outcome.final_url
|
||||||
|
}), 502
|
||||||
|
|
||||||
|
# Read bytes and decode to UTF-8 for rules and snippet
|
||||||
|
try:
|
||||||
|
with open(outcome.saved_path, "rb") as fh:
|
||||||
|
js_text = fh.read().decode("utf-8", errors="ignore")
|
||||||
|
except Exception:
|
||||||
|
js_text = ""
|
||||||
|
|
||||||
|
# Pull the rules engine from the app (prefer attribute, then config).
|
||||||
|
findings = []
|
||||||
|
try:
|
||||||
|
engine = getattr(current_app, "rule_engine", None)
|
||||||
|
if engine is None:
|
||||||
|
engine = current_app.config.get("RULE_ENGINE")
|
||||||
|
except Exception:
|
||||||
|
engine = None
|
||||||
|
|
||||||
|
if engine is not None and hasattr(engine, "run_all"):
|
||||||
|
try:
|
||||||
|
# run_all returns PASS/FAIL for each rule; we only surface FAIL (matched) to the UI
|
||||||
|
all_results = engine.run_all(js_text, category=category)
|
||||||
|
if isinstance(all_results, list):
|
||||||
|
matched = []
|
||||||
|
for r in all_results:
|
||||||
|
try:
|
||||||
|
if (r.get("result") == "FAIL"):
|
||||||
|
matched.append({
|
||||||
|
"name": r.get("name"),
|
||||||
|
"description": r.get("description"),
|
||||||
|
"severity": r.get("severity"),
|
||||||
|
"tags": r.get("tags") or [],
|
||||||
|
"reason": r.get("reason"),
|
||||||
|
"category": r.get("category"),
|
||||||
|
})
|
||||||
|
except Exception:
|
||||||
|
# Ignore malformed entries
|
||||||
|
continue
|
||||||
|
findings = matched
|
||||||
|
except Exception as exc:
|
||||||
|
try:
|
||||||
|
current_app.logger.error("Rule engine error", extra={"error": str(exc)})
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
findings = []
|
||||||
|
|
||||||
|
snippet = _make_snippet(js_text, max_chars=settings.ui.snippet_preview_len)
|
||||||
|
|
||||||
|
return jsonify({
|
||||||
|
"ok": True,
|
||||||
|
"final_url": outcome.final_url,
|
||||||
|
"status_code": outcome.status_code,
|
||||||
|
"bytes": outcome.bytes_fetched,
|
||||||
|
"truncated": outcome.truncated,
|
||||||
|
"sha256": outcome.sha256_hex,
|
||||||
|
"artifact_path": outcome.saved_path,
|
||||||
|
"findings": findings, # only FAILed rules
|
||||||
|
"snippet": snippet,
|
||||||
|
"snippet_len": len(js_text)
|
||||||
|
})
|
||||||
|
|
||||||
|
|
||||||
|
@api_bp.get("/artifacts/<run_uuid>/<filename>")
|
||||||
|
def get_artifact_raw(run_uuid, filename):
|
||||||
|
# prevent path traversal
|
||||||
|
if "/" in filename or ".." in filename:
|
||||||
|
abort(400)
|
||||||
|
|
||||||
|
run_dir = _resolve_results_path(run_uuid)
|
||||||
|
full_path = Path(run_dir) / filename
|
||||||
|
|
||||||
|
# if file is not there, give a 404
|
||||||
|
if not os.path.isfile(full_path):
|
||||||
|
abort(404)
|
||||||
|
|
||||||
|
# else return file
|
||||||
|
return send_file(full_path, as_attachment=False)
|
||||||
@@ -1,3 +1,5 @@
|
|||||||
|
# app/blueprints/ui.py
|
||||||
|
|
||||||
import os
|
import os
|
||||||
import json
|
import json
|
||||||
import asyncio
|
import asyncio
|
||||||
@@ -5,11 +7,10 @@ from pathlib import Path
|
|||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
from flask import Blueprint, render_template, request, redirect, url_for, flash, current_app, send_file, abort
|
from flask import Blueprint, render_template, request, redirect, url_for, flash, current_app, send_file, abort
|
||||||
|
|
||||||
# from .browser import fetch_page_artifacts
|
from app.utils.browser import get_browser
|
||||||
from .utils.browser import get_browser
|
from app.utils.enrichment import enrich_url
|
||||||
from .utils.enrichment import enrich_url
|
from app.utils.settings import get_settings
|
||||||
from .utils.settings import get_settings
|
from app.utils.io_helpers import get_recent_results
|
||||||
from .utils.io_helpers import get_recent_results
|
|
||||||
|
|
||||||
bp = Blueprint("main", __name__)
|
bp = Blueprint("main", __name__)
|
||||||
|
|
||||||
@@ -34,9 +35,6 @@ def index():
|
|||||||
The number of recent runs is controlled via settings.cache.recent_runs_count (int).
|
The number of recent runs is controlled via settings.cache.recent_runs_count (int).
|
||||||
Falls back to 10 if not present or invalid.
|
Falls back to 10 if not present or invalid.
|
||||||
"""
|
"""
|
||||||
# Resolve SANDBOX_STORAGE from app config
|
|
||||||
storage = Path(current_app.config["SANDBOX_STORAGE"]).resolve()
|
|
||||||
|
|
||||||
# Pull recent count from settings with a safe fallback
|
# Pull recent count from settings with a safe fallback
|
||||||
try:
|
try:
|
||||||
# settings is already initialized at module import in your file
|
# settings is already initialized at module import in your file
|
||||||
@@ -46,13 +44,15 @@ def index():
|
|||||||
except Exception:
|
except Exception:
|
||||||
recent_count = 10
|
recent_count = 10
|
||||||
|
|
||||||
|
# Resolve SANDBOX_STORAGE from app config
|
||||||
|
storage = Path(current_app.config["SANDBOX_STORAGE"]).resolve()
|
||||||
|
|
||||||
# Build the recent list (non-fatal if storage is empty or unreadable)
|
# Build the recent list (non-fatal if storage is empty or unreadable)
|
||||||
recent_results = get_recent_results(storage, recent_count, current_app.logger)
|
recent_results = get_recent_results(storage, recent_count, current_app.logger)
|
||||||
|
|
||||||
# Pass to template; your index.html will hide the card if list is empty
|
# Pass to template; your index.html will hide the card if list is empty
|
||||||
return render_template("index.html", recent_results=recent_results)
|
return render_template("index.html", recent_results=recent_results)
|
||||||
|
|
||||||
|
|
||||||
@bp.route("/analyze", methods=["POST"])
|
@bp.route("/analyze", methods=["POST"])
|
||||||
def analyze():
|
def analyze():
|
||||||
url = request.form.get("url", "").strip()
|
url = request.form.get("url", "").strip()
|
||||||
@@ -60,7 +60,7 @@ def analyze():
|
|||||||
if not url:
|
if not url:
|
||||||
flash("Please enter a URL.", "error")
|
flash("Please enter a URL.", "error")
|
||||||
return redirect(url_for("main.index"))
|
return redirect(url_for("main.index"))
|
||||||
|
|
||||||
storage = Path(current_app.config["SANDBOX_STORAGE"]).resolve()
|
storage = Path(current_app.config["SANDBOX_STORAGE"]).resolve()
|
||||||
storage.mkdir(parents=True, exist_ok=True)
|
storage.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
@@ -87,6 +87,7 @@ def analyze():
|
|||||||
|
|
||||||
@bp.route("/results/<run_uuid>", methods=["GET"])
|
@bp.route("/results/<run_uuid>", methods=["GET"])
|
||||||
def view_result(run_uuid: str):
|
def view_result(run_uuid: str):
|
||||||
|
# Resolve SANDBOX_STORAGE from app config
|
||||||
storage = Path(current_app.config["SANDBOX_STORAGE"]).resolve()
|
storage = Path(current_app.config["SANDBOX_STORAGE"]).resolve()
|
||||||
run_dir = storage / run_uuid
|
run_dir = storage / run_uuid
|
||||||
results_path = run_dir / "results.json"
|
results_path = run_dir / "results.json"
|
||||||
@@ -105,6 +106,7 @@ def view_result(run_uuid: str):
|
|||||||
|
|
||||||
@bp.route("/artifacts/<run_uuid>/<filename>", methods=["GET"])
|
@bp.route("/artifacts/<run_uuid>/<filename>", methods=["GET"])
|
||||||
def artifacts(run_uuid: str, filename: str):
|
def artifacts(run_uuid: str, filename: str):
|
||||||
|
# Resolve SANDBOX_STORAGE from app config
|
||||||
storage = Path(current_app.config["SANDBOX_STORAGE"]).resolve()
|
storage = Path(current_app.config["SANDBOX_STORAGE"]).resolve()
|
||||||
run_dir = storage / run_uuid
|
run_dir = storage / run_uuid
|
||||||
full_path = run_dir / filename
|
full_path = run_dir / filename
|
||||||
@@ -123,3 +125,11 @@ def artifacts(run_uuid: str, filename: str):
|
|||||||
return send_file(full_path)
|
return send_file(full_path)
|
||||||
|
|
||||||
|
|
||||||
|
@bp.get("/view/artifact/<run_uuid>/<filename>")
|
||||||
|
def view_artifact(run_uuid, filename):
|
||||||
|
# Build a safe raw URL that streams the file (you said you already have this route)
|
||||||
|
raw_url = url_for('api.get_artifact_raw', run_uuid=run_uuid, filename=filename)
|
||||||
|
# Optional: derive language server-side if you prefer
|
||||||
|
language = None # e.g., 'javascript'
|
||||||
|
return render_template('viewer.html', filename=filename, raw_url=raw_url, language=language)
|
||||||
|
|
||||||
@@ -8,5 +8,11 @@ cache:
|
|||||||
whois_cache_days: 7
|
whois_cache_days: 7
|
||||||
geoip_cache_days: 7
|
geoip_cache_days: 7
|
||||||
|
|
||||||
|
external_script_fetch:
|
||||||
|
enabled: True
|
||||||
|
max_total_mb: 5
|
||||||
|
max_time_ms: 3000
|
||||||
|
max_redirects: 3
|
||||||
|
|
||||||
ui:
|
ui:
|
||||||
snippet_preview_len: 300
|
snippet_preview_len: 300
|
||||||
|
|||||||
@@ -22,6 +22,7 @@ from __future__ import annotations
|
|||||||
from typing import Any, Dict, Optional
|
from typing import Any, Dict, Optional
|
||||||
from urllib.parse import urlparse
|
from urllib.parse import urlparse
|
||||||
|
|
||||||
|
_NOOP_ACTIONS = {"", "#", "javascript:void(0)", "javascript:void(0);"}
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# Adapters
|
# Adapters
|
||||||
@@ -169,35 +170,48 @@ def script_third_party_host(facts: Dict[str, Any]):
|
|||||||
|
|
||||||
# ---------------- Form rules ----------------
|
# ---------------- Form rules ----------------
|
||||||
|
|
||||||
def form_submits_to_different_host(facts: Dict[str, Any]):
|
def form_action_missing(facts: Dict[str, Any]):
|
||||||
"""Flags <form> actions that submit to a different hostname than the page."""
|
"""Flags <form> elements with no meaningful action attribute."""
|
||||||
base_host = facts.get("base_hostname") or ""
|
action = (facts.get("action") or "").strip()
|
||||||
action = facts.get("action") or ""
|
if action in _NOOP_ACTIONS:
|
||||||
try:
|
return True, "Form has no action attribute (or uses a no-op action)"
|
||||||
action_host = urlparse(action).hostname
|
|
||||||
if action_host and base_host and action_host != base_host:
|
|
||||||
return True, "Form submits to a different host"
|
|
||||||
except Exception:
|
|
||||||
# Parsing failed; treat as no match rather than erroring out
|
|
||||||
pass
|
|
||||||
return False, None
|
return False, None
|
||||||
|
|
||||||
|
|
||||||
def form_http_on_https_page(facts: Dict[str, Any]):
|
def form_http_on_https_page(facts: Dict[str, Any]):
|
||||||
"""Flags forms submitting over HTTP while the page was loaded over HTTPS."""
|
"""Flags forms submitting over HTTP while the page was loaded over HTTPS."""
|
||||||
base_url = facts.get("base_url") or ""
|
base_url = (facts.get("base_url") or "").strip()
|
||||||
action = facts.get("action") or ""
|
action = (facts.get("action") or "").strip()
|
||||||
|
|
||||||
try:
|
try:
|
||||||
if urlparse(base_url).scheme == "https" and urlparse(action).scheme == "http":
|
base_scheme = (urlparse(base_url).scheme or "").lower()
|
||||||
return True, "Form submits over insecure HTTP"
|
parsed_act = urlparse(action)
|
||||||
|
act_scheme = (parsed_act.scheme or "").lower()
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
return False, None # parsing trouble → don’t flag
|
||||||
|
|
||||||
|
# Only flag absolute http:// actions on https pages.
|
||||||
|
# Relative or schemeless ('//host/...') isn’t flagged here (it won’t be HTTP on an HTTPS page).
|
||||||
|
if base_scheme == "https" and act_scheme == "http":
|
||||||
|
return True, f"Submits over insecure HTTP (action={parsed_act.geturl()})"
|
||||||
return False, None
|
return False, None
|
||||||
|
|
||||||
|
|
||||||
def form_action_missing(facts: Dict[str, Any]):
|
def form_submits_to_different_host(facts: Dict[str, Any]):
|
||||||
"""Flags <form> elements with no action attribute."""
|
"""Flags <form> actions that submit to a different hostname than the page."""
|
||||||
action = (facts.get("action") or "").strip()
|
base_host = (facts.get("base_hostname") or "").strip().lower()
|
||||||
if not action:
|
action = (facts.get("action") or "").strip()
|
||||||
return True, "Form has no action attribute"
|
|
||||||
return False, None
|
if not action or action in _NOOP_ACTIONS:
|
||||||
|
return False, None
|
||||||
|
|
||||||
|
try:
|
||||||
|
parsed = urlparse(action)
|
||||||
|
act_host = (parsed.hostname or "").lower()
|
||||||
|
except Exception:
|
||||||
|
return False, None
|
||||||
|
|
||||||
|
# Only compare when the action specifies a host (absolute URL or schemeless //host/path).
|
||||||
|
if act_host and base_host and act_host != base_host:
|
||||||
|
return True, f"Submits to a different host ({act_host} vs {base_host})"
|
||||||
|
return False, None
|
||||||
@@ -279,6 +279,7 @@ details ul, details p {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* SCRIPTS TABLE */
|
||||||
.scripts-table td ul {
|
.scripts-table td ul {
|
||||||
margin: 0.25rem 0 0.25rem 1rem;
|
margin: 0.25rem 0 0.25rem 1rem;
|
||||||
padding-left: 1rem;
|
padding-left: 1rem;
|
||||||
@@ -305,6 +306,59 @@ details ul, details p {
|
|||||||
white-space: nowrap;
|
white-space: nowrap;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
/* lists & small text inside cells */
|
||||||
|
.forms-table td ul {
|
||||||
|
margin: 0.25rem 0 0.25rem 1rem;
|
||||||
|
padding-left: 1rem;
|
||||||
|
}
|
||||||
|
.forms-table td small {
|
||||||
|
opacity: 0.85;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* keep the table from exploding */
|
||||||
|
.forms-table {
|
||||||
|
table-layout: fixed;
|
||||||
|
width: 100%;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* columns: Action | Method | Inputs | Matches | Form Snippet */
|
||||||
|
.forms-table th:nth-child(1) { width: 15rem; } /* Action */
|
||||||
|
.forms-table th:nth-child(2) { width: 5rem; } /* Method */
|
||||||
|
.forms-table th:nth-child(3) { width: 15rem; } /* Inputs */
|
||||||
|
.forms-table th:nth-child(5) { width: 24rem; } /* Snippet */
|
||||||
|
.forms-table th:nth-child(4) { width: auto; } /* Matches grows */
|
||||||
|
|
||||||
|
/* ellipsize cells by default */
|
||||||
|
.forms-table td,
|
||||||
|
.forms-table th {
|
||||||
|
overflow: hidden;
|
||||||
|
text-overflow: ellipsis;
|
||||||
|
white-space: nowrap;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* nicer wrapping inside snippet/details & input chips */
|
||||||
|
.forms-table details { white-space: normal; }
|
||||||
|
.forms-table details > pre.code {
|
||||||
|
white-space: pre-wrap; /* let long lines wrap */
|
||||||
|
max-height: 28rem;
|
||||||
|
overflow: auto;
|
||||||
|
}
|
||||||
|
.forms-table .chips {
|
||||||
|
display: flex;
|
||||||
|
gap: 0.25rem;
|
||||||
|
flex-wrap: wrap;
|
||||||
|
white-space: normal; /* allow chip text to wrap if needed */
|
||||||
|
}
|
||||||
|
|
||||||
|
/* (optional) responsive tweaks */
|
||||||
|
@media (max-width: 1200px) {
|
||||||
|
.forms-table th:nth-child(1) { width: 22rem; }
|
||||||
|
.forms-table th:nth-child(3) { width: 16rem; }
|
||||||
|
.forms-table th:nth-child(5) { width: 18rem; }
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
/* let URLs/snippets wrap *inside* their cell when expanded content shows */
|
/* let URLs/snippets wrap *inside* their cell when expanded content shows */
|
||||||
.breakable {
|
.breakable {
|
||||||
white-space: normal;
|
white-space: normal;
|
||||||
|
|||||||
@@ -30,4 +30,7 @@
|
|||||||
<small>{{ app_name }} - A self-hosted URL analysis sandbox - {{ app_version }}</small>
|
<small>{{ app_name }} - A self-hosted URL analysis sandbox - {{ app_version }}</small>
|
||||||
</footer>
|
</footer>
|
||||||
</body>
|
</body>
|
||||||
</html>
|
</html>
|
||||||
|
|
||||||
|
{% block page_js %}
|
||||||
|
{% endblock %}
|
||||||
@@ -90,6 +90,9 @@
|
|||||||
100% { transform: rotate(360deg); }
|
100% { transform: rotate(360deg); }
|
||||||
}
|
}
|
||||||
</style>
|
</style>
|
||||||
|
{% endblock %}
|
||||||
|
|
||||||
|
{% block page_js %}
|
||||||
|
|
||||||
<script>
|
<script>
|
||||||
const form = document.getElementById('analyze-form');
|
const form = document.getElementById('analyze-form');
|
||||||
|
|||||||
@@ -79,21 +79,6 @@
|
|||||||
{% endfor %}
|
{% endfor %}
|
||||||
{% endif %}
|
{% endif %}
|
||||||
|
|
||||||
<!-- BEC Words -->
|
|
||||||
{% if enrichment.bec_words %}
|
|
||||||
<h3>BEC Words Detected</h3>
|
|
||||||
<table class="enrichment-table">
|
|
||||||
<thead>
|
|
||||||
<tr><th>Word</th></tr>
|
|
||||||
</thead>
|
|
||||||
<tbody>
|
|
||||||
{% for word in enrichment.bec_words %}
|
|
||||||
<tr><td>{{ word }}</td></tr>
|
|
||||||
{% endfor %}
|
|
||||||
</tbody>
|
|
||||||
</table>
|
|
||||||
{% endif %}
|
|
||||||
|
|
||||||
{% if not enrichment.whois and not enrichment.raw_whois and not enrichment.geoip and not enrichment.bec_words %}
|
{% if not enrichment.whois and not enrichment.raw_whois and not enrichment.geoip and not enrichment.bec_words %}
|
||||||
<p>No enrichment data available.</p>
|
<p>No enrichment data available.</p>
|
||||||
{% endif %}
|
{% endif %}
|
||||||
@@ -129,90 +114,131 @@
|
|||||||
|
|
||||||
<!-- Forms -->
|
<!-- Forms -->
|
||||||
<div class="card" id="forms">
|
<div class="card" id="forms">
|
||||||
<h2>Forms</h2>
|
<h2>Forms</h2>
|
||||||
{% if forms %}
|
|
||||||
{% for form in forms %}
|
{% if forms and forms|length > 0 %}
|
||||||
<details class="card {% if form.flagged %}flagged{% endif %}" style="padding:0.5rem; margin-bottom:0.5rem;">
|
<table class="enrichment-table forms-table">
|
||||||
<summary>{{ form.status }} — Action: {{ form.action }} ({{ form.method | upper }})</summary>
|
<thead>
|
||||||
<table class="enrichment-table">
|
<tr>
|
||||||
<thead>
|
<th>Action</th>
|
||||||
<tr>
|
<th>Method</th>
|
||||||
<th>Input Name</th>
|
<th>Inputs</th>
|
||||||
<th>Type</th>
|
<th>Matches (Rules)</th>
|
||||||
</tr>
|
<th>Form Snippet</th>
|
||||||
</thead>
|
</tr>
|
||||||
<tbody>
|
</thead>
|
||||||
{% for inp in form.inputs %}
|
<tbody>
|
||||||
<tr>
|
{% for f in forms %}
|
||||||
<td>{{ inp.name }}</td>
|
<tr>
|
||||||
<td>{{ inp.type }}</td>
|
<!-- Action -->
|
||||||
</tr>
|
<td class="breakable">
|
||||||
|
{% if f.action %}
|
||||||
|
{{ f.action[:25] }}{% if f.action|length > 25 %}…{% endif %}
|
||||||
|
{% else %}
|
||||||
|
<span class="text-dim">(no action)</span>
|
||||||
|
{% endif %}
|
||||||
|
</td>
|
||||||
|
|
||||||
|
<!-- Method -->
|
||||||
|
<td>{{ (f.method or 'get')|upper }}</td>
|
||||||
|
|
||||||
|
<!-- Inputs -->
|
||||||
|
<td>
|
||||||
|
{% if f.inputs and f.inputs|length > 0 %}
|
||||||
|
<div class="chips">
|
||||||
|
{% for inp in f.inputs %}
|
||||||
|
<span class="chip" title="{{ (inp.name or '') ~ ' : ' ~ (inp.type or 'text') }}">
|
||||||
|
{{ inp.name or '(unnamed)' }}<small> : {{ (inp.type or 'text') }}</small>
|
||||||
|
</span>
|
||||||
{% endfor %}
|
{% endfor %}
|
||||||
</tbody>
|
</div>
|
||||||
</table>
|
{% else %}
|
||||||
{% if form.flagged %}
|
<span class="text-dim">None</span>
|
||||||
<p><strong>Flag Reasons:</strong></p>
|
{% endif %}
|
||||||
<ul>
|
</td>
|
||||||
{% for reason in form.flag_reasons %}
|
|
||||||
<li>{{ reason }}</li>
|
<!-- Matches (Rules) -->
|
||||||
{% endfor %}
|
<td>
|
||||||
</ul>
|
{% if f.rules and f.rules|length > 0 %}
|
||||||
{% endif %}
|
<ul>
|
||||||
</details>
|
{% for r in f.rules %}
|
||||||
{% endfor %}
|
<li title="{{ r.description or '' }}">
|
||||||
{% else %}
|
{{ r.name }}
|
||||||
<p>No forms detected.</p>
|
{% if r.severity %}
|
||||||
{% endif %}
|
<span class="badge sev-{{ r.severity|lower }}">{{ r.severity|title }}</span>
|
||||||
<p><a href="#top-jump-list">Back to top</a></p>
|
{% endif %}
|
||||||
|
{% if r.tags %}
|
||||||
|
{% for t in r.tags %}
|
||||||
|
<span class="chip" title="Tag: {{ t }}">{{ t }}</span>
|
||||||
|
{% endfor %}
|
||||||
|
{% endif %}
|
||||||
|
{% if r.description %}
|
||||||
|
<small> — {{ r.description }}</small>
|
||||||
|
{% endif %}
|
||||||
|
</li>
|
||||||
|
{% endfor %}
|
||||||
|
</ul>
|
||||||
|
{% else %}
|
||||||
|
<span class="text-dim">N/A</span>
|
||||||
|
{% endif %}
|
||||||
|
</td>
|
||||||
|
|
||||||
|
<!-- Form Snippet -->
|
||||||
|
<td>
|
||||||
|
{% if f.content_snippet %}
|
||||||
|
<details>
|
||||||
|
<summary>View snippet ({{ f.content_snippet|length }} chars)</summary>
|
||||||
|
<pre class="code">{{ f.content_snippet }}</pre>
|
||||||
|
</details>
|
||||||
|
{% else %}
|
||||||
|
<span class="text-dim">N/A</span>
|
||||||
|
{% endif %}
|
||||||
|
</td>
|
||||||
|
</tr>
|
||||||
|
{% endfor %}
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
{% else %}
|
||||||
|
<p class="text-dim">No form issues detected.</p>
|
||||||
|
{% endif %}
|
||||||
|
|
||||||
|
<p><a href="#top-jump-list">Back to top</a></p>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
|
||||||
<!-- Suspicious Scripts -->
|
<!-- Suspicious Scripts -->
|
||||||
<div class="card" id="scripts">
|
<div class="card" id="scripts">
|
||||||
<h2>Suspicious Scripts</h2>
|
<h2>Suspicious Scripts</h2>
|
||||||
|
|
||||||
{% if suspicious_scripts %}
|
{% if suspicious_scripts %}
|
||||||
<table class="enrichment-table scripts-table">
|
<table class="enrichment-table scripts-table">
|
||||||
<thead>
|
<thead>
|
||||||
<tr>
|
<tr>
|
||||||
<th>Type</th>
|
<th>Type</th>
|
||||||
<th>Source URL</th>
|
<th>Source URL</th>
|
||||||
<th>Content Snippet</th>
|
<th>Matches (Rules & Heuristics)</th>
|
||||||
<th>Matches (Rules & Heuristics)</th>
|
<th>Content Snippet</th>
|
||||||
</tr>
|
</tr>
|
||||||
</thead>
|
</thead>
|
||||||
<tbody>
|
<tbody>
|
||||||
{% for s in suspicious_scripts %}
|
{% for s in suspicious_scripts %}
|
||||||
<tr>
|
<tr>
|
||||||
<!-- Type -->
|
<!-- Type -->
|
||||||
<td>{{ s.type or 'unknown' }}</td>
|
<td>{{ s.type or 'unknown' }}</td>
|
||||||
|
|
||||||
<!-- Source URL -->
|
<!-- Source URL -->
|
||||||
<td class="breakable">
|
<td class="breakable">
|
||||||
{% if s.src %}
|
{% if s.src %}
|
||||||
<a href="{{ s.src }}" target="_blank">{{ s.src[:50] }}</a>
|
<a href="{{ s.src }}" target="_blank" rel="noopener">{{ s.src[:50] }}</a>
|
||||||
{% else %}
|
{% else %} N/A {% endif %}
|
||||||
N/A
|
</td>
|
||||||
{% endif %}
|
|
||||||
</td>
|
|
||||||
|
|
||||||
<!-- Inline content snippet (collapsible) -->
|
<!-- Matches (Rules & Heuristics) -->
|
||||||
<td>
|
<td data-role="matches-cell">
|
||||||
{% if s.content_snippet %}
|
{% set has_rules = s.rules and s.rules|length > 0 %}
|
||||||
<details>
|
{% set has_heur = s.heuristics and s.heuristics|length > 0 %}
|
||||||
<summary>View snippet ({{ s.content_snippet|length }} chars) </summary>
|
|
||||||
<pre class="code">({{ s.content_snippet}}</pre>
|
|
||||||
</details>
|
|
||||||
{% else %}
|
|
||||||
N/A
|
|
||||||
{% endif %}
|
|
||||||
</td>
|
|
||||||
|
|
||||||
<!-- Rules & Heuristics -->
|
{% if has_rules %}
|
||||||
<td>
|
|
||||||
{% set has_rules = s.rules and s.rules|length > 0 %}
|
|
||||||
{% set has_heur = s.heuristics and s.heuristics|length > 0 %}
|
|
||||||
|
|
||||||
{% if has_rules %}
|
|
||||||
<strong>Rules</strong>
|
<strong>Rules</strong>
|
||||||
<ul>
|
<ul>
|
||||||
{% for r in s.rules %}
|
{% for r in s.rules %}
|
||||||
@@ -234,23 +260,45 @@
|
|||||||
</ul>
|
</ul>
|
||||||
{% endif %}
|
{% endif %}
|
||||||
|
|
||||||
{% if has_heur %}
|
{% if has_heur %}
|
||||||
<strong>Heuristics</strong>
|
<strong>Heuristics</strong>
|
||||||
<ul>
|
<ul>
|
||||||
{% for h in s.heuristics %}
|
{% for h in s.heuristics %}
|
||||||
<li>{{ h }}</li>
|
<li>{{ h }}</li>
|
||||||
{% endfor %}
|
{% endfor %}
|
||||||
</ul>
|
</ul>
|
||||||
{% endif %}
|
{% endif %}
|
||||||
|
|
||||||
{% if not has_rules and not has_heur %}
|
{% if not has_rules and not has_heur %}
|
||||||
N/A
|
<span class="text-dim">N/A</span>
|
||||||
|
{% endif %}
|
||||||
|
</td>
|
||||||
|
|
||||||
|
<!-- Content Snippet (reused for Analyze button / dynamic snippet) -->
|
||||||
|
<td data-role="snippet-cell">
|
||||||
|
{% if s.content_snippet %}
|
||||||
|
<details>
|
||||||
|
<summary>View snippet ({{ s.content_snippet|length }} chars)</summary>
|
||||||
|
<pre class="code">{{ s.content_snippet }}</pre>
|
||||||
|
</details>
|
||||||
|
{% else %}
|
||||||
|
{% if s.type == 'external' and s.src %}
|
||||||
|
<button
|
||||||
|
type="button"
|
||||||
|
class="btn btn-sm btn-primary btn-analyze-snippet"
|
||||||
|
data-url="{{ s.src }}"
|
||||||
|
data-job="{{ uuid }}">Analyze external script</button>
|
||||||
|
{% else %}
|
||||||
|
<span class="text-dim">N/A</span>
|
||||||
{% endif %}
|
{% endif %}
|
||||||
</td>
|
{% endif %}
|
||||||
</tr>
|
</td>
|
||||||
{% endfor %}
|
</tr>
|
||||||
|
{% endfor %}
|
||||||
</tbody>
|
</tbody>
|
||||||
</table>
|
</table>
|
||||||
|
|
||||||
|
|
||||||
{% else %}
|
{% else %}
|
||||||
<p>No suspicious scripts detected.</p>
|
<p>No suspicious scripts detected.</p>
|
||||||
{% endif %}
|
{% endif %}
|
||||||
@@ -269,8 +317,154 @@
|
|||||||
<!-- Source -->
|
<!-- Source -->
|
||||||
<div class="card" id="source">
|
<div class="card" id="source">
|
||||||
<h2>Source</h2>
|
<h2>Source</h2>
|
||||||
<p><a href="{{ url_for('main.artifacts', run_uuid=uuid, filename='source.txt') }}" target="_blank">View Source</a></p>
|
<p><a href="{{ url_for('main.view_artifact', run_uuid=uuid, filename='source.html') }}" target="_blank">View Source</a></p>
|
||||||
<p><a href="#top-jump-list">Back to top</a></p>
|
<p><a href="#top-jump-list">Back to top</a></p>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
{% endblock %}
|
{% endblock %}
|
||||||
|
|
||||||
|
{% block page_js %}
|
||||||
|
<script>
|
||||||
|
/**
|
||||||
|
* From an absolute artifact path like:
|
||||||
|
* /data/<uuid>/scripts/fetched/0.js
|
||||||
|
* /data/<uuid>/1755803694244.js
|
||||||
|
* C:\data\<uuid>\1755803694244.js
|
||||||
|
* return { uuid, rel } where rel is the path segment(s) after the uuid.
|
||||||
|
*/
|
||||||
|
function parseArtifactPath(artifactPath) {
|
||||||
|
if (!artifactPath) return { uuid: null, rel: null };
|
||||||
|
const norm = String(artifactPath).replace(/\\/g, '/'); // windows -> posix
|
||||||
|
const re = /\/([0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12})\/(.+)$/;
|
||||||
|
const m = norm.match(re);
|
||||||
|
if (!m) return { uuid: null, rel: null };
|
||||||
|
return { uuid: m[1], rel: m[2] };
|
||||||
|
}
|
||||||
|
|
||||||
|
/** Build /view/artifact/<uuid>/<path:filename> */
|
||||||
|
function buildViewerUrlFromAbsPath(artifactPath) {
|
||||||
|
const { uuid, rel } = parseArtifactPath(artifactPath);
|
||||||
|
if (!uuid || !rel) return '#';
|
||||||
|
const encodedRel = rel.split('/').map(encodeURIComponent).join('/');
|
||||||
|
return `/view/artifact/${encodeURIComponent(uuid)}/${encodedRel}`;
|
||||||
|
}
|
||||||
|
|
||||||
|
document.addEventListener('click', function (e) {
|
||||||
|
const btn = e.target.closest('.btn-analyze-snippet');
|
||||||
|
if (!btn) return;
|
||||||
|
|
||||||
|
const row = btn.closest('tr');
|
||||||
|
const snippetCell = btn.closest('[data-role="snippet-cell"]') || btn.parentElement;
|
||||||
|
const matchesCell = row ? row.querySelector('[data-role="matches-cell"]') : null;
|
||||||
|
|
||||||
|
const url = btn.dataset.url;
|
||||||
|
const job = btn.dataset.job;
|
||||||
|
|
||||||
|
// Replace button with a lightweight loading text
|
||||||
|
const loading = document.createElement('span');
|
||||||
|
loading.className = 'text-dim';
|
||||||
|
loading.textContent = 'Analyzing…';
|
||||||
|
btn.replaceWith(loading);
|
||||||
|
|
||||||
|
fetch('/api/analyze_script', {
|
||||||
|
method: 'POST',
|
||||||
|
headers: { 'Content-Type': 'application/json' }, // include CSRF header if applicable
|
||||||
|
body: JSON.stringify({ job_id: job, url: url})
|
||||||
|
})
|
||||||
|
.then(r => r.json())
|
||||||
|
.then(data => {
|
||||||
|
if (!data.ok) {
|
||||||
|
loading.textContent = 'Error: ' + (data.error || 'Unknown');
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
// --- Build the snippet details element ---
|
||||||
|
const snippetText = data.snippet || ''; // backend should return a preview
|
||||||
|
const snippetLen = data.snippet_len || snippetText.length;
|
||||||
|
|
||||||
|
// --- File path / viewer things
|
||||||
|
const filepath = data.artifact_path || ''; // e.g., "/data/3ec90584-076e-457c-924b-861be7e11a34/1755803694244.js"
|
||||||
|
const viewerUrl = buildViewerUrlFromAbsPath(filepath);
|
||||||
|
|
||||||
|
|
||||||
|
const details = document.createElement('details');
|
||||||
|
const summary = document.createElement('summary');
|
||||||
|
summary.textContent = 'View snippet (' + data.snippet_len + ' chars' + (data.truncated ? ', truncated' : '') + ', ' + data.bytes + ' bytes)';
|
||||||
|
|
||||||
|
const pre = document.createElement('pre');
|
||||||
|
pre.className = 'code';
|
||||||
|
pre.textContent = snippetText; // textContent preserves literal code safely
|
||||||
|
|
||||||
|
// put things in the DOM
|
||||||
|
details.appendChild(summary);
|
||||||
|
details.appendChild(pre);
|
||||||
|
|
||||||
|
const link = document.createElement('a');
|
||||||
|
link.href = viewerUrl;
|
||||||
|
link.target = '_blank';
|
||||||
|
link.rel = 'noopener';
|
||||||
|
link.textContent = 'open in viewer';
|
||||||
|
|
||||||
|
summary.appendChild(document.createElement('br')); // line break under the summary text
|
||||||
|
summary.appendChild(link);
|
||||||
|
|
||||||
|
loading.replaceWith(details);
|
||||||
|
|
||||||
|
// Replace "Analyzing…" with the new details block
|
||||||
|
loading.replaceWith(details);
|
||||||
|
|
||||||
|
// --- Update the Matches cell with rule findings ---
|
||||||
|
if (matchesCell) {
|
||||||
|
if (Array.isArray(data.findings) && data.findings.length) {
|
||||||
|
const frag = document.createDocumentFragment();
|
||||||
|
const strong = document.createElement('strong');
|
||||||
|
strong.textContent = 'Rules';
|
||||||
|
const ul = document.createElement('ul');
|
||||||
|
|
||||||
|
data.findings.forEach(function (f) {
|
||||||
|
const li = document.createElement('li');
|
||||||
|
const name = f.name || 'Rule';
|
||||||
|
const desc = f.description ? ' — ' + f.description : '';
|
||||||
|
li.textContent = name + desc;
|
||||||
|
|
||||||
|
// Optional badges for severity/tags if present
|
||||||
|
if (f.severity) {
|
||||||
|
const badge = document.createElement('span');
|
||||||
|
badge.className = 'badge sev-' + String(f.severity).toLowerCase();
|
||||||
|
badge.textContent = String(f.severity).charAt(0).toUpperCase() + String(f.severity).slice(1);
|
||||||
|
li.appendChild(document.createTextNode(' '));
|
||||||
|
li.appendChild(badge);
|
||||||
|
}
|
||||||
|
if (Array.isArray(f.tags)) {
|
||||||
|
f.tags.forEach(function (t) {
|
||||||
|
const chip = document.createElement('span');
|
||||||
|
chip.className = 'chip';
|
||||||
|
chip.title = 'Tag: ' + t;
|
||||||
|
chip.textContent = t;
|
||||||
|
li.appendChild(document.createTextNode(' '));
|
||||||
|
li.appendChild(chip);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
ul.appendChild(li);
|
||||||
|
});
|
||||||
|
|
||||||
|
frag.appendChild(strong);
|
||||||
|
frag.appendChild(ul);
|
||||||
|
|
||||||
|
// Replace placeholder N/A or existing heuristics-only content
|
||||||
|
matchesCell.innerHTML = '';
|
||||||
|
matchesCell.appendChild(frag);
|
||||||
|
} else {
|
||||||
|
matchesCell.innerHTML = '<span class="text-dim">No rule matches.</span>';
|
||||||
|
}
|
||||||
|
}
|
||||||
|
})
|
||||||
|
.catch(function (err) {
|
||||||
|
loading.textContent = 'Request failed: ' + err;
|
||||||
|
});
|
||||||
|
});
|
||||||
|
</script>
|
||||||
|
|
||||||
|
|
||||||
|
{% endblock %}
|
||||||
111
app/templates/viewer.html
Normal file
111
app/templates/viewer.html
Normal file
@@ -0,0 +1,111 @@
|
|||||||
|
{% extends "base.html" %}
|
||||||
|
{% block content %}
|
||||||
|
<div style="max-width:1100px;margin:0 auto;padding:1rem 1.25rem;">
|
||||||
|
<header style="display:flex;align-items:center;justify-content:space-between;gap:1rem;flex-wrap:wrap;">
|
||||||
|
<div>
|
||||||
|
<h2 style="margin:0;font-size:1.1rem;">Code Viewer</h2>
|
||||||
|
<div class="text-dim" style="font-size:0.9rem;">
|
||||||
|
<strong>File:</strong> <span id="fileName">{{ filename }}</span>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div style="display:flex;gap:.5rem;align-items:center;">
|
||||||
|
<button id="copyBtn" class="btn btn-sm">Copy</button>
|
||||||
|
<button id="wrapBtn" class="btn btn-sm">Toggle wrap</button>
|
||||||
|
<a id="openRaw" class="btn btn-sm" href="{{ raw_url }}" target="_blank" rel="noopener">Open raw</a>
|
||||||
|
<a id="downloadRaw" class="btn btn-sm" href="{{ raw_url }}" download>Download</a>
|
||||||
|
</div>
|
||||||
|
</header>
|
||||||
|
|
||||||
|
<div id="viewerStatus" class="text-dim" style="margin:.5rem 0 .75rem;"></div>
|
||||||
|
<div id="editor" style="height:72vh;border:1px solid #1f2a36;border-radius:8px;"></div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Monaco AMD loader (no integrity to avoid mismatch) -->
|
||||||
|
<script src="https://cdnjs.cloudflare.com/ajax/libs/monaco-editor/0.49.0/min/vs/loader.min.js"
|
||||||
|
crossorigin="anonymous" referrerpolicy="no-referrer"></script>
|
||||||
|
|
||||||
|
<script>
|
||||||
|
(function () {
|
||||||
|
const RAW_URL = "{{ raw_url }}";
|
||||||
|
const FILENAME = "{{ filename }}";
|
||||||
|
const LANGUAGE = "{{ language|default('', true) }}";
|
||||||
|
|
||||||
|
const statusEl = document.getElementById('viewerStatus');
|
||||||
|
|
||||||
|
function extToLang(name) {
|
||||||
|
if (!name) return 'plaintext';
|
||||||
|
const m = name.toLowerCase().match(/\.([a-z0-9]+)$/);
|
||||||
|
const ext = m ? m[1] : '';
|
||||||
|
const map = {js:'javascript',mjs:'javascript',cjs:'javascript',ts:'typescript',json:'json',
|
||||||
|
html:'html',htm:'html',css:'css',py:'python',sh:'shell',bash:'shell',
|
||||||
|
yml:'yaml',yaml:'yaml',md:'markdown',txt:'plaintext',log:'plaintext'};
|
||||||
|
return map[ext] || 'plaintext';
|
||||||
|
}
|
||||||
|
|
||||||
|
// Wait until the AMD loader has defined window.require
|
||||||
|
function waitForRequire(msLeft = 5000) {
|
||||||
|
return new Promise((resolve, reject) => {
|
||||||
|
const t0 = performance.now();
|
||||||
|
(function poll() {
|
||||||
|
if (window.require && typeof window.require === 'function') return resolve();
|
||||||
|
if (performance.now() - t0 > msLeft) return reject(new Error('Monaco loader not available'));
|
||||||
|
setTimeout(poll, 25);
|
||||||
|
})();
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
function configureMonaco() {
|
||||||
|
// Point AMD loader at the CDN
|
||||||
|
require.config({ paths: { 'vs': 'https://cdnjs.cloudflare.com/ajax/libs/monaco-editor/0.49.0/min/vs' } });
|
||||||
|
// Worker bootstrap
|
||||||
|
window.MonacoEnvironment = {
|
||||||
|
getWorkerUrl: function () {
|
||||||
|
const base = 'https://cdnjs.cloudflare.com/ajax/libs/monaco-editor/0.49.0/min/';
|
||||||
|
const code = "self.MonacoEnvironment={baseUrl:'" + base + "'};importScripts('" + base + "vs/base/worker/workerMain.js');";
|
||||||
|
return 'data:text/javascript;charset=utf-8,' + encodeURIComponent(code);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
async function main() {
|
||||||
|
try {
|
||||||
|
statusEl.textContent = 'Loading file…';
|
||||||
|
await waitForRequire();
|
||||||
|
configureMonaco();
|
||||||
|
|
||||||
|
const resp = await fetch(RAW_URL, { cache: 'no-store' });
|
||||||
|
const text = await resp.text();
|
||||||
|
|
||||||
|
require(['vs/editor/editor.main'], function () {
|
||||||
|
const editor = monaco.editor.create(document.getElementById('editor'), {
|
||||||
|
value: text,
|
||||||
|
language: LANGUAGE || extToLang(FILENAME),
|
||||||
|
readOnly: true,
|
||||||
|
automaticLayout: true,
|
||||||
|
wordWrap: 'on',
|
||||||
|
minimap: { enabled: false },
|
||||||
|
scrollBeyondLastLine: false,
|
||||||
|
theme: 'vs-dark'
|
||||||
|
});
|
||||||
|
|
||||||
|
// Buttons
|
||||||
|
document.getElementById('copyBtn')?.addEventListener('click', async () => {
|
||||||
|
try { await navigator.clipboard.writeText(editor.getValue()); statusEl.textContent = 'Copied.'; }
|
||||||
|
catch (e) { statusEl.textContent = 'Copy failed: ' + e; }
|
||||||
|
});
|
||||||
|
document.getElementById('wrapBtn')?.addEventListener('click', () => {
|
||||||
|
const opts = editor.getRawOptions();
|
||||||
|
editor.updateOptions({ wordWrap: opts.wordWrap === 'on' ? 'off' : 'on' });
|
||||||
|
});
|
||||||
|
|
||||||
|
statusEl.textContent = (resp.ok ? '' : `Warning: HTTP ${resp.status}`) + (text.length ? '' : ' (empty file)');
|
||||||
|
});
|
||||||
|
} catch (err) {
|
||||||
|
statusEl.textContent = 'Viewer error: ' + err.message;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
main();
|
||||||
|
})();
|
||||||
|
</script>
|
||||||
|
{% endblock %}
|
||||||
@@ -33,7 +33,7 @@ from flask import current_app
|
|||||||
from playwright.async_api import async_playwright, TimeoutError as PWTimeoutError
|
from playwright.async_api import async_playwright, TimeoutError as PWTimeoutError
|
||||||
|
|
||||||
from app.utils.io_helpers import safe_write
|
from app.utils.io_helpers import safe_write
|
||||||
from app.enrichment import enrich_url
|
from app.utils.enrichment import enrich_url
|
||||||
from app.utils.settings import get_settings
|
from app.utils.settings import get_settings
|
||||||
|
|
||||||
# Load settings once for constants / defaults
|
# Load settings once for constants / defaults
|
||||||
@@ -202,85 +202,111 @@ class Browser:
|
|||||||
# -----------------------------------------------------------------------
|
# -----------------------------------------------------------------------
|
||||||
# Form & Script analysis (plumbing only; detection is in the rules engine)
|
# Form & Script analysis (plumbing only; detection is in the rules engine)
|
||||||
# -----------------------------------------------------------------------
|
# -----------------------------------------------------------------------
|
||||||
def analyze_forms(self, html: str, base_url: str) -> List[Dict[str, Any]]:
|
def analyze_forms(self, html: str, base_url: str = "") -> List[Dict[str, Any]]:
|
||||||
"""
|
"""
|
||||||
Parse forms from the page HTML and apply rule-based checks (engine), keeping
|
Collect form artifacts and evaluate per-form matches via the rules engine.
|
||||||
only simple plumbing heuristics here (no security logic).
|
Only include rows that matched at least one rule.
|
||||||
|
|
||||||
Returns list of dicts with keys:
|
Returns list of dicts with keys (per matched form):
|
||||||
- action, method, inputs
|
- type: "form"
|
||||||
- flagged (bool), flag_reasons (list[str]), status (str)
|
- action, method, inputs
|
||||||
- rule_checks: {'checks': [...], 'summary': {...}} (per-form snippet evaluation)
|
- content_snippet: str
|
||||||
|
- rules: List[{name, description, severity?, tags?}]
|
||||||
"""
|
"""
|
||||||
soup = BeautifulSoup(html, "lxml")
|
soup = BeautifulSoup(html, "lxml")
|
||||||
forms_info: List[Dict[str, Any]] = []
|
results: List[Dict[str, Any]] = []
|
||||||
page_hostname = urlparse(base_url).hostname
|
|
||||||
|
engine = self._get_rule_engine()
|
||||||
|
base_hostname = urlparse(base_url).hostname or ""
|
||||||
|
# Match how scripts picks preview len
|
||||||
|
try:
|
||||||
|
preview_len = getattr(settings.ui, "snippet_preview_len", 200) # keep parity with scripts
|
||||||
|
except Exception:
|
||||||
|
preview_len = 200
|
||||||
|
|
||||||
for form in soup.find_all("form"):
|
for form in soup.find_all("form"):
|
||||||
action = form.get("action")
|
try:
|
||||||
method = form.get("method", "get").lower()
|
action = (form.get("action") or "").strip()
|
||||||
|
method = (form.get("method") or "get").strip().lower()
|
||||||
|
|
||||||
inputs: List[Dict[str, Any]] = []
|
inputs: List[Dict[str, Any]] = []
|
||||||
for inp in form.find_all("input"):
|
for inp in form.find_all("input"):
|
||||||
input_name = inp.get("name")
|
inputs.append({
|
||||||
input_type = inp.get("type", "text")
|
"name": inp.get("name"),
|
||||||
inputs.append({"name": input_name, "type": input_type})
|
"type": (inp.get("type") or "text").strip().lower(),
|
||||||
|
})
|
||||||
|
|
||||||
flagged_reasons: List[str] = []
|
# Use the actual form markup for regex rules
|
||||||
|
form_markup = str(form)
|
||||||
|
# UI-friendly snippet
|
||||||
|
content_snippet = form_markup[:preview_len]
|
||||||
|
|
||||||
if not action or str(action).strip() == "":
|
matches: List[Dict[str, Any]] = []
|
||||||
flagged_reasons.append("No action specified")
|
if engine is not None:
|
||||||
else:
|
for r in getattr(engine, "rules", []):
|
||||||
|
if getattr(r, "category", None) != "form":
|
||||||
|
continue
|
||||||
|
rtype = getattr(r, "rule_type", None)
|
||||||
|
|
||||||
|
try:
|
||||||
|
ok = False
|
||||||
|
reason = ""
|
||||||
|
if rtype == "regex":
|
||||||
|
# Run against the raw form HTML
|
||||||
|
ok, reason = r.run(form_markup)
|
||||||
|
elif rtype == "function":
|
||||||
|
# Structured facts for function-style rules
|
||||||
|
facts = {
|
||||||
|
"category": "form",
|
||||||
|
"base_url": base_url,
|
||||||
|
"base_hostname": base_hostname,
|
||||||
|
"action": action,
|
||||||
|
"action_hostname": urlparse(action).hostname or "",
|
||||||
|
"method": method,
|
||||||
|
"inputs": inputs,
|
||||||
|
"markup": form_markup,
|
||||||
|
}
|
||||||
|
ok, reason = r.run(facts)
|
||||||
|
else:
|
||||||
|
continue
|
||||||
|
|
||||||
|
if ok:
|
||||||
|
matches.append({
|
||||||
|
"name": getattr(r, "name", "unknown_rule"),
|
||||||
|
"description": (reason or "") or getattr(r, "description", ""),
|
||||||
|
"severity": getattr(r, "severity", None),
|
||||||
|
"tags": getattr(r, "tags", None),
|
||||||
|
})
|
||||||
|
except Exception as rule_exc:
|
||||||
|
# Be defensive—bad rule shouldn't break the form pass
|
||||||
|
try:
|
||||||
|
self.logger.debug("Form rule error", extra={"rule": getattr(r, "name", "?"), "error": str(rule_exc)})
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
continue
|
||||||
|
|
||||||
|
if matches:
|
||||||
|
results.append({
|
||||||
|
"type": "form",
|
||||||
|
"action": action,
|
||||||
|
"method": method,
|
||||||
|
"inputs": inputs,
|
||||||
|
"content_snippet": content_snippet,
|
||||||
|
"rules": matches,
|
||||||
|
})
|
||||||
|
|
||||||
|
except Exception as exc:
|
||||||
|
# Keep analysis resilient
|
||||||
try:
|
try:
|
||||||
action_host = urlparse(action).hostname
|
self.logger.error("Form analysis error", extra={"error": str(exc)})
|
||||||
if not str(action).startswith("/") and action_host != page_hostname:
|
|
||||||
flagged_reasons.append("Submits to a different host")
|
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
pass
|
||||||
|
results.append({
|
||||||
|
"type": "form",
|
||||||
|
"heuristics": [f"Form analysis error: {exc}"],
|
||||||
|
})
|
||||||
|
|
||||||
try:
|
return results
|
||||||
if urlparse(action).scheme == "http" and urlparse(base_url).scheme == "https":
|
|
||||||
flagged_reasons.append("Submits over insecure HTTP")
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
|
|
||||||
for hidden in form.find_all("input", type="hidden"):
|
|
||||||
name_value = hidden.get("name") or ""
|
|
||||||
if "password" in name_value.lower():
|
|
||||||
flagged_reasons.append("Hidden password field")
|
|
||||||
|
|
||||||
flagged = bool(flagged_reasons)
|
|
||||||
|
|
||||||
# Serialize a simple form snippet for rule category='form'
|
|
||||||
snippet_lines = []
|
|
||||||
snippet_lines.append(f"base_url={base_url}")
|
|
||||||
snippet_lines.append(f"base_hostname={page_hostname}")
|
|
||||||
snippet_lines.append(f"action={action}")
|
|
||||||
snippet_lines.append(f"method={method}")
|
|
||||||
snippet_lines.append("inputs=")
|
|
||||||
|
|
||||||
i = 0
|
|
||||||
n = len(inputs)
|
|
||||||
while i < n:
|
|
||||||
item = inputs[i]
|
|
||||||
snippet_lines.append(f" - name={item.get('name')} type={item.get('type')}")
|
|
||||||
i = i + 1
|
|
||||||
form_snippet = "\n".join(snippet_lines)
|
|
||||||
|
|
||||||
# Per-form rule checks (PASS/FAIL list via engine)
|
|
||||||
rule_checks = self.run_rule_checks(form_snippet, category="form")
|
|
||||||
|
|
||||||
forms_info.append({
|
|
||||||
"action": action,
|
|
||||||
"method": method,
|
|
||||||
"inputs": inputs,
|
|
||||||
"flagged": flagged,
|
|
||||||
"flag_reasons": flagged_reasons,
|
|
||||||
"status": "flagged" if flagged else "possibly safe",
|
|
||||||
"rule_checks": rule_checks
|
|
||||||
})
|
|
||||||
|
|
||||||
return forms_info
|
|
||||||
|
|
||||||
def analyze_scripts(self, html: str, base_url: str = "") -> List[Dict[str, Any]]:
|
def analyze_scripts(self, html: str, base_url: str = "") -> List[Dict[str, Any]]:
|
||||||
"""
|
"""
|
||||||
@@ -370,7 +396,7 @@ class Browser:
|
|||||||
|
|
||||||
Writes:
|
Writes:
|
||||||
- /data/<uuid>/screenshot.png
|
- /data/<uuid>/screenshot.png
|
||||||
- /data/<uuid>/source.txt
|
- /data/<uuid>/source.html
|
||||||
- /data/<uuid>/results.json (single source of truth for routes)
|
- /data/<uuid>/results.json (single source of truth for routes)
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
@@ -381,7 +407,7 @@ class Browser:
|
|||||||
run_dir.mkdir(parents=True, exist_ok=True)
|
run_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
screenshot_path = run_dir / "screenshot.png"
|
screenshot_path = run_dir / "screenshot.png"
|
||||||
source_path = run_dir / "source.txt"
|
source_path = run_dir / "source.html"
|
||||||
results_path = run_dir / "results.json"
|
results_path = run_dir / "results.json"
|
||||||
|
|
||||||
redirects: List[Dict[str, Any]] = []
|
redirects: List[Dict[str, Any]] = []
|
||||||
|
|||||||
@@ -9,8 +9,8 @@ from ipaddress import ip_address
|
|||||||
import socket
|
import socket
|
||||||
|
|
||||||
# Local imports
|
# Local imports
|
||||||
from .utils.cache_db import get_cache
|
from app.utils.cache_db import get_cache
|
||||||
from .utils.settings import get_settings
|
from app.utils.settings import get_settings
|
||||||
|
|
||||||
# Configure logging
|
# Configure logging
|
||||||
logging.basicConfig(level=logging.INFO, format="[%(levelname)s] %(message)s")
|
logging.basicConfig(level=logging.INFO, format="[%(levelname)s] %(message)s")
|
||||||
@@ -39,9 +39,6 @@ def enrich_url(url: str) -> dict:
|
|||||||
# --- GeoIP ---
|
# --- GeoIP ---
|
||||||
result["geoip"] = enrich_geoip(hostname)
|
result["geoip"] = enrich_geoip(hostname)
|
||||||
|
|
||||||
# --- BEC Words ---
|
|
||||||
result["bec_words"] = [w for w in BEC_WORDS if w.lower() in url.lower()]
|
|
||||||
|
|
||||||
return result
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
338
app/utils/external_fetcher.py
Normal file
338
app/utils/external_fetcher.py
Normal file
@@ -0,0 +1,338 @@
|
|||||||
|
# sneakyscope/app/utils/external_fetch.py
|
||||||
|
import hashlib
|
||||||
|
import os
|
||||||
|
import logging
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from typing import Optional, Tuple, List
|
||||||
|
from urllib.parse import urljoin, urlparse
|
||||||
|
|
||||||
|
import requests
|
||||||
|
|
||||||
|
from app.utils.settings import get_settings
|
||||||
|
|
||||||
|
settings = get_settings()
|
||||||
|
|
||||||
|
_ALLOWED_SCHEMES = {"http", "https"}
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class FetchResult:
|
||||||
|
"""
|
||||||
|
Outcome for a single external script fetch.
|
||||||
|
"""
|
||||||
|
ok: bool
|
||||||
|
reason: str
|
||||||
|
source_url: str
|
||||||
|
final_url: str
|
||||||
|
status_code: Optional[int]
|
||||||
|
content_type: Optional[str]
|
||||||
|
bytes_fetched: int
|
||||||
|
truncated: bool
|
||||||
|
sha256_hex: Optional[str]
|
||||||
|
saved_path: Optional[str]
|
||||||
|
|
||||||
|
|
||||||
|
class ExternalScriptFetcher:
|
||||||
|
"""
|
||||||
|
Minimal, safe-by-default fetcher for external JS files.
|
||||||
|
|
||||||
|
Notes / assumptions:
|
||||||
|
- All artifacts for this run live under the UUID-backed `results_path` you pass in.
|
||||||
|
- Saves bytes to: <results_path>/<index>.js
|
||||||
|
- Manual redirects up to `max_redirects`.
|
||||||
|
- Streaming with a hard byte cap derived from `max_total_mb`.
|
||||||
|
- Never raises network exceptions to callers; failures are encoded in FetchResult.
|
||||||
|
- Settings are read from get_settings()['external_script_fetch'] with sane defaults.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, results_path: str, session: Optional[requests.Session] = None):
|
||||||
|
"""
|
||||||
|
Args:
|
||||||
|
results_path: Absolute path to the run's UUID directory (e.g., /data/<run_uuid>).
|
||||||
|
session: Optional requests.Session to reuse connections; a new one is created if not provided.
|
||||||
|
"""
|
||||||
|
# Derived value: MiB -> bytes
|
||||||
|
self.max_total_bytes: int = settings.external_fetch.max_total_mb * 1024 * 1024
|
||||||
|
|
||||||
|
# Logger
|
||||||
|
self.logger = logging.getLogger(__file__)
|
||||||
|
|
||||||
|
# Where to write artifacts for this job/run (UUID directory)
|
||||||
|
self.results_path = results_path
|
||||||
|
|
||||||
|
# HTTP session with a predictable UA
|
||||||
|
self.session = session or requests.Session()
|
||||||
|
self.session.headers.update({"User-Agent": "SneakyScope/1.0"})
|
||||||
|
|
||||||
|
# -------------------------
|
||||||
|
# Internal helper methods
|
||||||
|
# -------------------------
|
||||||
|
|
||||||
|
def _timeout(self) -> Tuple[float, float]:
|
||||||
|
"""
|
||||||
|
Compute (connect_timeout, read_timeout) in seconds from max_time_ms.
|
||||||
|
Keeps a conservative split so either phase gets a fair chance.
|
||||||
|
"""
|
||||||
|
total = max(0.1, settings.external_fetch.max_time_ms / 1000.0)
|
||||||
|
connect = min(1.5, total * 0.5) # cap connect timeout
|
||||||
|
read = max(0.5, total * 0.5) # floor read timeout
|
||||||
|
return (connect, read)
|
||||||
|
|
||||||
|
def _scheme_allowed(self, url: str) -> bool:
|
||||||
|
"""
|
||||||
|
Return True if URL uses an allowed scheme (http/https).
|
||||||
|
"""
|
||||||
|
scheme = (urlparse(url).scheme or "").lower()
|
||||||
|
return scheme in _ALLOWED_SCHEMES
|
||||||
|
|
||||||
|
def _artifact_path(self, index: int) -> str:
|
||||||
|
"""
|
||||||
|
Build an output path like:
|
||||||
|
<results_path>/<index>.js
|
||||||
|
|
||||||
|
Ensures the directory exists.
|
||||||
|
"""
|
||||||
|
base_dir = os.path.join(self.results_path)
|
||||||
|
# Make sure parent directories exist (idempotent)
|
||||||
|
os.makedirs(base_dir, exist_ok=True)
|
||||||
|
filename = f"{index}.js"
|
||||||
|
return os.path.join(base_dir, filename)
|
||||||
|
|
||||||
|
# -------------------------
|
||||||
|
# Public API
|
||||||
|
# -------------------------
|
||||||
|
|
||||||
|
def fetch_one(self, script_url: str, index: int) -> FetchResult:
|
||||||
|
"""
|
||||||
|
Fetch exactly one external script with manual redirect handling and a hard per-file byte cap.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
script_url: The script URL to retrieve.
|
||||||
|
index: Numeric index used solely for naming the artifact file (<index>.js).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
FetchResult with status, metadata, and saved path (if successful).
|
||||||
|
"""
|
||||||
|
# Feature gate: allow callers to rely on a consistent failure when globally disabled.
|
||||||
|
if not settings.external_fetch.enabled:
|
||||||
|
return FetchResult(
|
||||||
|
ok=False,
|
||||||
|
reason="Feature disabled",
|
||||||
|
source_url=script_url,
|
||||||
|
final_url=script_url,
|
||||||
|
status_code=None,
|
||||||
|
content_type=None,
|
||||||
|
bytes_fetched=0,
|
||||||
|
truncated=False,
|
||||||
|
sha256_hex=None,
|
||||||
|
saved_path=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Scheme guard: refuse anything not http/https in this v1.
|
||||||
|
if not self._scheme_allowed(script_url):
|
||||||
|
return FetchResult(
|
||||||
|
ok=False,
|
||||||
|
reason="Scheme not allowed",
|
||||||
|
source_url=script_url,
|
||||||
|
final_url=script_url,
|
||||||
|
status_code=None,
|
||||||
|
content_type=None,
|
||||||
|
bytes_fetched=0,
|
||||||
|
truncated=False,
|
||||||
|
sha256_hex=None,
|
||||||
|
saved_path=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
current_url = script_url
|
||||||
|
status_code: Optional[int] = None
|
||||||
|
content_type: Optional[str] = None
|
||||||
|
redirects_followed = 0
|
||||||
|
|
||||||
|
# Manual redirect loop so we can enforce max_redirects precisely.
|
||||||
|
while True:
|
||||||
|
try:
|
||||||
|
resp = self.session.get(
|
||||||
|
current_url,
|
||||||
|
stream=True,
|
||||||
|
allow_redirects=False,
|
||||||
|
timeout=self._timeout(),
|
||||||
|
)
|
||||||
|
except requests.exceptions.Timeout:
|
||||||
|
return FetchResult(
|
||||||
|
ok=False,
|
||||||
|
reason="Timeout",
|
||||||
|
source_url=script_url,
|
||||||
|
final_url=current_url,
|
||||||
|
status_code=status_code,
|
||||||
|
content_type=content_type,
|
||||||
|
bytes_fetched=0,
|
||||||
|
truncated=False,
|
||||||
|
sha256_hex=None,
|
||||||
|
saved_path=None,
|
||||||
|
)
|
||||||
|
except requests.exceptions.RequestException as e:
|
||||||
|
return FetchResult(
|
||||||
|
ok=False,
|
||||||
|
reason=f"Network error: {e.__class__.__name__}",
|
||||||
|
source_url=script_url,
|
||||||
|
final_url=current_url,
|
||||||
|
status_code=status_code,
|
||||||
|
content_type=content_type,
|
||||||
|
bytes_fetched=0,
|
||||||
|
truncated=False,
|
||||||
|
sha256_hex=None,
|
||||||
|
saved_path=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
status_code = resp.status_code
|
||||||
|
content_type = resp.headers.get("Content-Type")
|
||||||
|
|
||||||
|
# Handle redirects explicitly (3xx with Location)
|
||||||
|
if status_code in (301, 302, 303, 307, 308) and "Location" in resp.headers:
|
||||||
|
if redirects_followed >= settings.external_fetch.max_redirects:
|
||||||
|
return FetchResult(
|
||||||
|
ok=False,
|
||||||
|
reason="Max redirects exceeded",
|
||||||
|
source_url=script_url,
|
||||||
|
final_url=current_url,
|
||||||
|
status_code=status_code,
|
||||||
|
content_type=content_type,
|
||||||
|
bytes_fetched=0,
|
||||||
|
truncated=False,
|
||||||
|
sha256_hex=None,
|
||||||
|
saved_path=None,
|
||||||
|
)
|
||||||
|
next_url = urljoin(current_url, resp.headers["Location"])
|
||||||
|
if not self._scheme_allowed(next_url):
|
||||||
|
return FetchResult(
|
||||||
|
ok=False,
|
||||||
|
reason="Redirect to disallowed scheme",
|
||||||
|
source_url=script_url,
|
||||||
|
final_url=next_url,
|
||||||
|
status_code=status_code,
|
||||||
|
content_type=content_type,
|
||||||
|
bytes_fetched=0,
|
||||||
|
truncated=False,
|
||||||
|
sha256_hex=None,
|
||||||
|
saved_path=None,
|
||||||
|
)
|
||||||
|
current_url = next_url
|
||||||
|
redirects_followed += 1
|
||||||
|
# Loop to follow next hop
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Not a redirect: stream response body with a hard byte cap.
|
||||||
|
cap = self.max_total_bytes
|
||||||
|
total = 0
|
||||||
|
truncated = False
|
||||||
|
chunks: List[bytes] = []
|
||||||
|
|
||||||
|
try:
|
||||||
|
for chunk in resp.iter_content(chunk_size=8192):
|
||||||
|
if not chunk:
|
||||||
|
# Skip keep-alive chunks
|
||||||
|
continue
|
||||||
|
new_total = total + len(chunk)
|
||||||
|
if new_total > cap:
|
||||||
|
# Only keep what fits and stop
|
||||||
|
remaining = cap - total
|
||||||
|
if remaining > 0:
|
||||||
|
chunks.append(chunk[:remaining])
|
||||||
|
total += remaining
|
||||||
|
truncated = True
|
||||||
|
break
|
||||||
|
chunks.append(chunk)
|
||||||
|
total = new_total
|
||||||
|
except requests.exceptions.Timeout:
|
||||||
|
return FetchResult(
|
||||||
|
ok=False,
|
||||||
|
reason="Timeout while reading",
|
||||||
|
source_url=script_url,
|
||||||
|
final_url=current_url,
|
||||||
|
status_code=status_code,
|
||||||
|
content_type=content_type,
|
||||||
|
bytes_fetched=total,
|
||||||
|
truncated=truncated,
|
||||||
|
sha256_hex=None,
|
||||||
|
saved_path=None,
|
||||||
|
)
|
||||||
|
except requests.exceptions.RequestException as e:
|
||||||
|
return FetchResult(
|
||||||
|
ok=False,
|
||||||
|
reason=f"Network error while reading: {e.__class__.__name__}",
|
||||||
|
source_url=script_url,
|
||||||
|
final_url=current_url,
|
||||||
|
status_code=status_code,
|
||||||
|
content_type=content_type,
|
||||||
|
bytes_fetched=total,
|
||||||
|
truncated=truncated,
|
||||||
|
sha256_hex=None,
|
||||||
|
saved_path=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
data = b"".join(chunks)
|
||||||
|
if not data:
|
||||||
|
return FetchResult(
|
||||||
|
ok=False,
|
||||||
|
reason="Empty response",
|
||||||
|
source_url=script_url,
|
||||||
|
final_url=current_url,
|
||||||
|
status_code=status_code,
|
||||||
|
content_type=content_type,
|
||||||
|
bytes_fetched=0,
|
||||||
|
truncated=False,
|
||||||
|
sha256_hex=None,
|
||||||
|
saved_path=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Persist to <results_path>/<index>.js
|
||||||
|
out_path = self._artifact_path(index)
|
||||||
|
try:
|
||||||
|
with open(out_path, "wb") as f:
|
||||||
|
f.write(data)
|
||||||
|
except OSError as e:
|
||||||
|
return FetchResult(
|
||||||
|
ok=False,
|
||||||
|
reason=f"Write error: {e.__class__.__name__}",
|
||||||
|
source_url=script_url,
|
||||||
|
final_url=current_url,
|
||||||
|
status_code=status_code,
|
||||||
|
content_type=content_type,
|
||||||
|
bytes_fetched=total,
|
||||||
|
truncated=truncated,
|
||||||
|
sha256_hex=None,
|
||||||
|
saved_path=None,
|
||||||
|
)
|
||||||
|
|
||||||
|
sha256_hex = hashlib.sha256(data).hexdigest()
|
||||||
|
|
||||||
|
# Structured log line for visibility/metrics
|
||||||
|
try:
|
||||||
|
self.logger.info(
|
||||||
|
"External script fetched",
|
||||||
|
extra={
|
||||||
|
"source_url": script_url,
|
||||||
|
"final_url": current_url,
|
||||||
|
"status": status_code,
|
||||||
|
"bytes": total,
|
||||||
|
"truncated": truncated,
|
||||||
|
"sha256": sha256_hex,
|
||||||
|
"saved_path": out_path,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
except Exception:
|
||||||
|
# Logging should never break the pipeline
|
||||||
|
pass
|
||||||
|
|
||||||
|
return FetchResult(
|
||||||
|
ok=True,
|
||||||
|
reason="OK",
|
||||||
|
source_url=script_url,
|
||||||
|
final_url=current_url,
|
||||||
|
status_code=status_code,
|
||||||
|
content_type=content_type,
|
||||||
|
bytes_fetched=total,
|
||||||
|
truncated=truncated,
|
||||||
|
sha256_hex=sha256_hex,
|
||||||
|
saved_path=out_path,
|
||||||
|
)
|
||||||
@@ -39,6 +39,14 @@ BASE_DIR = Path(__file__).resolve().parent.parent
|
|||||||
DEFAULT_SETTINGS_FILE = BASE_DIR / "config" / "settings.yaml"
|
DEFAULT_SETTINGS_FILE = BASE_DIR / "config" / "settings.yaml"
|
||||||
|
|
||||||
# ---------- CONFIG DATA CLASSES ----------
|
# ---------- CONFIG DATA CLASSES ----------
|
||||||
|
@dataclass
|
||||||
|
class External_FetchConfig:
|
||||||
|
enabled: bool = True
|
||||||
|
max_total_mb: int = 5
|
||||||
|
max_time_ms: int = 3000
|
||||||
|
max_redirects: int = 3
|
||||||
|
concurrency: int = 3
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
class UIConfig:
|
class UIConfig:
|
||||||
snippet_preview_len: int = 160
|
snippet_preview_len: int = 160
|
||||||
@@ -61,6 +69,7 @@ class AppConfig:
|
|||||||
class Settings:
|
class Settings:
|
||||||
cache: Cache_Config = field(default_factory=Cache_Config)
|
cache: Cache_Config = field(default_factory=Cache_Config)
|
||||||
ui: UIConfig = field(default_factory=UIConfig)
|
ui: UIConfig = field(default_factory=UIConfig)
|
||||||
|
external_fetch: External_FetchConfig = field(default_factory=External_FetchConfig)
|
||||||
app: AppConfig = field(default_factory=AppConfig)
|
app: AppConfig = field(default_factory=AppConfig)
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
|
|||||||
@@ -1,23 +1,18 @@
|
|||||||
# SneakyScope — Roadmap (Updated 8-21-25)
|
# SneakyScope — Roadmap (Updated 8-21-25)
|
||||||
|
|
||||||
## Priority 1 – Core Analysis / Stability
|
## Priority 1 – Core Analysis / Stability
|
||||||
|
|
||||||
* Opt-in “fetch external scripts” mode (off by default): on submission, download external script content (size/time limits) and run rules on fetched content.
|
|
||||||
* Remove remaining legacy form “flagged\_reasons” plumbing once all equivalent function rules are in place.
|
|
||||||
* Unit tests: YAML compilation, function-rule adapters, and per-script/per-form rule cases.
|
|
||||||
* SSL/TLS intelligence: for HTTPS targets, pull certificate details from crt.sh (filtering expired); if a subdomain, also resolve the root domain to capture any wildcard certificates; probe the endpoint to enumerate supported TLS versions/ciphers and flag weak/legacy protocols.
|
* SSL/TLS intelligence: for HTTPS targets, pull certificate details from crt.sh (filtering expired); if a subdomain, also resolve the root domain to capture any wildcard certificates; probe the endpoint to enumerate supported TLS versions/ciphers and flag weak/legacy protocols.
|
||||||
|
|
||||||
## Priority 2 – API Layer
|
## Priority 2 – API Layer
|
||||||
|
|
||||||
* API endpoints: `/screenshot`, `/source`, `/analyse`.
|
* API endpoints: `/screenshot`, `/source`, `/analyse`.
|
||||||
* OpenAPI spec: create `openapi/openapi.yaml` and serve at `/api/openapi.yaml`.
|
* **OpenAPI**: add `POST /api/analyze_script` (request/response schemas, examples) to `openapi/openapi.yaml`; serve at `/api/openapi.yaml`.
|
||||||
* Docs UI: Swagger UI or Redoc at `/docs`.
|
* Docs UI: Swagger UI or Redoc at `/docs`.
|
||||||
|
* (Nice-to-have) API JSON error consistency: handlers for 400/403/404/405/500 that always return JSON.
|
||||||
|
|
||||||
## Priority 3 – UI / UX
|
## Priority 3 – UI / UX
|
||||||
|
|
||||||
* Front page/input handling: auto-prepend `http://`/`https://`/`www.` for bare domains.
|
* Front page/input handling: auto-prepend `http://`/`https://`/`www.` for bare domains.
|
||||||
* Source code viewer: embed page source in an editor view for readability.
|
|
||||||
* Scripts table: toggle between “Only suspicious” and “All scripts”.
|
|
||||||
* Rules Lab (WYSIWYG tester): paste a rule, validate/compile, run against sample text; lightweight nav entry.
|
* Rules Lab (WYSIWYG tester): paste a rule, validate/compile, run against sample text; lightweight nav entry.
|
||||||
|
|
||||||
## Priority 4 – Artifact Management & Ops
|
## Priority 4 – Artifact Management & Ops
|
||||||
@@ -33,6 +28,6 @@
|
|||||||
* Domain reputation (local feeds): build and refresh a consolidated domain/URL reputation store from URLHaus database dump and OpenPhish community dataset (scheduled pulls with dedup/normalize).
|
* Domain reputation (local feeds): build and refresh a consolidated domain/URL reputation store from URLHaus database dump and OpenPhish community dataset (scheduled pulls with dedup/normalize).
|
||||||
* Threat intel connectors (settings-driven): add `settings.yaml` entries for VirusTotal and ThreatFox API keys (plus future providers); when present, enrich lookups and merge results into the unified reputation checks during analysis.
|
* Threat intel connectors (settings-driven): add `settings.yaml` entries for VirusTotal and ThreatFox API keys (plus future providers); when present, enrich lookups and merge results into the unified reputation checks during analysis.
|
||||||
|
|
||||||
## Backlog / Far‑Off Plans
|
## Backlog / Far-Off Plans
|
||||||
|
|
||||||
* Server profile scan: run a lightweight nmap service/banner scan on common web/alt ports (80, 443, 8000, 8080, 8443, etc.) and SSH; combine with server headers to infer stack (e.g., IIS vs. Linux/\*nix).
|
* Server profile scan: run a lightweight nmap service/banner scan on common web/alt ports (80, 443, 8000, 8080, 8443, etc.) and SSH; combine with server headers to infer stack (e.g., IIS vs. Linux/\*nix).
|
||||||
|
|||||||
Reference in New Issue
Block a user