Files
SneakyScope/app/config/suspicious_rules.yaml
Phillip Tarrant 55cd81aec0 feat(text): add text analysis pipeline & surface results in UI
- engine: add analyse_text() to extract visible page text and evaluate
  category="text" rules; collect matched phrases and expose as
  `content_snippet` (deduped, length-capped via settings.ui.snippet_preview_len).
- engine: removed unused code
- browser: removed double call for enrichment
- engine: improve regex compilation — honor per-rule flags (string or list)
  and default IGNORECASE when category=="text".
- engine: add dispatch logging "[engine] applying categories: …" gated by
  settings.app.print_rule_dispatch.
- ui(templates): add `templates/partials/result_text.html` mirroring the forms
  table; renders page-level records and their matched rules.
- ui(controller): wire `analyse_text()` into scan path and expose
  `payload["suspicious_text"]`.
- rules(text): add `identity_verification_prompt`, `gated_document_access`,
  `email_collection_prompt`; broaden `credential_reset`.

fix: text indicators were not displayed due to missing analyzer and mismatched result shape.

Result shape:
  suspicious_text: [
    {
      "type": "page",
      "content_snippet": "...matched phrases…",
      "rules": [
        {"name": "...", "description": "...", "severity": "medium", "tags": ["..."]}
      ]
    }
  ]
2025-08-22 17:18:50 -05:00

149 lines
4.5 KiB
YAML

# config/suspicious_rules.yaml
# Baseline suspicious rules for SneakyScope
# Organized by category: script, form, text
# Notes:
# - Engine compiles regex with IGNORECASE.
# - 'severity' is optional: low | medium | high
# - 'tags' is optional: list of strings for grouping
# --- Script Rules ---
- name: eval_usage
description: "Use of eval() in script"
category: script
type: regex
pattern: '\beval\s*\('
severity: high
tags: [obfuscation, unsafe-eval]
- name: new_function_usage
description: "Use of Function constructor (new Function)"
category: script
type: regex
pattern: '\bnew\s+Function\s*\('
severity: high
tags: [obfuscation]
- name: document_write
description: "Use of document.write (often abused in malicious injections)"
category: script
type: regex
pattern: '\bdocument\s*\.\s*write\s*\('
severity: medium
tags: [injection, legacy-api]
- name: inline_event_handler
description: "Inline event handler detected (onload, onclick, etc.)"
category: script
type: regex
pattern: '\bon(load|click|error|mouseover|mouseenter|submit|keydown|keyup|change)\s*='
severity: medium
tags: [inline-handlers, potential-xss]
- name: obfuscated_encoding
description: "Suspicious use of atob()/btoa() (base64 encode/decode)"
category: script
type: regex
pattern: '\b(atob|btoa)\s*\('
severity: medium
tags: [encoding, obfuscation]
- name: unescape_usage
description: "Use of unescape() (legacy/obfuscation)"
category: script
type: regex
pattern: '\bunescape\s*\('
severity: low
tags: [legacy-api, obfuscation]
- name: string_timer_usage
description: "String passed to setTimeout/setInterval (sink for XSS)"
category: script
type: regex
pattern: '\bset(?:Timeout|Interval)\s*\(\s*[''"`].+[''"`]\s*,'
severity: medium
tags: [xss-sink]
- name: long_hex_constants
description: "Long hex-like constants (possible obfuscation)"
category: script
type: regex
pattern: '["'']?0x[0-9a-fA-F]{16,}["'']?'
severity: low
tags: [obfuscation]
# --- Form Rules ---
- name: suspicious_form_action_absolute
description: "Form action uses absolute URL (potential credential exfiltration)"
category: form
type: regex
pattern: '<form\b[^>]*\baction\s*=\s*[''"]https?://'
severity: medium
tags: [exfiltration, form]
- name: hidden_inputs
description: "Form with hidden inputs (could be used to smuggle data)"
category: form
type: regex
pattern: '<input\b[^>]*\btype\s*=\s*[''"]hidden[''"]'
severity: low
tags: [stealth, form]
- name: password_field
description: "Form requests a password field"
category: form
type: regex
pattern: '<input\b[^>]*\btype\s*=\s*[''"]password[''"]'
severity: high
tags: [credentials, form]
# --- Text Rules (Social Engineering / BEC / Lures) ---
- name: identity_verification_prompt
description: "Prompts to verify identity/account/email, often gating access"
category: text
type: regex
# e.g., "verify your identity", "confirm your email", "validate account"
pattern: '\b(verify|confirm|validate)\s+(?:your\s+)?(identity|account|email)\b'
flags: [i]
severity: medium
tags: [bec, verification, gating]
- name: gated_document_access
description: "Language gating document access behind an action"
category: text
type: regex
# e.g., "access your secure document", "unlock document", "view document" + action verbs nearby
pattern: '(secure|confidential)\s+document|access\s+(?:the|your)?\s*document|unlock\s+document'
flags: [i]
severity: medium
tags: [lure, document]
- name: email_collection_prompt
description: "Explicit prompt to enter/provide an email address to proceed"
category: text
type: regex
# e.g., "enter your email address", "provide email", "use your email to continue"
pattern: '\b(enter|provide|use)\s+(?:your\s+)?email\s+(?:address)?\b'
flags: [i]
severity: low
tags: [data-collection, email]
- name: credential_reset
description: "Password/credential reset or login-to-continue wording"
category: text
type: regex
# includes: reset password, update credentials, log in to (verify|view|access), password expiry/expiration
pattern: '\b(reset\s*password|update\s*credentials|log\s*in\s*to\s*(?:verify|view|access)|password\s*(?:expiry|expiration|expires))\b'
flags: [i]
severity: medium
tags: [bec, credentials]
- name: suspicious_iframe
description: "Iframe tag present (possible phishing/malvertising/drive-by)"
category: text
type: regex
pattern: '<iframe\b[^>]*\bsrc\s*=\s*[''"][^''"]+[''"]'
severity: medium
tags: [iframe, phishing, malvertising]