Files
SneakyScope/docs/rules-rebuild-remember.md
Phillip Tarrant b59bf67329 added notes to settings.yaml
moved core app config (name, version) out of settings and into app/app_settings.py
added ability to brand SneakyScope to any name
added caching of cert information from crt.sh (cache enable and lenght is configurable in settings.yaml)

streamlined header/footer loading to be more correct
2025-08-23 20:37:44 -05:00

114 lines
4.1 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
Got it — heres a structured outline you can keep as a “memory primer” for when you want to resume this work (whether continuing the current engine or pivoting to a SaaS build). Think of it as a **checkpoint map** of what weve covered and what directions you flagged.
---
# SneakyScope Rules Engine Refactor & SaaS Considerations (Outline)
## 1. Current State
* Engine: `app/rules/rules_engine.py` — mixed YAML + code rules, `(bool, str)` return shape expected.
* Issue: function-based rules returning `(bool, dict)` → caused invalid type warnings.
* Stopgap: `FunctionRuleAdapter` converted returns.
* Desire: Remove YAML entirely; rules defined in code only.
---
## 2. Agreed Direction
* **Rule definition approach**:
* Option B chosen → decorator-based registration.
* Every rule defined in `app/rules/` as Python functions.
* Rules register with metadata (`name`, `category`, `severity`, etc.).
* **Return shape**:
* Always return a **Result dict** (no adapter needed).
* Engine enforces schema and fills in defaults.
* **Engine relocation**:
* Move to `app/utils/rules_engine/`.
* Responsibilities: load, validate, freeze registry, run rules, aggregate results, log/report.
---
## 3. Result Schema (concept)
* **Per RuleResult**
* Required: `ok: bool`, `message: str`.
* Identity: `name`, `category`, `severity`, `tags`, `rule_version`.
* Detail: `data: object|null`.
* Timing: `duration_ms`.
* Errors: structured `error` object if exceptions occur.
* Provenance: `source_module`, optional `policy` snapshot.
* **Per AnalysisResult (run-level envelope)**
* Input scope: target URL, category, content hash, facts profile.
* Provenance: run\_id, engine\_version, ruleset\_checksum, timestamp, duration.
* Results: array of RuleResults.
* Summary: counts by severity, match count, errors, first match, top severity.
* Artifacts: references (screenshot, DOM snapshot, etc.).
* Policy snapshot: optional central policy/overrides.
---
## 4. Operational Standards
* **Determinism**: identical inputs + ruleset\_checksum → identical results.
* **Message stability**: avoid wording churn; expand via `data`.
* **Size limits**: `message ≤ 256 chars`; `data ≤ 816 KB`.
* **Errors**: `ok=false` if error present; always emit `message`.
* **Severity**: rule sets default; policy may override.
* **Tags**: controlled vocabulary; additive.
---
## 5. Migration Plan
1. Create new `rules_engine` package in `app/utils/`.
2. Add decorator/registry for rules.
3. Port all rules from YAML → Python modules grouped by category.
4. Delete YAML loader + adapters.
5. Update call sites to build `facts` and call `engine.run(...)`.
6. Add CI tests:
* Schema compliance.
* No duplicates.
* Ruleset checksum snapshot.
7. Integration tests with real fixtures.
8. Benchmark & harden (caps on input size, rule runtime).
---
## 6. SaaS Expansion (future)
* **Multi-tenancy**: separate org/user scopes for data and rule runs.
* **RBAC**: roles (admin, analyst, viewer).
* **Compliance**: logging, retention, export, audit trails.
* **Rules**: centrally maintained, not user-editable.
* **APIs**: authenticated endpoints, per-user quotas.
* **Observability**: per-tenant metrics, alerts.
* **Security**: sandboxing, strict module allowlists, compliance with SOC2/ISO.
* **Data controls**: PII redaction, encryption, retention policies.
---
## 7. Future-Proofing Hooks
* Versioning: ruleset checksum + per-rule versions.
* Extensibility: support `actions`, `links`, `evidence` in Result.
* Policy: central config for thresholds/overrides.
* Hot reload (optional, dev-only).
* Rule provenance tracking (source\_module, commit SHA).
---
✅ This outline is enough to “re-hydrate” the context later — you wont need to dig back into old logs to remember why `(bool, str)` didnt fit, why YAML was removed, or what schema we were converging on.
---
Do you want me to also save this in a **short “README-spec” style** (like `RESULTS.md`) so it can live in your repo as the contract doc for rules, or should I keep this as just your personal checkpoint outline?