moved core app config (name, version) out of settings and into app/app_settings.py added ability to brand SneakyScope to any name added caching of cert information from crt.sh (cache enable and lenght is configurable in settings.yaml) streamlined header/footer loading to be more correct
4.1 KiB
Got it — here’s a structured outline you can keep as a “memory primer” for when you want to resume this work (whether continuing the current engine or pivoting to a SaaS build). Think of it as a checkpoint map of what we’ve covered and what directions you flagged.
SneakyScope – Rules Engine Refactor & SaaS Considerations (Outline)
1. Current State
- Engine:
app/rules/rules_engine.py— mixed YAML + code rules,(bool, str)return shape expected. - Issue: function-based rules returning
(bool, dict)→ caused invalid type warnings. - Stopgap:
FunctionRuleAdapterconverted returns. - Desire: Remove YAML entirely; rules defined in code only.
2. Agreed Direction
-
Rule definition approach:
- Option B chosen → decorator-based registration.
- Every rule defined in
app/rules/as Python functions. - Rules register with metadata (
name,category,severity, etc.).
-
Return shape:
- Always return a Result dict (no adapter needed).
- Engine enforces schema and fills in defaults.
-
Engine relocation:
- Move to
app/utils/rules_engine/. - Responsibilities: load, validate, freeze registry, run rules, aggregate results, log/report.
- Move to
3. Result Schema (concept)
-
Per RuleResult
- Required:
ok: bool,message: str. - Identity:
name,category,severity,tags,rule_version. - Detail:
data: object|null. - Timing:
duration_ms. - Errors: structured
errorobject if exceptions occur. - Provenance:
source_module, optionalpolicysnapshot.
- Required:
-
Per AnalysisResult (run-level envelope)
- Input scope: target URL, category, content hash, facts profile.
- Provenance: run_id, engine_version, ruleset_checksum, timestamp, duration.
- Results: array of RuleResults.
- Summary: counts by severity, match count, errors, first match, top severity.
- Artifacts: references (screenshot, DOM snapshot, etc.).
- Policy snapshot: optional central policy/overrides.
4. Operational Standards
- Determinism: identical inputs + ruleset_checksum → identical results.
- Message stability: avoid wording churn; expand via
data. - Size limits:
message ≤ 256 chars;data ≤ 8–16 KB. - Errors:
ok=falseif error present; always emitmessage. - Severity: rule sets default; policy may override.
- Tags: controlled vocabulary; additive.
5. Migration Plan
-
Create new
rules_enginepackage inapp/utils/. -
Add decorator/registry for rules.
-
Port all rules from YAML → Python modules grouped by category.
-
Delete YAML loader + adapters.
-
Update call sites to build
factsand callengine.run(...). -
Add CI tests:
- Schema compliance.
- No duplicates.
- Ruleset checksum snapshot.
-
Integration tests with real fixtures.
-
Benchmark & harden (caps on input size, rule runtime).
6. SaaS Expansion (future)
- Multi-tenancy: separate org/user scopes for data and rule runs.
- RBAC: roles (admin, analyst, viewer).
- Compliance: logging, retention, export, audit trails.
- Rules: centrally maintained, not user-editable.
- APIs: authenticated endpoints, per-user quotas.
- Observability: per-tenant metrics, alerts.
- Security: sandboxing, strict module allowlists, compliance with SOC2/ISO.
- Data controls: PII redaction, encryption, retention policies.
7. Future-Proofing Hooks
- Versioning: ruleset checksum + per-rule versions.
- Extensibility: support
actions,links,evidencein Result. - Policy: central config for thresholds/overrides.
- Hot reload (optional, dev-only).
- Rule provenance tracking (source_module, commit SHA).
✅ This outline is enough to “re-hydrate” the context later — you won’t need to dig back into old logs to remember why (bool, str) didn’t fit, why YAML was removed, or what schema we were converging on.
Do you want me to also save this in a short “README-spec” style (like RESULTS.md) so it can live in your repo as the contract doc for rules, or should I keep this as just your personal checkpoint outline?