Skip to main content

// audit plane

The spine under every move.

Every Magistry write — catalog mutation, ad spend change, customer reply — passes through one append-only plane. Evidence captured, judge score recorded, reversal op pre-stored. Phase-gated, kill-switched, rate-limited.

Append-only Reversal pre-stored Dry-run by default
audit plane · today
decision_log         : 142 rows
  ├─ DRAFT             62
  ├─ DISCOUNT_TEST     14
  ├─ PUBLISH            8
  └─ SCALE_WINNER       3

ad_decision_log      :  47 rows
  ├─ ADJUST_BUDGET     22
  ├─ AUDIENCE_WRITE     9
  ├─ CREATIVE_BRIEF     6
  └─ NEGATIVE_TERMS    10

cs_thread_evaluations:  318 sends
  ├─ judge_pass        302
  ├─ judge_hold         11
  └─ escalated           5

kill_switch          : OFF (all systems live)
rate_limiter         : 0 ceilings hit
advisory_locks       : 2 active, 0 queued

// why it exists

Autonomy you can read.

If you can't read what an agent did and why, you can't trust it. The audit plane exists so every operator — engineer, finance lead, CEO — can answer one question fast: what moved, and on what basis.

An agent without an audit is just a leak.

Autonomy without a paper trail means you find out about a bad write in a dashboard, days later. The audit plane is what turns 'the agent did something' into a row you can read, evaluate, and reverse.

Every specialist writes to the same plane.

Catalog, Campaign, CS, Disputes, Social, Researcher — they all share one place for evidence, citations, and reversal ops. One query gets you the whole day, across every system.

Append-only is the only honest model.

Magistry never edits or deletes a row. Reversals create new rows that point to the original. Your audit history is a ledger, not a Wikipedia.

// what it stores

Three tables. Every move.

One table per specialist. Same shape, same evidence schema, same reversal pattern. A read across all three reconstructs the agent's full operational day.

decision_log

Catalog Specialist

Every lifecycle transition for every SKU. From-state, to-state, action, trigger, evidence (jsonb), applied_to_shopify, reversal op.

// schema

product_idfrom_stateto_stateactiontriggerevidenceapplied_at

ad_decision_log

Campaign Specialist

Every paid-side write — budget changes, bid strategy shifts, audience writes, creative pushes, negative-term additions. Per channel.

// schema

channelskillactiondeltarate_limit_statusevidenceapplied_at

cs_thread_evaluations

CS Specialist

Every draft and send — judge scores per axis (policy, fact, brand_voice, risk), policy citation, supplier signal, language flag, verdict.

// schema

thread_iddraft_idjudge_scorepolicy_citelanguageverdictsent_at

// phase model

Two phases. One executor.

Magistry will not write to your real systems until you say so. Phase 1 is shadow operations — full reasoning, no mutations. Phase 2 is live — same rows, executor applies. Same shape, same reversal.
// phase 1

Dry-run — plan, judge, log, do not write.

On every new account, every new agent, every new policy. Magistry runs the full loop end-to-end but holds back the final mutation. You read the rows, you flag the questionable ones, you decide when to flip the switch.

  • All evidence captured, judge scores computed.
  • applied_to_shopify = false on every row.
  • Reversal ops still pre-stored — symmetric with live.
  • Operator approval needed to move to Phase 2.
// phase 2

Live — executor applies pending rows inside policy.

Trust earned, mode set to live. The executor walks the queue and applies. Same rows, same evidence, but now the mutation actually lands. Rate limits hold, kill switch overrides, advisory locks prevent stomps.

  • Idempotent writes — re-runs never double-apply.
  • Per-action rate limits enforced before mutation.
  • Kill switch state checked at write-time, not config time.
  • Reversal op stored alongside the apply for one-click rollback.

// safety integration

Where the rails live.

The audit plane is where the safety primitives bite. Every primitive is enforced at write-time, not config time — meaning a kill switch flipped mid-cycle stops the next row, not the next reboot.

Kill Switch

stores.config->>'kill_switch'

Checked at every write. Global or scoped per agent. Flipped ON: all pending mutations halt, queued rows mark as held. No flush, no race.

Rate Limiter

safety/rate_limiter.check_action()

Per-action budgets like draft_max_per_week=20. Exceeded: row is logged with status=RATE_LIMITED and waits — never blasts through the ceiling.

Advisory Locks

safety/locks.store_lock(store, key)

Prevents concurrent executors stomping each other. Two cycles can plan in parallel; only one can execute against the same store key at a time.

Trademark Filter

apply_policy() pre-publish

Generated copy is screened against your registered trademark list before any catalog write. Hit → row marked NEEDS_REVIEW, never silently pushed.

// reversal model

One-click rollback, by design.

Every applied row carries its own reversal op — pre-computed at plan time, not improvised at rollback time. Click the row, hit rollback, the executor pushes the inverse mutation and stamps a new row pointing at the original.

Append-only means your audit history grows; nothing gets overwritten. The pair of rows — original + reversal — is the story.

decision_log · rollback chain
row #84193                    reversible
  action      : ADJUST_BUDGET
  delta       : +12%
  status      : APPLIED
  reversal_op : ADJUST_BUDGET δ=-12%

row #84194  (reverses #84193)
  action      : ADJUST_BUDGET
  delta       : -12%
  status      : APPLIED
  reverses    : 84193
  triggered_by: operator (jane@store.com)

— audit chain stays intact —

// audit plane

See the day Magistry just had.

Every decision row, across every specialist, in one queryable place. Read it, score it, reverse it, ship the digest to leadership — autonomy you can defend in a board meeting.

Append-only · Reversible · Built to be read