// changelog

What shipped, what's reversible, and what's next.

Every release is a row in a log. Every row is reversible. Below: ten cycles of shipped work, in plain language, with the executors and gates that moved.

Subscribe to changelog

Read the blog

10 releases in 4 months Append-only history Phase 2 default

// filter

Browse by surface.

Each release is tagged to the agent or surface it touches. Filter the timeline below — or scroll for the full feed.

// showing 10 of 10 releases · sorted desc by date

// the feed

One row per shipped change.

No marketing-ese. The executor name, the gate it crossed, the rollback path — all on the row.

2026-05-20v1.4.0Campaign

Negative-keyword harvesting goes autonomous.

The anomaly responder can now write NEGATIVE_TERMS directly into Google Ads with tier-gated rollback. Previously this required an operator click for every cluster — now the planner-judge-executor loop runs end-to-end on terms below the 0.6 intent floor.

New harvest_negatives executor with per-ad-group rate limit (12 writes / 24h).
Reversal op pre-stored on every NEGATIVE_TERMS row — one tap restores the term.
Self-calibrating intent floor — anchored to your historical converter distribution, not a global 0.6.

2026-05-12v1.3.4Catalog

Cost confidence Tier B now eligible for OPTIMIZE_LOSER.

Soft optimisation (copy + image regenerate) no longer requires verified Tier A cost data. Discount writes still gated to Tier A — margin policy stays enforced where money moves.

Tier B unlocks: OPTIMIZE_LOSER, FLAG_ORPHAN, KEEP. Tier C remains read-only.
New tier_upgrade job watches Shopify cost_per_item edits and re-tiers the SKU on next cycle.
Decision rows now stamp the tier at decision time, not at read time — full audit fidelity.

2026-04-30v1.3.0CS

Reply Judge second-pass scoring shipped.

The CS Specialist now runs every drafted reply through a separate judge model before send. Replies scoring below 0.78 brand-voice fidelity are flagged for operator review instead of going out.

Brand voice anchor — sampled from your last 200 sent replies on connect.
Empathy + clarity sub-scores stored on the decision row for monthly audit.
Operator override on flagged replies preserves the judge score for calibration.

2026-04-22v1.2.9Otto

Otto answers questions about your decision_log.

Ask Otto in-app: 'why did the agent vault SKU-LIN-228?' and it traces the rows — cycle, action, evidence, judge score — into a single answer with deep links to each row.

Inline timeline rendering — every row clickable, every decision opens in the audit plane.
Otto refuses to answer when evidence is missing — no hallucinated rationales.
Question history saved per-operator, searchable by SKU or campaign ID.

2026-04-08v1.2.5Social

Pinterest joins Social Autopilot.

Pinterest pin generation, board placement, and seasonal scheduling are now first-class. Same attention-queue ranking — every pin scored on EV before publish.

Idea pin + standard pin support, with image-variation pipeline reused from Meta.
Board match heuristic — pins routed to the closest semantic board on your account.
Seasonal anchor — Pinterest pins published 45 days ahead of the search peak, not the sale date.

2026-03-28v1.2.0Disputes

Chargeback evidence packets auto-assembled.

When a dispute lands, the Disputes module pulls the order, the message history, the shipment proof, and the policy excerpt into a single signed PDF — ready to upload to Shopify Payments inside 60 seconds.

Evidence linked back to the decision_log rows that touched the order — provenance preserved.
Friendly-fraud signal score: 0.0 (legit) → 1.0 (likely friendly fraud) for triage routing.
Operator can re-open and add evidence; original packet stays append-only.

2026-03-15v1.1.8Researcher

Reverse-image supplier match now runs on 4 marketplaces.

Researcher's reverse-image lane now searches AliExpress, 1688, DHGate, and Made-in-China in parallel — frequency-clustered with confidence band per supplier candidate.

Top 3 supplier candidates surfaced per discovered product, with per-candidate cost & MOQ.
Trademark + uniqueness filter runs before any candidate enters the draft pipeline.
New 'compete map' shows which competing stores already source from the same supplier.

2026-02-26v1.1.3Audit

decision_log is now signed at the row level.

Every decision_log row carries an HMAC signature derived from the row's content and the previous row's signature — a tamper-evident chain. Operators get a one-click verify on any row.

Per-store HMAC key stored in Supabase Vault — never leaves the boundary.
Verify endpoint returns chain status: 'verified' / 'broken at row N' / 'tail unverified'.
Old rows back-signed during a one-time migration on upgrade.

2026-02-04v1.1.0Campaign

Campaign Specialist moves to Phase 2 by default for new stores.

After 14 days of dry-run-only proof, new accounts can flip Campaign Specialist to live writes with one click — the kill switch stays default-on and per-action rate limits are pre-set to a conservative band.

Phase 2 ramp — first 7 days capped at 20% of suggested write volume.
Operator dashboard now shows 'would-have-done' vs 'did' for the entire dry-run window.
Auto-rollback if 3 consecutive judge scores drop below 0.7.

2026-01-18v1.0.5Catalog

Self-calibrating winner floor.

The classifier no longer uses a global 'ROAS > 2.0' threshold. Each store gets its own winner floor — max(2.0, p75 of your own distribution) — recomputed weekly off your last 90 days.

Per-store thresholds visible in the audit plane, with the math explained.
Floor never moves more than 0.3x week-over-week — stability gate prevents whiplash.
Old single-threshold behaviour preserved behind a kill-switch flag for migration.

// see it run

Want the next release on your store?

Connect a read-only Shopify token and the agents draft their first dry-run cycle inside an hour. You see exactly what every future release would have done before it touches a thing.

Book a 20-min demo

Read the FAQ

Dry-run by default · Append-only logs · One-click rollback