FIG. 01 / THE THESIS

Fund the loops that prove they help.

Agents are markdown files with a price. A deterministic allocator decides which one deserves the next dollar. You rate what helped. The budget learns.

ONE GO BINARYPLAIN FILESNO MODEL JUDGES THE PORTFOLIOMIT

THE ALLOCATIONMARKET OPEN

00:30✓uptime $0.008all healthy, 200 OK in 142ms01:00★outcome usefulexposed a pricing move, allocation raised02:00✓security $0.214raw SQL in /search, patch attached02:41→competitor skippedagent has unrated output04:15✓repo-health $0.21verify passed, auto-rated useful05:12⊘dbcleanup heldplan pending your approval06:00✓digest $0.089morning brief compiled from 5 agents06:01▲portfoliospend shifted toward verified return

FIG. 02 / THE DOCTRINEFIVE CLAUSES

§ 01

Cron believes every job is equally valuable, forever.

watchd does not. verified outcomes per dollar decide who runs.

§ 02

A schedule is eligibility, not entitlement.

due means allowed to compete. the allocator decides what runs.

§ 03

Useful compounds. Harmful starves.

ratings are append-only. corrections are the compounding asset.

§ 04

No model judges the portfolio.

one inspectable formula. identical state, identical decision.

§ 05

Anything dangerous waits for you.

observe → propose → act. authority is a ceiling, never a default.

FIG. 03 / ONE NIGHTSCROLL TO ADVANCE

LOCAL TIME

23:00

While you sleep, the allocator decides which due agent deserves the next dollar.

You slept. They didn't.

6 RUNS · $0.37 TOTAL · 1 PLAN AWAITING YOUR APPROVAL

23:08» watchd up

5 agents armed. schedules are eligibility, not entitlement.

00:30✓ uptime$0.008

All healthy. 200 OK in 142ms.

01:00✓ errors$0.021

3 new timeouts in the payment worker. Same root cause.

02:00✓ security$0.214

New commit adds raw SQL in /search. Flagged with a patch suggestion.

04:15✓ competitor$0.041

Acme cut Pro pricing 20%. Second cut this quarter. Reading as a price war.

06:00✓ digest$0.089

Morning brief compiled. 4 findings, 1 plan pending your approval.

FIG. 04 / THE ALLOCATORDETERMINISTIC

One formula decides. You can read it.

Every scheduled agent is a strategy competing for a finite daily budget. A bandit, not a vibe: no model scores the portfolio, no prompt decides who runs. Four terms, all inspectable.

score=weight×(expected+ε·uncertainty)/cost

internal/portfolio/portfolio.go · HOVER THE TERMS

WEIGHT

goal importance

Declared in the goal file. product: 3, chores: 1. Money follows what you said matters.

EXPECTED

(1 + useful − harmful) / (2 + rated)

Laplace-smoothed verified return. New agents start humble at one half; every rating moves the posterior.

ε · UNCERTAINTY

√( ln(N + 2) / (n + 1) )

A bounded exploration bonus, UCB-style. Undertested agents keep getting auditions, so an incumbent never owns the budget.

COST

actual average spend

The denominator is reality: observed dollars per run, not the number the agent promised. Floor at one cent.

EVERY DECISION SHIPS A REASON:✓ highest verified return→ agent has unrated output→ pending review cap reached→ global daily budget exhausted→ goal daily budget exhausted→ unrated review cap reached

IDENTICAL STATE PRODUCES THE IDENTICAL DECISION. EVERY ADMISSION REASON IS STORED, QUERYABLE, AND YOURS TO AUDIT.

FIG. 05 / THE LEDGERAPPEND-ONLY

Three words train the market.

useful, neutral, harmful. Ratings append, never overwrite, so your corrections stay auditable and become the portfolio's compounding asset. Where a shell command can check reality, you don't even have to type them.

RATE THE NEXT RUN OF competitor ↓

expected = (1 + 1 − 0) / (2 + 2)= 0.500

TOMORROW'S ALLOCATION MOVES WITH THIS NUMBER

competitor_0609usefulexposed a pricing move

competitor_0610neutralno change detected

OR LET REALITY DO THE RATING ↓

---
name: repo-health
goal: product
verify: go test ./...
verify_timeout: 2m
---

satisfiedalready true before the run: $0.00, no tokens burned

usefulfalse before, true after: auto-rated, no human needed

incompletestill false after execution: recorded neutral

errorverifier failed or timed out: quarantined from the stats

THE VERIFIER RUNS BEFORE AND AFTER. ALREADY TRUE MEANS NO MODEL IS EVEN WOKEN UP. ITS OUTPUT IS CAPPED AT 8 KiB AND TREATED AS UNTRUSTED DATA.

BACKLOG? watchd rate TRIAGES EVERYTHING UNRATED, ONE KEYPRESS PER RUN. THE ALLOCATOR HOLDS AGENTS WITH UNRATED OUTPUT, SO RATING THROUGHPUT IS THE MARKET'S LIQUIDITY.

FIG. 06 / AUTHORITYobserve / propose / act

Power is granted, never assumed.

Goals set a ceiling; agents can only lower it. A gated run gets read-only tools and must end with a concrete plan. Nothing executes until you approve, and notify pushes the plan to your phone instead of waiting to be noticed.

observeread-only tools. looking is free.

proposeread-only, plus a plan held for your approval.

actconfigured tools. gate: true still demotes it to propose.

---
name: dbcleanup
schedule: 1d
model: sonnet
gate: true
notify: "ntfy pub alerts 'watchd: $WATCHD_AGENT \
  $WATCHD_STATUS'"
---
 
# DB Cleanup
 
Find bloated tables, unused indexes, and
rows older than the retention policy.
Propose a cleanup plan.

$ watchd pendingAWAITING APPROVAL

ID AGENT STATUS

a3f9 dbcleanup pending (2m ago)

PLAN · READ-ONLY PASS

1. VACUUM ANALYZE on 4 bloated tables

2. Drop unused index idx_sessions_legacy

3. Archive 48,210 rows >180d from events

APPROVING RESUMES THE SAME SESSION, SO IT EXECUTES EXACTLY THE PLAN YOU READ. EVIDENCE IS RECHECKED FIRST, SO A RECOVERED SYSTEM SUPERSEDES A STALE PLAN. TRY IT ↑

FIG. 07 / MEMORYmemory: true

Loops that compound.

One line of frontmatter gives an agent a notes file it curates itself: injected at the start of every run, rewritten at the end. A scanner that re-reports the same findings is noise. One that remembers builds a position.

RUN 0123:30

memory/competitor.md — memory is empty — the agent writes a baseline

## Baseline — 3 competitors

- Acme: Pro $49/mo, AI addon beta

- Initech: usage-based, no free tier

- Globex: enterprise only, POC req'd

RUN 0205:30

memory/competitor.md — reads the baseline, reports only the delta

## Baseline — 3 competitors

- Acme: Pro $49/mo, AI addon beta

- Initech: usage-based, no free tier

- Globex: enterprise only, POC req'd

+ 06-09: Acme launches annual billing

+ Initech free tier rumored (HN)

RUN 0311:30

memory/competitor.md — connects runs into a trend you can act on

## Baseline — 3 competitors

- Acme: Pro $49 -> $39 (-20%)

- Initech: usage-based, no free tier

- Globex: enterprise only, POC req'd

- 06-09: Acme launches annual billing

+ 2nd Acme price cut this quarter

+ pattern: price war forming.

+ watch for the Globex response

CURATED, NOT TRANSCRIPT-STUFFED. STALE ENTRIES GET PRUNED, AND A POISONED PAGE SCRAPED IN RUN 12 NEVER BECOMES STANDING INSTRUCTIONS FOR RUN 13

FIG. 08 / WHY NOT CRON + BASH

Cron runs scripts. watchd runs judgment.

A bash script checks if the endpoint returned 200. An agent notices the 200 took four seconds, that the body is an error page wearing a success code, that the same timeout pattern showed up last Tuesday. You write intent in plain language; the model does the interpreting; the allocator pays for what proves out.

CRON + BASH

✗Every recurring job assumed equally valuable, forever.

✗Write a bash script. Parse output yourself. No structure.

✗No cost tracking. You find out at the end of the month.

✗No run history. stdout is gone when the terminal closes.

✗Runs with your full permissions from minute one.

✗No memory. Every run starts from zero and re-reports.

WATCHD

✓Budget flows to verified value. Schedules are only eligibility.

✓Write a markdown file. The model handles the thinking.

✓Cost tracked per run. Budgets enforced mid-run.

✓Every run a queryable record with provenance hashes.

✓observe / propose / act: authority is a ceiling.

✓memory: true means runs build a position and compound.

SAME VERDICT FOR CLOUD ROUTINES: A HANDFUL OF STATELESS RUNS PER DAY, HOUR FLOORS, NO APPROVAL PRIMITIVE. WATCHD LOOPS ARE UNLIMITED, SUB-MINUTE, REMEMBER EVERYTHING, AND WAIT FOR YOU.

FIG. 09 / THE ANATOMYONE FILE

An agent is a delegation you can read.

Frontmatter is the contract, the body is the brief. Four decisions, then write it like a delegation to a smart intern: where to look, what judgment to apply, what shape to return. The whole craft is the sharpness of the judgment line.

goalwhy it deserves money. points at a file in goals/ that sets weight and the permission ceiling.

modelhaiku for cheap high-frequency pings, sonnet for judgment work.

budgethard per-run ceiling in dollars, enforced mid-run, not after.

memorythe compounding switch. on: builds on last run. off: groundhog day.

agents/show-hn.mdMANUAL UNTIL IT EARNS A SCHEDULE

---

name: show-hn

goal: ideas← why

model: sonnet← judgment, not pings

budget: 0.40← ceiling

memory: true← compound

---

Fetch the current Show HN entries.← where to look

Extract the 5 highest-leverage← the judgment

mechanisms I could apply to my own

work this week. Mechanisms, not

product descriptions.

Three lines per idea. Skip what← the shape

your memory already reported; flag

patterns that keep reappearing.

THE LOOP: watchd run show-hn → watchd logs → watchd rate. RATE USEFUL AND THE ALLOCATOR FUNDS IT MORE. ADD schedule: 1d ONLY WHEN IT EARNS IT.

FIG. 10 / UNDER THE HOODSINGLE BINARY

2,600 lines of Go. The model thinks. The math chooses.

No AI runtime, no database, no API keys to manage. watchd spawns claude -p, parses the JSON, and records cost, evidence, allocation, outcomes, and instruction hashes. Every run answers "which instructions produced this output": prompt_hash and agent_hash travel with the record.

cmd/watchd

Entry point. One binary, no runtime deps.

internal/cli

run, up, portfolio, outcome, pending, approve

internal/agent

Parses markdown agents + YAML frontmatter

internal/portfolio

Goals, policy, review limits, deterministic allocation

internal/runner

Spawns claude -p: memory, gate, verify, notify

internal/store

Run history as JSON files with provenance hashes

internal/daemon

Scheduler loop for recurring agents

≈ 2,600 LINES OF GOONE DEPENDENCY: yaml.v3PLAIN FILES, NO DATABASE

cli -> runnercli -> storecli -> agentcli -> daemoncli -> portfoliodaemon -> runnerrunner -> agentrunner -> storeportfolio -> agentportfolio -> store

FIG. 11 / OPEN THE MARKET2 MINUTES

Fund your first loop tonight.

Star on GitHub