Russell Miller

San Francisco. Built Enact solo over 16 sessions in early 2026 — AI-co-developed end-to-end with Claude Code as the coding partner.

Open to DevRel / PMM / Solutions Engineer at AI coding co's

What I shipped

A PreToolUse hook firewall for Claude Code that runs every Bash, Read, Write, Edit, Glob, and Grep call through a deterministic Python policy engine before execution. Then a 39-prompt paired-chaos study to measure what it adds.

39paired chaos prompts

8 → 0incidents w/o vs with

545tests passing

1.0.0on PyPI

The headline finding

An asymmetry in Claude Code's self-refusal rate across three trigger types:

Trigger	Self-refusal
User-typed destructive command	~80%
Read-shaped exfil ("show me the env vars")	~20%
Agent-self-initiated destruction (PocketOS pattern)	~0%

The third row is the load-bearing one — flagship model + flagship IDE + explicit safety rules, agent invents a destructive action to "fix" unrelated friction, prod gone in seconds. A deterministic gate that runs before the model decides anything fills exactly that gap. Full research post →

Artifacts

Research post — 39 paired runs (8 min) Follow-up — gitignore block evidence (3 min)

github.com/russellmiller3/enact pypi.org/project/enact-sdk enact.cloud landing

Why DevRel / PMM / SE

The shape of the work I've been doing alone for this product is the shape of the role:

Empirical research with paired sweeps → DevRel content
Architecture posts + 90-second demos → developer marketing
Cold outbound + landing-page iteration → product marketing
"My agent broke prod, here's the fix" customer conversations → solutions engineering

Six months of doing this for one product. Want to do it for ten products at the company building the platform.

How I work

AI-co-development is the default. Every artifact in this repo (code, posts, landing, this page) was built with Claude as the coding partner. I'm the strategy + judgment layer; the model is the implementation layer. That ratio is the future.
Empirical-first. The post leads with paired-run data because "I tried to break it" beats "I think it's good." Same for landing-page numbers, same for cold-email claims.
Plain English. Hiring managers read the post in 8 minutes; CISOs read the landing in 90 seconds. Jargon is a tax I refuse to charge.
Ship the boring decisions. 545 tests, signed receipts on every action, fail-open default — the unsexy guarantees ship before the sexy features.

What's next on the project

Cursor MCP integration — same engine, second IDE
WebFetch URL policies — closes the last 2 of 8 CC tools
Fabrication detector — diff Claude's narrative against the receipt log; surface hallucinated-success cases

Full ROADMAP in the repo.

Russell Miller · San Francisco
email: russell@enact.cloud
github: github.com/russellmiller3
best DM channel: LinkedIn (find me via the post or repo)