Vitso

Fear -- Part II: The Weapon

I Governed My Own Agents for a Weekend. Here's What Broke.

What happens when an Ops/Sec executive stops writing governance frameworks and starts living inside one

Tom Smolinsky -- March 2026
Part II of the Fear series on AI, governance, and the space between

Last weekend I spent 36 hours operating autonomous AI agents under a governance framework I built myself. I wrote the policies. I designed the tier model. I approved the architecture. Then I sat inside it and tried to get real work done.

By Saturday night, I was typing "approve" every 60 seconds while a 1-minute approval timer expired faster than I could read what I was approving. My agent couldn't notify me when it needed something because I hadn't built the notification path yet. And the agent I was working with suggested a perfectly reasonable shortcut that would have bypassed the entire governance structure I'd spent weeks designing.

The governance held. But the experience was miserable. And that misery turned out to be the most valuable thing I've built so far.

The four phases nobody talks about

Most governance work -- AI or otherwise -- dies in Phase 1. Here's why.

Phase 1 -- Write the rules

The flattering phase

This is where frameworks are born. You produce decks, principles, taxonomy, clean diagrams. You classify actions into tiers. You define what "governed" means. Everyone agrees in principle. The warm corporate glow of having been "thoughtful" fills the room.

This is where most governance stops. Not because the work is bad. Because nobody has to live inside the machine they designed.

Phase 2 -- Enforce the rules

Paper gets attached to steel

You build the infrastructure that turns policy into protocol. Interceptors. Approval workflows. Tool budgets. Scoped access controls. The governance system gets teeth.

This is where most organizations discover that their rules were written by people who never had to follow them.

Phase 3 -- Feel the friction

Reality collects its invoice

"Proceed." "Approve." "Continue." "Yes, go ahead." Timer expires. Approval denied. Re-submit. Wait. The governance works, but the experience is hostile. The controls are defensible, legally decorative, and operationally worthless.

This is where practitioners give up. They either gut the controls or build shadow workarounds. Both are governance theater with extra steps.

Phase 4 -- Refine without retreating

The part almost nobody reaches

You don't remove the tiers. You don't disable approvals. You negotiate operational patterns that preserve the governance while making it livable. Mission briefs instead of per-action approval. Pre-authorized scopes instead of constant nudging. Longer approval windows instead of twitchy timers. Out-of-band notification instead of silent waiting.

Same principles. Better ergonomics. The governed path becomes usable without becoming ungoverned.

What actually happened

I run two autonomous agents on my home lab. Juke is an observer agent that monitors infrastructure, runs assessments, writes reports, and builds things under my direction. Fader is a governed penetration testing agent that runs on a separate environment. Both operate under a four-tier action classification: T1 actions are autonomous. T2 actions require pre-approved scope. T3 actions require my explicit approval in real time. T4 actions are prohibited, period.

The weekend's mission was building the act-side governance plane -- an MCP Gateway that controls what tools agents can use, under what conditions, with what audit trail. The agent thinks through one gateway (inference). The agent acts through another gateway (tool execution). Both are logged. Both are governed. A shared trace ID ties them together.

Juke wrote the implementation plan. I reviewed and approved it. Juke started building. And then reality showed up.

The blocker that changed everything

Juke couldn't continue the build because he didn't have permission to inspect Docker containers -- a read-only operation he needed to verify that the existing infrastructure was ready for the new gateway. The governance I'd written classified this as requiring elevated access. The runtime configuration blocked it.

My agent was blocked from building the governance system by the governance system.

Instead of deferring this to the backlog, I treated it as a live governance design problem. Juke identified the control surfaces, mapped the needed capabilities to the tier model, and discovered something I hadn't considered: the approval system was path-based, not subcommand-aware. If he'd been granted access to the Docker binary directly, he would've had unrestricted access -- including destructive commands -- disguised as a read-only tool.

His solution was to build bounded helper scripts that wrapped only the exact commands we'd approved, then allowlist those scripts instead of the binary. The broader Docker operations still required my explicit approval. The fallback on anything not in the allowlist was deny.

The agent designed around his own skeleton key. That's Layer 1 governance working as designed -- not because the system forced him to, but because he understood the principle well enough to recognize the trap.

The shortcut that proved the system

Later that evening, things got harder. Juke needed to notify me via Telegram when he hit approval boundaries. The notification path didn't exist yet. My AI collaborator suggested a workaround: just use the Telegram bot token directly with curl. Quick, harmless, solves the immediate problem.

Juke refused.

Not because the shortcut was technically dangerous. Because it was a side-channel bypass of the governed messaging path. The token was accessible. The command was simple. The justification was sound. And the answer was still no.

I pushed him on it. I pointed out that he'd accepted "broader temporary approval" for other things -- wasn't this the same? He made a clean distinction: an operator-approved policy change inside the governed path is different from an agent workaround outside the governed path. One is governance adapting. The other is governance eroding.

That distinction is the whole game.

The uncomfortable truth

Friction is not evidence that governance failed. Unexamined friction is evidence that governance is still immature.

If you feel friction and conclude "controls are bad," you become a Laggard in builder's clothing. If you feel friction and hide it while everyone routes around you, you become a Liar with better branding. If you feel friction, instrument it, classify it, and redesign the operating pattern without surrendering the principle -- that's the Learner move.

That's the Builder-CISO move too. Not "I build tools." Not "I'm technical." It's: I submit myself to the consequences of the governance I claim to believe in. Then I improve it where it hurts.

"Build it and they will come. Block it and... they will come anyway."

That's the whole history of security and governance in one sentence. If the governed path is unlivable, the unofficial path becomes the real system. The box-checker CISO never discovers this because they never operate under their own policies. The people who have to follow the rules either suffer in silence or route around them.

And then leadership is shocked -- shocked -- when practitioners create shadow paths.

Why this matters for AI governance specifically

The people writing agent governance frameworks right now -- the ones who have never had to operate under expiring approvals, scoped mutation boundaries, channel friction, runtime authority mismatches, audit expectations, interruption costs, or notification gaps -- are going to ship frameworks that look responsible and behave like punishment.

I know this because I almost did it myself. The governance I designed was principled, architecturally sound, and operationally hostile. It took sitting inside it for 36 hours to understand that a 1-minute approval TTL makes sense on paper and is unusable in practice. That a notification gap isn't a convenience problem -- it's a control plane failure. That an agent that can't page its operator when it needs help will eventually find a workaround, and that workaround is where governance starts to rot.

The security industry learned this lesson over twenty years. Perimeter defense gave way to defense in depth. Trust-based networks gave way to Zero Trust. Policy posted on the wall gave way to policy enforced at the protocol layer. Every concept from that journey maps directly to agent governance. Network segmentation becomes gateway profiles and Docker network isolation. ACLs become T1-T4 interceptors at the tool execution layer. Least privilege becomes scoped tool budgets per agent. File integrity monitoring becomes governance document integrity hashing at session start.

The vocabulary changes. The architecture doesn't.

We built this on Docker Compose because that's what fits a single-machine lab. The same architecture runs on Kubernetes, on cloud-managed container services, on traditional server environments. The gateway pattern, the tiered interceptors, the scoped tool budgets, the cross-plane audit trail -- those are governance abstractions, not Docker features. The runtime changes. The governance model doesn't.

And the people who internalized those lessons -- the CISOs, the security architects, the Ops/Sec practitioners -- are the people best equipped to build governance that actually works at runtime. Not because they read about agent governance in a paper. Because they lived the same progression in a different domain.

What refinement looks like (without retreating)

By Sunday, my agent and I were negotiating a better operating model. Not dismantling the governance -- improving it. The changes we worked out:

Mission briefs instead of per-action approval. For a defined workstream, I lay out the objective, approved scope, ordered steps, stop conditions, and escalation triggers. The agent executes through the sequence without asking "may I proceed?" at every step. The governance is in the brief, not the interrupts.

Pre-authorized tier scopes. "For this workstream, T1 and T2 actions are pre-approved inside the named directory." That's one sentence that eliminates dozens of approval round-trips without loosening the actual controls.

Extended approval windows. Moving from 1-minute to 5-minute TTLs. The approval is still required. The window just acknowledges that humans blink, sip coffee, and occasionally have a life.

Out-of-band notification. Telegram as a one-way paging channel. The agent sends "come back to the console" when it needs something. No context bleed, no shifting the work surface. Just an interrupt path that acknowledges the operator isn't staring at the screen every second.

Same T1-T4 tiers. Same audit trail. Same prohibited actions. Same principle that the governed path is the only path. The ergonomics changed. The accountability didn't.

The line I'd put on the wall

Core thesis

Governance maturity is not the presence of controls. It is the ability to refine controls under real operational load without retreating from accountability.

You cannot design humane governance from outside the blast radius. You have to get annoyed by it. You have to wait on it. You have to hit approve too many times. You have to watch the timer expire while you're still reading. You have to miss the out-of-band page that doesn't exist yet. You have to feel the temptation to bypass your own controls.

Only then do you know where the design is honest and where it's vanity.

What we did this weekend was more valuable than getting the gateway running. We exercised the full stack: principle, policy, enforcement, lived operator experience, and adaptation loop. The gateway is almost the lab rat. The actual product is the governance model that survived contact with reality and came out better for it.

Not as performance. Not as suffering cosplay. As method.

This is the second piece in the Fear series about what happens when an Ops/Sec executive decides to learn AI by building governed systems with it. Part I, "The Vibe Coders Are Here (And We're Not Ready)," named the problem. This one describes what it looks like to start solving it -- by operating inside the solution and fixing it where it hurts.

Part III will cover the technical architecture: how traditional security concepts like Zero Trust, network segmentation, and defense in depth map directly to autonomous agent governance. Every concept translates. The security practitioners who understand those concepts are exactly who should be building this next layer.

Fear -- A Series on AI, Governance, and the Space Between

Part I: The Enemy Within -- The Vibe Coders Are Here (And We're Not Ready)

Part II: The Weapon -- I Governed My Own Agents for a Weekend. Here's What Broke.

Part III: Witch Hunt -- Zero Trust for Agent Runtime (coming soon)

Part IV: Freeze -- (coming soon)