01
Case Studies·The Work

What I've built.

Not a wall of logos — a few systems told in depth. For each: the problem, the architecture decision and why, how agents were supervised and evaluated, the outcome, and what I'd do differently.

Multi-tenant governance platformProduction · v2.5.0

Control Tower

Problem

Five Microsoft tenants, 200+ locations, and no unified view of cost, identity, compliance, or lifecycle evidence. Leadership couldn't answer basic governance questions across the estate.

Architecture

Python/FastAPI + HTMX + PostgreSQL on the Azure SDK (Cost Management, Resource Manager, Policy, Defender). Containerized to GHCR with OIDC-only auth — no stored secrets. I chose server-rendered HTMX over a SPA to keep the attack surface small and the system auditable.

Oversight & evals

Built by a supervised agent fleet. Every release gated by 7,386 automated tests and a 48/48 automated judge score, with DR drills completed before promotion.

Outcome

v2.5.0 in production June 2026, all five tenants syncing live.

What I'd do differently

I'd stand up the evaluation harness earlier. The judge caught regressions the test suite missed and quietly became the real release gate.

System diagram

HTTBCCFNTLLDCECONTROL TOWERFastAPI · HTMX · PostgreSQLAZURE SDKCost ManagementResource ManagerPolicyDefenderSECURITY & AUTHOIDC-only authWorkload identity fed.Docker -> GHCRZero stored secrets7,386 AUTOMATED TESTS · JUDGE 48/48 · DR DRILLS COMPLETE
PythonFastAPIHTMXPostgreSQLAzure SDKDockerOIDC
Identity-aware RAGPhase 2 · In build

Knowledge Fabric

Problem

Support knowledge scattered across POS configs (Mindbody, Zenoti), SharePoint, and 100K+ Freshdesk tickets. Answers had to respect brand and role boundaries — staff in one brand shouldn't surface another brand's playbook.

Architecture

A multi-agent ingestion pipeline with SQLite + FTS5 memory and Entra identity as the grounding layer — 1,523 verified humans across a 15-slot attribute schema. Identity is the filter, not an afterthought bolted on later.

Oversight & evals

A planner agent orchestrating specialists — crawler, programmer, code-reviewer, security-auditor. Phase 1 closed at 52/52 QA checks.

Outcome

Phase 1 shipped; Phase 2 in build, with consumers planned for Teams, Outlook, and Freshdesk.

What I'd do differently

Grounding on identity from day one paid off — retrofitting role boundaries into a RAG system after the fact is painful.

System diagram

DATA SOURCESPOS Configs(Mindbody · Zenoti)SharePointDocs & WikisFreshdesk100K+ ticketsPLANNER AGENTCrawlerProgrammerCodeReviewerSecurityAuditorMEMORY STORESQLite · FTS5 · vector indexENTRA IDENTITY FILTER1,523 verified users · 15-slot attribute schema · brand + role boundariesTeamsOutlookFreshdesk
Multi-agentRAGSQLiteFTS5Entra IDPython
Cost governance the COO actually usesLive · Drives exec decisions

Estate Trace

Problem

Nobody could answer 'what do we run and what does it cost' across 41 repos, 11 subscriptions, 178 Azure resources, plus reseller-billed licensing.

Architecture

Live-pull scripts against GitHub, Microsoft Graph, and Azure Cost Management feeding a single canonical workbook — with 12/12 verification checks gating every refresh.

Oversight & evals

Each pull is verified against the prior canonical snapshot; all 12/12 checks must pass before the workbook is published as source of truth.

Outcome

~$12K/mo run-rate quantified with a ~40% optimization path and 25 prioritized remediation actions. It drives executive financial decisions.

What I'd do differently

The real value was the verification checks. Once people trusted the numbers, the workbook became the thing every cost conversation started from.

PythonGitHub APIMS GraphAzure Cost Mgmt
Code private — architecture summarized; diagram on the roadmap.
OIDC lifecycle across 5 tenantsProduction

Zero-secret automation

Problem

Lifecycle automation — onboarding, offboarding, leave-of-absence — across five tenants, without storing a single credential anywhere.

Architecture

GitHub Actions with workload identity federation per tenant, driven from Monday.com and CSV inputs. Least privilege and federated trust by construction, zero long-lived secrets.

Oversight & evals

Every run leaves an audit trail; permissions are scoped per-tenant and least-privilege — the security story FDE interviewers ask about.

Outcome

Lifecycle automation running across all five tenants with no stored secrets and full audit trails.

What I'd do differently

Federated identity took longer to wire than secrets would have, but it removed an entire class of risk. Worth every hour.

System diagram

TRIGGER SOURCESMonday.comHR workflow eventsCSV ImportBulk lifecycle opsGITHUB ACTIONSWorkload identity · Per-tenant scopeZERO STOREDSECRETSHTTOIDCFederated trustLeast privilegeBCCOIDCFederated trustLeast privilegeFNOIDCFederated trustLeast privilegeTLLOIDCFederated trustLeast privilegeDCEOIDCFederated trustLeast privilegeOnboardOffboardLeave of AbsenceEVERY RUN PRODUCES A FULL AUDIT TRAIL · NO SECRETS EVER TOUCH DISK
GitHub ActionsWorkload Identity FederationOIDCMonday.com
Local-first AI email agent (personal)Personal · Shipped

Mysa Mail

Problem

Email overwhelm — but running cloud AI over a personal inbox is a privacy non-starter. I wanted the same production discipline off the clock.

Architecture

FastAPI + SvelteKit, 5,700+ LOC backend, with 5 classifiers and approval workflows before any destructive write. Local Ollama by default, cloud opt-in. WCAG 2.2 AA throughout.

Oversight & evals

Audit-trailed inference with explicit approval gates before destructive actions — the agent proposes, the human disposes.

Outcome

Shipped and in daily use, with local-first inference and an accessible UI.

What I'd do differently

The approval-before-write pattern is the thing I now reach for in every agent I build.

FastAPISvelteKitOllamaSQLiteWCAG 2.2 AA
The production quality barProduction

Group Management Hub

Problem

Group and access sprawl across four tenants — 334 groups with no consistent governance or quality bar.

Architecture

The same supervised-agent discipline as Control Tower, held to a strict gate: 7,984 tests and WCAG AA across the UI.

Oversight & evals

7,984 automated tests gating every release; accessibility validated to WCAG AA before promotion.

Outcome

Production across four tenants managing 334 groups — the exhibit for the quality bar I hold every system to.

What I'd do differently

Boring, exhaustive testing is the unglamorous reason these systems can be agent-built and still trusted in production.

PythonFastAPIAzureWCAG AA
Code private — architecture summarized; diagram on the roadmap.

More from the workshop.

Smaller trails — open-source tools and personal builds that carry the same discipline.

Open Source

Code Puppy

An open-source AI code agent — multi-agent architecture with specialized sub-agents for QA, architecture, research, and planning. No IDE, no vendor lock-in.

TypeScriptMulti-AgentCLI
View
Shipped

360 Communicator

Cross-platform desktop app for franchise communication — Tauri + React + Vite. Unifies fragmented tooling for multi-unit operators.

TauriReactViteTypeScript
View
Live

C&M Barber Shop

Conversion-first, zero-framework static site — Cloudflare Pages, WCAG 2.2 AA, self-hosted fonts, A+ security headers. Accessible doesn't mean boring.

HTMLCSSCloudflare PagesWCAG 2.2
View
Enterprise

Passkey Enable Azure

Modernizing identity from passwords to FIDO2/passkeys across enterprise environments — security engineering meets user experience.

AzureFIDO2IdentitySecurity

Process Innovation

Reframing how teams move from idea to ship.

Trails & Mountain Biking

Northwest Arkansas singletrack as creative fuel.

Family, Community, Nature

The grove that grounds everything else.