How I Built an Agent-Publish Framework That Codes and Tests Itself

Every night my agent runs research. Every morning I wake up to markdown files sitting in a folder, going nowhere. That is the 3 AM problem. This post is how I am fixing it, with code that writes code, tests itself, and ships to GitHub Pages on a schedule. No hand-holding. No manual copy-paste.

TL;DR: I built agent-publish, a Python CLI that turns agent markdown into styled HTML and pushes it to GitHub Pages. A cron job runs daily, pulls tasks from a Kanban file, writes code, runs tests, commits, and pushes. The agent does the work. I just review the output.

The Problem

I run three research crons every Monday. ITSM/MSP agentic AI at 9 AM. Design evolution at 2 PM. Skills radar at 7 PM. Each one produces a markdown report. For months those reports lived in ~/.hermes/cron/output/, invisible to everyone including me.

The gap between "agent produced research" and "human can read it" was manual drudge work: copy markdown, paste into a converter, style it, commit, push. I did it once. Then twice. By the third time I knew I would rather write a tool than do this again.

That is how agent-publish started. Not as a grand vision. As a refusal to do repetitive shit.

What Agent-Publish Does

At its core it is a markdown-to-HTML pipeline with three jobs:

Convert markdown to semantic HTML with syntax highlighting, tables, and a TOC.
Style it with inline CSS. No external dependencies. No CDN fonts. No JavaScript.
Publish to a GitHub Pages repo with fingerprint-based deduplication. No duplicate commits.

The CLI is dead simple:

$ agent-publish research.md --repo ~/Projects/vadapayasam.github.io --theme default

That one command produces a self-contained HTML file, copies it to the right path, commits with a cache-busting hash, and pushes to main. If the content has not changed, it skips the commit entirely.

The Architecture

Converter

Parses markdown with Python-Markdown extensions for fenced code, tables, and TOC. Extracts the H1 as title. Generates a content fingerprint for dedup.

Publisher

Git-aware deployment. Checks a JSON cache before writing. Commits with a short prefix. Pushes to origin. Handles asset copying alongside the HTML.

Themes

Three built-in CSS families: default (warm cream), minimal (brutal black and white), and brutalist (neon on black). All inline. No external files.

Validator

Post-publish checks: HTML structure, CSS renders without deps, URL is reachable, commit is non-empty. Fails loudly if anything breaks.

The repo lives at github.com/thisisprabha/agent-publish. It is MIT licensed. Use it, fork it, ignore it. I do not care.

But That Is Just the Tool

The interesting part is not the code. It is how the agent runs it without me.

Here is the daily loop:

Cron fires at 1 AM IST. A Hermes cron job loads the kanban-codex-lane skill.
It reads KANBAN.md inside the repo. That file has five columns: Backlog, Ready, In Progress, Review, Done.
Picks the top two Ready cards. Never more than two. Constraint forces focus.
Executes each card: write code, run tests, fix failures, commit.
Moves cards from Ready to In Progress to Done.
Pushes the updated Kanban state and all commits to GitHub.
Sends a Telegram summary with what got done, what failed, and what is next.

The Kanban file is the single source of truth. Not Todoist. Not a Trello board. A markdown file in Git. The cron reads only that file, so the context window stays tiny. No 15K token roadmap planning sessions. Just: read two cards, do the work, commit.

How the Daily Sprint Works in Practice

On June 2, 2026, the cron woke up and found AP-001 and AP-002 in the Ready column:

AP-001 DONE — Scaffold core converter module (markdown to HTML with fenced code and tables)
AP-002 DONE — Create base CSS system inline, clean minimal style, mobile responsive

It wrote converter.py, themes.py, and a test that verified the HTML output had a <main> tag and no broken tables. Then it ran pytest. Tests passed. Cards moved to Done. Commit: 🤖 Agent publish: scaffold converter + themes.

Next day it will pick AP-003 and AP-004. CLI entry point and GitHub Pages deployment helper. Two cards. No scope creep.

The Model Layer

Here is where it gets slightly unhinged. The cron does not just run shell commands. It loads an LLM skill, reads the Kanban context, and reasons about what code to write. It can:

Generate Python modules from a card description.
Write tests before implementation, or after, depending on the card.
Debug its own test failures by reading stderr and patching the code.
Refuse to commit if tests fail. No green-check theatre.

The model is Kimi k2.6 via a custom provider. It is cheap, has a 1M token context window, and does not mind reading stack traces. I do not use it for design work. That is my job. But for plumbing code, tests, and repetitive publishing logic? It is faster than me and does not get bored.

The Honest Part (What Is Still Broken)

I am not going to sell you a finished product. Here is what still sucks:

Problem	Status	Why It Hurts
Syntax highlighting requires Pygments	WIP	Adds a dependency. Inline CSS is cleaner but hard to maintain.
No dark mode toggle	TODO	Media query is easy. Toggle with JS is not. Personal preference: ship without.
Image asset copying is naive	WIP	Relative paths break if the markdown references parent directories.
Config file is YAML	DONE	Actually fine. TOML would be cooler but YAML is readable.
No PyPI package yet	TODO	Need `setup.py`, classifiers, and a release workflow. Boring but required.
Tests are minimal	WIP	Four sample markdowns verify structure. Need integration tests for git push.

The Kanban board has twelve cards total. Seven are Done. Five are in flight. I am not adding new features until those five are closed. Scope discipline is the only reason this has not become abandonware yet.

Why Git-Based Kanban Instead of Todoist

I use Todoist for my daily life. Groceries, calls, review requests. But for agent sprints it is the wrong shape. Todoist tasks are flat. They do not know about git state, commit hashes, or test failures.

A markdown Kanban file in the repo solves that:

Git history shows when a card moved from Ready to Done. Immutable audit trail.
The agent reads it with zero API calls. No rate limits. No auth tokens.
I can edit it manually if I want to reorder priorities. The agent will pick up the change on the next run.
It is portable. Anyone who clones the repo sees the exact same board.

Todoist still gets a mirror task called [Mirror] agent-publish daily sprint. That is for my human brain, not the agent. The agent does not touch Todoist.

How to Use This for Your Own Projects

If you want to copy this, here is the shortest path:

Clone thisisprabha/agent-publish and install it: pip install -e .
Create a repo with GitHub Pages enabled. Any static site generator works. I use raw HTML in a sketch/ folder.
Write a KANBAN.md with your first two tasks in the Ready column. Keep them small. "Scaffold converter" is a day of work. "Build entire platform" is a year of disappointment.
Set a cron in Hermes (or any agent scheduler) that runs daily, loads the repo, reads Kanban, executes Ready cards, commits, pushes.
Do not watch it. Check the Telegram summary in the morning. If it failed, read the error. If it succeeded, move on with your life.

The magic is not the tool. It is the constraint: two cards per day, no more. That forces the agent to finish things. Without that limit you get beautiful half-finished architecture diagrams and zero working code.

What Comes Next

The immediate backlog is finishing the core package: PyPI release, better tests, and a README that does not lie about features. After that I want to extract three more agent workflows from my vault:

Agent-lane: A subagent orchestration pattern using worktree isolation. One agent writes, one tests, one reviews. No single thread choking on context limits.
Agent-self-audit: Nightly log analysis that finds broken patterns in cron output and opens repair tasks automatically.
Agent-research-guard: A state-aware research pipeline that fingerprints web sources before expensive LLM calls, skipping duplicates and drifting queries.

Each one will get its own repo, its own Kanban, and its own daily cron. The goal is a fleet of small agents with narrow scope, not one god-agent that knows everything and does nothing.

Built by Prabha with agent-publish. Code is MIT. Opinions are mine. If you find a bug, open an issue. If you want to argue about YAML vs TOML, keep it to yourself.