Loading page...
Loading page...
rote takes an agent skill you already trust and graduates it into a real workflow. Everything provable becomes plain code. The LLM stays only where judgment is genuinely required.
✓ Apache 2.0✓ On PyPI today✓ Six runtime targets
# graduate a skill into a durable pipeline
$ uvx --from rote-cli rote graduate ./my-skill --out ./graduated/
# re emit the same pipeline to another runtime
$ uvx --from rote-cli rote emit ./graduated --runtime temporal
Why compile a skill
rote fixes all three problems the way a compiler would: it moves the deterministic parts into code and keeps the model only where inputs are truly open ended.
A fuzzy skill run takes ten to twenty minutes of agent time, every single time.
You pay full model tokens on every run, even for steps that never change.
The same input can produce a different path tomorrow. That is hard to test and harder to trust.
How it works
A SKILL.md and its references folder is all it needs. The same format your agents already use.
An LLM reads the skill against a structured rubric and sorts each step into one of five node kinds by how deterministic it can be.
One pipeline.yaml plus generated code for the engine you already operate. Emission is plain code and byte identical every time, so you can regression test it.
Five node kinds
Each answer becomes a node with the cheapest implementation that still does the job.
pure_functionProvable logic becomes plain Python. Deterministic, testable, free to run.
external_callAPI and tool calls with typed inputs and outputs, handled by your runtime's retries.
llm_judgeA single model call behind a typed signature, kept only where inputs are genuinely unbounded.
agent_loopA bounded agent loop, preserved only for steps that truly need exploration.
hitl_gateA human approval gate that durably suspends the pipeline and resumes when you say so.
Runtime targets
The intermediate representation is independent of any runtime. Emit the same pipeline to any of six targets today, with Restate planned.
Python, the default. SQLite in dev, Postgres in prod, no orchestrator to run.
Python workers for teams already on Temporal.
TypeScript at the edge.
The DBOS model for TypeScript stacks.
TypeScript, event driven.
A raw adapter with no engine at all.
The research behind the idea
rote builds on Compiled AI, a 2026 paper by Trooskens et al. on compiling LLM workflows into deterministic pipelines. Their measurements, not ours: 57x fewer tokens, 450x lower median latency, and 100 percent reproducibility.
57x
fewer tokens than running the agent loop
450x
lower median latency
100%
reproducibility, versus 95% at temperature zero
rote's own bundled example graduates a real sales outreach skill into a 22 node pipeline, 78.9 percent plain code, in about 13 minutes for about $0.70. It also caught three mandatory exclusion checks the human baseline missed.
One off tasks and open ended research need flexibility. Keep those as live agent loops.
rote shines on the skill you have already run twenty times and now want to run a thousand more, unattended.
Approval gates are a first class node kind. The pipeline durably pauses until someone signs off.
Claude Code plugin
Add the marketplace inside Claude Code, then ask Claude to graduate a skill in plain language. A second skill serves your graduated pipelines as MCP tools, so agents can call them like any other tool.
# inside Claude Code
$ /plugin marketplace add trevhud/rote
Questions engineers ask
rote is early and moving fast. Join the waitlist and we will let you know as new runtimes and major releases land.
Prefer to dive in now? rote-cli is on PyPI and the source is on GitHub.