After this lesson you'll be able to write a multi-step Workflow with per-step retry config, explain why step logic must be idempotent, and know when to reach for Workflows instead of a plain Worker or Queues.
A normal Worker lives for the length of one request: if it crashes halfway through a five-step process, everything it hadn't already persisted is gone, and you're rebuilding the state from scratch. Workflows is Cloudflare's answer to that problem — durable execution for long-running, multi-step processes. You write your process as a sequence of named steps; the platform checkpoints the return value of each completed step, and if the underlying instance crashes, gets rescheduled, or a step throws, Workflows resumes from the last completed step instead of restarting the whole run. A Workflow instance can legitimately run for minutes, hours, or weeks — it's built for that, not just tolerating it.
step API on top of it — but knowing this explains the behavior: one instance processes its steps one at a time, in order, and its state survives restarts the same way a Durable Object's storage does.
You define a Workflow as a class extending WorkflowEntrypoint, with a run(event, step) method. Inside run, you call step.do("step name", callback) for each unit of work. Three things matter about that call:
`step-${Date.now()}`), or the engine can't tell a resumed step from a new one.step.sleep() / step.sleepUntil() pause a Workflow — even for days — without holding a running instance or billing CPU time while it waits.The consequence of automatic retries is the single most important rule in this lesson: step callbacks must be idempotent and the control flow around them must be deterministic. Retries mean a step's code can run more than once for the same logical attempt (e.g. the API call succeeds but the response is lost before Workflows records it, triggering a retry that repeats the call). Code outside a step.do() — anything in run() directly — can also re-execute if the engine restarts, so side effects (writes, sends, charges) belong strictly inside steps, and a step that isn't naturally idempotent needs to check whether its effect already happened before performing it again (e.g. "has this order already been marked paid?" before charging).
A three-step Workflow: call an external API, process the result, write it to R2. Each step has its own retry policy sized to how flaky and how expensive that particular step is.
import { WorkflowEntrypoint, WorkflowStep, WorkflowEvent } from "cloudflare:workers";
type Params = { reportId: string };
export class ReportWorkflow extends WorkflowEntrypoint<Env, Params> {
async run(event: WorkflowEvent<Params>, step: WorkflowStep) {
const { reportId } = event.payload;
// Step 1: call an external API. Retries handle transient network/5xx errors.
const raw = await step.do(
"fetch source data",
{
retries: { limit: 5, delay: "10 seconds", backoff: "exponential" },
timeout: "30 seconds",
},
async () => {
const res = await fetch(`https://api.example.com/reports/${reportId}`);
if (!res.ok) throw new Error(`upstream ${res.status}`);
return res.json();
}
);
// Step 2: pure transformation — no side effects, so it's safe to retry
// even without a special idempotency check.
const processed = await step.do("transform result", async () => {
return { reportId, total: raw.items.reduce((sum: number, i: any) => sum + i.amount, 0) };
});
// Step 3: write to R2. Check-before-write makes the retry idempotent —
// if a previous attempt already wrote the object, don't write it again.
await step.do(
"persist to R2",
{ retries: { limit: 3, delay: "5 seconds", backoff: "linear" } },
async () => {
const key = `reports/${reportId}.json`;
const existing = await this.env.REPORTS_BUCKET.head(key);
if (existing) return; // already written by a prior attempt
await this.env.REPORTS_BUCKET.put(key, JSON.stringify(processed));
}
);
}
}
Binding and trigger config in wrangler.toml:
[[workflows]]
name = "report-workflow"
binding = "REPORT_WORKFLOW"
class_name = "ReportWorkflow"
Kick off an instance from a Worker via the binding:
const instance = await env.REPORT_WORKFLOW.create({ params: { reportId: "abc123" } });
return Response.json({ id: instance.id, status: await instance.status() });
Workflows bills on three dimensions: requests, CPU time, and storage. As of this writing:
| Dimension | Free | Paid (Workers Paid plan) |
|---|---|---|
| Requests | 100,000/day (shared with Workers) | 10M/month included, then $0.30/million |
| CPU time | 10ms CPU per invocation | 30M CPU-ms/month included, then $0.02/million CPU-ms |
| Storage (state) | 1GB/month | 1GB/month included, then $0.20/GB-month |
Two details worth internalizing: time spent waiting on a fetch response or paused in step.sleep() does not incur CPU time — you're billed for compute, not wall-clock duration. And a Workflow instance's state is retained for 3 days (Free) or 30 days (Paid) by default, which is what the storage line item measures. Pricing changes; confirm current numbers on the live page linked below before quoting them in a proposal.
step.sleep), check activation status, send a follow-up — durable, long-lived, and cheap to leave idle.await charge(customer, amount) directly in a step.do() callback and rely on the retry config to "just handle" failures. But if the charge succeeds and the network drops before Workflows records the step as complete, the retry re-runs the callback — and re-charges the customer. The fix is to make the side effect itself idempotent (pass an idempotency key to the payment API, or check "has this already been charged?" before charging) rather than assuming a step only ever executes once. The same trap applies to logic placed outside step.do() in run() directly: that code can re-execute on engine restart with no caching at all, so anything with a side effect — writes, sends, random IDs used for dedup — needs to live inside a step, not beside one.
Cloudflare Workflows — Rules of Workflows is the canonical page on determinism and idempotency requirements referenced in this lesson; pair it with the Workflows pricing page for current numbers, since pricing is subject to change.
A step's name acts as a cache key — once that named step completes, its result is persisted, and a retry or resume reuses the cached result instead of re-running the step. That's why step names must be deterministic (fixed, not built from Date.now() or similar).
Each Workflow instance is backed by a Durable Object, which is where the single-instance execution and persisted step state come from — Workflows is a step-oriented API layered on top of that.