Cloudflare Containers pricing now bills CPU on active usage, not provisioned capacity. For engineering leaders, that means your container bill starts tracking what your code actually burns, measured at $0.00002 per vCPU‑second, while memory and disk pricing remain unchanged. The change landed on November 21, 2025 and applies to Containers and Sandboxes immediately. (developers.cloudflare.com)
Here’s the thing: pricing changes rarely stay neutral. If your workloads are spiky, I/O‑bound, event‑driven, or sleep often, you likely save money. If you’re running hot at 80–100% CPU all day, you’ll see less impact without architectural tweaks. Let’s unpack the mechanics and get practical about what to tune this week.
What exactly changed in Cloudflare Containers pricing?
Cloudflare moved CPU charges from “allocated vCPU over time” to “actual CPU consumed over time.” Same unit price, new meter. CPU is billed at $0.00002 per vCPU‑second, memory at $0.0000025 per GiB‑second, and disk at $0.00000007 per GB‑second. Workers Paid includes monthly allotments (for example, 375 vCPU‑minutes, 25 GiB‑hours of memory, and 200 GB‑hours of disk) before overage rates apply. (developers.cloudflare.com)
Operationally, charges start when a request hits the container or when you manually start it, and stop when the instance sleeps (scale‑to‑zero). Instance types still cap ceilings (vCPU, memory, disk), but you’re no longer paying CPU for ceiling you don’t use. (developers.cloudflare.com)
Cloudflare Containers pricing: how to model your bill
Let’s get practical. Build a quick model so finance and engineering see the same picture.
Inputs
For each distinct workload, capture:
- Average CPU utilization when active (u), as a decimal (e.g., 0.25).
- Active time per instance per day (Ta), in seconds (exclude sleep).
- Average memory working set (M), in GiB, during active time.
- Disk footprint (D), in GB, during active time.
- Concurrent instances when active (C).
CPU cost
CPU cost ≈ C × Ta × u × vCPU_limit × $0.00002. If your instance type tops at 1 vCPU and you use 25% on average, u = 0.25. (developers.cloudflare.com)
Memory and disk
Memory cost ≈ C × Ta × M × $0.0000025. Disk cost ≈ C × Ta × D × $0.00000007. These remain provisioned‑style over active time; the CPU shift doesn’t change them. (developers.cloudflare.com)
Included allotments
Subtract Workers Paid included amounts across the month (375 vCPU‑minutes, 25 GiB‑hours, 200 GB‑hours) before applying overage rates. Track by meter to avoid surprises. (developers.cloudflare.com)
Quick example
Say you run standard‑2 (1 vCPU, 6 GiB, 12 GB) for an API that’s active four hours per day, average CPU 20%, concurrency 5. CPU: 5 × (4 × 3600) × 0.2 × 1 × 0.00002 ≈ $0.144/day. Memory: 5 × (4 × 3600) × 6 × 0.0000025 ≈ $0.108/day. Disk: 5 × (4 × 3600) × 12 × 0.00000007 ≈ $0.006/day. Adjust for included allotments at the account level. (developers.cloudflare.com)
Who wins—and who won’t—under active CPU billing?
Winners: bursty APIs, webhook handlers, I/O‑bound scrapers, chat and stream orchestrators, background jobs waiting on upstream services, and daytime‑only internal tools. These benefit because CPU burns only during real work, then drops to near‑zero when sleeping. (developers.cloudflare.com)
Less impact: steady CPU‑bound services (long‑running transforms, inference at sustained >60–70% CPU), or memory‑heavy apps where the memory meter dominates. You’ll likely need design tweaks—batching differently, autosleep, and concurrency controls—to see savings.
FAQ: People also ask
Does this change apply to Sandboxes too?
Yes—Cloudflare’s update applies to Containers and Sandboxes: CPU now bills on active usage; memory and disk pricing are unchanged. (developers.cloudflare.com)
What are the current instance types and ceilings?
Common instance shapes range from 1/16 vCPU with 256 MiB (lite/dev) up through multi‑vCPU options. Choose the smallest ceiling that safely holds your peak demand, then let the active CPU meter do its work. (developers.cloudflare.com)
How many containers can I run at once?
Since September 2025, Cloudflare raised concurrent resource limits substantially (e.g., up to 100 vCPU, 400 GiB memory, and 2 TB disk across live instances), enabling larger fleets per account. Check your specific account limits before a scale‑out test. (developers.cloudflare.com)
How does this compare to AWS Lambda or Fargate?
Lambda charges on GB‑seconds plus request count and optional ephemeral storage; it scales to zero and is great for short, stateless tasks. Containers give you OS‑level flexibility, persistent disk within a running instance, and now CPU that bills only while you work. The cost frontier will depend on your memory profile and duty cycle. Benchmark both with the same traffic shape. (aws.amazon.com)
Your 7‑day tuning checklist
Run this sprint and ship changes fast:
- Instrument CPU and active time. If you don’t already export CPU, add lightweight sampling in your app or agent. Tag spans with instance type and request IDs. Decide what “active” means for your workload.
- Right‑size instance types. Drop one size if headroom is consistently >50% and latency SLOs hold. Re‑run load tests after each change. (developers.cloudflare.com)
- Enable aggressive autosleep. Shorten idle timeouts where possible so sleeping time is real savings. Verify cold‑start latency and pre‑warm for critical hours. (developers.cloudflare.com)
- Shape concurrency. Prefer a few warm instances with controlled concurrency over many idling instances. Cap C per endpoint based on P95 CPU per request.
- Move chatty storage to R2 to blunt egress. Containers egress isn’t free; offload assets and large reads to R2 where internet egress is $0.00. (developers.cloudflare.com)
- Split the CPU hogs. Separate hot CPU paths (image/video transforms, heavy JSON processing) into dedicated workers/containers so the rest of the fleet stays cool.
- Preview and alert on cost. Convert vCPU‑seconds, GiB‑seconds, and GB‑seconds to dollars nightly and alert when deltas exceed 15% day‑over‑day.
A simple framework to re‑architect for savings
Use this three‑lane framework for decisions in platform reviews:
Lane A: Shrink & sleep. If CPU utilization is below 30% and latency budget is loose, downsize and cut timeouts in half. Focus on cache hits and upstream backoffs.
Lane B: Split & specialize. If P95 CPU per request is high but bursty, split hot paths to a dedicated size, keep cold paths tiny, and introduce mediators or queues so cold paths don’t hold hot CPU hostage.
Lane C: Shift & standardize. If you’re still >60% CPU for hours, consider dedicated compute patterns or even a different product (e.g., Workers for short stateless steps, or keeping some heavy jobs on a batch platform). Re‑benchmark after each shift.
Cost math you can copy
Cloudflare’s example shows why this change matters: a standard‑2 instance running one hour at only 20% CPU now pays about $0.0144 for CPU instead of the previous $0.072 when billed on full allocation—an 80% drop. Multiply that across fleets and months and you’re looking at real budget impact. (developers.cloudflare.com)
Tip: keep the memory meter honest. If memory creeps up, it can erase CPU savings. Track working‑set drift and add periodic heap profiles to your on‑call runbook. (developers.cloudflare.com)
Risks, limits, and gotchas
Cold starts and jitter: dialing down timeouts too aggressively can spike tail latency. Guard with pre‑warm windows during peak and circuit‑breakers to upstreams.
Hidden egress: containers that fetch lots of third‑party data may see egress charges. Store and serve from R2 or cache at the edge where possible to reduce paid egress from Containers. (developers.cloudflare.com)
Per‑account ceilings: even with higher concurrent limits, don’t assume infinite scale on day one. Stage rollouts and watch error budgets. (developers.cloudflare.com)
Comparing apples and oranges: Lambda’s GB‑second math and Cloudflare’s vCPU‑second math are different lenses. Always replay the same real traffic profile in both before declaring victory. (aws.amazon.com)
What to do next
- Build a one‑page cost model for your top three services using the formulas above and your New Relic/Datadog traces.
- Run a two‑hour load test at realistic concurrency. Capture CPU%, active seconds, and memory GB‑seconds.
- Apply Lane A/B/C decisions and retest. Document SLO/$$ trade‑offs in PRs.
- Schedule a budget checkpoint in 14 days to compare projected vs. actual.
Should you switch platforms?
Containers shine if you need OS‑level tooling, small persistent disks while active, or long‑lived processes with bursts. Workers shine for ultra‑fast, short stateless functions at the edge. Lambda still wins on massive ecosystem integrations and event sources, and it’s easy to compare costs because Lambda publishes transparent example math. Your best option might be a hybrid: Workers for fan‑out and auth, Containers for heavier per‑request logic, and a batch platform for sustained compute. (developers.cloudflare.com)
Related deep dives and tools
If you’re modeling this change at scale, our earlier breakdown on Cloudflare Containers’ real pricing levers pairs well with this update. For cost‑cutting strategies beyond compute, see our take on reducing cloud networking spend with regional NAT Gateways. And for keeping eyes on live systems, our CloudWatch observability playbook for modern AI systems shows how to wire cost and performance signals into your pager, fast.
If you want help modeling or executing the plan end‑to‑end, our services team runs fixed‑scope cost and performance sprints that hand you the math, dashboards, and patches.
Benchmarks to run before you celebrate
Traffic replay: take one busy hour, replay at 1.25× and 2×, and capture CPU% and active seconds.
Steady‑state soak: run at your median TPS for 2 hours and confirm CPU stays below plan.
Tail latency test: add 500ms of artificial upstream delay and verify autosleep/pre‑warm settings don’t blow your P99.9.
Concurrency sweeps: vary C from 1 to 2× expected peak to find the sweet spot for CPU burn vs. queueing. Log vCPU‑seconds per successful request to spot regressions.
Zooming out
This move by Cloudflare aligns serverless containers with reality: you shouldn’t pay for CPU you never touch. It also pressures teams to measure the basics—CPU%, live seconds, and memory GB‑seconds—because those are now the bill. With higher resource ceilings since September and transparent per‑unit pricing, the platform has runway for serious production loads; the big wins will come from careful right‑sizing, smart sleep, and removing CPU‑hot paths from the critical path. (developers.cloudflare.com)
