BYBOWU > Blog > Web development

Cloudflare Containers Pricing Just Changed: What to Fix Now

blog hero image
Cloudflare flipped the switch on November 21, 2025: Containers and Sandboxes now bill CPU by active usage, not what you provision. For many teams, that’s real money back—especially for bursty, spiky workloads. For others, it’s a wake-up call to right-size instances, adjust timeouts, and rethink autoscaling assumptions. This post breaks down what exactly changed, who benefits (and who doesn’t), the numbers that matter, and a battle-tested checklist to capture savings in days, not quart...
📅
Published
Nov 23, 2025
🏷️
Category
Web development
⏱️
Read Time
10 min

On November 21, 2025, Cloudflare shifted how it bills compute: Containers and Sandboxes now charge for CPU based on active usage instead of provisioned capacity. If you’ve been watching your bill creep because instances sat around underutilized, this Cloudflare Containers pricing update is your chance to claw money back—without rewriting your app.

Here’s the thing: pricing changes are rarely neutral. This one helps teams with bursty workloads, inconsistent concurrency, or long I/O waits. If you run hot at 80–100% CPU all day, you’ll see less impact and may need to tweak architecture to win. Let’s get practical.

Illustration comparing provisioned vs active CPU billing

What exactly changed—and what stayed the same

Cloudflare’s Containers and Sandboxes now bill CPU at $0.00002 per vCPU‑second based on what your instances actually burn while they’re doing work. If your container is idle or sleeping, you’re not paying for CPU. Memory and disk still follow their existing pricing, and egress rates remain unchanged by region.

Key numbers and dates to anchor on:

  • Change date: Friday, November 21, 2025 (effective immediately for Containers and Sandboxes).
  • CPU price: $0.00002 per vCPU‑second, billed on active usage.
  • Workers Paid plan includes CPU, memory, and disk allotments used by Containers; beyond that, you pay the per‑unit rates.
  • Memory: billed per GiB‑second; Disk: per GB‑second; neither changed with this update.
  • Egress: still tiered by region (e.g., North America/Europe starting at cents per GB) with included monthly allotments.

Cloudflare also currently offers several container instance types you can pick from (e.g., 1/16 vCPU with 256 MiB RAM up to multi‑vCPU options). You pay for CPU only when the instance is doing work; the instance type constrains ceilings for CPU, memory, and disk and influences cost when active.

Who benefits—and who doesn’t

If your service patterns look like any of these, you’re likely to win:

  • Bursty traffic with long I/O waits (database, API calls, queues). CPU naps while I/O happens—now you stop paying for those naps.
  • Event‑driven jobs with sporadic spikes (webhooks, backfills, nightly transforms).
  • Spiky tenants in multi‑tenant SaaS where noisy neighbors used to force you into oversized instances “just in case.”

Where savings are modest without further changes:

  • Compute‑bound workloads running near 100% CPU (e.g., transcoding, ML inference without batching). You’ll pay roughly what you used to, because you’re actually using the CPU you provisioned.
  • Always‑on cron‑style daemons that spin but don’t sleep. Convert them to event‑driven or scale‑to‑zero patterns.

Cloudflare Containers pricing: how to estimate your new bill

You don’t need a spreadsheet to get directional guidance. Use this quick mental model for one container type:

  1. Measure average CPU utilization over a typical hour (from logs or metrics). Example: 20% on a 1 vCPU instance.
  2. Convert to active vCPU‑seconds: 1 vCPU × 3,600 s × 20% = 720 vCPU‑s.
  3. Multiply by price: 720 × $0.00002 = $0.0144 for that hour’s CPU.

Previously you effectively paid as if the vCPU were fully allocated for that hour ($0.072). The change brings that down in proportion to your real utilization. Multiply by instance count, hours active, and days per month for a rough monthly estimate, then add memory, disk, and egress as before.

What about Workers, Durable Objects, and networking?

Containers sit alongside Workers. Requests still arrive via a Worker, and each container instance has a backing Durable Object. You’ll continue to see Workers and Durable Objects usage on your bill. The new CPU accounting applies to Containers and Sandboxes; other prices and inclusions for Workers remain as documented.

Networking remains a lever. If your app is egress‑heavy, optimize payloads, compress, coalesce round‑trips, and consider data localization to keep traffic intra‑region where possible.

The 90‑Minute Cost Cut Plan

If you want savings this week, run this short, focused exercise with your team. Block 90 minutes, screenshare your dashboards, and make decisions in the room.

  1. Pick two services that drive the top 40% of CPU minutes. Don’t boil the ocean.
  2. Grab seven days of metrics: average CPU %, p95 CPU %, requests, and I/O wait. If you lack metrics, add them today—CPU and request counters inside your Worker and container are enough.
  3. Right‑size the instance type. If p95 CPU never exceeds 35% on a 2 vCPU instance, drop to 1 vCPU and keep headroom with concurrency limits.
  4. Shorten idle timeouts so instances sleep sooner. Start conservative (e.g., 60s → 20s) and watch cold‑start impact.
  5. Batch tiny tasks. Accumulate small jobs to hit the CPU when you’re awake instead of frequent micro‑bursts that wake/sleep constantly.
  6. Add backpressure via Durable Object queues or Workflows so bursts don’t spin up more concurrent containers than needed.
  7. Rinse and re‑estimate. Recompute vCPU‑seconds with your new settings. Ship the smallest safe change today; leave bigger refactors for a sprint.

Design patterns that maximize the new model

The new billing favors architectures that keep CPU either busy in short bursts or fully asleep. These patterns work well:

  • Edge gate, container burst: Terminate HTTP at a Worker that authenticates, rate‑limits, and enriches requests. Only send heavy ones into a container. Light reads can finish at the edge.
  • Workflow fan‑in/fan‑out: Use Workflows to orchestrate multi‑step jobs, pause on I/O, and resume without keeping CPU clocking.
  • Queue with adaptive concurrency: Drain at a fixed concurrency that keeps CPU near 50–70% during spikes, then let instances sleep. Fewer, fatter bursts often cost less than many tiny wakes.
  • Smart caching: Cache compiled templates, query results, and static artifacts in KV/R2 to cut CPU loops. Less recompute equals fewer vCPU‑seconds.
Edge to container architecture diagram emphasizing sleep/burst pattern

What’s the catch?

Two realities to plan around:

  • Autoscaling and load balancing for Containers are still evolving. Today you can scale manually or with helper code; utilization‑based autoscale and latency‑aware routing are on the roadmap. Design for burst tolerance and avoid hard assumptions about instant multi‑region scale‑out.
  • Constantly hot compute won’t magically get cheaper. If your CPU is pegged, the new billing looks a lot like the old world. To win, you’ll need batching, model/codec tuning, or pushing parts to more efficient runtimes.

People also ask: short, direct answers

Does this change how Workers are priced?

No. The shift applies to CPU for Containers and Sandboxes. Workers still follow their documented pricing and inclusions. You’ll see separate line items for Workers, Durable Objects, and any Container usage.

Is this better or worse than paying for provisioned vCPU?

If your average utilization is under ~60%, paying for active vCPU‑seconds is typically better. Above that, it’s comparable—your real wins come from reducing idle time and smoothing load.

Will my latency increase if instances sleep more?

Possibly on the first request after sleep (cold‑ish start). You can tune sleep timeouts and keep a warm instance per region for steady flows while still letting the fleet scale to zero between bursts.

A simple framework to tune for vCPU‑seconds

Use the SCALE framework with your leads and SREs:

  • Size for the p95, not the p5. Instance type should track your 95th percentile CPU, with a margin—not your marketing peak.
  • Cache aggressively at the edge. KV/R2/HTML edge caching cuts container wakes and CPU churn.
  • Adapt concurrency. Target a CPU band (e.g., 55–70%) and adjust worker pools to stay there during spikes.
  • Let it sleep. Tune idle timeouts; it’s better to sleep for minutes than hover at 5% CPU for hours.
  • Eliminate waste. Remove debug middleware in prod, pre‑compile templates, and batch chat/AI calls.

Numbers to keep on your desk

  • Change date: November 21, 2025.
  • CPU price: $0.00002 per vCPU‑second for Containers and Sandboxes.
  • Included monthly usage exists on paid plans; beyond that, CPU, memory, disk, and egress are billed per unit.
  • Typical instance sizes today include 1/16 vCPU (256 MiB) up through multi‑vCPU options like 2–4 vCPU with 8–12 GiB memory.
  • Regional egress pricing remains tiered; North America/Europe start at low cents per GB with included allotments.

Worked example: take a bursty API from red to green

Scenario: a multi‑tenant analytics API runs on 1 vCPU containers with 6 GiB RAM. Traffic spikes at the top of the hour for 6–8 minutes; average CPU over the hour is 22% but peaks hit 75%. Previously you provisioned for the worst case and paid for it all hour.

With active CPU billing, you pay for ~22% of the hour’s vCPU‑seconds. Push further by:

  • Reducing idle timeout to 30s so instances sleep between spikes.
  • Batching writes and compressing JSON to cut CPU per request.
  • Running a single warm instance per region for cache priming; allow others to scale to zero.

Teams see 30–60% CPU line‑item reductions with changes like this—often more if they also clean up N+1 queries and templating loops.

Guardrails: observability and limits

Billing by active CPU doesn’t remove the need for guardrails. Set per‑request CPU ceilings, clamp concurrency, and watch your 95th percentile CPU per instance type. In practice, we track three dials weekly: average vCPU‑seconds per request, warm‑start ratio, and idle time per instance. When any drifts, we fix it in the next deploy.

Operational risks and edge cases

  • Hot caches hide problems. Cold starts after deploys can spike CPU if you don’t warm key paths. Automate warming via smoke tests or synthetic traffic.
  • Chatty east‑west traffic. If containers call each other frequently, collapse hops via Service Bindings or move simple compute to the Worker to cut wakeups.
  • Long‑running CPU jobs. For video or ML, consider chunking, streaming, or offloading to batch systems that price better at sustained CPU.

What to do next

  • Run the 90‑minute review and ship one instance‑type change plus one timeout change this week.
  • Add vCPU‑seconds per request to your dashboards so you can spot regressions fast.
  • Move lightweight middleware to the edge Worker to reduce container wakeups.
  • Schedule a focused audit if you’re on a deadline or lack bandwidth.

If you want a second set of eyes, our team at ByBowu helps product and platform leads ship faster and run cheaper. See a snapshot of how we work in what we do and browse real engagements in our portfolio. For teams already standardizing on Cloudflare, our CDN resilience playbook pairs well with this cost focus so you don’t trade savings for stability. When you’re ready, drop us a line via contacts.

FAQ for finance and leadership

How should we forecast costs for 2026?

Model CPU as variable with utilization bands, not as fixed provisioned cost. For each service, estimate low/medium/high utilization scenarios (20/50/80%) and run vCPU‑seconds math against traffic forecasts. Treat memory, disk, and egress as largely unchanged.

Will our SLOs suffer if we let instances sleep?

Not if you tune it. Keep one warm instance in regions serving steady traffic, and shorten timeouts only on spiky paths. Monitor p95 latency before/after; in most APIs, savings outstrip the small cold‑start tax.

Should we move CPU‑heavy jobs off Containers?

Sometimes. If your workloads are sustained CPU burners, compare options that discount long, continuous CPU or support specialized accelerators. Keep request/response logic at the edge and send the heavy lifting to a purpose‑built lane.

Final thought

Cloud pricing tweaks usually feel like a moving target. This time, the target favors smart engineering. The move to active CPU billing rewards teams that right‑size instances, encourage sleep, and keep compute focused. Grab the easy wins this week, then layer in better batching, caching, and orchestration. Your bill—and your pager—will thank you.

Developer dashboard with CPU utilization improvements
Written by Viktoria Sulzhyk · BYBOWU
2,568 views

Get in Touch

Ready to start your next project? Let's discuss how we can help bring your vision to life

Email Us

[email protected]

We'll respond within 24 hours

Call Us

+1 (602) 748-9530

Available Mon-Fri, 9AM-6PM

Live Chat

Start a conversation

Get instant answers

Visit Us

Phoenix, AZ / Spain / Ukraine

Digital Innovation Hub

Send us a message

Tell us about your project and we'll get back to you

💻
🎯
🚀
💎
🔥