On December 2, 2025, GitHub began removing legacy $0 budgets for organizations and enterprises, changing how GitHub Copilot premium requests are controlled and billed. If your team relies on Claude, Gemini, or agent features, this shift can quietly turn blocked usage into paid usage—unless you adjust policies and budgets now. (github.blog)

Illustration of an engineering org dashboard with AI usage and billing alerts

What changed on December 2, 2025?

Historically, many orgs had a default $0 Copilot premium request budget. When a developer hit their monthly allowance, additional premium requests were blocked. GitHub is removing those $0 account‑level budgets for Enterprise and Team accounts created before August 22, 2025. After this change, whether overage usage is allowed depends on your “Premium request paid usage” policy—not a static budget. (github.blog)

GitHub’s docs and changelog note the phase‑in date and the policy pivot clearly: if paid usage is enabled, premium requests over the allowance can be billed; if it’s disabled, usage is blocked. Many admins will need to revisit defaults they set months ago. (docs.github.com)

What exactly counts as a premium request?

Premium requests are consumed when developers use certain models or features beyond the included models. For paid plans, GPT‑4.1 and GPT‑4o are included and do not consume premium requests; higher‑end or specialized models do, and each model has a multiplier (for example, Claude Opus can count as 10 requests per prompt). This matters because one long code review with an expensive model can burn a week’s allowance. (docs.github.com)

Common multipliers today include 1× for general‑purpose models like Claude Sonnet 4 or Gemini 2.5 Pro, discounted 0.25–0.33× for lightweight models, and up to 10× for top‑tier reasoning models. Check your tenant’s current model list—the menu evolves and multipliers can change. (docs.github.com)

How many premium requests come with each plan?

The current baseline allowances are straightforward: Free includes 50 premium requests; Pro includes 300; Pro+ includes 1,500. Business and Enterprise typically carry 300 and 1,000 premium requests per user per month, respectively. These reset on the first of each month. If you need more, you can let users go past their allowance and pay per request. (github.com)

Overages are priced at $0.04 per premium request, multiplied by the selected model’s rate. That’s manageable at small volumes, but it adds up quickly with 10× models or automated reviews running at scale. (docs.github.com)

Why this matters to engineering and finance

Here’s the thing: the old $0 budget guardrail was blunt but effective. The new approach shifts control to policies and per‑tool SKUs. That’s better for fine‑tuning—but risky if your defaults now permit spend without caps. In practical terms, platform leads need to re‑assert budget boundaries, and CFOs will want cost visibility tied to features and models, not just seats. GitHub has also been expanding supported models and features—another reason usage can spike if policies aren’t updated. (arstechnica.com)

A 90‑minute triage to stop surprise bills

If you do nothing else today, run this one‑hour‑and‑a‑half checklist with your platform engineer and your billing admin:

1) Verify your current state (20 minutes)

Download usage for the last two months and identify who hit the ceiling and which features/models consumed premium requests. In parallel, confirm whether your “Premium request paid usage” policy is enabled and whether any budgets remain configured. (docs.github.com)

2) Decide your default stance (10 minutes)

Choose one of two defaults per environment (prod vs. sandboxes): block overages globally, or enable paid usage but set a strict monthly cap. Tip: consider blocking in production until you model the impact of multipliers and only enabling paid usage for a small pilot group. (github.blog)

3) Set budgets and per‑tool rules (20 minutes)

Define a Bundled premium requests budget with a sensible cap. If your org uses multiple AI tools (for example, coding agent or Spark), ensure the cap applies across tools to avoid whack‑a‑mole overages. (docs.github.com)

4) Right‑size allowances by plan (15 minutes)

Move heavy users of code review or agents to Enterprise (1,000/user) and keep light chat users on Business (300/user). Align this to team goals: velocity, defect reduction, or PR throughput. (git666.top)

5) Lock in model strategy (15 minutes)

Default to included models for general chat and completions. Permit 1× models for targeted tasks (test generation, refactors). Restrict 10× models to short bursts with explicit approvals. Document multipliers in your internal wiki so devs know the “cost per click.” (docs.github.com)

6) Communicate and monitor (10 minutes)

Post a short Loom or Slack announcement explaining what’s changing, where to see remaining premium requests, and who approves overage unlocks. Encourage teams to check the Copilot status icon and usage dashboards weekly. (github.blog)

A practical rollout in 7 days

Here’s a pragmatic playbook we’ve used with clients to stabilize costs without stalling momentum:

Day 1: Baseline

Export the last 60 days of premium request usage. Segment requests by model and feature (chat vs. agent vs. code review). Identify your top 10% consumers and the repos where they work. (github.blog)

Day 2: Policy hardening

Set “Premium request paid usage” to Disabled at the org level. Create a short‑term exception group for pilot users who can request paid usage via a ticket. (docs.github.com)

Day 3: Model tiers

Publish a model access matrix: Included models for all; 1× models for seniors and PR owners; 10× models only via pilot group with a 30‑minute limit. Include examples of when to switch models. (docs.github.com)

Day 4: Seat right‑sizing

Map Business vs. Enterprise seats to the real workload: reviewers and maintainers often benefit from the 1,000/user Enterprise allowance; occasional chat users can sit on Business or Pro+. (git666.top)

Day 5: Budgets and alerts

Set a Bundled premium requests budget equal to 20–30% of last month’s total usage. Enable weekly alerts to platform engineering and finance. If you expect a spike (e.g., refactor sprint), pre‑approve a temporary increase. (docs.github.com)

Day 6: Developer workflow tweaks

Encourage defaulting to included models for exploratory chat. For code review, scope prompts narrowly and avoid running long, multi‑file analyses with high‑multiplier models unless necessary. Teach devs to watch the usage indicator in IDE. (github.blog)

Day 7: Review outcomes

Compare trendlines: request volume, cost per PR, PR cycle time. If velocity holds, keep policies. If quality or speed dip, loosen selectively (e.g., open 1× models on critical repos while keeping 10× locked). (github.blog)

Data points worth knowing

• Allowances reset on the first of each month—plan your heavy reviews just after resets. • Pro includes 300 and Pro+ includes 1,500 monthly premium requests; Business and Enterprise commonly include 300 and 1,000 per user. • Overages bill at $0.04 per request before multipliers. • Claude Opus can consume 10× a single request; lightweight models can be 0.25–0.33×. • The policy that governs paid usage is enabled by default in many orgs—verify it. (github.com)

Risks, edge cases, and gotchas

Multiple enterprises and orgs: users with licenses from more than one billing entity must choose “Usage billed to” correctly or premium requests won’t apply as intended. Spark and other tools may have fixed rates that chew through allowances faster than chat. And of course, the model catalog is a moving target—what’s 1× today might shift, so bake documentation refreshes into your monthly ritual. (docs.github.com)

Security teams should also revisit policies for model access. If your code review prompts include sensitive context, ensure your data handling posture matches your compliance stance. For broader hardening guidance, see our take on recent framework issues like React Server Components vulnerabilities and what fast‑moving teams did to contain blast radius.

Let’s get practical: a simple spend model

Start with last month’s request count per user. Multiply by the average model rate you intend to allow (for many teams, a blended 0.7–1.2× is realistic if you limit 10× use). Add 10–20% for experimentation. Compare the total against your bundled budget and the per‑user allowance across plans. This bottom‑up model usually beats a seat‑only forecast, especially when code review usage spikes during release crunches. (docs.github.com)

Where this is headed

GitHub is clearly moving toward granular, SKU‑level controls per Copilot feature, which is good news for governance and bad news for set‑and‑forget budgets. Expect more premium‑eligible tools and more model choice. That’s powerful—if you treat model selection like any other production knob with cost, latency, and quality trade‑offs. (github.blog)

Team reviewing model access policies and budgets on a screen

How we can help

We’ve been rolling out usage policies, budgets, and developer enablement for AI coding tools with product teams and platform groups. If you want a tight plan that preserves velocity, start by skimming our piece on measuring Copilot impact with the right metrics, then browse what we do for engineering leaders. If you need hands‑on help to set policies and budgets, see our services and drop us a note via contacts.

What to do next

1) Audit your current policies and budgets today. 2) Set your default stance: disable paid usage or cap it tightly. 3) Publish a model matrix and coach developers on multipliers. 4) Right‑size plans by role; reserve high‑multiplier models for short, high‑value work. 5) Review usage weekly in December while your teams adapt.

Editorial chart showing reduced AI overage costs after policy changes

GitHub Copilot Premium Requests: Avoid Surprise Bills