Design ★ 17,195

capacity-planner

Use when an ops leader (Director of CX, Head of Support, VP Ops, Head of BizOps, Head of IT ops, Head of Finance ops) is sizing ops capacity, building a headcount plan, modeling utilization risk, planning Q3 capacity or annual support capacity, or designing CS coverage — and needs Erlang-C queueing math, P90 demand sizing, shrinkage-adjusted FTE, manager-trigger thresholds, and a quarterly hiring

cd ~/.claude/skills
git clone https://github.com/alirezarezvani/claude-skills.git claude-skills

capacity-planner

Sizing tool for ops teams that handle queued work — Support, CX, Customer Success, BizOps, IT ops, Finance ops. Built on Erlang-C queueing theory, Little’s Law, and the operational-leadership canon (Fournier, Larson, Cleveland, Reinertsen). Deterministic, stdlib-only, no LLM calls.

Purpose

You are an ops leader sized 15 → 35 with no idea how the 35-person org will actually behave at peak load. Or you are at 88% utilization and SLA is starting to slip. Or you have a hiring budget approved and need to sequence it across four quarters without burning out the existing team. This skill answers those questions with arithmetic, not vibes.

It produces three artifacts:

  1. Capacity sizing at 70/80/90% utilization against P50/P90/P99 demand, with P(SLA breach) at each point and a SAFE/WATCH/AT_RISK/CRITICAL risk band.
  2. Utilization health at the per-member traffic-light level plus a team verdict (HEALTHY/SQUEEZED/OVERLOADED/UNBALANCED).
  3. 12-month quarterly hiring plan accounting for ramp curves, attrition, QoQ demand growth, and span-of-control manager triggers.

When to use

  • Annual ops capacity planning (October-November for the following fiscal year).
  • Quarterly re-sizing if demand changed >15% or attrition spiked.
  • Pre-budget defense — the math that justifies the headcount ask to your CFO.
  • Diagnostic when an ops team is missing SLA and you need to know whether it’s a sizing problem, a process problem, or a bottleneck problem.
  • M&A / new-segment launch modeling — sizing a new team or combined org.

Workflow

  1. Intake demand. Pull P50/P90/P99 daily ticket/case volume from your work system (Zendesk, Intercom, JSM, ServiceNow, Salesforce). If you only have averages, stop and pull the distribution. Single- point demand estimates are the most expensive anti-pattern in ops.
  2. Model throughput. Run capacity_modeler.py with your demand, AHT, SLA target, current FTE, and shrinkage. Use --profile for your function (support / cx / bizops / finance-ops / it-ops). Read the 80%-utilization row — that’s your sizing point.
  3. Flag utilization risk. Run utilization_analyzer.py against your current team’s actual utilization data. Anyone >85% sustained is a throughput-collapse risk per Reinertsen. Spread >30 percentage points across team means UNBALANCED — fix that before hiring.
  4. Sequence hiring. Run hiring_sequencer.py with current FTE, target EOY, ramp time, attrition, and growth. It will front-load hires (Q1 35%, Q4 15%), apply ramp curves, and trigger a manager hire when span of control crosses 7 ICs/manager.
  5. Walk the Forcing-question library (see below). One question at a time. Do not skip ahead. Answers must be written down before you commit the plan.

Scripts

  • scripts/capacity_modeler.py — Erlang-C sizing with shrinkage adjustment and P50/P90/P99 breach probabilities. --profile for industry defaults.
  • scripts/utilization_analyzer.py — per-member traffic-light + team-level health verdict with variance detection.
  • scripts/hiring_sequencer.py — 12-month quarterly plan with ramp, attrition, growth, max-hires-per-quarter constraint, and manager-trigger logic.

All three accept --input <path> (JSON), --output {markdown,json}, --sample (built-in example), and --help. Stdlib only.

References

  • references/queueing_theory_canon.md — Erlang, Little, Hopp & Spearman, Reinertsen, Kingman, Cleveland, ITIL, Armony et al. (8 sources). The math.
  • references/ops_workforce_planning_canon.md — Fournier, Larson, Google SRE Workbook, Frei, Lawler, Bersin, Gartner, Grove (8 sources). The people factors.
  • references/capacity_anti_patterns.md — 11 named anti-patterns with cited sources, tool guards, and the meta-discipline that Lencioni + Goldratt + Christensen impose. (8+ named sources.)

Assets

  • assets/capacity_brief_template.md — 20-minute fill-out template with JSON skeletons for all three tools and an output checklist.

Assumptions

This skill assumes:

  • Work is queued (tickets, cases, work items) — not project-style. If your team’s work isn’t queued, this is the wrong skill.
  • Demand has a stationary-enough distribution within a quarter. Step-changes (new product launch, M&A, regulatory shift) require re-running mid-quarter.
  • You have at least 90 days of historical demand data to compute P50/P90/P99. If not, generate the distribution from your sales / user-base forecast first.
  • Service is single-class within a queue. If you have hard priority tiers (P1/P2/P3 with class-specific SLAs), model each as a separate queue and sum.
  • Channels are modeled coherently. Multi-channel teams use the appropriate --profile with built-in shrinkage premium.

Anti-patterns

See references/capacity_anti_patterns.md for the full taxonomy with sources. Top eight:

  1. Plan-to-100%-utilization (Reinertsen Principle 12)
  2. Treat-ramp-as-instant (Larson)
  3. Ignore-attrition-in-12-month-plan (Bersin)
  4. Hire-ICs-forever-with-no-manager-trigger (Fournier)
  5. Size-to-P50-demand-only (Cleveland)
  6. No-shrinkage-adjustment (Cleveland, SRE Workbook)
  7. Single-channel-model-for-multi-channel-work (Gartner, Kingman)
  8. No-surge-plan-for-P99-events (Hopp & Spearman, Reinertsen)

Distinct from

  • c-level-advisor/vpe-advisor measures engineering throughput via DORA 4 metrics, story points, deployment frequency, and cycle time bottlenecks. It is for engineering teams shipping code. This skill is for ops teams handling tickets/cases. Different unit of work, different math (Erlang-C vs. DORA), different bottleneck (queueing-blind staffing vs. WIP + lead time).
  • c-level-advisor/chro-advisor does strategic workforce planning (1-5 year capability portfolios, talent supply, leadership succession). This skill does operational 0-12 month capacity sizing against demand. Per Lawler: conflating them gets you hired into the wrong jobs.
  • project-management/* tracks delivery throughput on projects (Jira velocity, sprint capacity). This skill sizes around steady- state queued work.
  • Sibling process-mapper finds the bottleneck. This skill sizes the team around a known bottleneck. Order of operations: process-mapper first → capacity-planner second. Hiring around the wrong constraint wastes the hires.
  • business-growth/cs-coverage (if it exists) sizes Customer Success coverage by ARR/CSM ratio and segment. This skill sizes by queued work volume (tickets, cases, escalations). For a CS team that handles both relationship work AND a ticket queue, run both.

Forcing-question library (Matt Pocock grill discipline)

Discipline: walk these one at a time. Do not skip ahead. Answers must be written down. If you can’t answer one, that is your next investigation.

Q1 — “What is your bottleneck, and have you confirmed it empirically?”

Recommended answer: a named, measured stage in the workflow with queue-time data showing where work waits. Not a vibe. Not “escalations take too long”. An actual measured queue.

Why it’s the first question: Goldratt (The Goal, 1984) — every system has exactly one binding constraint at a time. Sizing around the wrong constraint wastes hires entirely. If you do not know your bottleneck, run process-mapper BEFORE this skill.

Canon: Eli Goldratt, The Goal (1984); Reinertsen, Principles of Product Development Flow (2009).

Q2 — “What service trade-off are you accepting?”

Recommended answer: a written, explicit choice — fast vs. empathetic, broad vs. deep, low-cost vs. high-quality. Frances Frei is unambiguous: you cannot win all four. The team that tries wins zero.

Why it matters: AHT, SLA, and shrinkage inputs are the operational expression of this trade-off. If they don’t agree (e.g., you set AHT for “empathy” but SLA for “speed”), the plan is internally inconsistent.

Canon: Frances Frei & Anne Morriss, Uncommon Service (HBR Press, 2012).

Q3 — “What’s your demand P90, and what’s the gap to your P99?”

Recommended answer: two specific numbers from the last 90 days of data, with the calendar context of each (e.g., “P90 was 480 tickets/day on normal Tuesdays; P99 was 720 on the day after the November release”). A team sized to P50 misses SLA half the time. A team sized to P99 overstaffs by 30-50%. P90 is the right operating sizing point per Cleveland.

Canon: Brad Cleveland, Call Center Management on Fast Forward (4th ed., 2019); A.K. Erlang, The Theory of Probabilities and Telephone Conversations (1909).

Q4 — “At your planned utilization, what is P(SLA breach) at P90 and at P99?”

Recommended answer: two probabilities, computed (not guessed) from Erlang-C with your specific N, AHT, and SLA target. If P(breach at P90)

10% you are understaffed at the sizing point. If P(breach at P99) > 50% you have no surge plan and the next peak event will be visible to the CEO.

Canon: Erlang (1909); Hopp & Spearman, Factory Physics (3rd ed., 2008), VUT equation.

Q5 — “Have you budgeted replacement hires for the attrition you’ll see this year?”

Recommended answer: yes, with a specific number. At 30% annual attrition (Bersin BPO midpoint), a 20-FTE team loses ~6 people this year. If your “add 5 net” plan is actually a “hire 11” plan, the recruiting volume changes drastically. Anti-pattern #3.

Canon: Bersin/Deloitte talent benchmarks (2015-2023); Edward Lawler, Strategic Workforce Planning (USC CEO, 2008).

Q6 — “When does span of control trigger a manager hire, and who is the candidate?”

Recommended answer: a specific quarter (from hiring_sequencer.py) and at least one identified candidate (internal lead or external hire). Past 7 ICs/manager, 1:1s degrade, feedback cycles slip, attrition climbs. Past 10 you have a coverage crisis. Hire the manager BEFORE crossing 10, not after.

Canon: Camille Fournier, The Manager’s Path (O’Reilly, 2017), ch. 5; Andy Grove, High Output Management (1983).

Q7 — “What is your surge plan for the P99 day?”

Recommended answer: an explicit, documented plan — overflow tier, BPO contracted capacity, on-call rotation, executive escalation tree, OR a written degradation contract that says “on P99 days we extend SLA to X minutes and notify customers proactively”. If the answer is “we’ll figure it out”, the P99 day is a fire visible to the board.

Canon: Hopp & Spearman, Factory Physics (2008); Reinertsen (2009) on capacity-margin discipline.


Walk these seven in order. One at a time. Write the answers down. The plan you submit is only as defensible as your answers to these seven questions.

Similar skills

algorithmic-art Design

Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.

anthropics/skills ★ 146,722
brand-guidelines Design

Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.

anthropics/skills ★ 146,722
claude-api Design

Build, debug, and optimize Claude API / Anthropic SDK apps. Apps built with this skill should include prompt caching. Also handles migrating existing Claude API code between Claude model versions (4.5 → 4.6, 4.6 → 4.7, retired-model replacements). TRIGGER when: code imports `anthropic`/`@anthropic-ai/sdk`; user asks for the Claude API, Anthropic SDK, or Managed Agents; user adds/modifies/tunes a C

anthropics/skills ★ 146,722
frontend-design Design

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, artifacts, posters, or applications (examples include websites, landing pages, dashboards, React components, HTML/CSS layouts, or when styling/beautifying any web UI). Generates creative, polished code and UI design that avoids generic AI aesthetics.

anthropics/skills ★ 146,722
mcp-builder Design

Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).

anthropics/skills ★ 146,722
slack-gif-creator Design

Knowledge and utilities for creating animated GIFs optimized for Slack. Provides constraints, validation tools, and animation concepts. Use when users request animated GIFs for Slack like "make me a GIF of X doing Y for Slack.

anthropics/skills ★ 146,722
More in Design →