Run a team at 80% of its capacity and it will deliver faster than the same team run at 100%. That is not a motivational slogan. It is arithmetic, and it has been arithmetic since a mathematician named John Kingman worked out the formula for waiting lines in 1961.
Most managers do the opposite. When a person has open time on their calendar, it reads as waste. So we fill it. Another project, another ticket, one more account. The logic feels airtight: idle time is money left on the table, so load everyone to the brim. Then delivery dates start slipping, everything sits in a queue, and nobody can explain why a team that is clearly working flat out keeps missing.
I watched this happen for years running IT operations for a regional telecom in Saskatchewan. Our network team was booked solid. Every engineer had a full plate, utilization on the dashboards looked fantastic to my boss, and change requests took six weeks to clear. When I later did fractional COO work for smaller companies, I saw the identical pattern in marketing teams, finance teams, support queues. The busiest teams were almost always the slowest. It took me an embarrassingly long time to understand that the two facts were connected, and that the busyness was causing the slowness.
The math nobody teaches new managers
Here is the relationship, stripped to its bones. As a team’s utilization climbs, the time work spends waiting in the queue does not rise gently. It rises exponentially. Every extra percentage point of utilization adds more delay than the point before it.
The clearest way I have seen it explained comes from the psychological safety researchers who reduced Kingman’s formula to a proxy you can do in your head: expected wait time is roughly the percentage of time you are busy divided by the percentage of time you are free.
Run the numbers and the curve jumps out at you:
- At 50% utilized, the proxy is 50 divided by 50, which is 1.
- At 80% utilized, it is 80 divided by 20, which is 4.
- At 90% utilized, it is 90 divided by 10, which is 9.
- At 99% utilized, it is 99 divided by 1, which is 99.
- At 100%, you are dividing by zero, and the wait is effectively infinite.
Look at the jump from 80% to 90%. You added ten points of utilization and the wait time more than doubled. That is the whole problem in one line. The gain you think you are getting by filling that last slice of someone’s week is dwarfed by the delay you create everywhere else.
The engineers who study this in factories put real minutes on it. In one worked example from lean operations, a workstation running at 80% utilization, with normal variation in how long jobs take, produced an average wait of about 22 minutes per job. Push that station toward 90% and the wait balloons. The station is not broken. It is not slow. It is simply too full to absorb the ordinary bumps in the work.
Why full teams feel slow even when everyone is fast
The reason the curve bends is variability, and knowledge work is drowning in it. A support ticket that usually takes twenty minutes occasionally takes three hours. A “quick review” turns into a rewrite. Someone gets sick, a client escalates, a dependency you were counting on slips. Kingman’s formula multiplies utilization by variability, which means the more unpredictable your work is, the more cruelly high utilization punishes you.
When a team has slack, that variation gets absorbed. The unexpected three-hour ticket lands, someone picks it up, and the queue barely notices. When a team is at 100%, there is no one free to absorb it. The ticket waits. The thing behind it waits. The thing behind that waits. One bad afternoon ripples through the whole pipeline for days. This is exactly the dynamic I wrote about in cross-team dependencies, except here the drag is coming from inside your own team, hidden by a dashboard that says everyone is productively occupied.
There is a second cost that the queueing math does not even capture: the tax of switching. A person managing eight active pieces of work is not doing eight things. They are spending a chunk of every day remembering where they left off, re-reading threads, and context-switching. Asana’s research on how knowledge workers spend their time found that roughly 60% of the average workday goes to coordination and “work about work” rather than the skilled work people were hired for. Load the schedule to the brim and you are not adding output. You are adding overhead. I dug into the personal side of this in cognitive load for managers; the team-level version is the same trap scaled up.
What the delivery research says
This is not just factory theory. The team behind the DORA research program, which has studied software delivery across thousands of organizations, reaches the same conclusion from the other direction. Their finding on limiting work in progress is blunt: when teams try to multitask across too many assignments, tasks take longer and people burn out. Constraining the amount of work in flight reduces lead times and improves throughput. Less work started at once means more work finished, sooner.
That sentence deserves rereading because it violates every manager’s gut instinct. Starting less gets you more. It works because unfinished work is not value. A project that is 90% done delivers nothing. Five projects each 90% done deliver nothing five times over, while consuming five people’s attention. Finish two of them and you have shipped two things while the other three wait their turn, and those three move faster because the people on them are no longer splitting their heads five ways.
What to actually do about it
You do not need to calculate coefficients of variation to manage this. You need a few habits.
Plan to roughly 80%, not 100%. When you build the quarter or the sprint, leave a fifth of the team’s capacity genuinely open. Not “open until something comes up,” which is code for zero. Actually open. That slack is what absorbs the escalations, the sick days, and the work you cannot yet see. Managers resist this because 80% looks like you are paying for idleness. You are not. You are paying for speed and for the ability to respond when it matters. Full teams have no ability to respond, which is why they feel brittle.
Limit how much is in flight, not just how much is assigned. There is a difference between the work on someone’s list and the work they are actively touching this week. Cap the second number. A useful rule of thumb from flow practice: the number of items actively “in progress” should not exceed the number of people who can move them. If you have four people who can do the work, do not have twelve things half-started. This is where the throughput actually lives, and it connects directly to how you handle work intake at the front door.
Watch queue time, not activity. The dashboards that got me into trouble measured how busy people were. The number that predicts whether you hit your dates is how long work waits before someone starts it. Track the age of items in your queue. When things start aging, the answer is almost never “work harder.” It is “start fewer things” or “protect more slack.”
Treat a permanently maxed-out person as a risk, not a hero. The engineer who is always at 100% is not your most reliable resource. They are your most fragile one, because they have no capacity to absorb a surprise, and surprises are guaranteed. I have made the mistake of rewarding the always-full person and quietly wondering why work routed through them kept slipping. The slipping was the utilization. Related failure mode: when that maxed-out person is you, you have become the bottleneck, and the whole team waits on your queue.
Attack variability where you can. You cannot make knowledge work perfectly predictable, but you can reduce the swings. Clearer intake, better-defined tasks, fewer surprise reprioritizations, and cutting rework all shrink the variation term in the equation. Lower variation lets you safely run at higher utilization for the same wait time. It is the one lever that lets you have a little more of both.
The uncomfortable reframe
The instinct to fill every calendar comes from a good place. You are responsible for the payroll and you want to see it put to work. But utilization is not output, and a team with no slack is not efficient. It is a team that has traded away its ability to deliver on time and its capacity to handle anything unexpected, in exchange for a dashboard that looks fully utilized.
The best-run operations I have been part of always looked slightly underloaded from the outside. There was room to breathe, and work moved through them fast and clean. The worst-run ones looked heroic. Everyone was slammed, nobody was idle, and nothing shipped when it was supposed to. Once you have seen the curve behind that, you cannot unsee it. The full team is the slow team, and the fix is to give it back some room.