Don't just limit WIP - optimize it

Don't just limit WIP - optimize it

 m

You might be not so advanced with your Kanban. Perhaps, you only visualize your work and your team doesn't bother with WiP limits. Visualization is useful as long as it is fresh, in that one receives signals from a well-designed board about where the flow of the delivery of value is getting impeded. 

The addition of WiP limits drives a dramatic improvement in the performance of teams. One caveat is that better results get attained if work in progress is not allowed to age. Another caveat is that setting too low a WiP limit is something to watch, by monitoring statistical performance as well as "how it feels"; having too low a WiP limit is rarely an issue for knowledge work though. Why does limiting WiP reduce cycle time? Because cycle time reduces when "wait time" and "task switching" time are squeezed out, driving has a better correlation between cycle time and estimated effort. We begin to realize that maybe estimation is not needed, as completing items means more

There are examples that go to 30 days cycle time for 2-3 days of work, and at this point the repsonse is usually if only we had more team members. The only intervention we make is to apply WiP limits that create realistic tension to collaborate to finish work
There are examples that go to 30 days cycle time for 2-3 days of work from London, UK and Hilversum, NL, and at this point, the response is usually "if only we had more team members". The only intervention we make is to apply WiP limits that create realistic tension to collaborate to finish work. This is because of Little's Law, proven in 1961 and revisited 50 years later, and still true today.

Little's Law, when stated in terms of departures instead of arrivals, requires some assumptions (see Daniel Vacanti's video at https://vimeo.com/52683659).

Little's Law when stated in terms of departures (note underlying assumptions in Daniel's video)
Little's Law when stated in terms of departures (note underlying assumptions in Daniel's video). Lead Time = End to End Customer Cycle Time or Selected to Done for out simulations)

In this video (which I cannot recommend highly enough along with his Titanic video at https://vimeo.com/239539858), Daniel refers to Little's Flaw, that is assuming Little's Law will work when the underlying assumptions to make it work are not satisfied. The primary assumption to my mind is not allowing ageing of work in progress. That is, once it starts you should aim to finish the item as soon as possible without ageing more than other items in progress. 

Teams eliminated Little's Flaw by having explicit policies in their workflow including watching/preventing the ageing of work in progress. These "Little's Flaw killer-policies" allowed Little's Law came fully into effect. Our teams became much more predictable, even with very complicated work, with dependencies, and (shock horror:)) ordering features with cost of delay (features are Product Backlog Item bundles in this case). The trick to avoid ageing, is regardless of how the Product Backlog Items (PBIs) were ordered, once they get selected (customer commitment) we should finish them as quickly as possible.

​  Cumulative flow diagram from May 2018 in Cincinnati - slope of top line (blue) into Selected (Sprint Backlog) is average arrival rate, Scrum batch replenishment is obvious; the slope of bottom (green) line is the average delivery rate.  ​
Cumulative Flow Diagram(CFD) shows the amount of work arriving, at various stages of in progress, and done. This CFD is for a team in Cincinnati, from one of 90 or so simulations that pretty much went the same way, although this team was quicker to optimize than most others. Sprint length was 6 days. Notice the batch replenishment every few days, ok as long as the Sprint Goal is not compromised, although smoother replenishment would be better. The slope of top blue line is average arrival rate into "selected" (sprint backlog). Slope of top red line is average arrival rate into "build". Slope of top green line is average departure rate, not totally in line with arrival rate. I was happier that slope of top red line was towards the end roughly tracking the slope of the top green line from day 13 onwards, indicating a movement to "stop starting and start finishing" behaviour. 
Cycle Time Scatter Plot - notice the dramatic improvement in cycle time predictablility
Cycle Time Scatter Plot for Product Backlog Items - this is the usual pattern - notice the dramatic improvement in cycle time predictability. We can see evidence of flow debt (metaphorically borrowing work time from Peter to pay Paul sacrificing Peter's cycle time) earlier in the chart, debt which was repaid by longer cycle times later. It takes a number of "days" to bleed a clogged system before cycle times find a new norm. This was a better example than most simulations, with less work-in-progress to bleed than usual. 
Extra graphs
Notice the typical pattern for these Featureban simulations - initial honeymoon period without historical work causing distractions, team still ends up backing up with an excessive amount started (and often impeded) work,  after WiP goes down, cycle times reduce, and then a lag to improved delivery rate (would have continued to increase in this case to 3 to 4 per day if I had let the simulation continue like in previous simulations). Notice also the tightening of cycle time around 2 days on the distribution histogram. Monte Carlo using random number ranges from this simulation would have produced good results as at least one item was delivered per measured time period (days). A blog post on Monte Carlo will come later.

I used Mike Burrows' Featureban (https://www.agendashift.com/featureban) and Andy Carmichael's additional design, with some tuning. "Daily", team members flipped coins. Heads or Tails, heads was a good day, tails not so good, and people learned to benefit from one of the rules for tails that one could move an item from selected into build. On a given day a PBI could move into selected and build with heads or tails, and then with two heads, could move through Build and Test into Done. In this case, we know that each PBI takes two to three days if there is no waiting. We seemed to get reliable with 2 day cycle time towards the end and it seemed pointless to continue as it was obvious from that point we would stay consistent as long as we did not change policies.

There were about 90 advanced Featureban simulations in the US and across the EU, and the evidence is clear (within the constraints of this series of experiments). A limit of 6 work in progress items for say 9 developers leads to more collaboration. I noticed that a WiP limit of 2/3 to 3/4 of the number of development team members was the "sweet spot". When work in progress was not allowed to age by consciously selecting the oldest-in-progress item first (or even for example "if any item is older than 4 days let's move it"), the dramatic reduction in cycle time was causal, not just correlated. When WiP limits created more pull, teams focused on the rightmost part of the Kanban board first so pull signals would come from right to left, with no work getting pushed into the 1st in process column (Build) until the team is ready.

I had a team once who removed the WiP limits at this point and immediately cycle times increased in an identical trend to before. I had only a few teams with "fake WiP limits", limits that were so high they were not creating tension to finish work, and those teams had the same trends as push systems, cycle times that were getting longer and longer.

Mike Burrows' Featureban, also recording day moves to columns for more interesting graphs

Some teams understood that a really frugal Product Owner can appear to avoid the need for teams to have the discipline to limit their own WiP. You can see that even in this case in Cincinnati with frugal under-commitment that cycle times still got longer and longer. I saw only one case (in Leeds, UK) where the product owner was so mean with the work that the Development Team did not see the need for WiP limits (and the need for that discipline seemed to have been taken care of by the Product Owner only dropping work in when she was confident the team would move it into build that day). However, there are a couple of issues with this:

  • The Development Team is not self-managing in that it is not selecting work in consultation with the Product Owner, to support the Sprint Goal negotiated between the Product Owner and the Development Team. That is the Product Owner is acting more like Project Manager 2.0, and there is no project manager role in Scrum.
  • It is more like just-in-time replenishment to the extent that the Development Team might have come up with a better suggested order of PBIs for the Product Owner, particularly in a multiple teams Scrum scenario. Refer to Barry Overeem's post Myth 5 - In Scrum, the Product Backlog is prioritized (http://www.barryovereem.com/myth-5-in-scrum-the-product-backlog-is-prioritized/). The Development Team's ability to decide how to do the work where there are dependencies, or the need to acquire more knowledge/skills first, is impeded somewhat.

Meanwhile, check out Daniel Vacanti's case study for Kanban at Siemens Health Services at https://www.infoq.com/articles/kanban-siemens-health-services, where reduction of end to end customer cycle time for complex work was validated.

You will miss a trick if you do not optimize to flow of work via WiP limits. Don't be fooled by temporarily rising throughput without WiP limits. WiP is a leading indicator of cycle time. Cycle time can be a leading indicator for throughput but there can also be a tradeoff between cycle time and throughput. For example, when you deliberately incur "flow debt" (borrowing work time from other items to prioritize an item(s) in progress, thus causing relative ageing and disruption the assumptions for Little's Law - see Daniel Vacanti's Little Flaw video at https://vimeo.com/52683659).

Another tradeoff example that Andy Carmichael gave me is when a system is overburdened reducing WIP increases throughput (by reducing overhead tasks and failure demand) and reduces cycle time. When a system is underutilized reducing WIP will *reduce* throughput but have no effect on cycle time. In between, there's a choice to keep cycle time low by reducing WIP, or allow a higher WIP so processes are not starved of work. In theory, this should increase throughput.  

Visualize your work for sure, but go the extra mile to optimize the flow of value. It leads to a calmer and most sustainable work environment, and I'd be surprised if it doesn't improve quality.

I am a co-author of Kanban Guide at https://kanbanguides.org. If you’re interested in learning more about Kanban in another context, visit KanbanGuides.org.