<?xml version="1.0" encoding="UTF-8"?>







<rss version="2.0">
<channel>
  <title>Lasse&#039;s weblog</title>
  <link>http://radio.javaranch.com/lasse/</link>
  <description>Blurts on the Art of Software Development</description>
  <language>en</language>
  <copyright>Lasse Koskela</copyright>
  <lastBuildDate>Tue, 22 Apr 2008 04:04:57 GMT</lastBuildDate>
  <generator>Pebble</generator>
  <docs>http://backend.userland.com/rss</docs>
  <image>
    <url>http://pebble.sourceforge.net/common/images/powered-by-pebble.gif</url>
    <title>Lasse&#039;s weblog</title>
    <link>http://radio.javaranch.com/lasse/</link>
  </image>
  
  <item>
    <title>Planning Poker 101</title>
    <link>http://radio.javaranch.com/lasse/2008/04/22/1208837097457.html</link>
    
      
        <description>
          &lt;p&gt;

&lt;/p&gt;&lt;p&gt;
&lt;a name=&#034;entry_section_disclaimer&#034;&gt;&lt;b&gt;The Disclaimer&lt;/b&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;
Ever heard of planning poker? If you haven&#039;t, continue reading. If you know what planning poker is and don&#039;t feel like reading yet another intro, jump over to &lt;i&gt;The Beef&lt;/i&gt;. No, wait. I think you should read the whole thing. If for no other reason than to see where you and I differ in opinion or style. You might even be inspired to post a comment or a trackback or whatever folks nowadays do in the blogosphere.
&lt;/p&gt;

&lt;p&gt;
&lt;a name=&#034;entry_section_intro&#034;&gt;&lt;b&gt;The Intro&lt;/b&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;
Planning poker is a group estimation technique popularized by &lt;a href=&#034;http://www.mountaingoatsoftware.com/&#034;&gt;Mike Cohn&lt;/a&gt;. And it&#039;s focused around playing cards - hence the name. It&#039;s not really poker but you often see teams using a deck of playing cards just like the ones you use for ripping off your mates in &lt;a href=&#034;http://en.wikipedia.org/wiki/Texas_hold_&#039;em&#034;&gt;Texas Hold&#039;em&lt;/a&gt;. At gunpoint, I might identify one similarity between planning poker and hold&#039;em, though. You don&#039;t know what cards are the other players holding.
&lt;/p&gt;
&lt;p&gt;
Ok, enough with the mystery. Here&#039;s how it works.
&lt;/p&gt;
&lt;p&gt;
You&#039;re in a planning meeting for the next iteration. You have a bunch of features to implement and you need to figure out which or how many of them you can implement in your two-week iteration. There&#039;s plenty of things you and your team needs to estimate and a limited amount of time to spend on it. Let&#039;s say you&#039;re using &lt;a href=&#034;http://radio.javaranch.com/lasse/2008/04/17/1208381586654.html&#034;&gt;story points&lt;/a&gt;. So, you pull out a deck of cards.
&lt;/p&gt;
&lt;p&gt;
You deal out sets of six cards (A,2,3,5,8,K) to each team member - everyone gets an Ace and a King and cards of rank 2, 3, 5 and 8. Next, you name the feature you&#039;re estimating and maybe read out loud some details such as a short description, the acceptance criteria, or something like that. A brief discussion ensues where team members ask clarifying questions about the feature. Then, one by one, the team members place their hand on the table, holding one of their cards face down. Once everyone has their hand (and card) on the table, you count to three and turn the cards over. The rank your card has expresses your estimate for whatever you&#039;re estimating. The Ace is a &#034;1&#034;, the King is &#034;too big to estimate&#034;. Everything else is just that - the rank.
&lt;/p&gt;
&lt;p&gt;
Some teams find that plain old playing cards cramp their style and order specially designed planning poker cards from &lt;a href=&#034;http://www.mountaingoatsoftware.com/products/cards&#034;&gt;Mike Cohn&lt;/a&gt;, &lt;a href=&#034;http://www.crisp.se/planningpoker/&#034;&gt;Crisp&lt;/a&gt;, &lt;a href=&#034;http://www.nordija.com/en/PlanningPoker.html&#034;&gt;Nordija&lt;/a&gt; or &lt;a href=&#034;http://www.planningpokercards.com/&#034;&gt;PlanningPokerCards.com&lt;/a&gt;. Or, if you&#039;re working with &lt;a href=&#034;http://www.ri.fi/en&#034;&gt;us&lt;/a&gt;, just ask and we&#039;ll bring along a couple of decks or our cards. Our cards are obviously better looking than the rest but they all work quite nicely.
&lt;/p&gt;
&lt;p&gt;
As you look around the table and your team members&#039; cards, you see one of three things: a consensus, numbers all over the range, or something in between. If there&#039;s a consensus you write down the estimate - you all agree so there&#039;s no point in going through motions. In any other case, the highest and lowest estimate explain the rationale behind their estimate - why did they give that particular estimate? A discussion ensues and, after some 53 seconds of exchanging thoughts, people start putting their hands on the table again - with a card face down, indicating that they&#039;re ready to have another go at estimating whatever you were just estimating. The odds are, you now either get consensus or don&#039;t. (Yes, I did figure that out on my own.) If you get consensus, write down the estimate and move on. If you don&#039;t, have another go at arguing for the high/low estimate and re-estimate. If you still don&#039;t get consensus (or near consensus), you either take the highest estimate or the average. It doesn&#039;t really matter all that much which approach you take - just decide the protocol with the team before the planning session.
&lt;/p&gt;
&lt;p&gt;
There. You&#039;ve got all the items estimated and there&#039;s still 12 minutes before someone takes over the meeting room. And, lo and behold, those estimates included not just the input of Jack and Bill who are always keen to voice their opinion about anything and everything but also that of Steve and Melinda who usually keep their mouth shut while Jack and Bill argue about effort estimates. This planning poker stuff isn&#039;t just fast and lightweight. It&#039;s also helping our team commit to its promises simply by involving the whole team in the estimation process!
&lt;/p&gt;

&lt;p&gt;
That&#039;s planning poker. Simple. Fast. Effective. However, when people get introduced to planning poker and go about their first planning session using it, something happens that I really dislike. It has to do with what you do when you don&#039;t immediately get consensus. Let me explain.
&lt;/p&gt;

&lt;p&gt;
&lt;a name=&#034;entry_section_beef&#034;&gt;&lt;b&gt;The Beef [with Average and Median]&lt;/b&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;
My beef is with people intuitively beginning to use an &#034;average&#034; or &#034;median&#034; algorithm for resolving differences between team members&#039; estimates when they first see those cards coming up without a consensus.
&lt;/p&gt;
&lt;p&gt;
The phenomenon of thinking ahead, trying to find a &#034;solution&#034; in parallel to receiving information is human nature. You can see it in, for example, a software developer starting to imagine what the system architecture will be already after reading the first two pages of the requirements specification.
&lt;/p&gt;
&lt;p&gt;
This, I understand. I do it all the time.
&lt;/p&gt;
&lt;p&gt;
But &lt;i&gt;why&lt;/i&gt; average and &lt;i&gt;why&lt;/i&gt; median? Why do we take these as some kind of a &#034;safe bet&#034;, trusting that the truth must be close to the majority? Isn&#039;t it possible that the one who&#039;s in the minority knows more than the others?
&lt;/p&gt;
&lt;p&gt;
Using an average or median algorithm for resolving differences (rather than trying to find a consensus through discussion) is like imagining the system architecture based on the first section of the spec. It&#039;s simply too soon! Just like the requirements specification always hides something you didn&#039;t quite take into consideration with your immediate vision of the architecture, the few minority votes behind those poker cards always include some very useful information that&#039;s worth unrevealing.
&lt;/p&gt;
&lt;p&gt;
Really? &lt;i&gt;Always?&lt;/i&gt; Isn&#039;t it the case that every now and then the minority vote was due to lack of knowledge about the domain, the code base, or an oversight of some kind?
&lt;/p&gt;
&lt;p&gt;
Yes.
&lt;/p&gt;
&lt;p&gt;
And that&#039;s why you should bring it up.
&lt;/p&gt;
&lt;p&gt;
It&#039;s not just about getting the best possible estimate with the available knowledge. It&#039;s also about sharing that knowledge and improving those estimates in the future.
&lt;/p&gt;
&lt;p&gt;
So bring up those arguments and try to get that consensus before resorting to the last word of average or median.
&lt;/p&gt;
        </description>
      
      
    
    
    
    <comments>http://radio.javaranch.com/lasse/2008/04/22/1208837097457.html#comments</comments>
    <guid isPermaLink="true">http://radio.javaranch.com/lasse/2008/04/22/1208837097457.html</guid>
    <pubDate>Tue, 22 Apr 2008 04:04:57 GMT</pubDate>
  </item>
  
  <item>
    <title>On Story Points</title>
    <link>http://radio.javaranch.com/lasse/2008/04/17/1208381586654.html</link>
    
      
        <description>
          &lt;p&gt;
I&#039;ve mentioned this before but most of my work these days revolves around helping organizations and software development teams adopt agile methods or otherwise restructure their operations. Specifically, a lot of our clients (like most teams at &lt;a href=&#034;http://www.ri.fi/en&#034;&gt;Reaktor&lt;/a&gt;) are using or looking at &lt;a href=&#034;http://www.ri.fi/web/en/technology-and-research/scrum&#034;&gt;Scrum&lt;/a&gt; so I get to see a lot of teams take their first steps in that direction. And I get asked similar questions over and over again. While there&#039;s really no single &#034;correct&#034; answer to most of those questions, some of them I tend to respond to with an almost identical answer or suggestion.
&lt;/p&gt;
&lt;p&gt;
Earlier today I found an old index card with a note sketched vertically along the side saying, &lt;i&gt;&#034;[empty tick box] story points for product backlog&#034;&lt;/i&gt;, with the word &#034;product&#034; underlined. I don&#039;t really remember what I was trying to say to myself with that note (that happens a lot with notes I&#039;ve made more than a few weeks ago) but I interpret it to mean that the team I had been consulting at that time had decided to use &lt;i&gt;story points&lt;/i&gt; for estimating tasks for their sprint backlogs - and I wanted to bring forth some thoughts about that at the next suitable moment.
&lt;/p&gt;
&lt;p&gt;
Now, the use of story points has been the topic of many of those recurring questions I get asked so I decided to (gasp) blog about. I haven&#039;t done this in a while so it&#039;s kind of exciting. Anyway, here&#039;s a short introduction to story points and, related to that, what I often suggest to teams I coach or consult with.
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;STORY POINTS - THE &lt;span style=&#034;text-decoration:line-through;&#034;&gt;DIRECTOR&lt;/span&gt;LASSE&#039;S CUT&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Scene 1: Relative estimation&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
Story points are an &lt;i&gt;abstract unit of effort&lt;/i&gt; (or &lt;i&gt;complexity&lt;/i&gt; but I&#039;ll stick with effort for now). They&#039;re not hours. They&#039;re not days. They&#039;re not quarters-of-a-day. They have no defined relationship with any time unit you can think of. What they do have in common is the scale - it&#039;s linear and proportional. In other words, the distance between &#034;1 story point&#034; and &#034;2 story points&#034; is the same as the distance between &#034;2&#034; and &#034;3&#034;, between &#034;3&#034; and &#034;4&#034;, and so forth. The measures are relative and can be compared with each other. For example, 2 story points is twice the size of 1 story point.
&lt;/p&gt;
&lt;p&gt;
So we know that story points are an abstract unit of effort. What do we &lt;i&gt;measure&lt;/i&gt; in story points, then? We use story points for estimating the effort of implementing &lt;i&gt;user stories&lt;/i&gt; (or whatever format you prefer for expressing requirements or desired features for your product). In other words, we assign story points (or just &#034;points&#034;) to features based on how much effort we estimate them to require. And, remember, story points are not a measure of time but effort - not even a measure of effort in time. A measure of effort. Period.
&lt;/p&gt;
&lt;p&gt;
The advantage of estimating in story points (compared to, say, &lt;i&gt;effective engineering hours&lt;/i&gt; or calendar days) is speed. With story points we&#039;re estimating relatively, comparing items of work to each other and placing them into virtual buckets representing our chosen scale. Items of size &#034;1&#034; go with other items of size &#034;1&#034;. Items of size &#034;3&#034; go with other items of size &#034;3&#034;, and so forth. When we encounter a story that&#039;s almost twice as much work as an item in the 3-point bucket, we throw that in the 5-point bucket. It&#039;s all relative. And that&#039;s what makes it fast - we (humans) are good at comparing things. We&#039;re not that good at estimating how long it will take to make that point-of-sale system support an additional type of campaign pricing scheme. With relative estimation, we can plow through many more backlog items with equal accuracy (and I mean accuracy, not precision) compared to breaking them all up to smaller tasks, estimating them, adding them up, and slapping some buffer on top.
&lt;/p&gt;
&lt;p&gt;
So where does this take us? We have a technique for quickly generating estimates by comparing the items relatively to each other but those estimates are presented in abstract units of effort that have no relationship to the Gregorian calendar or time - let alone the project&#039;s deadline. How do we know how long it will take to build the product or finish the project?
&lt;/p&gt;
&lt;p&gt;
Well, only the almighty god would know the answer - on a good day - but we do have the means for producing a useful prediction. It&#039;s called an experiment. We simply start working on a couple of those items and see how much we could accomplish in a given period of time - how many story points worth of features did we implement?
&lt;/p&gt;
&lt;p&gt;
If we completed 18 story points in the first week and the product backlog has a total of 440 story points remaining, we can quite confidently say that the project will finish within 4-18 months. After the second week, with 14 more story points completed, we can quite confidently say that the project will finish within 8-16 months. After the third week, with 15 more story points completed, we can quite confidently say that the project will finish within 7-12 months. After the fourth week, we&#039;re again that much more confident that our velocity - the pace at which we complete features - is representative of the rest of the project and yields a reliable prediction for the overall completion date.
&lt;/p&gt;
&lt;p&gt;
We won&#039;t be 100% certain before &lt;i&gt;all&lt;/i&gt; features have been delivered. And that goes for any other estimation techniques, too.
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Scene 2: Calibrating the scale&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
Now this is all fine and dandy but what if we need to make a prediction about the completion date without the luxury of working on the project first for a few weeks? That&#039;s where &lt;i&gt;calibration&lt;/i&gt; comes into the picture.
&lt;/p&gt;
&lt;p&gt;
Calibration, referring to calibrating the story point scale, is an activity where a small number of items are analyzed more closely - the traditional work breakdown structure approach - in an effort to produce a time-based estimate, for example, in effective engineering hours (also referred to as &lt;i&gt;ideal engineering hours&lt;/i&gt;) or calendar days.
&lt;/p&gt;
&lt;p&gt;
For example, you could take items that are roughly 10 hours, 20 hours, and 30 hours of effort in effective engineering hours. Then you would assign these items a story point-based estimate, effectively &lt;i&gt;anchoring&lt;/i&gt; your story point scale to time. For instance, those three items you picked could be assigned 3, 5, and 8 story points, respectively. It&#039;s not a mathematically solid translation but it&#039;s close enough.
&lt;/p&gt;
&lt;p&gt;
Now, if you need to make that prediction about the completion date, you estimate the product backlog relatively in story points, add it all up, and translate back to calendar time with whatever formula you used for calibration, not forgetting to translate from effective engineering hours to calendar time. What you should forget at this point is the formula – it has served its purpose and is now just mental baggage, seducing our feeble minds to continue translating back and forth between hours and points.
&lt;/p&gt;
&lt;p&gt;
But doesn&#039;t this mean that the story points do relate to time after all? Where&#039;s the catch?
&lt;/p&gt;
&lt;p&gt;
Yes, there &lt;i&gt;is&lt;/i&gt; a relation between story points and time. I apologize for claiming otherwise. I would like to point out, however, that I did it with good intentions. You see, whenever we deal with story points we should think in terms of story points and relative estimates, not making the mental translation back and forth between hours and points. When we start doing that, we start losing the advantage of relative estimation. The &lt;i&gt;neurolinguistic programming&lt;/i&gt; we do with &#034;points&#034; versus &#034;hours&#034; is needed in order to help us get to (and stay in) the relative estimation mindset.
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Scene 3: Scales come in different shapes&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
In the calibration example above, I suggested that the scale was set with 3, 5 and 8 story points for items of an estimated 10, 20, and 30 effective engineering hours, respectively. Why that mapping? Wouldn&#039;t it be easier to just keep the numbers and rename the unit to &#034;story points&#034;?
&lt;/p&gt;
&lt;p&gt;
First of all, that whole NLP stuff isn&#039;t just fluff. &#034;Just&#034; calling them story points doesn&#039;t do the trick. It&#039;s not enough to turn that mental switch from hours to points. Some kind of a (preferably non-trivial) translation is necessary. Second, the larger the numbers get the more difficult it becomes for our relative comparison to work effectively. We&#039;re really good at telling whether item A is twice or three times the size of item B. We suck at telling whether item C is 9 or 42 times the size of item D. We need a small scale to make this work. As a rule of thumb, &lt;i&gt;anything with two digits is already too big&lt;/i&gt;.
&lt;/p&gt;
&lt;p&gt;
The mapping from 10 to 3, 20 to 5, and 30 to 8 was not purely coincidental either. I could&#039;ve just divided everything by four and gotten 2.5, 5 and 7.5 – nice and small numbers between 1 and 10. Floating-point numbers are too complicated, though, so that wouldn&#039;t work. I also could&#039;ve divided by four and rounded down to, say, 2, 5, and 8. But I didn&#039;t. Why?
&lt;/p&gt;
&lt;p&gt;
First of all, that would&#039;ve been quite all right. I just happen to like the practice of making all story point estimates fit into the &lt;i&gt;Fibonacci sequence-based scale of 1, 2, 3, 5 and 8&lt;/i&gt;. It might seem like a small difference but I&#039;ve found that a more limited scale like this (compared to the full scale from 1-8) helps me think in relative terms when I can&#039;t assign a &#034;4&#034; or a &#034;6&#034; on a story. Besides, the increasing gap between the available values quite nicely represents the increase in uncertainty when estimating bigger items. Many other teams and consultants have found this scale useful, too. It&#039;s not something I&#039;d fight for but I like it and recommend it.
&lt;/p&gt;
&lt;p&gt;
My second favorite would be something like a scale limited to 1, 2, 4 and 8. This would be simpler but for some reason I&#039;m not sure I&#039;m fully aware of, I prefer the Fibonacci sequence. It probably has to do with what my colleague &lt;a href=&#034;http://kijoe.blogspot.com/&#034;&gt;Jukka&lt;/a&gt; pointed out – the Fibonacci sequence has slightly smaller gaps and, given a backlog item to estimate, it’s that much less likely to fall ‘in the between’ the available values compared to the 1-2-4-8 scale. In other words, the gaps of the Fibonacci sequence just feel right for me.
&lt;/p&gt;
&lt;p&gt;
&lt;b&gt;Scene 4: The note&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
Remember that old note on an index card I mentioned in the beginning of this rather lengthy blog post? It related to a team using story points for estimating their sprint backlog and I had something to say about that practice. Here&#039;s that something.
&lt;/p&gt;
&lt;p&gt;
Most consultants and practitioners recommend using story points for estimating the product backlog and effective engineering hours for estimating the sprint backlog. Myself included. Why is that? As I mentioned earlier, one of the major advantages of estimating relatively in story points is that we can do it quickly for even a large list of backlog items because we&#039;re not entrenching ourselves in the nitty-gritty details of what a given backlog item entails in terms of implementation - the rough estimates in the product backlog are quite sufficient because the all of those variances cancel each other out in the long run.
&lt;/p&gt;
&lt;p&gt;
For a sprint backlog of, say, two weeks of work, that canceling out doesn&#039;t happen quite as effectively as it does for several months of work. In other words, when we&#039;re estimating the sprint backlog - trying to figure out whether we can deliver a given set of backlog items in those two weeks or not - we benefit from more detail, more analysis, more effort put into the estimation. While the story point estimates for our product backlog items indicate how much we &lt;i&gt;tend to&lt;/i&gt; deliver in two weeks, time-based estimates increase the &lt;i&gt;certainty&lt;/i&gt; of that tendency. For many teams, this increase feels significant enough to justify the higher effort.
&lt;/p&gt;
&lt;p&gt;
Some teams, however, are doing great by using story points for estimating their sprint backlog, too. Having broken down backlog items into tasks and established (anchored) a scale for technical tasks, they routinely run through the sprint backlog, throwing tasks into buckets according to their &#034;task point&#034; scale. That scale is typically different from the one used for the product backlog, mind you. Using the same scale would practically push the story point values beyond the comfort zone of single-digit estimates and could possibly bias the estimates given for technical tasks. We (humans) are good at making the world fit a pattern. That&#039;s also true for making task estimates add up to story point estimates so I recommend sticking to a different scale. 
&lt;/p&gt;
&lt;p&gt;
While I do prefer and recommend keeping story points and relative estimation in the realm of the product backlog, I have seen relative, point-based estimation work quite well for sprint backlogs, too. Just ensure that the points are small enough – most tasks should be doable in less than a day. I know it can work and if it does, it&#039;s quite a breeze to do sprint planning (assuming that there&#039;s no bottleneck in, say, access to information about the problem domain or in our understanding of the code base). Still, I acknowledge that it&#039;s more intuitive and probably - on average - more accurate to estimate the technical tasks of a sprint backlog in terms of engineering hours.
&lt;/p&gt;
&lt;p&gt;
There. I think I&#039;ve said everything I wanted to. Except the things I forgot already. Well, that&#039;s what the blog comments are for...
&lt;/p&gt;
        </description>
      
      
    
    
    
    <comments>http://radio.javaranch.com/lasse/2008/04/17/1208381586654.html#comments</comments>
    <guid isPermaLink="true">http://radio.javaranch.com/lasse/2008/04/17/1208381586654.html</guid>
    <pubDate>Wed, 16 Apr 2008 21:33:06 GMT</pubDate>
  </item>
  
  </channel>
</rss>
