Points vs Hours

by Remco 8. December 2010 21:14

As a travelling developer, I often get exposed to different estimation practices.  These have ranged everywhere from using the most basic ‘finger in the wind’ tactics to calculations so complex that trying to use them without spreadsheets or software would be impossible.

It shouldn’t be a surprise that developers are so interested in estimation practices.  Estimates are incredibly important.  It isn’t unusual for the most brilliantly engineered piece of software to be considered a catastrophic failure – just because the estimate went wrong.  Or for the most disgusting piece of trash to be considered a glorious achievement – just because the estimate was right.

Almost every developer has an opinion about which estimates practices work best for them.  I’m definitely no exception to this – and I often find myself falling into ‘philosophical discussions’ about which approaches are better.  By far the most common arguments I seem to find myself in revolve around comparisons between time based estimates and relative sizing – also known as ‘points vs hours’.

Personally, I am very much in favour of estimates created using relative sizing.  As I find myself often repeating my reasons for preferring this method, I’ve listed them here.  I hope someone will find this useful.

 

Relative sizing is more accurate due to better accounting for variables

Any kind of time based estimate is affected by a number of different variables.  Examples of some variables that can impact time estimates are as follows:

- Ability/experience of individual developers
- Development team's understanding of the business domain
- Strength and quality of communication between the development team and the business
- Technical debt
- Motivation
- Stability/clarity of software requirements
- System and domain complexity
- Cooperation level of outside dependencies (i.e. other technical/operations teams involved in the project)
- Specification of the development team's hardware
- Amount of available caffeine

... And above all else:

- Task size

The interesting thing about estimates is that except for task size, most of these other variables have the potential to remain relatively constant over the life of a project.  While variables such as technical debt can increase (given neglect), or decrease (given close attention), it is often quite safe to assume that a complex domain will remain complex, unstable requirements will continue to be unstable, and the coffee supply will never be quite what we need it to be!  Variables that do change will often do so in a predictable way or will otherwise have only a small impact in the grand scheme of an estimate (at least, when compared to Task size).

Therefore, to simplify our estimation, we can use the following formula:

Effort = Size * Everything else

Since 'Everything else' is essentially an aggregate of all things impacting the speed of our development, we can safely rephrase our formula as:

Effort = Size * Velocity

By separating size from velocity, it becomes possible to track both figures separately and account for their uncertainty using different methods.  Unlike Size, Velocity has a tendancy to be far more consistent over the life of a project.  This means it is possible to take the recorded Velocity from a previous sprint, and use it to calibrate estimates for future sprints.  This can remove a large amount of judgement and human error from what would otherwise be a very unstable mental calculation.

When attempting to perform estimates directly using hours, it's important to consider that this formula still applies to the estimation process.  It's just hidden away.  When trying to assign an estimate to a task, the developer will simply be making a judgement call based on their experience of working within the environment (or similar environments), then applying this to the size of the task to produce a result.  I'm not saying that this result is necessarily wrong, but it is mentally much more difficult to produce and can be very hard to repeat consistently - especially when different people are responsible for different sets of estimates.

The act of transferring Velocity between sprints should never be completely free from adjustment by judgement.  Often a team will have an accurate collective opinion about when they expect their Velocity to change.  Encouraging a team to be involved in this adjustment will help encourage them to avoid individually considering Velocity when they are estimating Size.
 

Relative sizing allows analysis of overall productivity of the entire team

Because estimation in points allows Velocity to remain separate, it can be used as a reliable performance indicator.  When estimating in hours, teams that run into difficulty with technical debt or unstable requirements partway through a project can often remain blind to statistics indicating that their performance has become impeded, as they are all busy unconsciously recalibrating their hourly estimates with their perceived Velocity change.  While the team is almost certainly aware of the impediment, it can be very difficult to quantify or explain to project stakeholders.  This can be a source of incredible frustration should the impediment be outside the control of the team.

Productivity metrics can also be very useful for examining the impact of introducing new team members onto a project.  All projects have a critical point at which adding more people will only reduce Velocity rather than increase it.  Visibility of this is very important when making management decisions on how to best accelerate development.


Relative sizing provides a uniform way to compare productivity between different projects, helping to identify opportunities and benchmarks

Where organisations can create consistency in their estimation practices across projects, it opens up a wealth of useful statistical information that can be used for the calibration of future estimates where many of the variables would otherwise be unknown.  A consultancy pitching for a project against competitors will have a major advantage should they be able to accurately estimate the amount of effort required to complete a project up front.  Trying to create these estimates based purely from the judgement of individuals within the consultancy can be an extremely stressful and potentially disastrous exercise.  Statistical information on velocity of previously completed projects can provide an effective double-checking mechanism that can sound alarm bells long before a project is even started.

Furthermore, comparison of Velocity between project teams can be used to help identify the impact of tools and techniques that may improve the performance of the whole organisation.


Relative sizing allows estimation practices to be consistently reproduced by other team members who do not have extensive experience with project complications

Creating accurate estimates where velocity is left entirely to judgement requires a great deal of experience within the development and business environment.  Developers who do this with consistency usually have developed a 'gut feel' for how long tasks tend to take.  This experience is very difficult to replace or replicate, and can take many months to transfer.

Where velocity has been tracked and accounted for, the act of estimating time simply requires a relative comparison of upcoming work vs work that has been completed.  It is far easier for a new developer on a team to perform accurate relative comparisons of tasks than it is to estimate the work speed of the entire team.  The velocity itself is simply a number that can be passed from one person to the next.

In my experience, it is normal for developers to be able to share individually consistent and accurate estimates within two weeks of joining a project - where the team's velocity is known.


Relative sizing reduces deterioration of estimates over time (due to velocity shifting)

As the project environment changes, the projected working speed of a team needs to be regularly adjusted.  When velocity is left entirely to judgement, it is very easy for the 'gut feel' estimates of team members to deviate from the real world environment in which work is being done.  While any experienced team member will eventually realise that there has been a change in velocity, it can often take several missed objectives or deadlines before this truly becomes clear.

By tracking velocity separately from the size of work done, changes in the working environment are made much clearer much earlier.  These issues are also more easily raised with business owners, giving a development team more bargaining power.

Also, it is common that estimates for tasks can be constructed well in advance of the work being done (often weeks, or even months).  These estimates will degrade much slower if they are recorded using relative sizing, as a team's velocity is always likely to change over time.



Relative sizing allows for easier handling of unplanned work that is raised during a sprint

While methodologies such as SCRUM attempt to be very strict with locking down a sprint's scope in advance of work being started, any experienced developer knows that the real world is very different.  We live in a world where requirements are constantly changing and mid-sprint impediments can appear with little to no warning.  Unplanned work is a major risk in any project, and it needs to be properly accounted for.

When estimating time to complete task without accounting for velocity, it often becomes the habit of a team to try and account for unplanned work by 'padding' effort estimates to leave space for the unexpected.  This padding is completely subject to individual judgement and can carry with it a large degree of inaccuracy.

Even the riskiest project will have statistical patterns of unplanned work.  Over time, these patterns can be tracked, providing developers with useful information with which to construct future estimates.  It is not unusual for a planned sprint to have as much as 50% of its work appear midway and be completely unplanned.  However, it is also not unusual for the next 3 sprints following this one to have the same percentage of unplanned work.

By accounting for unplanned work separate from planned work, and ensuring that the estimates against tasks are performed using a relative sizing system, it creates a flexible and accurate way to fill a sprint's capacity without trying to align all planned work to a standard 40 hour working week.

For example, lets say we have a single developer working a standard 80 hours over a two week sprint.  At the beginning of the sprint, the developer identifies 4 tasks that will need to be completed during the sprint.  Each task is estimated at 20 hours each.

Partway through the first week, the product owner applies pressure to the developer to include an additional task estimated at 10 hours of extra work.  Having anticipating that some unexpected work would appear during the sprint, the developer includes the unplanned task in the sprint and hopes that the blunt estimates given for the planning work will be enough to cover for the extra load.

Partway through the second week, the product owner introduces another unplanned task estimated at 10 hours.  While uneasy about overloading the sprint, the developer accepts the unplanned task, warning the product owner that this may have an impact on existing commitments.  Without any solid figures to clarify the impact on the planned work, no long term plans are adjusted.

Finally, all tasks are completed on time - at the expense of 10 hours of overtime.


This could be compared to a situation as follows:


The same developer works a standard 80 hour sprint over two weeks.  Based on the previous sprint done, the developer knows that he has an individual output velocity of 5 'points' per day of effort.  This would give a total working capacity of 50 points over the entire sprint.  The developer also knows that from the previous sprint, 10 points of the work done came from unplanned tasks.  With this knowledge, he is able to confidently commit himself to 40 points worth of planned tasks.

Partway through the first week of the sprint, the product owner introduces an unspecified but not entirely unexpected change to the sprint - introducing an additional task with a size of 10 points.  The developer is able to confidently commit to the new task and continues with the sprint.

Partway through the second week of the sprint, the product owner attempts to introduce another task with a size of 5 points.  The developer knows that the sprint is already at full capacity, and is able to provide solid information to the product owner to back his position.  In doing so, the developer can force the product owner to choose between the new task introduced and the other tasks in the sprint, with full understanding of the impact.

Should the developer be lacking solid information on his velocity or unplanned vs planned work, he would have been unable to confidently hold his position when negotiating with the product owner and would have been likely to either work overtime or unreasonably keep work out of the sprint that could have otherwise been completed on time.


Productivity is not consistent between developers - estimating in hours makes it appear so

One developer is not equal to another.  All developers have different areas where their knowledge or experience is stronger than others.  When asking developers for estimates in time, they will often provide an estimate that seems realistic for them doing the work themselves.

This can be discouraging (and damaging to commitments) for other developers who may feel the task would take longer were they to pick it up themselves.  This problem tends to be magnified where one developer (such as a team leader) is responsible for creating estimates, and others are responsible for doing the work.

By encouraging an estimation practice that is based on relative sizing rather than time, developers are far more able to disassociate their own work speed from the estimate.  The relatively sized estimates can then be calibrated to produce a time based commitment by using the team's entire velocity, rather than the velocity of an individual.

Avoiding estimates in time can also help to avoid situations where developers feel their time is being spent unproductively because the hours in which they are applied to a task do not match the estimate.  It is harder to feel the day has been constructive when spending 8 hours on a 4 hour task, than it is to spend an entire day on a 3 point task, when your projected velocity averages to 6 points per day.  Estimations for time spent solving problems are usually only accurate when grouped together as a whole - estimating directly in time tends to discourage this way of thinking.


Hours are political, 'points' are statistics

Where possible, it can be hugely advantageous to have project stakeholders understand how relative sizing and velocity calculations work.  This can help them to avoid falling into the "Mythical Man Month" trap of believing that time spent = productivity.

Clients that view a sprint backlog as a set of relatively sized tasks will be far less likely to believe that the newest developer on a team will have the same productive output as the team's longest standing member.  They'll also be much more understanding of the impact of overtime on a team's velocity (the longer you work, the slower you go!).

By projecting a team's estimates directly in time without accounting for velocity, you are effectively signing direct commitments with your project stakeholders purely through your own judgement.  Everyone knows what an hour is.  Everyone has expectations on how many hours fit within a week.

I'm not saying that you should avoid making commitments in hours.  There is nothing wrong with making commitments in hours - not as long as the estimates weren't in hours themselves.

Realistically, many clients will quickly become suspicious (or even aggressive) if you start giving them every estimate in points.  This is quite easy to understand given that a 'point' only has meaning to people that have been exposed to the inner workings of your team for a certain length of time.  Corporate budgets often need to justify the cost of large projects well in advance, and often this can only be done using time-based costing measurements.  Therefore I would suggest that early in a project, it is better to estimate internally within the team (using relative sizing), then publish your commitments to project stakeholders as time.  As trust is built with the client and their understanding of your team's internal working practices improves, you may have more flexibility about how you communicate your estimates.

 

About Me

I'm Remco Mulder, the developer of NCrunch and a code monkey at heart.  I've spent the last decade consulting around Auckland and London, and I currently live in New Zealand.  Interests include writing code, writing tests for code, and writing more code! Follow me on twitter