Developing an Agile Planning and Tracking Scorecard

Agile changes the nature of planning and tracking. The term agile is used to refer to a variety of software development frameworks (like scrum, XP, crystal methods) which all share approaches to scoping work and managing the delivery of working features.

Using scrum as an example (Figure 1), it is clear that the available development capacity in resources and schedule is assessed and considered fixed at the outset. That assessment guides the scope of features promised for a delivery cycle. This is in contrast to plan-driven waterfall development (at least in the extreme), where requirements scope is locked down up-front, guiding the prediction of resources and schedule (Figure 2).

Figure 1: Illustration of Scrum Planning and Delivery Cycle

Figure 2: Comparing Predictive (Waterfall) and Agile (Adaptive) Methods
	Plan-Driven	Agile
Fix Up Front	Scope	Cost Schedule
Negotiate and Manage	Cost Schedule	Scope

Product Backlog: The Agile Front End – An important part of agile planning is the maintenance of a set of features worth implementing. Prioritizing this set based on prospective value and risk (to the customer and/or the emerging system) guides a team in planning an incremental release cycle. Agile emphasizes ongoing, flexible planning over a large rigid plan. Managing in such an environment involves finding ways to make that flexibility more of an asset than a liability. A simple scorecard can help organize and guide that work.

A Simple Planning and Tracking Scorecard – In keeping with the spirit of seeing and removing waste, a scorecard cannot impose more overhead than it pays back in efficiency gains over time. There are a number of variations on the card illustrated (Figure 3) as necessary to tailor the use and payback to particular development situations. The points made here, though, should guide the assessment of prospective fit and customization that would make sense.

Figure 3: Planning and Tracking Scorecard

Section I: Development and Environment Productivity Factors

Figure 4: Ideal Time Versus Calendar Time

Work days per month simply accounts for weekends and holidays. Ideal time rate (sometimes called “ideal time”) tracks the proportion of time that team members get to focus and work on “what they were hired to do – on the project at hand.” Interruptions, other work, meetings on other topics, etc., all reduce ideal time to just a few hours a day for most people (Figure 4). Note that this number is governed more by management and the work environment than team members themselves. As people are paid in person-months of calendar time, this work rate is one handle on cost efficiency.

Sizing or Scoping Productivity

The top level goal of quickly delivering working features suggests a measure of features delivered but there is, of course, more to that than meets the eye. Counting features would be simple, but would treat all features as being created equal in complexity, risk and required skill and effort. Some notion of sizing or scoping comes to mind in order to give credit where it is due. Without touching off an age-old debate about software sizing, perhaps it can be agreed that scoping is a more general, and perhaps more useful notion, than sizing. Size implies measures of a finished work product, like source lines of code (SLOC), which is a narrow property of already-written code, in languages where lines are available to count. Scope is something that can be done with emerging requirements, and it applies to work and work products that do not reduce to line counting.

Story point scoping is based on what an average developer could accomplish in one ideal day. At the bite-sized piece of a single requirement or feature, it is not too hard to get a reasonable estimate of scope in story points (SP). Precision in the absolute is not necessary, as long as an unbiased estimate, relative to other familiar work, is possible. Any estimating errors in the aggregate tend to wash out. Story points per ideal day indicates a team or organization’s planned performance in relation to the average. Plus or minus 25 percent or so (0.75 to 1.25) from the average may help the planning for a particular case. Some pros and cons of prospective sizing or scoping methods are outlined in Table 1.

The bottom line is, the activity of scoping can provide:

A reality check on requirements understanding. (If we can’t scope it, do we really understand it?)
A running check on scope change. (Features added or removed show up clearly in size-change logs.)
A basis for tracking productivity improvements. (An improving development environment will deliver more valuable size-per-unit time and effort. Without size tracking, no one gets credit for the reduced time and effort – the reward is usually more work.)

Table 1: Some Approaches to Sizing or Scoping
Sizing or Scoping Method	Approach	Pros and Cons
Source Lines of Code (SLOC)	Count lines produced in source code. Counting gets tricky when new code is added to an existing code base.	Easy to automate, but relies on a lines-oriented language and tracks a property that may have little to do with delivered value. Long history may provide useful benchmarks, even for teams working without lines of code.
Function Points	Assess the requirements from the perspectives of inputs, outputs, inquiries and other properties that gauge the complexity and scale of the work required.	Requires training, calibration and perhaps tailoring to specific application domains. Not inherently suited for all kinds of software (e.g., real-time systems). Broad history provides baselines and benchmarks.
Story Points	While definitions may vary, a common one is that a story point is what an archetypal developer can accomplish in about one day of uninterrupted time – focusing on what they were hired to do (an ideal day).	New method, small base of experience and benchmarks. Simple to understand and to do.

Section II: Iteration Parameters

Team size, iterations and iteration length assess available resources and schedule. These factors, together with base productivity and available ideal time rates in the scorecard cells above, dictate the available maximum capacity for a project being scoped. As that is expressed in the same units as the requirements scoping (story points here), it is easy to begin considering which and how many features to sign up for. A percent capacity promised factor may make sense to back off a bit from the maximum, allowing for reality and unforeseen demands. The feature backlog at the end of that scorecard section represents the capacity-based, reasonable promise suggested by this planning section. Planning a small piece of work like this, based on what they can actually see themselves accomplishing, is in contrast to the “big bang” project that tries to understand and promise the world and then stretches resources to deliver.

Section III: Execution Metrics

Duration (month), ideal days committed, effort (person-months) and iterations track the planned and actual values for those important project cost and cycle time measures. Working features delivered (SP) measures forecast and actual delivered work product output. Here, some value in the handle that story points provide for scope delivered can be seen. Were a company just to claim done-ness or count features, it would not truly understand how much was accomplished.

About Defects – While agile approaches seek to reduce the incidence and impacts of defects, they may still be a big component of wasted time and cost. Figure 5 illustrates an assessment of time intervals versus the way they were spent. As rework, which captured time spent finding and fixing defects, is a significant component of waste for this group, a scorecard section tracking those times makes sense. If significant waste had been driven by other factors, those would, of course, call for scorecard visibility.

Defining Defect Terms – To be clear about the use of terms, “defects” are anything unacceptably missing, wrong or extra which are handed-off for further work or release to internal or external customers. As Figure 6 illustrates, things missing, wrong or extra are surely discovered and fixed, before the work product is declared done, in the course of one or more iterations. Those fixes would be called “mistakes” and generally not tracked.

Issues during the review of a finished (but not handed-off) work product will be called “errors” to contrast their smaller cost and impact with defects, which pass through review and become embedded in future iterations and/or a release.

In the scorecard, the count and repair time of pre-release defects and errors track the number of significant issues and the time spent finding and fixing them. The released defects: count and repair time cells do the same for released defects.

Section IV: Productivity Measures

Ideal time rate uses the iteration experience to report the way that ideal time actually worked out as a percent of calendar time. Decreases in this number can signal waste at the global level that will dampen the best development efforts in the future.

Measuring Delivered Features Rate – A scope measure like story points or function points is a good start on measuring the rate at which working features are delivered – but still not enough to tell the whole story. Depending on the perspective of the scorecard user, three related measures of delivered feature productivity can be called for:

Velocity (SP per iteration) is a measure that customers care about. Their bottom line is often related to what they get over a certain interval (caring little about how many people it took or how much it cost).
SP per person-month is a measure that the business cares about – tracking the cost associated with delivered functionality.
SP per ideal day is a measure that developers care about – removing the waste imposed by their work environment and highlighting their productivity during the time they allot to do what they do best.

Released defects per SP tracks the rate of defect escape in terms that can be meaningful to the development team and the customer. It is goodness of execution measure for the team (with a fair normalization for the degree of work that was involved) and a goodness of experience measure for the customer.

Defect find-and-fix cost (percent of project cost) can be a useful business measure, as it brings waste back into the financial view. This number is a stark reminder of the bottom-line costs of poor quality connected with the project. Of course, reducing the rate of defect insertion in the first place, and finding and fixing those errors that do get into the work closer to their point of origin, work together to lower this percentage.

Checking the Fit with Individual Situations

Every company’s view of agile may be different enough to call for different scorecard content and use. The hope is that the issues considered here have illustrated some useful ways that agile projects can use available measures to guide planning and ongoing closed-loop learning about the impacts and lasting value of process improvements to be made.

The author acknowledges the help of Ken Schwaber’s book Agile Project Management with Scrum and the book Agile Software Development with Scrum by Mr. Schwaber and Mike Beedle.

Developing an Agile Planning and Tracking Scorecard