Write actionable backlog items
It’s easy, when backlogging your own product’s issues, to simply document the existence of the issue: “The app sometimes throws the error
Cannot access property .includes of undefined.” It’s easy, when customers have the ability to open backlog items against your product, for them to do the same.
An actionable backlog item has a handful of important qualities. The problem with the above example is you and your team has nowhere to start. The issue isn’t reproducible; the root cause is unknown and undiscoverable. What happens when your team opens this item and simply cannot reproduce it? They wasted their time, and the item goes nowhere. Further, there is no definition of done: It’s perfectly likely a possible edge case was addressed somewhere in the app, but what if that was not the customer-impact error? The error still exists in production, impacting customers, but no longer sits in the backlog because the engineering team believes it to have been addressed.
I was recently asked to share the issue template I use when creating tasks, but I want to break it down a bit so that it can be better harnessed to your team’s needs.
Each item should include a problem statement — a high-level problem that needs to be solved. This basic description prevents your team from over-indexing on the wrong thing. A customer may request, “I want to the page to scroll to the bottom of the table by default.” Easy enough to implement, right? But what’s really being solved here? If the problem is actually just that they want to see the last table items first, then the ideal solution may be to allow the table to be sorted descending. Having a problem statement allows product owners, developers, and designers to discuss implementations and features with other customers before over-indexing on one customer’s needs. A just-to-end table would satisfy this customer but upset others who do not want to see last items first. A sortable table would satisfy everyone’s needs.
In INVEST, this is your customer value.
Current and expected behavior
By describing the current behavior and expected behavior, we offer clarity toward what needs to change. These short synopsis allow negotiation of the task before completion. In the previous example, the expected behavior is for the table to jump-to-end. However, in our discussion of the task, we have pivoted. The expected behavior is now for the table to be sortable. In INVEST, this represents that the task is negotiable.
You may even find the current behavior to change definition as the task is discussed. An issue may be “When I click the submit button, I get a banner saying an error occurred,” with the expectation being “An error should not occur.” After investigating, you find that the error is accurate — the submitted form contains errors, and the API cannot handle the data. The current behavior, displaying an error to the customer, is not wrong and should not change. The problem here is that the error is too vague. The current behavior is that the error message surfaced to the customer does not provide actionable steps for them to unblock themselves. The expected behavior is for the error provided to be actionable: What form fields need to change and how?
These quick task summaries allow fast task refactors and clarity for operators.
Your acceptance criteria defines when the task can be deemed complete. For example, in the previous examples, the task is complete when the error is no longer thrown under reproduction conditions, when the table is sortable, or when the API error banner provides actionable steps for unblocking the customer.
I strongly recommend using behavior-driven structure when filling out acceptance criteria.
Given a situation
When an action occurs
Then expect an outcome
Given I have filled out the form with "test" in the position field
When I click submit
Then an error banner should state that my position must be numeric.
These behavior-driven definitions offer the most clarity for the definition of done. When demoing the project, these are the action items. When unit, integration, or end-to-end testing the product, these are the steps the automated framework should be taking. In INVEST, this represents testable. While the expected behavior should not be too ambiguous, there should be effectively no ambiguity remaining after the acceptance criteria is written. Once the acceptance criteria is met, there should be complete team and customer consensus that the task is done.
Since these two sections can be hard to distinguish, I will offer a real-world example that occurred recently on my team.
In one product, we query the first 1,000 records of an API and display it in a table. Since this truncates after the 1,000th item, we are only displaying partial data. Thus, when sorting the table, you do not necessarily get accurate results. For example, when querying the data sorted by resource name, but sorting the table by memory usage, the customer would see the highest-memory-usage resources from the first 1,000 alphabetical results. What the customer wants to see is the highest-memory-usage resources from all results. This requires we re-query the API.
The current behavior of the product is that the customer can only see highest-impact resources that occur in the first 1,000 alphabetically. The expected behavior of the product is that the customer should be able to see the highest-impact resources out of all resources they own by re-querying the resources as needed.
This works. It was correctly defined. It was actionable. The engineers acted on it and delivered. This task did not use the behavior-driven acceptance criteria, though, so when we came to demo the project, we found a second sorting mechanism on the table. Instead of sorting the table by column to harness this functionality (which sorted the first 1,000 alphabetical results), there was an additional sorting column drop-down that would re-query the API based on the selected column name in the drop-down. It worked. It solved the customer need. It provided value. Scrum really delivered here. We were able to deliver this quickly to customers, and we were able to respond to this change quickly. Notable about this implementation, sorting the table is still useless. There is no customer value in sorting only the first 1,000 alphabetical resources. The implemented drop-down solved the problem of sorting, but it was not a solution that removed the pre-existing, useless functionality.
This task would have benefit from:
Given I am viewing the list of resources
When I click the sort icon on the memory usage column
Then I expect to see highest memory usage resources of my account.
This acceptance criteria offered a significantly more actionable definition than the expected behavior. In fact, the goal of a behavior-driven acceptance criteria is to be so well-defined that it can be automated. Automated testing not only validates that the task has been complete, but it prevents it from regressing due to future changes.
Guide your team and customers to leave at least one acceptance criteria, but not more than four. If your task has more than four acceptance criterion, it should be broken down into multiple backlog items. In INVEST, this keeps your items small.
You won’t read this in a scrum book, but coming from a monitoring-oriented organization, myself and my team know the value of business metrics.
There are two monitoring-related questions that you should answer for each backlog item:
- What metrics or logs could you have used to catch this issue before customers reported it? Are you monitoring for errors effectively? Is your team’s process for monitoring errors and customer experience lacking the metrics or logs needed to have caught this automatically? For example, my team emits metrics for every error and logs relevant messages useful for root causing and debugging. We either alarm on those metrics or monitor them weekly to identify outliers, running queries against our logs to identify customer impact (top priority tasks) and root causes. With this system, we expect to catch errors before customers report them. If we have a backlog item for a customer report, what can we do while addressing it to make sure we catch future problems like this before customers have to bring it to our attention?
- What new metrics or logs would be valuable to accompany this task? Frequently, what we see here are customer behavior metrics. Is this feature being used at all? How frequently? In what capacity? These metrics are valuable when planning product improvements and redesigns. My team has been bit before by deprecating features we assumed were not used. Our re-designs, despite being vetted by existing customers, happened to not be vetted by customers using these features. When launching to a significantly wider audience, we realized just how wrong we were. We no longer rely on these assumptions or small sample sizes. We now have hard data on how frequently our features are used. You may also make note to track errors of new features. When new APIs are added, this section serves as a reminder to the operating engineer to track errors on that API.
Lastly, if the ticket is being defined by a subject matter expert, they may discussion any implementation details here. File locations, utilities, dependencies, etc. may be relevant to the operator and valuable to include in the ticket, but unlike the previous sections, these specifically are only accessible to the operator and not relevant when customers or most stakeholders create items. As such, we save these for last, but offer the ability to include them so as to speed up development time by decreasing ambiguity and research and allowing tasks to be completed in parallel — not only by the specialist, subject matter expert.
If you have any questions, feedback, or relevant great advice, please leave it in the comments below.