At a Glance
A benchmark study measures and compares usability metrics against a baseline study. Such studies are typically run on a regular basis (monthly, quarterly, yearly) and evaluate how a user's experience with your product changes over time.
Most benchmark studies focus on basic usability requirements, such as task success, errors, time on task, and subjective user ratings (satisfaction, “ease of use,” confidence in being able to complete the task).
How Many Contributors Should I Use in a Benchmark Study?
Benchmark studies typically have sample sizes larger than the average qualitative (or formative) usability studies. But what is the recommended sample size for benchmark and baseline studies?
The answer is the UX industry’s standby response: it depends. There are a handful of factors that can dictate how many contributors to use, including whether you are comparing products, designs, or even one benchmark to another—or if you’re investigating usability issues with, for example, a site or an app.
How Do Benchmark Studies Help Me?
Benchmarking allows you to track the progress (or lack thereof) of the changes and revisions that comprise the typical multistage redesign. The metrics/results of each successive study will inform you as to how successful these changes and revisions have been.
A benchmark study is considered a “summative” assessment, so greater attention is paid to creating tasks that can be reused multiple times, independent of the kind of changes or improvements made to a site, app, or design.
When Should I Conduct a Benchmark Study?
Benchmark studies are particularly valuable when launched before any redesign efforts have begun. To accurately benchmark a design or customer flow, you need to start with a baseline. The first of your benchmarking studies, the baseline will be what you compare the results of the ensuing studies so as to conclude how much change and what kind of change occurred.
A caveat: before launching it to a larger group, be sure to pilot the baseline study with at least one individual in each contributor group. Doing so validates the structure and usefulness of the baseline; presenting an unsound, poorly structured baseline to a larger group means that you’ll likely have to start all over again with a new baseline study.
Once you have successfully completed a baseline benchmark study, you can then replicate and launch the study again after design changes have been implemented. The differences in metrics from the baseline to the new design will show how well the new design is working.
Best Practices for Running a Benchmark Study
Because a single benchmark study involves multiple stages, writing an effective benchmark demands a well-thought-out structure. Here are some best practices that will help you write and conduct an effective benchmark study:
- Limit the number of tasks and reuse them: Settle on a few key tasks and write clear, concise task instructions that can be used repeatedly, even if the tested design changes several times over the course of a benchmark study.
- Bundle a qualitative task with quantitative metrics questions, reusing the same metrics questions with different tasks.
- Create an analysis plan. Doing so makes it easier to replicate your test. Decide how to quality-check your sessions. If you plan to watch videos…which ones? How will you find these clips and videos later?
- Do a strategic qualitative analysis as part of the analysis plan. Decide whether and how you are going to analyze contributor videos in addition to undertaking a quantitative analysis of their responses.
- Get the baseline test right. The baseline is the initial test against which all subsequent rounds of benchmark tests are measured. Getting this initial test right—settling on reusable metrics and writing reusable tasks—is critical for not having to restart the benchmark from scratch at a later date.
- Share your findings with multiple stakeholders. Getting the right people to review your findings will make it easier to conclude what new tests to run and what tweaks need to be made to the product/service being benchmarked.
Need more information? Read these related articles.
Want to learn more about this topic? Check out our University courses.
Please provide any feedback you have on this article. Your feedback will be used to improve the article, and when you submit your survey you'll be entered into a drawing for a $50 Amazon gift card. This survey should take 5 minutes to complete. Article evaluations will remain completely confidential.