An Introduction to Agile Development, Part 2: Getting Started with Scrum

TL;DR: To get started with Scrum, grab one of the popular tools (e.g., Jira), implement all the Scrum artifacts and fill your gaps with some experienced people. It is not all too important to have the right task estimates, sprint lengths, etc. from the beginning or to follow the methodology dogmatically. The crucial part is to have a team that commits to the process, reflects critically, is eager to learn, and improves incrementally.

It took us a little while to get the second part of our series about agile development done, but Andreas and I had some pretty good reasons for it (or at least we hope so). While I moved back to Germany from the U.S. and started a new position as a UX Manager, Andreas is now not only the Head of Research at DATEV, but also started as a full professor of Web Engineering at Anhalt University of Applied Sciences.

But anyway, we somehow managed to also talk a little bit about Scrum. In this part of the series, we want to specifically focus on how to get started with it. There are many tutorials and introductions describing Scrum, such as Atlassian’s brief introduction to the topic. However, some questions that both Andreas and I have encountered over and over are often poorly addressed or not addressed at all. Let’s start at the very beginning:

Max: How do you get started with Scrum, i.e., when you want to implement it from scratch?

Andreas: When starting from scratch, in my opinion you should take one of the usual Scrum-supporting tool (like Jira, Meistertask, Sprintly, or Team Foundation Server) and begin by implementing the basic Scrum artifacts. That is, establishing the Scrum roles, establishing a backlog, the meetings (or “ceremonies”, in Jira’s words) etc. Based on my experience, it is important to follow the Scrum methodology at the beginning of the introductory phase closely, but not dogmatically. However, if the budget is available, then fill the gaps in your organization now (e.g., an experienced Scrum master). Additionally, I suggest to manifest the Scrum methodology explicitly by establishing a social contract in the team as a starting point of the transition process and to get the team members’ commitment. The social contract should cover the characteristics of the team, so that they know how to behave within the Scrum process, e.g., how to estimate task efforts (Scrum poker would be one possibility). Particularly, the Scrum master is responsible for permanently training the team and refocusing the process if required.

Max: I’ve experienced situations in which the Scrum master and the product owner were the same person. Do you think this is generally a bad idea?

Andreas: It is imperative that the two roles can be executed well with respect to an impact-driven product (the product owner’s responsibility) as well as an efficient process and a happy team (the Scrum master’s responsibilities). One has to be aware of the fact that often, these goals are competing, which requires a team setting where neither the product goals nor the team values and principles are sacrificed. Of course, it is hard to ensure a balance between these goals if the jobs are done by the same person. While it is the product owner’s job to get new features on the road as quickly as possible, the Scrum master has to defend the process and the scope of the sprint. However, similar problems might occur when appointing inexperienced people or one experienced and one less experienced person. My point here is that there are many situations in which an agile team might be influenced in a bad way, and not only because one person is in charge of the Scrum Master and product owner roles at the same time.

Hence, in my experience, the key to a successful agile team is a valid setting following agile principles. Appointing two experienced people capable of executing the roles as defined might be the perfect setting. Yet, one really experienced person doing the two jobs with some additional backup by the team can lead to success as well, but should not be a permanent situation.

Max: Once you’ve committed the team to the agile process, how do you determine the initial sprint length?

Andreas: For this, I suggest to first analyze the given goal. In many cases, Scrum is causing problems when the sprint length is not adjusted well. For instance, when new requirements are brought up in short iterations and thus break the sprint often, or because of a team not yet capable of applying the vertical slicing methodology properly, which often leads to unfinished sprints. Hence, start with a best guess, then observe the rough lenght of the cycles in which new requirements appear and adjust the sprint length accordingly.

Max: You’ve also mentioned task efforts before. Many teams seem to struggle in this regard. Do you have any suggestions for them?

Andreas: Very true! I’ve also observed that teams often struggle with defining some kind of useful estimation. I suggest starting with T-shirt sizes. Often, a best practice is that, together with the team, the Scrum master defines the minimum size as “S” (e.g., 1 day), then “M” (e.g., double of “S”), “L”, and “XL”. After some time and experience, the team will recognize that this approach might not be suitable for their complex tasks and look more into, e.g., Fibonacci-based story points and other more advanced complexity measures. From time to time, the Scrum master needs to carefully adjust, thus leading the team to a method which is appropriate, efficient, and—last but not least—accepted by the management. From this point of view, an objective suggestion might not be possible, since this is strongly dependent on the team setting and product situation. However, a common mistake is to have no agile-experienced person(s) in place. More than once, I observed teams failing to improve (in reasonable time spans) due to a missing experienced “guide” leading the team towards a productive setting.

Max: Following up on this, in which cases and under which circumstances would it be justifiable to work without task estimates?

Andreas: Well, there can, of course, be situations in which effort estimations are too difficult, inefficient, or simply frustrating the team. For instance, if no resource planning is possible due to external reasons, synchronization with many other teams is necessary, autonomy of teams is not ensured in general, or, particularly, transparency and trust are not established within the team or company, then it is a waste of time working on task estimates, since they will always fail. Hence, to ensure that the team is adopting the agile mindset, it is eligible to pause the estimation of tasks, in my opinion.

However, I strongly disagree with the sometimes expressed opinion that task estimates are not worthwile to work on or to improve, respectively. For an ambitious team, it is crucial to be self-aware regarding the impact of their work and the challenges (e.g., with respect to knowledge transfer) and to be able to justify particular tasks (see, for example, our discussion about technical goals in the first part of the interview).

Max: Getting back to the role of the Scrum master: Often, they are recruited from people who were not active in software development. What does a Scrum master who is (or was) not a software engineer themselves have to pay specific attention to?

Andreas: Well, again, this is a controversial issue. I think we can agree that software development is a very specific field driven by strong individual skills and team efforts. People with a background in software development mostly have experienced the specifics of this field, are prepared for upcoming situations, and simply speak the language of software engineers. Hence, they might be accepted easier by the team members, which is crucial for a Scrum master due to the lateral leadership relation in Scrum teams …

Max: … but this does not exclude people who were not active as software engineers?

Andreas: No, absolutely not! Actually, it is sometimes better to have a Scrum master who might team up with the product owner and help steer the process more towards product-related impact rather than focusing mostly on the technology, which happens, not always, but sometimes. However, with no background in software development one has to be really careful about common mistakes. For instance, with making statements about difficult tasks and assessing them as easy (or the other way around), the Scrum master might degrade himself in the eyes of the software engineers, who are usually very proud of their skills and their work.

Therefore, my tip for Scrum masters without a software development background, who plan to work with software engineers, is to observe a software development team for several sprints without taking a particular role and to learn the language, the specifics of the field and maybe even try to do do some tasks by themselves to experience the complexity of computing in general. This way, hopefully, the foundations can be laid for earning trust and being accepted as a valid contributor to the team goals.

Max: In comparison: What do Scrum masters who are software engineers themselves have to pay specific attention to?

Andreas: In my opinion, former software engineers have a particular advantage, if and only if, they embrace their role as a Scrum master and use their software development experience for tackling the challenges of their team. In many cases, I’ve experienced that a former software engineer had fewer problems of steering the process towards iterative delivery. However, the risk of micro-discussions is high, since software engineers often have a dedication for technical issues even if they are not responsible for implementing them themselves. But if—for example—retrospectives are wasted with technical discussions, the team will not evolve with respect to the process quality.

Hence, Scrum masters having worked as software engineers should really pay attention to the main characteristics of the Scrum master role: be focused on the agile mindset, improve the working environment, identify and clear obstacles, etc. Particularly, the relationship with the product owner needs to be carefully developed to establish the necessary positive atmosphere in the team. As product owners are often (unintentionally) considered as outsiders by the software engineers, it is very important to mediate this conflict by leaving some of the software engineering background behind.

Max: Is there a threshold in terms of the number of people below which applying Scrum doesn’t make sense?

Andreas: The creation of and working with Scrum artifacts requires some time. In my opinion, you need at least five people in a team, but if the Scrum master and product owner are exclusively assigned to one team, then more software engineers might be required, so that the delivery speed is high enough and the product owner and Scrum master are actually “utilized”. Additionally, it depends on the maturity of the team members and the company. Particularly, if the culture of the company is not following the agile principles, even proven scalable agile frameworks (like SAFe or LESS) have a high risk to fail. In such a non-agile environment I would suggest having a core team focusing on the demanded functionality and additional supporters at the intersection to other teams focusing on solving the obstacles.

In general, I would like to turn the question around: Do you need a Scrum process for, e.g., organizing just 3 software engineers efficiently? My pragmatic solution would be to ensure that they are working together using direct communication. If that doesn’t work, I suggest to temporarily hire an Agile coach and help them find a suitable way of collaborating—but without aiming directly towards a Scrum process.

Max: Can Scrum work without retrospective meetings? If yes, under which circumstances?

Andreas: A core principle of the agile methodology is to adapt and to improve. Retrospective meetings are the corresponding expression in Scrum language. Additionally, retrospectives, e.g., establish an official timespan for controlled experiments that might lead to improvements, and many more positive aspects. So, on the one hand, retrospective meetings are very useful—if not necessary—for successfully implementing Scrum. On the other hand, if there is a team in which the team members are trusting each other, are experimental, challenge each other during implementation, give permanent honest feedback, are transparent to outsiders, etc., then you don’t necessarily have to establish a retrospective meeting at the end of each sprint. In short: If the team is very mature and permanently working towards improvement, then the time span between retrospective meetings can be increased. Yet, skipping them altogether is not such a good idea, from my experience. There is always something that needs discussion and is difficult to resolve during daily work.


This concludes our deep-dive into the specifics of Scrum, in particular, the roles of the Scrum master and product owner, task estimates, and retrospectives. The key takeaway here is probably that it is not so important to adhere to a strictly predefined process. Instead, aiming for a positive environment—with a team and leaders that are able and willing to adapt and continuously strive for improvement—will lead to efficient and happy teams.

In the third and final part of our interview, we will have a closer look at agile methodologies in the context of start-ups and the future of agile development in general. Thank you for reading and stay tuned!

An Introduction to Agile Development, Part 1: Scrum vs. Kanban

TL;DR: Two of the most popular agile methodologies are Scrum and Kanban. They mainly differ in the handling of prioritization (once per sprint vs. continuous) and deadlines (yes vs. no). Yet, independent of the approach, the key to success is to ensure focus and continuous delivery by: 1) creating small enough, manageable tasks, 2) prioritizing based on impact, and 3) finding a good trade-off between technical and business goals.

Many teams in software development and beyond are using agile processes to manage their tasks. However, working in an agile environment raises a variety of questions. Sometimes this is due to inexperienced team members, sometimes due to management deciding to use agile methods without having understood them.

Today, I have the pleasure to discuss agile development and two of the most popular agile methodologies—Scrum and Kanban—with Dr. Andreas Both, the Head of Architecture, Web Technologies, and IT Research at DATEV. Before, Andreas and I worked together at Unister, Germany’s biggest E-commerce company at that time, where he supervised my Ph.D. thesis. In this series of articles, our goal is to make agile development processes more understandable to people new to the topic and to dive a little deeper into the specifics of each methodology.

Max: To kick off the interview, we are going to start with one of the questions that is posed most often when people get started with agile methodologies and that I had to answer on way more than one occasion: What is the difference between Scrum and Kanban?

Andreas: In my honest opinion the core difference is the time taken for prioritization and iterations. While in Scrum the team is prioritizing once in an iteration, and during the sprint mostly only pull tasks, in Kanban the tasks are re-prioritized continuously. Hence, a task might be the next to be pulled from the backlog even though it has been raised by the product owner just seconds ago. Based on my experience, Scrum teams are feeling more freedom and less pressure due to the fact that in most companies the sprint is a protected area where external disruptions are minimized. Hence, development teams have more options to organize and plan their work, which might lead to higher happiness.

Max: Are there also differences in handling deadlines between Scrum and Kanban?

Andreas: A Scrum team delivers at the end of the sprint the latest. Kanban teams do not need an actual deadline, but if it takes very long to finish a task this might indicate a problem with slicing the task at hand into small enough, manageable subtasks. However, this problem also appears in Scrum teams. So, the differences with respect to deadlines are not significant if the process lead takes care. I think the core capability is to maintain a continuous delivery process.

Max: What is the best way or process to prioritize tasks?

Andreas: Based on my experience I’d say an impact-driven approach has the best chances of resulting in a successful project. It provides the opportunity for failing fast if the project is not executable. However, a mature team is needed in most cases to ensure success following this approach. For instance, the capability of slicing large tasks into smaller ones is required, so that manageable tasks are created and can be prioritized. Within inexperienced teams, you often hear statements like “We can’t slice it into smaller tasks,” which frequently leads to saving-the-world huge meta-tasks. Of course, there is no best process for task prioritization—just suitable ones for the current team’s or product owner’s maturity level and the product context. For instance, a simple approach might also help to establish a good acceptance of the product by the users. Finally, to live a methodology is crucial. A product owner should never be in the situation to not provide a justification or methodology for his prioritizations.

Max: So, from this, I conclude the process is not crucial in your opinion as long as the prioritization ensures focus. A different issue is that while establishing a backlog many teams struggle to prioritize requests of different kinds. How do you handle the conflicts between business goals and technical goals?

Andreas: I think there is no short answer to this question. When starting a project I suggest to favor the business needs over the long-term maintainability. This is due to a typical observation that requirements are unclear at this point in time. Early ideas about useful architecture etc. often do not hold and lead to wasted investments. At the same time, you have to communicate permanently to the (product) management that these early results and the velocity at which they have been created cannot be considered as regular and long-term. After some time teams tend to claim a certain amount of time to work on reducing technical debt. It is a question of maturity of the organization whether the development team can actually invest time to achieve technical goals or not. Unfortunately, we had to fight hard for this and lost many times.

Max: Isn’t there a risk that this leads to unsatisfied developers?

Andreas: Yes, there is a risk for sure. Unfortunately, reality shows that it is really hard to argue against killer arguments from strategic product management like “If we are not finishing this feature first, then the competitors will win.” Therefore, the Scrum master and the product owner have to work seriously on communicating the global priorities of feature completeness, maintainability, adaptability, etc.

Max: Very true. But what is your suggestion for a fresh product owner for prioritizing also non-business goals, such as the elimination of technical debt?

Andreas: In my experience, the key to success is to transform technical goals into business goals by finding related business KPIs. Let’s assume that the development team is annoyed because there is no time to refactor the source code and implementing unit tests for a certain component. While doing some analytics one might discover that the same component often contains bugs which need to be fixed in the sprint instantly. Hence, the missing tests are a reason for missing the deadlines for completing new features—leading to the business goal effectively encapsulating the technical goal. If the process lead and the product owner work closely together, then these justifications might be easy to find. However, if you do not find a reason, you should also talk frankly to the development team that it will not happen … and sometimes you just need to show courage and frankly inform your product management about the plan of aiming for a technical goal and just do it without justification.

Max: What are typical examples of features that should be realized using Scrum on the one hand and Kanban on the other?

Andreas: In my opinion, Scrum is more suitable for development teams targeting product goals where they can predict the requirements of the business domain (at least a little bit). If this is not the matter, then blockers within the Scrum sprint are more likely, which reduces the predictability of results at the end of a sprint. On the other hand, Kanban can be applied more easily with teams working in less predictable fields (e.g., research teams) or which are triggered by external interrupts (e.g, IT operations teams). For larger teams, there are results from different domains that Scrum is doing better. However, if you have such a large project that several teams are required, then your teams really should be experienced in agile methodology anyways.

This concludes the first part of Andreas’s and my little introduction to agile development, where we mainly addressed general issues such as deadlines, prioritization, communication, and the differences between business and technical goals. Also, we’ve had a closer look at Kanban and how it is different from Scrum. In the next part of the interview, we will be paying more attention to the specifics of the latter. Stay tuned!

What is ›Usability‹?

What is Usability?Earlier this year, I submitted a research paper about a concept called usability-based split testing1 to a web engineering conference (Speicher et al., 2014). My evaluation involved a questionnaire that asked for ratings of different usability aspects—such as informativeness, readability etc.—of web interfaces. So obviously, I use the word “usability” in that paper a lot; however, without having thought of its exact connotation in the context of my research before. Of course I was aware of the differences compared to User eXperience, but just assumed that the used questionnaire and description of my analyses would make clear what my paper understands as usability.

Then came the reviews and one reviewer noted:

“There is a weak characterization of what Usability is in the context of Web Interface Quality, quality models and views. Usability in this paper is a key word. However, it is weakly defined and modeled w.r.t. quality.”

This confused me at first since I thought it was pretty clear what usability is and that my paper was pretty well understandable in this respect. In particular, I thought Usability has already been defined and characterized before, so why does this reviewer demand me to characterize it again? Figuratively, they asked me: “When you talk about usability, what is that ›usability‹?”

A definition of usability

As I could not just ignore the review, I did some more research on definitions of usability. I remembered that Nielsen defined usability to comprise five quality components—Learnability, Efficiency, Memorability, Errors, and Satisfaction. Moreover, I had already made use of the definition given in ISO 9241–11 for developing the usability questionnaire used in my evaluation: 

“The extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.”

For designing the questionnaire I had only focused on reflecting the mentioned high-level factors of usability—effectiveness, efficiency, and satisfaction—by the contained items. However, the rest of the definition is not less interesting. Particularly, it contains the phrases

  1. “a product”;
  2. “specified users”;
  3. “specified goals”; and
  4. “specified context of use”.

As can be seen, the word “specified” is used three times—and also “a product” is a rather vague description here.

This makes it clear that usability is a difficult-to-grasp concept and even the ISO definition gives ample scope for different interpretations. Also, in his paper on the System Usability Scale, Brooke (1996) refers to ISO 9241–11 and notes that “Usability does not exist in any absolute sense; it can only be defined with reference to particular contexts.” Thus, one has to explicitly specify the four vague phrases mentioned above to characterize the exact manifestation of usability they are referring to. Despite my initial skepticism, that reviewer was absolutely right!

Levels of usability

As the reviewer explicitly referred to “Web Interface Quality”, we also have to take ISO/IEC 9126 into account. That standard is concerned with software engineering and product quality and defines three different levels of quality metrics: 

  • Internal metrics: Metrics that do not rely on software execution (i.e., they are a static measure)
  • External metrics: Metrics that are applicable to running software
  • Quality in use metrics: Metrics that are only available when the final product is used in real conditions

As usability clearly is one aspect of product quality, these metrics can be transferred into the context of usability evaluation. In analogy, this gives us three levels of usability: Internal usability, external usability, and usability in use.

This means that if we want to evaluate usability, we first have to state which of the above levels we are investigating. The first one might be assessed with a static code analysis, as for example carried out by accessibility tools. The second might be assessed in terms of an expert going through a rendered interface without actually using the product. Finally, usability in use is commonly assessed with user studies, either on a live website, or in a more controlled setting.

Bringing it all together

Once we have decided for one of the above levels of usability, we have to give further detail on the four vague phrases contained in ISO 9241–11. Mathematically speaking, we have to find values for the variables product, users, goals, and context of use, which are sets of characteristics. Together with the level of usability, this gives us a quintuple defined by the following cross product: 

level of usability × product × users × goals × context of use.

We already know the possible values for level of usability:

level of usability ∈ { internal usability, external usability, usability in use },

so what are the possible values for the remaining variables contained in the “quintuple of usability”?


The first one is rather straightforward. Product is the actual product you are evaluating, or at least the type thereof. Particularly, web interface usability is different from desktop software or mobile app usability. Also, it is important to state whether one evaluates only a part of an application (e.g., a single webpage contained in a larger web app), or the application as a whole. Therefore: 

product ⊆ { desktop application, mobile application, web application, online shop, WordPress blog, individual web page, … }. 

Since product is a subset of the potential values, it is possible to use any number of them for a precise characterization of the variable, for instance, product = { mobile application, WordPress blog } if you are evaluating the mobile version of your blog. This should not be thought of as a strict formalism, but is rather intended as a convenient way to express the combined attributes of the variable. However, not all values can be meaningfully combined (e.g., desktop application and WordPress blog). The same holds for the remaining variables explained in the following.


Next comes the variable users, which relates to the target group of your product (if evaluating in a real-world setting) or the participants involved in a controlled usability evaluation (such as a lab study). To distinguish between these is highly important as different kinds of users might perceive a product completely differently. Also, real users are more likely unbiased compared to participants in a usability study.

users ⊆ { visually impaired users, female users, users aged 19–49, test participants, inexperienced users, experienced users, novice users, frequent users, … }.

In particular, when evaluating usability in a study with participants, this variable should contain all demographic characteristics of that group. Yet, when using methods such as expert inspections, users should not contain “usability experts,” as your interface is most probably not exclusively designed for that very specific group. Rather, it contains the characteristics of the target group the expert has in mind when performing, for instance, a cognitive walkthrough. This is due to the fact that usability experts are usually well-trained in simulating a user with specific attributes.


The next one is a bit tricky, as goals are not simply the tasks a specified user shall accomplish (such as completing a checkout process). Rather, there are two types of goals according to Hassenzahl (2008): do-goals and be-goals. 

Do-goals refer to pragmatic usability, which means “the product’s perceived ability to support the achievement of [tasks]” (Hassenzahl, 2008), as for example the aforementioned completion of a checkout process.

Contrary, be-goals refer to hedonic usability, which “calls for a focus on the Self” (Hassenzahl, 2008). To give just one example, the ISO 9241–11 definition contains “satisfaction” as one component of usability. Therefore, “feeling satisfied” is a be-goal that can be achieved by users. The achievement of be-goals must not necessarily be connected to the achievement of corresponding do-goals (Hassenzahl, 2008). In particular, a user can be satisfied even if they failed to accomplish certain tasks and vice versa.

Thus, it is necessary to take these differences into account when defining the specific goals to be achieved by a user. The variable goals can be specified either by the concrete tasks the user shall achieve or by Hassenzahl’s more general notions if no specific tasks are defined:

goals ⊆ { do-goals, be-goals, completed checkout process, writing a blog post, feeling satisfied, having fun, … }.

Context of use

Last comes the variable context of use. This one describes the setting in which you want to evaluate the usability of your product. It can be something rather general—such as “real world” or “lab study” to indicate a potential bias of the users involved—, device-related (desktop PC vs. touch device) or some other more specific information about context. In general, your setting/context should be described as precisely as possible. 

context of use ⊆ { real world, lab study, expert inspection, desktop PC, mobile phone, tablet PC, at day, at night, at home, at work, user is walking, user is sitting, … }.

Case study

For testing a research prototype in the context of my industrial PhD thesis, we have evaluated a novel search engine results page (SERP) designed for use with desktop PCs (Speicher et al., 2014). The test was carried out as a remote asynchronous user study with participants being recruited via internal mailing lists of the cooperating company. They were asked to find a birthday present for a good friend that costs not more than €50, which is a semi-open task (i.e., a do-goal). According to our above formalization of usability, the precise type of usability assessed in that evaluation is therefore given by the following (for the sake of readability, the quintuple is given in list form): 

  • level of usability = usability in use
  • product = {web application, SERP}
  • users = {company employees, novice users, experienced searchers (several times a day), average age ≈ 31, 62% male, 38% female}
  • goals = {formulate search query, comprehend presented information, identify relevant piece(s) of information}
  • context of use = {desktop PC, HD screen, at work, remote asynchronous user study}

In case the same SERP is inspected by a team of usability experts in terms of screenshots, the assessed type of usability changes accordingly. In particular, users changes to the actual target group of the web application, as defined by the cooperating company and explained to the experts beforehand. Also, goals must be reformulated to what the experts pay attention to (only certain aspects of a system can be assessed through screenshots). Overall, the assessed type of usability is then expressed by the following:

  • level of usability = external usability
  • product = {web application, SERP}
  • users = {German-speaking Internet users, any level of searching experience, age 14–69}
  • goals = {identify relevant piece(s) of information, be satisfied with presentation of results, feel pleased by visual aesthetics}
  • context of use = {desktop PC, screen width ≥ 1225 px, expert inspection}


Usability is a term that spans a wide variety of potential manifestations. For example, usability evaluated in a real-world setting with real users might be a totally different kind of usability than usability evaluated in a controlled lab study—even with the same product. Therefore, a given set of characteristics must be specified or otherwise, the notion of “usability” is meaningless due to its high degree of ambiguity. It is necessary to provide specific information on five variables that have been identified based on ISO 9241–11 and ISO/IEC 9126: level of usability, product, users, goals, and context of use. Although I have introduced a mathematically seeming formalism for characterizing the precise type of usability one is assessing, it is not necessary to provide that information in the form of a quintuple. Rather, my primary objective is to raise awareness for careful specifications of usability, as many reports on usability evaluations—including the original version of my research paper (Speicher et al., 2014)—lack a complete description of what they understand as ›usability‹.

(This article has also been published on Medium and as a technical report.)

1 “Usability-based split testing” means comparing two variations of the same web interface based on a quantitative usability score (e.g., usability of interface A = 97%, usability of interface B = 42%). The split test can be carried out as a user study or under real-world conditions.


John Brooke. SUS: A “quick and dirty” usability scale. In Usability Evaluation in Industry. Taylor and Francis, 1996. 

Marc Hassenzahl. User Experience (UX): Towards an experiential perspective on product quality. In Proc. IHM, 2008.

Maximilian Speicher, Andreas Both, and Martin Gaedke. Ensuring Web Interface Quality through Usability-based Split Testing. In Proc. ICWE, 2014.


Special thanks go to Jürgen Cito, Sebastian Nuck, Sascha Nitsch & Tim Church, who provided feedback on drafts of this article 🙂