How to Infer Usability from User Interactions. My Poster Presented at #ICWE2014

WaPPU poster presented @ ICWE 2014The corresponding publications are:

  • Maximilian Speicher, Andreas Both and Martin Gaedke (2014). “Ensuring Web Interface Quality through Usability-based Split Testing”. In Proc. ICWE.
  • Maximilian Speicher, Andreas Both and Martin Gaedke (2014). “WaPPU: Usability-based A/B Testing”. In Proc. ICWE (Demos).

For more information about WaPPU, please see this previous post. Special thanks go to Fred Funke, who helped with designing the poster!

Advertisements

First Screencast Published in VSR Media Center

The demo video about usability-based A/B testing I created for the 2014 International Conference on Web Engineering is now featured in the media center of the VSR research group at Chemnitz University of Technology. The chair of VSR is Prof. Dr.-Ing. Martin Gaedke, who is the primary advisor of my PhD thesis.

The video above demonstrates the use of the WaPPU* service, which implements the novel principle of usability-based A/B testing. The underlying concept is that on one variation of an interface (A), we train a model from collected user interactions and an automatically presented usability questionnaire. Then, the other variation (B) involved in the A/B test uses this model to infer its usability from interactions alone.

Say, on interface A we perform a click within a particular element (#content) and then rate the site’s usability as good using the questionnaire. We reload the page, click outside that particular element and give a bad usability rating. The WaPPU service automatically trains a model that—simply speaking—knows the following:

                 click    --- usability = good
                /
element #content
                \
                 no click --- usability = bad

This model is instantly available to interface B. So if we now visit B and click outside of #content, WaPPU automatically infers a bad usability rating from this. The ratings of both variations of the investigated interface are available in a dashboard provided by our tool in real-time. This dashboard also features a traffic light that indicates whether one interface is significantly better or worse than the other based on a Mann–Whitney U test.

* “Was that Page Pleasant to Use?”

Usability-based Split Testing or How to infer web interface usability from user interactions

The continuous evaluation of an e-commerce company’s web applications is crucial for ensuring customer satisfaction and loyalty. Such evaluations are usually performed as split tests, i.e., the comparison of two slightly different versions of the same webpage with respect to a target metric. Usually, metrics that stakeholders are interested in include completed checkout processes, submitted registration forms or visited landing pages. To give just one example, a dating website could present 50% of their users with a blonde woman on the cover page while the other half see a dark-haired one. It is then possible to choose the “better” front page based on the number of registrations it generated—if you pay attention to the underlying statistics1.

While metrics of this type very well reflect how much money you make, you can’t make well-founded statements about usability based on such numbers (“you don’t know why you get the measured results”)2. Thus, in the long-term a way better solution is to provide your customers with a site they love to use instead of confusing them in such a way that they accidentally buy your products, isn’t it? This calls for the introduction of usability as a target metric in split tests.

The WaPPU dashboardWe have developed WaPPU, the prototype of a usability-based split testing service. The underlying principle is to track interactions (mouse, scrolling etc.) in both versions of the tested interface. The one version additionally asks for an explicit rating of its usability by using a previously developed questionnaire3. WaPPU then takes all of these data and automatically trains models (based on existing machine learning techniques4) that are instantly used to predict the usability of the other interface from user interactions alone. This makes it possible to compare the interfaces based on their usability as perceived by users, e.g., “interface A has a usability of 85%, interface B of only 57%”.

The feasibility of our approach has been evaluated in a split test involving a real-world search engine results page. We were able to train the above mentioned models, from which we also derived general heuristics for search results pages, such as “better readability is indicated by a lower page dwell time” or “less confusion is indicated by less scrolling”.

Usability-based Split Testing  paper @ ICWE2014We have described our novel approach and the corresponding evaluation in a full research paper5 and an accompanying demo paper6. Both will be presented at the 2014 International Conference on Web Engineering (ICWE). The conference proceedings will be published by Springer and the final versions of our papers will be available at link.springer.com.

1 http://www.sitepoint.com/winning-ab-test-results-misleading/
2 http://www.nngroup.com/articles/putting-ab-testing-in-its-place/
3 Maximilian Speicher, Andreas Both and Martin Gaedke (2013). “Towards Metric-based Usability Evaluation of Online Web Interfaces”. In Mensch & Computer Workshopband.
4 http://www.cs.waikato.ac.nz/ml/weka/
5 Maximilian Speicher, Andreas Both and Martin Gaedke (2014). “Ensuring Web Interface Quality through Usability-based Split Testing”. In Proc. ICWE.
6 Maximilian Speicher, Andreas Both and Martin Gaedke (2014). “WaPPU: Usability-based A/B Testing”. In Proc. ICWE (Demos).

4 Submissions accepted at International Conference on Web Engineering (ICWE)

End of February, I submitted four contributions to the 14th International Conference on Web Engineering: two full papers, one demo and one poster. Of these four submissions, all were accepted and will be presented at the conference, which is to be held in Toulouse (see map below) from July 1 to July 4. In the following, I’ll give a quick overview of the accepted papers. A more detailed explanation of my current research will be the subject of one or two separate articles.

  • Maximilian Speicher, Sebastian Nuck, Andreas Both, Martin Gaedke: “StreamMyRelevance! Prediction of Result Relevance from Real-Time Interactions and its Application to Hotel Search” — This full paper is based on Sebastian Nuck’s Master thesis. He developed a system for processing user interactions collected on search results pages in real-time and predicting the relevance of individual search results from these.
  • Maximilian Speicher, Andreas Both, Martin Gaedke: “Ensuring Web Interface Quality through Usability-based Split Testing” — This full paper proposes a new approach to split testing that is based on the actual usability of the investigated web interface rather than pure conversion maximization. We have trained models for predicting usability from user interactions and from these have also derived additional interaction-based heuristics for comparing search results pages.
  • Maximilian Speicher, Andreas Both, Martin Gaedke: “WaPPU: Usability-based A/B Testing” — This demo accompanies our paper about Usability-based Split Testing. The WaPPU tool builds upon this new concept and demonstrates how usability can be predicted from user interactions using automatically learned models.
  • Maximilian Speicher: “Paving the Path to Content-centric and Device-agnostic Web Design” — This poster is based on one of my previous posts. It provides a review of motherfuckingwebsite.com, which satirically claims to be a perfect website. Based on current research, we suggest improvements to the site that follow a strictly content-centric and device-agnostic approach.

My PhD research is supervised by Prof. Dr.-Ing. Martin Gaedke (VSR Research Group, Chemnitz U of Technology) and Dr. Andreas Both (R&D, Unister GmbH) and funded by the ESF and the Free State of Saxony.