REFOCUS: Current & Future Search Interface Requirements for German-speaking Users

REFOCUSWhen looking at current research, there is plenty of existing work inquiring into how users use search engines1 and how future search interfaces could look like2. Yet, an investigation of users’ perceptions of and expectations towards current and future search interfaces is still missing.

Therefore, at this year’s International Conference on WWW/Internet (ICWI ’16) my co-author Martin Gaedke presented our paper “REFOCUS: Current & Future Search Interface Requirements for German-speaking Users”, which we wrote together with Andreas Both. To give you an idea of what our work aims at, I’m going to provide a step-by-step explanation of the research paper’s title.

REFOCUS. An acronym for Requirements for Current & Future Search Interfaces.

Search Interface Requirements. From an exploratory study with both qualitative and quantitative questions we have derived a set comprising 11 requirements for search interfaces. The initial set of requirements was validated by 12 dedicated experts.

Current. The requirements shall be valid for current search interfaces. According to the experts’ reviews, this applies to eight of the requirements.

Future. Also, the set of requirements shall inform the design and development of future search interfaces. According to the experts’ reviews, this applies to ten of the requirements. Supporting the design of future search interfaces is particularly important with the wide variety of Internet-capable novel devices, like cutting-edge video game consoles, in mind.

German-speaking Users. Due to the demographics of our participants, the set of requirements can be considered to be valid for German-speaking Internet users. 87.3% of the participants were German while 96.6% lived in a German-speaking country at the time of the survey.

If this sounds interesting to you, please go check out our research paper at ResearchGate or arXiv. The original publication will be available via the IADIS Digital Library.

1 For instance, http://www.pewinternet.org/2012/03/09/search-engine-use-2012/ (accessed November 8, 2016).
2 For instance, Hearst, M. A. ‘Natural’ Search User Interfaces. In Commun. ACM 54(11), 2011.

Advertisements

The Search Interaction Optimization Toolkit – The Essence of my PhD Thesis

SIO Toolkit Logo
Logo of the SIO Toolkit.

My PhD thesis introduces a novel methodology that is named Search Interaction Optimization (SIO) and is used for designing, evaluating and optimizing search engine results pages (so-called SERPs). As a proof-of-concept of this new methodology, I’ve developed a corresponding SIO toolkit, which comprises a total of seven components1 (most of which have already been introduced in previous posts):

  1. Inuit, a new instrument for usability evalutation;
  2. WaPPU, a tool for Usability-based Split Testing;
  3. a catalog of best practices for creating better usable SERPs, which together with WaPPU and a special add-on forms
  4. S.O.S., a tool for automatically evaluating and optimizing SERPs;
  5. TellMyRelevance! (TMR), a novel pipeline that predicts the relevance of search results from client-side interactions;
  6. StreamMyRelevance! (SMR), a streaming-based version of TMR that works in real-time rather than batch-wise; and
  7. a set of requirements for current & future search interfaces, which has been derived from an empirical study with German-speaking users.
SIO Methodology Logo
Logo of the SIO Methodology.

Describing the design and development of the above components and evaluating their effectiveness and feasibility makes for a major part of my thesis. Now, I’ve finally managed to organize all of them in terms of GitHub repos2, which I make available through a new website I have specifically created for my PhD project: http://www.maxspeicher.com/phdthesis/. In particular, on that site you can filter the components depending on whether you want to design, evaluate and/or optimize a SERP. It also lists all of the related publications including links to the corresponding full texts (via ResearchGate). In case you are actually interested in all that fancy research stuff3—have fun browsing, reading & playing around! 🙂

1 The logo of the SIO toolkit features only six tiles because S.O.S. and the catalog of best practices are treated as one component there.
2 Because my PhD project was carried out in cooperation with Unister GmbH (Leipzig), unfortunately it’s not possible for me to provide the source codes of all components via GitHub, as some contain company secrets.
3 Which I doubt. 😉

Usability-based Split Testing or How to infer web interface usability from user interactions

The continuous evaluation of an e-commerce company’s web applications is crucial for ensuring customer satisfaction and loyalty. Such evaluations are usually performed as split tests, i.e., the comparison of two slightly different versions of the same webpage with respect to a target metric. Usually, metrics that stakeholders are interested in include completed checkout processes, submitted registration forms or visited landing pages. To give just one example, a dating website could present 50% of their users with a blonde woman on the cover page while the other half see a dark-haired one. It is then possible to choose the “better” front page based on the number of registrations it generated—if you pay attention to the underlying statistics1.

While metrics of this type very well reflect how much money you make, you can’t make well-founded statements about usability based on such numbers (“you don’t know why you get the measured results”)2. Thus, in the long-term a way better solution is to provide your customers with a site they love to use instead of confusing them in such a way that they accidentally buy your products, isn’t it? This calls for the introduction of usability as a target metric in split tests.

The WaPPU dashboardWe have developed WaPPU, the prototype of a usability-based split testing service. The underlying principle is to track interactions (mouse, scrolling etc.) in both versions of the tested interface. The one version additionally asks for an explicit rating of its usability by using a previously developed questionnaire3. WaPPU then takes all of these data and automatically trains models (based on existing machine learning techniques4) that are instantly used to predict the usability of the other interface from user interactions alone. This makes it possible to compare the interfaces based on their usability as perceived by users, e.g., “interface A has a usability of 85%, interface B of only 57%”.

The feasibility of our approach has been evaluated in a split test involving a real-world search engine results page. We were able to train the above mentioned models, from which we also derived general heuristics for search results pages, such as “better readability is indicated by a lower page dwell time” or “less confusion is indicated by less scrolling”.

Usability-based Split Testing  paper @ ICWE2014We have described our novel approach and the corresponding evaluation in a full research paper5 and an accompanying demo paper6. Both will be presented at the 2014 International Conference on Web Engineering (ICWE). The conference proceedings will be published by Springer and the final versions of our papers will be available at link.springer.com.

1 http://www.sitepoint.com/winning-ab-test-results-misleading/
2 http://www.nngroup.com/articles/putting-ab-testing-in-its-place/
3 Maximilian Speicher, Andreas Both and Martin Gaedke (2013). “Towards Metric-based Usability Evaluation of Online Web Interfaces”. In Mensch & Computer Workshopband.
4 http://www.cs.waikato.ac.nz/ml/weka/
5 Maximilian Speicher, Andreas Both and Martin Gaedke (2014). “Ensuring Web Interface Quality through Usability-based Split Testing”. In Proc. ICWE.
6 Maximilian Speicher, Andreas Both and Martin Gaedke (2014). “WaPPU: Usability-based A/B Testing”. In Proc. ICWE (Demos).

StreamMyRelevance! Predicting Search Result Relevance from Streams of Interactions

SMR paper @ ICWE2014Guessing the relevance of delivered search results is one of the biggest issues for today’s search engines. The particular problem is that it’s difficult to obtain explicit statements from users about whether they found what they were searching for. Clicks are commonly used to guess relevance (using so-called “click models”) but they are far from being a perfect indicator. Particularly, a user might click a search result, but then return to the results page because the visited webpage was useless. Also, it’s possible that no clicks happen at all if the desired piece of information is already shown on the results page (e.g., in terms of an info box).

To tackle the above shortcoming, we have investigated the suitability of implicit feedback in terms of mouse cursor interactions for predicting the relevance of search results. For this, we developed StreamMyRelevance!—a system that receives streams of interactions and relevance judgments and trains statistical models from these in near real-time. The models can then be used to infer relevance from interactions in the future. The relevance judgments we’re using to train our models can either be implicit (e.g., a completed booking process in the case of hotel search) or explicit (e.g., statements by paid quality raters/crowdworkers).

Analysis of a large amount of real-world interaction data from two e-commerce portals showed that StreamMyRelevance! is able to train good models that show the tendency to perform better than a state-of-the-art click model solution1 that is successfully used in industry. Our results particularly underpin the benefit of using interaction data other than clicks for guessing the relevance of search results.

We have summarized the design and evaluation of our system in a full research paper2 that will be presented at the 2014 International Conference on Web Engineering (ICWE). The conference proceedings will be published by Springer and the final version of our paper will be available at link.springer.com. Special thanks go to Sebastian Nuck, who helped with development and evaluation of StreamMyRelevance! in the context of his Master’s Thesis at Leipzig University of Applied Sciences.

1 Chao Liu, Fan Guo, and Christos Faloutsos (2009). “BBM: Bayesian Browsing Model from Petabyte-Scale Data”. In Proc. KDD.
2 Maximilian Speicher, Sebastian Nuck, Andreas Both and Martin Gaedke (2014). “StreamMyRelevance! Prediction of Result Relevance from Real-Time Interactions and its Application to Hotel Search”. In Proc. ICWE.

4 Submissions accepted at International Conference on Web Engineering (ICWE)

End of February, I submitted four contributions to the 14th International Conference on Web Engineering: two full papers, one demo and one poster. Of these four submissions, all were accepted and will be presented at the conference, which is to be held in Toulouse (see map below) from July 1 to July 4. In the following, I’ll give a quick overview of the accepted papers. A more detailed explanation of my current research will be the subject of one or two separate articles.

  • Maximilian Speicher, Sebastian Nuck, Andreas Both, Martin Gaedke: “StreamMyRelevance! Prediction of Result Relevance from Real-Time Interactions and its Application to Hotel Search” — This full paper is based on Sebastian Nuck’s Master thesis. He developed a system for processing user interactions collected on search results pages in real-time and predicting the relevance of individual search results from these.
  • Maximilian Speicher, Andreas Both, Martin Gaedke: “Ensuring Web Interface Quality through Usability-based Split Testing” — This full paper proposes a new approach to split testing that is based on the actual usability of the investigated web interface rather than pure conversion maximization. We have trained models for predicting usability from user interactions and from these have also derived additional interaction-based heuristics for comparing search results pages.
  • Maximilian Speicher, Andreas Both, Martin Gaedke: “WaPPU: Usability-based A/B Testing” — This demo accompanies our paper about Usability-based Split Testing. The WaPPU tool builds upon this new concept and demonstrates how usability can be predicted from user interactions using automatically learned models.
  • Maximilian Speicher: “Paving the Path to Content-centric and Device-agnostic Web Design” — This poster is based on one of my previous posts. It provides a review of motherfuckingwebsite.com, which satirically claims to be a perfect website. Based on current research, we suggest improvements to the site that follow a strictly content-centric and device-agnostic approach.

My PhD research is supervised by Prof. Dr.-Ing. Martin Gaedke (VSR Research Group, Chemnitz U of Technology) and Dr. Andreas Both (R&D, Unister GmbH) and funded by the ESF and the Free State of Saxony.