Five days ago, on a train traveling home for Christmas, I was thinking about my personal highlights of 2019. While a lot of good things happened in the past 12 months (and I’m not going to talk about private matters here), from a professional point of view, there’s a clear winner: Giving a talk about mixed reality at the ACM Conference on Human Factors in Computing Systems (a.k.a. CHI) in Glasgow.
The talk was based on research I conducted together with friends from the University of Michigan (where I was a post-doc from 2017‒18), Michael Nebeling and Brian Hall. We had noticed that a lot of people we talked to had differing and partly competing understandings of what mixed reality (or MR) is. For instance, some relied on the original definition by Milgram and Kishino from 1994, which defines MR as a continuum (see below), while others adhered to a newer notion pushed by Microsoft, which also applies to experiences that are clearly VR.
Hence, we concluded that—even though it might seem the question What is Mixed Reality? should have a relatively simple answer—it would be worthwhile to discover and investigate all the different notions of mixed reality that are out there. And we were right, the situation wasn’t as easy as you’d think.
What did we find?
As we hypothesized, there is indeed not a single, “best” definition of mixed reality. Instead, we found six distinct and widely used working definitions:
MR according to Milgram et al.’s continuum (see above)
MR as a synonym for AR
MR as a type of collaboration (interaction between AR and VR users that are potentially physically separated)
MR as a combination of AR & VR (a system combining distinct AR and VR parts)
MR as an alignment of environments (e.g., synchronization between a physical and virtual environment)
MR as a “stronger” version of AR (e.g., HoloLens)
These can be classified based on a conceptual framework (some would call it a taxonomy) with seven dimensions:
number of environments
number of users
level of immersion (e.g., not immersive ‒ partly immersive ‒ fully immersive)
level of virtuality (e.g., not virtual ‒ partly virtual ‒ fully virtual)
degree of interaction (e.g., implicit ‒ explicit)
input (e.g., motion, location)
output (e.g., visual, audio)
I have also distilled our findings into an infographic:
How did we do it?
To discover the six working definitions as well as the seven dimensions of the conceptual framework, we conducted expert interviews that were augmented (clever wordplay, huh?) by an extensive literature review. First, we interviewed a total of ten experts working on augmented and/or virtual reality, from both, academia and industry (occupations ranged from professor to R&D executive to CEO of an AR company). These interviews yielded a preliminary set of four working definitions. Subsequently, we reviewed a total of 68 sources, mainly from the CHI, CHI PLAY, UIST, and ISMAR conferences from 2014‒18 (inclusive). These confirmed the four preliminary notions while we also discovered two more that were added to the set.
Ultimately, we derived the conceptual framework by identifying the minimum number of dimensions that still allowed us to classify all of the working definitions unambiguously.
Example: Pokémon GO
To give just one example (from our paper), let’s have a look at how Pokémon GO would fit into the conceptual framework. First of all, the viral game constitutes MR according to notion № 4: a combination of AR and VR in a single system.
It comprises oneenvironment since everything happens on the same device.
It can be played by oneuser on that device.
The level of immersion lies between not immersive and partly immersive.
The level of virtuality lies between partly virtual (the game’s AR view) and fully virtual (the game’s map view).
Interaction is implicit (the player moves in the real world, all explicit interaction happens via a HUD).
It uses the user’s geolocation as input and provides visual and auditory output.
Now, why is this important? Mixed reality is a trending topic. Many people are talking about it nowadays and the number of papers, research artifacts, hardware, and apps is steadily increasing. MR has the potential to become omnipresent in our everyday lives. Therefore, it is important to put one’s words into context. With our research, we hope to provide researchers, students, and professionals with a tool that lets them better communicate what they mean when talking about MR, and to reduce misunderstandings in a rapidly evolving field. We are also proud that our paper received an 🏅 Honorable Mention Award, which stresses the importance of the question at hand.
What is Web 3.0? That’s a good question! And I’m pretty sure I won’t be able to answer it in this essay. Yet, I’ll try my very best to get closer to the answer. There exist several definitions of Web 3.0, none of which can be considered definite. A very general one describes it as “an extension of Web 2.0,” which is of limited helpfulness. Also, I’ve heard some call the Semantic Web “Web 3.0,” while Nova Spivack as well as Tim Berners-Lee see it only as a part of the latter. Interestingly, what has been neglected in most discussions about Web 3.0 so far are augmented (AR) and virtual reality (VR), or 3D in general. Seems like this could be worth a closer look. Although both, AR and VR have been connected to Web 3.0 separately, they rather have to be seen as an integral part of the overall concept, in addition to the Semantic Web. In the following, I describe why 3D — and AR/VR in particular — are beyond the Web 2.0, why current trends in web technology show that we are entering the Web 3.0 at high speed right now, and what will change for us — the designers, developers, architects etc.
Where are we coming from?
To be able to put Web 3.0 in relation to what we’ve seen so far, let’s have a brief look at the beginnings first.
As web technologies evolved, websites became less static, looking more and more like desktop applications. Soon, users could connect via social networks (Facebook’s like button is ubiquitous nowadays) and watch videos online. YouTube videos, tweets, and alike became discrete content entities (i.e., detached from a particular webpage) that could now be easily embedded anywhere. For instance, WordPress by default features specific shortcodes for these two. Data, rather than the underlying technology, became the center of the web (cf. “What Is Web 2.0” by Tim O’Reilly), which in particular led to an increasing number of mash-ups. Through templating, e.g., by using WordPress, it became increasingly easy for everyone to create a sophisticated website. Also, the proliferation of mobile and small-screen devices with touch screens caused the advent of responsive and adaptive websites as well as completely new kinds of interaction and corresponding user interfaces. Rather than by technologies, the Web 2.0 was and is defined by social interactions, new types of (mashable) content, and a stronger focus on user experience, among other things (cf. “Social Web” by Anja Ebersbach et al.). Yet, contents were as flat as before. That’s the web today’s average user knows.
Now that we’ve seen where we come from, let’s elaborate on why 3D is a major part of Web 3.0.
Virtual and augmented reality
Neither VR nor AR are the Web 3.0 (as has been stated by some). Still, they are an important part of the bigger picture. Since Google introduced their Cardboard at the I/O 2014, consuming VR has become affordable and feasible for average users. Another, similar device heavily pushed right now is the Gear VR. Yet, despite the introduction of 360° video support by YouTube and Facebook, as of today, corresponding content is still rather limited compared to the overall number of websites. This will change with the growing popularity of devices such as 360° cameras, which allow you to capture 360° videos and photospheres (like in Google Street View) with just one click. Such 360° images can then be combined to, e.g., virtual tours using dedicated web platforms such as Roundme, YouVisit, and HoloBuilder. In this way, the average user can also create their own VR content that can be consumed by anyone, in particular through their Cardboards or other head-mounted displays (HMDs). Hence, the amount of available VR content will grow rapidly in the near future.
I personally like to refer to the type of VR content created from 360° images and consumed through HMDs as “Holos,” so let’s stick to that naming convention for now. Just like YouTube videos and tweets, Holos are discrete content entities. That is, technically speaking, all of them are simply iframes, but denote completely different kinds of content on a higher level of abstraction. Particularly, unlike plain YouTube videos and tweets, Holos add a third spatial dimension to the web content that is consumed by the user. That is, they move the web from 2D to 3D, the enabling technologies being WebGL, Three.js, and Unity. Another example for this evolution is Sketchfab, which brings high-end 3D models to the web and has been described as “the youtube for 3D content.” Contrary to VR, AR has not yet reached the same status regarding affordability and feasibility for average users. This is due to the fact that AR can’t be simply created and consumed in a web browser. Currently, AR application are of more interest in Industry 4.0 contexts. However, I’m sure that once VR has hit the mainstream, the complexity of AR will decrease and develop into the same direction. Already now, platforms like HoloBuilder offer the possibility to also create AR content in the browser, which can then be consumed using a dedicated Android or iOS app.
With the introduction of the third dimension in web content, also the necessary interactions change significantly. So far, we’ve had traditional interaction using mouse and keyboard and the touch interaction we know from smartphones and tablet PCs. Now, when consuming Web 3.0 content through our Cardboard, we face a novel, hands-free kind of interaction since we cannot touch the screen of the inserted phone. Instead, “clickable” objects need to be activated using, e.g., some kind of crosshair that is controlled via head movements (notice the little dot right below the sharks in the picture above). Another scenario (of the seemingly thousands that can be thought of) could be content consumed through a Gear VR while controlling it with a smart watch. Also, smart glasses and voice recognition — and more natural user interfaces in general — will become a thing. This calls for completely new and probably radical approaches towards usability, UX, interface, and interaction design that more and more move away from what average users were used to 15 or even only five years ago. All of this will aim at providing an experience that’s as immersive as possible for the user.
Finally, what I also consider to already be a part of Web 3.0 is Google’s Material Design language. This is because, just like AR and VR, it aims at extending Web 2.0 beyond the second dimension. Although the outcome is clearly not 3D content in the sense of AR and VR as described above, Material Design puts a strong focus on layers and shadows. Hence, it introduces what I like to call 2½D.
Where are we going?
To summarize, the specific properties of and differences between Web 1.0, 2.0, and 3.0 are given in the following rough overview1:
Smartphone, tablet PC
Smart glasses, Google Cardboard, Gear VR
Hands-free, head movement, voice, smart watch
HTML5, CSS3, Ajax, jQuery, Node.js
WebGL, Three.js, Unity, Material Design
Webpages, text, images
YouTube videos, tweets, (blog) posts etc.
Photospheres, 360° videos, 3D models, Holos
Defined by / focus on
Data, social interaction, mash-ups, UX, responsiveness
AR and VR—or 3D in general—will become the predominant kind of content created and consumed by users, taking the place of the plain content we’ve been used to so far. For instance, think of the personal portfolio of a painter. In the Web 1.0, it was a hand-crafted website created with Microsoft FrontPage. In the Web 2.0, it’s a WordPress page featuring a premium theme specifically designed as a showcase for paintings. Also, the painter has a dedicated Facebook page to connect with their fans. In the Web 3.0, the personal portfolio will be a walkthrough of a virtual 3D arts gallery, with the paintings virtually hanging on the walls. That walkthrough can be consumed using a web browser, either on a PC, on a tablet, on a smartphone, or through Google Cardboard. Therefore, everyone involved in creating websites and web applications will face new challenges: from presenting information in 3D to designing completely novel kinds of interactions to having to consider a wide variety of VR devices and so on. The very underlying look and feel of the web—for both, creators and consumers—will change drastically.
In analogy to the two-dimensional Web 2.0, Web 3.0 is the perfect metaphor for the three-dimensional web that is currently evolving. Besides the development towards interconnectedness, IoT, linked data, and the Semantic Web, the fact that the we are moving away from the webpage paradigm (cf. “Atomic Design” by Brad Frost) and into the third dimension is one of the major indicators that we are on the verge of experiencing the Web 3.0. And I for my part find it really exciting.
1 This table raises no claims to completeness. Particularly, for the sake of simplicity, I omit the properties of Web 3.0 not connected to AR and VR.
At this year’s INFORMATIK conference held by the GI in Cottbus, I had the chance to present a research paper (full text here) about HoloBuilder—officially titled “Enabling Industry 4.0 with holobuilder”1—that I wrote together with my colleagues Kristina Tenhaft, Simon Heinen and Harry Handorf. In our paper, we examine HoloBuilder from a research rather than a marketing perspective by explaining and demonstrating how it acts as an enabler for Industry 4.0.
The paper was presented in the session named “Industry 4.0: Computer Science Forms New Production Systems”, which featured a selection of renowned experts for Industry 4.0—including Prof. Dr.-Ing. Peter Liggesmeyer of TU Kaiserslautern, Prof. Dr. Jürgen Jasperneite of OWL University and Prof. Dr.-Ing. Jörg Wollert of Aachen University of Applied Sciences, among others. The presenters set a particular focus on topics such as Internet of Things, smart factories, wireless communication and OPC UA, with which our presentation fitted in seamlessly—as will be explained in the following. The feedback we received was consistently positive.
Industry 4.0 was the original use case of our platform, i.e., the use case based on which the first prototypes had been created. From those, the current form of HoloBuilder evolved. The term Industry 4.0 was first coined in the context of the High-Tech Strategy 2020 of the German government. Basically, the smart factory, in which people, machines and products are ubiquitously interconnected, is at the center of Industry 4.0.2 Particular focus is moreover on cyber-physical systems, which merge the virtual and the real world.
HoloBuilder & Industry 4.0
From the technical perspective, implementing Industry 4.0 to a high degree means realizing the smart factory including cyber-physical systems. For this, two prime concepts to consider are Augmented Reality and machine-to-machine communication. Augmented Reality (AR) adds virtual objects to the real world in a see-through scenario, e.g., with smart glasses or a tablet PC. On the one hand, AR provides a “fusion of the physical and the virtual world”3 and thus forms a framework for cyber-physical systems while on the other hand it facilitates efficient human–machine interfaces. Yet, AR alone cannot realize a smart factory, because it only caters for displaying objects, which is a form of one-way communication. Hence, AR needs to be complemented with capabilities for machine-to-machine communication (M2M).
To enable the implementation of Industry 4.0, HoloBuilder has been designed as a platform that makes it possible for everyone concerned to create and consume arbitrary AR content. This is a particular advantage over other AR solutions, which require specific skills for creating the desired content, among other things. In contrast, HoloBuilder facilitates end-user design, which enables, e.g., engineers and mechanics without programming skills to create AR applications in the context of Industry 4.0. To also cater for M2M, the platform as well incorporates OPC UA capabilities, which is a standardized protocol. In this way, information provided by a machine (e.g., its current temperature) can be presented in terms of virtual objects in an AR scenario. Moreover, by manipulating such virtual objects, the user can also give commands to the machine via OPC UA. This makes it possible to, e.g., display a virtual button that can switch a machine on or off.
Hermann et al.4 define six design principles for Industry 4.0, upon which we build to show HoloBuilder’s potential for being an enabler of Industry 4.0:
Service Orientation and
To summarize the above, Augmented Reality and machine-to-machine communication are two core principles to be considered when implementing Industry 4.0 in terms of a smart factory with cyber-physical systems. HoloBuilder, a platform for end-user design of arbitrary AR content, provides support for both. Our platform moreover fulfills all of the six design principles for Industry 4.0, which underpins HoloBuilder’s potential as an enabler.
Our paper has been published in the proceedings of the 2015 INFORMATIK conference and is also available via ResearchGate (including full text).
1 At the time the paper was accepted, we still had the company-internal convention to write HoloBuilder in lowercase letters, which has changed by now. 2http://www.plattform-i40.de/ 3 Kagermann, Henning: Chancen von Industrie 4.0 nutzen [Taking the Chances of Industry 4.0]. In (Bauernhansl, Thomas; ten Hompel, Michael; Vogel-Heuser, Birgit, eds): Industrie 4.0 in Produktion, Automatisierung und Logistik [Industry 4.0 in Production, Automation and Logistics], pp. 603–614. Springer, 2014. 4 Hermann, Mario; Pentek, Tobias; Otto, Boris: Design Principles for Industrie 4.0 Scenarios: A Literature Review. 2015. Working Paper No. 01/2015, Audi Stiftungslehrstuhl Supply Net Order Management, TU Dortmund.
For about 5 weeks now, I’ve been working in my new job at bitstars (an augmented/virtual reality start-up based in Aachen) and so far have been mainly involved in the development of our new platform HoloBuilder.
What is HoloBuilder?
HoloBuilder allows to create what we call 3D presentations, or 360° presentations if photospheres are involved (see example below). That is, the user can create a set of “slides”, but unlike in, say Microsoft PowerPoint, these “slides” are three-dimensional (which makes them more something like rooms). Such a room can be filled with arbitrary 2D and 3D objects, different kinds of texts and even 360° photos. A straightforward use case would be to virtually furnish your new apartment, i.e. you take a 360° photo of every room, add 3D models of your desired furniture and finally interlink the rooms, thus creating a virtual tour of the apartment. This is similar to the presentation linked below, which has been created by one of our student assistants.
We need your help!
As we are a relatively young start-up, we do not have a large user base yet, but need any feedback we can get. Therefore, at this point, I’d like to ask all of you to check out the alpha version of HoloBuilder and share your impressions with me. Try it out, send me your presentations, think of possible use cases or explain why you can’t think of any, tell me why you would or would not use HoloBuilder … Just send me anything that comes to your mind while using or just looking at our platform. Every little piece of information is extremely valuable for us and will support us in developing a great product.
Go to holobuilder.com or simply click the screenshot below to open the example project. You can use the comments section on this page for your feedback or send a tweet to @maxspeicher.