The idea of enhancing human perception through computer-mediated reality has its origins
tracing back to the 1960s [1]. Since then, this concept has undergone various transformations,
leading to a myriad of terms that can be bewildering for many [2]. Extended Reality (XR) serves
as a broad term that covers a range of immersive technologies such as virtual reality (VR),
augmented reality (AR), and mixed reality (MR) along with several input mechanisms for
interactions. XR denotes a spectrum of experiences that blur the boundaries between the physical
world and virtual environments [3]. This spectrum was conceptualized by Milgram within the
reality–virtuality continuum, which encompasses all possible variations of real and virtual
objects [4,5]. Figure 1 illustrates the relationship between different XR technologies, depicting
the transition from the real environment to the virtual environment.
Figure 1. Relation between XR technologies, input mechanisms, and environment
While VR allows the user to submerge in immersive environments, AR enables the overlapping of relevant virtual elements to the observed reality. The XR spectrum also includes MR, Augmented Virtuality (AV), and Diminished Reality, each offering distinct experiences [6,7,8]. However, the prevailing trend indicates a shift from VR headsets to MR headsets, exemplified by newcomers like the Meta Quest 3 and Meta Quest Pro.
AR is a technology designed to enhance the user's visual field by overlaying necessary information for the current task [9,10]. It exhibits several key properties, including the integration of real and augmented objects, real-time interactivity, and accurate alignment of real and augmented elements [11]. These dynamics of AR complement human associative information processing and memory, facilitating a seamless transition between reality and augmentation [12]. AR finds extensive applications in remote guidance, instructional visualization for complex assembly, disruptive tasks, and training within existing environments [13].AR could be divided into three different types of Hand-held device (smartphone or tablet), Head-worn (glasses) and spatial (projector/hologram) [14]. AR essentially enriches reality with computer-generated content, commonly accessed through smartphones, tablets, or glasses [15, ]. This digital content overlays the user's real-world view, providing a blend of real and virtual experiences, which can include 3D stereoscopic or 2D imagery [17]. The essence of AR lies in Peripheral Input Mechanisms its ability to deliver real-time information or data within the context of the user's immediate surroundings, enhancing situational awareness and facilitating informed decision-making [15]. Table 1 showcases the most recent AR devices, detailing their specifications and offering a thorough comparative analysis.
Table 1. Specifications of latest AR devices
AV represents a higher level of virtual experience compared to augmented reality, primarily used for visualizing new products and procedures such as picking processes. In AV, a significant portion of the elements involved are synthetic in nature [9,24]. MR is characterized by the blending of both real and virtual environments, where each augments the other [9,25]. MR closely resembles AR but offers a more immersive experience through increased interaction between virtual and real-world elements, enhancing the realism for users [26]. Unlike AR, in MR, users not only see computer-generated content integrated with reality but also interact with it, creating a seamless fusion between virtual and physical interactions. This is exemplified in interactive heads-up displays from science fiction, where users can touch virtual content and perceive its scale as if it were real [15]. To achieve the MR experience, specialized headsets equipped with integrated computers, translucent glass, and sensors are required. These systems map the real-world environment in real-time using integrated sensors, enabling virtual objects to interact with the physical environment and users seamlessly. In essence, MR offers a more immersive and interactive version of AR [2]. Table 2 features notable examples of MR headsets widely utilized in various reported MR applications, along with their specifications and a comparative analysis.
Table 2. Specifications of widely used and latest MR headsets
VR refers to the utilization of real-time digital computers, along with specialized hardware and software, to create simulations of alternate worlds or environments that are convincingly realistic to users [34, 35]. Positioned at the far end of the reality-virtuality continuum, VR systems generate entirely computer-generated content, immersing users fully in virtual environments, devoid of interaction with the real world. The immersive nature and high level of presence in VR systems offer significant flexibility for exploring "what-if" scenarios [22]. VR immerses users in a synthetic environment that replicates real-world properties using various technologies such as high-resolution displays, high refresh rates, head-mounted displays, stereo headphones, and motion-tracking systems [36, 37].
In VR, users become completely absorbed in a virtual world, unable to perceive the real world around them. This technology enables users to step into a threedimensional world, where they can explore, interact with, and move around as if it were real. [38, 39]. VR finds applications across different phases of manufacturing, commonly used for product development, marketing, training, ergonomics [40] and visualizing digital factories during the design phase, whether for green-field or brownfield applications [41]. When using VR, users don glasses, earphones, and other devices to immerse themselves in a simulated environment, shutting out the real world entirely.
Everything within VR environments, including the user's avatar, is virtual content. These environments can consist of 360-degree photos, 360- degree videos, or n-D models, capturing scenarios or situations for users to experience firsthand. Interactivity is a key feature of VR, allowing users to interact with objects, props, or the environment itself. Typically, VR experiences are delivered through enclosed head-mounted displays with surround audio, providing a fully immersive sensory experience [42,43]. VR systems typically come in three setups [22]:
Table 3 presents prominent examples of VR headsets extensively utilized in various reported VR
applications. The table includes their specifications and provides a comprehensive comparative
analysis.
The term "Haptics" was introduced by Revesz in 1950, inspired by observations of blind individuals' performance. It denotes an unconventional sensory experience distinct from traditional touch and kinesthesis. Specifically, haptics refers to "active touch" rather than passive touching [51]. Human haptic perception encompasses both kinesthetic and cutaneous (tactile) feedback. Kinesthetic feedback involves sensing the position and movement of one's body, mediated by receptors located in the skin, joints, skeletal muscles, and tendons [52]. Conversely, cutaneous feedback relates to stimuli detected by low-threshold mechanoreceptors beneath the skin within the contact area [53]. Devices designed to stimulate kinesthesia are typically grounded, bulky, mechanically complex, and expensive, with a limited workspace. Traditionally, these kinesthetic devices can provide distinct forces or torques to move the user's hand or resist motion, offering high-quality interactions but facing constraints in terms of cost and portability [54]. To circumvent the limitations associated with grounded kinesthetic devices, haptic feedback can be delivered through cutaneous devices. Although cutaneous feedback can theoretically be provided for the entire body, it is predominantly delivered through fingertips, which are commonly utilized for grasping and manipulation and are rich in mechanoreceptors [55]. Research has demonstrated that, to some extent, it is feasible to compensate for the absence of kinesthesia using the modulated cutaneous force technique without experiencing significant performance degradation [56,57].
Despite its significance, the haptic sense remains relatively underexplored compared to sight and hearing [58]. The prevailing trend in haptic device development involves bringing the base of these devices closer to the area of stimulation. This shift entails transitioning from grounded devices, which cannot be worn on parts of the user's body, to handheld devices, and progressing further towards the development of partially and fully wearable devices [59]. Accordingly, haptic systems could be classified by wearability level into four categories, as depicted in Figure 5.
Grounded haptic devices, also referred to as "tabletop" haptic devices, are those that cannot be worn on any part of the user's body due to their size and/or functional features, such as the presence of air reservoirs or compressors. As a result, the workspace of such devices is typically limited. Grounded haptic systems can be further categorized into two types: graspable and touchable devices (as depicted in Figure 5a). Because grounded devices are not as constrained in terms of size and weight compared to hand-held and wearable devices, they often employ pneumatic actuation, utilizing bulky reservoirs and pumps, or magnetic actuation, which involves platforms and large electric coils [40].
Hand-held devices, as outlined in Figure 5.b, are devices that can be held within the hands without the need for straps. Compared to grounded devices, they are typically lighter, impose fewer constraints on movements, and offer a larger workspace. However, they cannot be worn, thus limiting complete freedom of movement. Hand-held devices have the capability to render kinesthetic, tactile feedback, or both simultaneously. These controllers significantly enhance the user experience while held in hand during operation. Traditional controllers commonly provide vibrotactile feedback to emphasize certain events occurring on the screen [40]. The most common types of wearable haptic devices are haptic gloves and exoskeleton systems, often referred to simply as exoskeletons. The primary distinction between them is that not all haptic gloves have an exoskeletal structure, and conversely, not all exoskeletal systems are designed in the form of gloves. These devices are primarily intended to provide kinesthetic haptic feedback while being worn directly on the user's body [42]. Table 4 showcases notable examples of haptic gloves widely utilized in various reported XR applications. The table includes detailed specifications for each glove and offers a comprehensive comparative analysis.
Haptic suits represent wearable devices tailored to provide haptic feedback to the upper body as users interact with different elements in the virtual environment [60,61,62]. Most commercially available haptic suits employ vibration actuators strategically positioned throughout the torso and arms, operating in predetermined sequences. This configuration enables users to experience haptic feedback resembling various sensations, such as the impact of a bullet or a fist [63]. Previous studies have been utilized in movies [64], military training simulations [65,66] and games [43,44,67].
In the realm of advanced medical technology, the 5D concept revolutionizes patient care by seamlessly integrating multiple dimensions with the power of extended reality (XR) capabilities. Initially, we consider the 3D aspect, encompassing the spatial coordinates of XYZ, capturing the precise location and depth within the operating room (OR). This forms the foundation for detailed anatomical mapping and spatial orientation.
Moving into the 4D domain, we add the critical element of time, enabling real-time movements through volumetric video streaming. This allows surgeons to visualize and interact with dynamic, three-dimensional images of patient anatomy as they operate, providing a live, immersive view of the surgical site. The incorporation of XR, including augmented reality (AR) and virtual reality (VR), enhances this experience by overlaying vital information directly onto the surgeon's field of view, ensuring they have immediate access to critical data without diverting their attention.
The 5D paradigm further enhances this by incorporating context—an amalgamation of information, assets, and AI-analytics. This contextual layer includes comprehensive patient data, AI-driven insights, and additional resources such as medical histories, imaging results, and surgical plans. XR capabilities enable these data points to be seamlessly integrated and visualized within the surgeon’s XR headset or AR display. For instance, AR can project a patient’s medical history and real-time analytics onto a virtual screen in the surgeon’s view, allowing for quick reference and enhanced decision-making.
AI-analytics play a crucial role in this 5D environment by providing predictive analytics and real -time decision support. These AI tools analyze patient data, monitor vital signs, and predict potential complications, offering suggestions and alerts that are contextually relevant to the ongoing procedure. Surgeons can receive these insights directly through their XR devices, ensuring they are constantly informed and able to make data-driven decisions on the fly.
The integration of these five dimensions with XR capabilities facilitates an unprecedented level of precision and efficiency in surgical procedures. Surgeons are equipped with a comprehensive, immersive view of the patient's anatomy, real-time movements, and contextual information, all enhanced by AI-analytics. This holistic, data-enriched approach not only optimizes patient outcomes but also transforms the surgical experience, making it more intuitive and informationrich. Through the combination of 5D concepts and XR technology, the future of surgery is poised to be more precise, informed, and effective than ever before.
References