EXPO21XX - Universities and Research - The Johns Hopkins University

ROBOTICS 21XX

Hall Map / Categories
Hall 26
Previous ...Université Libre de Bruxelles (ULB) G5: The Johns Hopkins University University of Salford... Next
Previous Next

Navigation : EXPO21XX > ROBOTICS 21XX > H26: Medical Robots > The Johns Hopkins University

The Johns Hopkins University

Videos

Loading the player ...

Offer Profile
The Computational Interaction and Robotics Lab is a research laboratory in the Department of Computer Science at Johns Hopkins University. We are also associated with the NSF Engineering Research Center for Computer-Integrated Surgical Systems and Technology (ERC-CISST) and the Laboratory for Computational Sensing and Robotics (LCSR). We are interested in understanding the problems that involve dynamic, spatial interaction at the intersection of vision, robotics, and human-computer interaction.

Product Portfolio

MAPS - Manipulating and Perceiving Simultaneously
The goal of this project is to develop a system, consisting of a robotic hand equipped with tactile sensors, capable of autonomously exploring an environment and identifying objects that have been encountered before, while manipulating the unknown objects as necessary. The ability to explore an unknown object using solely haptic information requires expansion of the state of the art both in object recognition and in manipulation, in addition to the application of simultaneous localization and mapping techniques to the haptic domain. Our approach focuses first on the adaptation of feature-based object recognition methods, from the computer vision domain, to haptic object recognition.

The Schunk Anthropomorphic Hand (SAH), mounted upon the Barrett WAM arm, with tactile sensors from Pressure Profile Systems, is our platform for haptic manipulation, however extensive work is also being conducted in a simulator for tactile sensing that we are developing (shown in a screen-shot below).

- picture on right courtesy of Schunk

Language of Surgery
Surgical training and evaluation has traditionally been an interactive and slow process in which interns and junior residents perform operations under the supervision of a faculty surgeon. This method of training lacks any objective means of quantifying and assessing surgical skills. Economic pressures to reduce the cost of training surgeons and national limitations on resident work hours have created a need for efficient methods to supplement traditional training paradigms. While surgical simulators aim to provide such training, they have limited impact as a training tool since they are generally operation specific and cannot be broadly applied. Robot-assisted minimally invasive surgical systems, such as Intuitive Surgical’s da Vinci, introduce new challenges to this paradigm due to its steep learning curve. However, their ability to record quantitative motion and video data opens up the possibility of creating descriptive, mathematical models to recognize and analyze surgical training and performance. These models can then be used to help evaluate and train surgeons, produce quantitative measures of surgical proficiency, automatically annotate surgical recordings, and provide data for a variety of other applications in medical informatics.

Context-Aware Surgical Assistance for Virtual Mentoring
Minimally invasive surgery (MIS) is a technique whereby instruments are inserted into the body via small incisions (or in some cases natural orifices), and surgery is carried out under video guidance. While presenting great advantages for the patient, MIS presents numerous challenges for the surgeon due to the restricted field of view presented by the endoscope, the tool motion constraints imposed by the insertion point, and the loss of haptic feedback. One means of overcoming some of these limitations is to present the surgeon with a registered three-dimensional overlay of information tied to pre-operative or intraoperative volumetric data. This data can provide guidance and feedback on the location of subsurface structures not apparent in endoscopic video data.

Our objective is to develop algorithms for highly capable context-aware surgical assistant (CASA) robotic systems that are able to maintain a dynamically updated model of the surgical field and ongoing surgical procedure for the purposes of assistance, evaluation, and mentoring. Information that must be maintained includes real-time representation of the patient’s anatomy & physiology, the relationship of surgical instruments to the patient’s anatomy, and the progress of the procedure relative to the surgical plan. In the case of telesurgical systems, one key challenge is the real-time fusion of preoperative models and images with stereo video and real-time motion of surgical instruments. Our approach dynamically registers pre-operative volume data to surfaces extracted from video, and thus permits the surgeon to view pre-operative high-resolution CT or MRI or intra-operative ultrasound scans directly on tissue surfaces. By doing so, operative targets and pre-operative plans will become clearly apparent.

We are developing an integrated demonstration of the feasibility of performing such a fusion within the context of a CASA robotic system and we will demonstrate simple capabilities to use the fused information to monitor interactions of the surgical instruments with the target anatomy and detect changes in the anatomy.

Our project is currently targeting minimally invasive surgery using the da Vinci surgical robot. The da Vinci system provides the surgeon with a stereoscopic view of the surgical field. The surgeon operates by moving two master manipulators which are linked to two (or more) patient-side manipulators. Thus, the surgeon is able to use his or her natural 3D hand-eye coordination skills to perform delicate manipulations in extremely confined areas of the body.

For the CASA project, the robotic surgery system presents several advantages. First, it provides stereo (as opposed to the more common monocular) data from the surgical field. Second, through the da Vinci API, we are able to acquire motion data from the master and slave manipulators, thus providing complete information on the motion of the surgical tools. Finally, we are also provided with motion information on the observing camera itself, making it possible to anticipate and compensate for ego-motion.

Algorithms are developed and evaluated on phantom data and data acquired using the Intuitive Surgical da Vinci system.

Micro-Surgical Assistant Workstation
Providing enhanced information and physical abilities to surgeons
The Micro-Surgical Assistant Workstation is a system which is designed to aid and augment the performance of micro-surgical tasks by a human surgeon. A micro-surgical task is simply a surgical task which is performed on such a small scale that it must be done while looking through a microscope.

There are two main components to the workstation; the first aims to give the surgeon better knowledge about the field of operation, and the second aims to give the surgeon a better ability to execute very precise motions with surgical tools even when working at a microscopic scale.

One example of the first component would be providing the surgeon with information about the surgical environment which would not normally be available, such as overlaying images from another imaging modality (eg. CT, MRI, OCT, Fundus), which would be registered in real time to images captured through the operating microscope.

The current example of the second component is the Steady Hand Robot, which is a cooperative manipulator designed to improve the fine motor control of a surgeon. The surgeon and the robot both hold the same tool; when the surgeon exerts a force on the tool, the robot detects that force, and moves the tool correspondingly. Having the robot in the loop allows us to do force scaling, effectively slowing down the motion of the tool; this allows for more precise positioning than could be done free hand. Additionally, the robot gives greater stability, as a tool can be released by the surgeon and will stay in the same position, rather than falling.

The current model on which the Micro-Surgical Assistant Workstation is being developed and evaluated is a retinal-surgery model, and several retinal surgery tasks have been attempted with the robot.

Capsule Endoscopy
Capsule endoscopy (CE) has recently emerged as a valuable imaging technology for the gastrointestinal (GI) tract, especially the small bowel and the esophagus. With this technology, it has become possible to directly evaluate the gut mucosa of patients with a variety of conditions, such as obscure gastrointestinal bleeding, Celiac disease and Crohn's disease.

Although the use of capsule endoscopy is gaining rapidly, the evaluation of capsule endoscopic imagery presents numerous practical challenges. In a typical case, the capsule acquires 50,000 or more images over an eight-hour period. The quality of these images is highly variable due to the uncontrolled motion of the capsule itself as it moves through the GI tract, the complexity of the structures being imaged, and inherent limitations of the disposable imager. In practice, relatively few (often less than 1000) of the study images contain significant diagnostic content. As a result, it is challenging to create an effective, repeatable means for evaluating capsule endoscopic sequences.
In this NIH funded effort, we are interested in creating a tool for semi-automated, objective, quantitative assessment of pathologic findings in capsule endoscopic data, and in particular, quantitative assessment of lesions that appear in Crohn's disease of the small bowel. We are developing statistical learning methods for performing lesion classification and assessment in a manner consistent with a trained expert.

To evaluate our methods, we have collected a substantial database of images of Crohn's lesions together with an expert assessment of several attributes indicative of lesion severity. In addition, our database also contains a large number of other GI anomalies seen in CE. Recent publications on reduction of complexity of CE assessment by reducing the number of images that need to be examined by a clinician and statistical methods for assessment appear in the publications page.

- Sample images from our capsule endoscopy database. From left to right: a severe lesion, normal villi, lesion not related to Crohn's, and a mild Crohn's lesion.

Computer Vision for Sinus Surgery - Registering Endoscopic Video to Preoperative 3D Models
The focus of this project is to study the problem of registering an endoscopic video sequence to a preoperative CT scan with applications to sinus surgery. The main goal of this project is to be able to accurately track the tip of the endoscope in real-time using vision techniques and thus be able to determine tool tip locations within the sinus passage. The advantages to being able to do real-time visual tracking are quite substaintial as current tool tracking systems for sinus surgery are often cumbersome to use since they can require modification of the tool handles, require additional bulky machinery which occupies valuable operating room floor space and can cost large sums of money which can detract from the budgets for other surgical tools. This project is being persued under a grant from the NIH

Real-time Video Mosaicking with Adaptive Parametrized Warping
Image registration or alignment for video mosaicing has many research and real applications. Our motivations come primarily from the medical field, and primarily seek to over come fundamental field of view and resolution tradeoffs that occur ubiquitously in endoscopic surgery. There are two general approaches to computing the visual motions between successive images, the critical issue for registration. Direct approaches use all the image pixels available to compute an image-based error, which is then optimized. Complementary approaches are to specifically detect certain image features first then to estimate the corresponding relations between image feature pairs in different camera views. The latter often has the advantage of a larger range of convergence, although at the cost of a prior feature-detection and correspondence stage.

Human Machine Collaborative Systems
During micro-scale manipulation tasks such as microsurgery, the human operator performs at the limits of his or her motor control. The high accuracy required while performing such procedures makes the task both mentally and physically demanding. As part of the Engineering Research Center for Computer-Integrated Surgical Systems and Technology (ERC-CISST), Professor Allison Okamura in Mechanical Engineering and researchers in Computer Science (Gregory Hager and Russell Taylor) are designing systems that amplify or assist human physical capabilities when performing tasks that require learned skills, judgment, and dexterity. We refer to these systems as Human-Machine Collaborative Systems (HMCS), as they generally seek to combine human judgment and experience with physical and sensory augmentation using advanced display and robotic devices to improve the overall performance (Figure 1).

Human-machine systems offer the operator a direct interaction with the environment through a single robot (Cooperative manipulation) or remotely through multiple robots (Telemanipulation). In addition, the robots can guide the user using "Virtual Fixtures", which are control methods designed to encourage the user to move in certain directions and prevent tool entry into undesired regions. The target applications of HMCS include micro-surgical procedures such as retinal and sinus surgery (Figure 2) and industrial micro-assembly.

- Figure 1: The JHU Eye Robot, in which the surgeon and robot cooperate to perform retinal surgery, is a Human-Machine Collaborative System.
- Figure 2: The robot guides the human to perform certain motions or stay away from delicate tissues, using reconstructions of the retinal surface.

Motion Compensation in Cardiovascular MR Imaging
Robust MR imaging of the coronary arteries is challenging due to the complex motion of these arteries induced by both respiratory and cardiac motion. Current approaches have had limited success due to the motion artifacts introduced by the variability in cardiac and respiratory motion during data acquisition. The effects of motion variability can be significantly reduced if the motion of the coronary artery can be estimated and used to guide MR image acquisition. We thus propose a method that involves acquiring high speed low-resolution images in specific orientations, extracting coronary motion to guide the high-resolution MR image acquisition. In order to show the feasibility of the proposed approach, we present and validate a multiple template tracking approach that allows reliable and accurate tracking of the left coronary artery (LCA) in low-resolution realtime MR image sequences in different orientations. We have also demonstrated using MR simulations that accounting for cardiac variability improves overall image quality.

Multi-Modality Retinal Image Registration
Optical coherance tomography is a non-invasive imaging modality analogous to ultrasound using light rays. Registration of pre-operative OCT images to more familiar and easily available intra-operative fundus images allows precise location of pathologies which might otherwise be invisible, allowing a wider array of interventions.

Vision-Based Human Computer Interaction
The Visual Interaction Cues (VICs) project is focused on developing new techniques for vision-based human computer interaction (HCI). The VICs paradigm is a methodology for vision-based interaction operating on the fundamental premise that, in general vision-based HCI settings, global user modeling and tracking are not necessary. For example, when a person presses the number-keys while making a telephone call, the telephone maintains no notion of the user. Instead, it only recognizes the action of pressing a key. In contrast, typical methods for vision-based HCI attempt to perform global user tracking to model the interaction. In the telephone example, such methods would require a precise tracker for the articulated motion of the hand. However, such techniques are computationally expensive, prone to error and the re-initialization problem, prohibit the inclusion of an arbitrary number of users, and often require a complex gesture-language the user must learn. In the VICs paradigm, we make the observation that analyzing the local region around an interface component (the telephone key, for example) will yield sufficient information to recognize user actions.

The principled techniques of the VICs paradigm are applicable in general HCI settings as well as advanced simulation and virtual reality. We are actively investigating 2D, 2.5D, and 3D environments; we've developed a new HCI platform called the 4D Touchpad (figure below) where vision-based methods can complement the conventional mouse and keyboard.

In the VICs project, we study both low-level image analysis techniques and high-level gesture language modeling. In low-level image analysis, we use deterministic (color, shape, motion, etc.), machine learning (e.g. neural networks), and dynamic modeling (e.g. Hidden Markov Models) to model the spatio-temporal characteristics of various hand gestures. We have constructed a highlevel language model that integrates a set of low-level gestures into a single, coherent probabilistic framework. In the language model, every low-level gesture is called a Gesture Word, and each complete action is a sequence of these words called a Gesture Sentence

- Adaptive Background Modeling
- Intelligent Buttons
- Left: The 4D Touchpad - A new platform for the development of next-generation interfaces.
  It facilitates unencumbered interaction with an architecture that provides core functionality for general vision-based interfaces.
  Right:The new 4DT system. Interaction components are rendered over a flat monitor.

Articulated Object Tracking
Many objects encountered in the real world can be described as kinematic chains of parts with roughly uniform appearance characteristics. We developed a GPU-accelerated method for tracking such objects in single- or multi-channel (eg, stereo) video streams in diverse domains. The method consists, in brief, of modeling the appearance of the various object parts, then rendering a 3D model of the target object geometry from each view, and measuring the consistency of the resulting image with an appearance class probability map derived from the video images. It's been demonstrated in both surgical and generic settings.

- - Sample images, tracking in different domains

ROBOTICS 21XX

MAPS - Manipulating and Perceiving Simultaneously

Language of Surgery

Context-Aware Surgical Assistance for Virtual Mentoring

Micro-Surgical Assistant Workstation

Capsule Endoscopy

Computer Vision for Sinus Surgery - Registering Endoscopic Video to Preoperative 3D Models

Real-time Video Mosaicking with Adaptive Parametrized Warping

Human Machine Collaborative Systems

Motion Compensation in Cardiovascular MR Imaging

Multi-Modality Retinal Image Registration

Vision-Based Human Computer Interaction

Articulated Object Tracking