University of Southern California

CGIT  |  IMSC  |  CS  |  SOE  |  USC |

Research fields

o       Computer Vision (motion, dynamics, and recognition)

o       Computer Graphics (geometry/texture modeling and rendering)

o       Interactive Media (fusion, authoring and retrieval)

o       Immersive Technologies (virtual and augmented reality)

o       Human-Computer Interaction (mobile and pervasive computing)


Projects

 

Motion and Dynamics

 

 

Dynamic Scene Analysis and Understanding

 

 

o       Robust Image Recognition and Matching

Developing high-performance image matching and recognition techniques for multimedia applications.  The key problems to be addressed are invariant feature extraction, description, and fast matching from live video or archived image sources.  Targeted applications are mobile computing (e.g. handheld devices for entertainment, advertising, and dynamic media sharing) and multimedia data searching and retrieval [more info].

o       Detection and Tracking of Dynamic Events and Objects

Developing algorithms for automatic analysis video imagery for detecting and tracking of dynamic events and objects through the scene.  Methods to estimate poses of targets from single or multiple images [more info].

o       2D Image Motion Estimation

Tracking natural scene features from video streams acquired by stationary or moving cameras.  The tracked scene features can be used for recovering camera pose, image stabilization, image registration, and creation of image mosaic from video sequences.   We developed a robust tracking approach and the software has been licensed to Rhythm & Hues, a major special-effects production firm in Santa Monica CA.  This software, named “Fastrack” by the firm, has been successfully used for creation of special effects in such films as “X-Men 2”, “Daredevil”, and “Dr. Seuss’ The Cat in the Hat” - is capable of tracking hundreds of features from one frame to another with sub-pixel accuracy in only a few seconds on a standard personal computer, processes roughly 40 percent of movie shots without having to provide extensive input to the computer, which is highly appreciated by the effects artists of the firm [more info].

o       Automatic Image Mosaicing

Automatic creation of high quality image mosaic from image/video sequences based on robust image motion estimation, where no assumption is made about the 3D camera motion or the scene structure [more info].

o       Natural Scenes Analysis Using Wavelet and Fractal Models

This work focuses on the study of new methodology of natural scene analysis using the wavelet and fractal theories.  The goal is to create new models and approaches which are efficient for dealing with natural scene images, hence helping with image representing of natural scenes, extraction semantic information, and pattern analysis [more info].

An introduction talk

 

 


 

 

 

Motion Estimation and Navigation

 

 

o       Landmark-based Camera Pose Tracking and Estimate

Automatic estimation of camera 6DOF pose using artificial landmarks.  Methods for detecting, recognizing, and tracking landmarks in real-time [more info].

o       Markerless Pose Tracking for Unprepared Environments (Structure from Motion)

This work addresses the case where neither camera motion, nor structure information is available.  The approach uses naturally-occurring features tracking to recover relative camera motion and scene structures (Structure From Motion) [more info].

o       Wide-area Tracking Using Panoramic (omni-directional) Imaging Sensor

Camera pose tracking from multiple video streams acquired by omni-directional imaging system that provides a full 360-degree horizontal viewing [more info].

o       Point and Line Feature Auto-calibration

Approach to simultaneously estimate 6D pose of a camera and 3D parameters of tracked features (points or lines) in the scene.  An initial camera pose estimate is computed from a set of known calibrated features.  Other features (intentional fiducials (IF) or natural features (NF)), at initially unknown positions, are tracked in the images produced as the camera moves.  The IF or NF 3D positions are estimated (automatically calibrated) and their position estimates are used, in turn, to estimate the pose of the camera.  This computation iterates and converges to produce both 6D camera pose and 3D IF or NF positions over a sequence of images [more info].

o       Model-based Tracking and Visual Navigation

Automatic estimation of sensor pose motion using scene knowledge including: scene 3D models (buildings, varied utility signs, facility tags or labels, etc.); natural occurring features; and auto-calibrated scene knowledge (runtime-calibrated features constrained to the model database) [more info].

o       Hybrid Vision/INS/GPS

Hybrid tracking technology attempt to compensate for the shortcomings of each single technology by using multiple measurements to produce robust results.  We focus the research on robust pose estimate by integrating the computer vision, inertial, and GPS sensors for unprepared outdoor environments.  Methods to fuse complementarily the diverse data resources [more info].

o       Optimal Estimation and Filtering

Motion estimation is a typical nonlinear estimate or probabilistic inference problem.  The optimal solution to the nonlinear estimate problem is given by the recursive Bayesian estimation technology.  However, for the real-world problems, the optimal Bayesian recursion is intractable and approximate solutions must be used.  Extended Kalman Filter (EKF) is the most widely used estimation approach due to its simplicity and tractability.  The EKF, however is based on a sub-optimal implementation of the recursive Bayesian estimation framework applied to Gaussian random variables, this can seriously affect the pose estimate accuracy or even lead to a divergent solution.  That is where we propose to focus our attention to make the most significant impact [more info]. 

A summary talk

 

 


 

 

 

Recognition and Human Motion

 

 

o       Face Detection and Tacking

Real-time detection of multiple faces and estimation of head pose for collaborative workspace [more info].

o       Automatic Recognition of Human Face and Expression

The important of developing face recognition system by computer is not only in the cognitive aspects, but also in its practical applications.  Most existing systems only operate on frontal view of face or facial profile, the aims of the research are to develop automatic face recognition techniques that can work under less constrained imaging conditions (varying pose and expressions).   We approach methods including the multiple view-based, synergetic neural network, and principal component analysis, to deal with the complex problems of pose and expression varying [more info].

o       Real-time Landmark Detection and Recognition

Accurate landmark detection and recognition are crucial for the real-time pose tracking system.  We develop a principal component analysis (PCA) based method extending from the above face recognition work that can robustly detect and recognize the designed B/W square landmarks (an alphanumeric or symbol region embedded in their design facilitates unique fiducial recognition from sets of 50-100 different symbols), achieving 28 fps and allowing the viewpoint varying up to 70 degree in depth [more info].

o       Computer Operation via Human Face Orientation

We approach a passive human head tracking and locating system through determining the gaze of face in images.  When a head is captured by video camera, the face area and some facial features are extracted automatically.  To determine the pose of input head, a vision approach is employed for estimating and tracking the gaze direction of the face.  The direction of face gaze gives a good estimation of the normal direction of face plane; therefore it could be used for determining the head pose and location with respect to the camera [more info].

 

 


 

 

 

Geometry and Appearance

 

 

Scene Reconstruction and Modeling

 

 

o       Large-scale Scene Modeling

Approaches to rapidly create large-scale urban site models from LiDAR and imagery, extract significant urban features (buildings, streetscapes), and refine the reconstructed models [more info].

o       Rapid Modeling of Dynamic Objects

Rapid modeling of dynamic events and objects (people, vehicle) from video images, allowing the objects to be visualized in 3D world [more info].

o       3D Scene Reconstruction From Stereo

Modeling 3D scene from stereo imagery.  We developed stereo matching approaches for different scenarios (man-made and natural scenes), and produced a task-oriented stereo vision system for automatic measure and reconstruction of SEM (Scanning Electronic Microscope) imagery, and digital photogrammerty (SPOT imagery).   Approaches using the wavelet to stereo matching: a wavelet zero-crossing matching, and a wavelet phase-based stereo matcher [more info].

A technical talk

 

 


 

 

 

Dynamic Texture

 

 

o       Real-time Texturing from Video

Producing model textures from real-time video streams captured by stationary or moving cameras.  By employing the live video as texture resource, we are not only able to create an accurate and photo-realistic appearance of the rendering scene, but also support dynamic spatio-temporal update in the structure of texture model, database, and rendering system [more info].

 

 


 

 

 

Data Fusion and Comprehension

 

 

o       Dynamic Fusion and Visualization of Imagery and 3D Models

A technique combines all manner of images, video, 3D models, and data in a coherent visualization that supports varied media types and layers of abstraction.  Our work focuses on a novel approach, called augmented virtual environment (AVE), fusing dynamic imagery with 3D models.  The AVE provides a unique approach to visualize and comprehend multiple streams of temporal data or images.  Models are used as a 3D substrate for the visualization of temporal imagery, providing improved comprehension of scene activities.  Dynamic multi-texture projections enable real time update and “painting” of scenes to reflect the most recent visual scene data.  The dynamic controls, including viewpoint as well as image inclusion, blending, and projection parameters, make for interactive real-time visualization of events occurring over wide areas such as a campus, airport, security infrastructure, military base, or battlefield [more info].

 

 


 

 

 

Graphics  Immersive Reality, HCI

 

 

Augmented/Virtual Reality and HCI

 

 

o       Mobile AR System and Collaboration

Developing mobile AR that enables information annotation overlaying on real images aiding on-site people to perform a verity of complex tasks.  The effort targets the development of a practical mobile AR system that provides accurate position aware computing and information assistance to users throughout a facility or field [more info].

o       Multi-sensors Fusion for Outdoor Augmented Reality

This project is for the development of multi-sensor fusion technology that specifically targets outdoor augmented reality.  The aim is to produce technology and system that support and integrate into systems such as NRL’s BARS (Battlefield Augmented Reality System) system.  The effort targets the development of a complete system integrating current or near term technologies with the needs of applications [more info].

o       Augmented Reality for Space Flight

This project deals with the production of annotated video (augmented reality) and its use in NASA’s training and operations applications.  With the assistance of Dr. Anthony Majoros at the Boeing Company, we propose to develop and construct a prototype AR authoring system and evaluate its utility and human performance benefits in terms of learning, recall, problem-solving, and time to complete tasks [more info].

o       Geospatial Registration of Information for Dismounted Soldiers (GRIDS)

GRIDS is an Augmented Reality system that can offer an intuitive, natural way for dismounted soldiers to understand electronic information.  The fundamental goal of GRIDS is to implement geospatial registered information in an untethered and unstructured environment [more info].

o       Virtual Guidance for Aerospace Targets Recognition and Visual Navigation

This is a part of a telerobotics vision system which is an integrated operation platform for telerobotics vision, simulation, and manipulation research.  The main work we carried out are on the following problems: environment modeling, dynamic sensor calibration, and virtual view generation aiding for visual navigation.

AR technical talk

 

 


 

 

 

Rendering and Visualization

 

 

o       4-D Distributed Modeling and Visualization

Developing methodologies and testbed required for dynamic time-critical modeling and visualization systems, using augmented reality, visualization, and 4D dynamic models of the environment, which incorporate fast, robust, automatic, and accurate modeling of visual data, leading to enhanced and improved tools and techniques, as well as compact representations for time-space visualization of real world scenery [more info].

o       Volume Rendering for 3D Virtual Colonoscopy

The motivation of this research is to employ advanced visualization techniques for imaging and exploring the mucosal surface of human colon.  A prototype system has been developed including interactive navigation, fast rendering, and segmentation of the colon, which allows the user to achieve both planned and guided navigations inside the colon using the 3D image as a virtual environment – “a virtual explorer” [more info].

 

 


 

 

 

Systems and Applications

 

 

Surveillance and Situational Awareness

 

 

o       V-Sentinel: A Novel System for Wide-area Situational Awareness

Developing robust and intelligent systems for wide-area situational awareness is vital to many applications including national security, transportation management, environment monitoring, catastrophe response, and tactical decision-making and military rations in battlefield environments.  We employ our AVE framework to fuse and present the aggregate sensing information so that it is easily browsed, interpreted, and comprehended to support situational awareness and decision-making processes [more info].

Overview talk

 

 


 

 

 

Scene Reconstruction and Modeling

 

 

o       V-Urban3D: A System for Rapid Creation of Large-scale 3D Urban Model

Developing a state-of-the-art software solution that facilitates the rapid and reliable creation of large-scale 3D urban models from range sensing data.  Based on the latest computer vision, graphics, and modeling technologies, the V-Urban3D offers many superexcellent features over existing systems, including true 3D urban site reconstruction, rapid extraction and modeling buildings with a variety of irregular shapes and rooftops, and accurate geometry model refinement, etc. - all performed in an integrated environment [more info].