4-D Distributed Modeling and Visualization 

 

Results

 


Text Box:     Portable tracking and data acquisition system

The developed hybrid tracking technology allows us to track 6DOF (position and orientation) sensors pose in real-time in an open outdoor environment. Since we consider the case where images/video (or other data) from different sensor platforms (still, moving, aerial). Only if their tracked positions/orientations are known, these images can all be projected onto the scene model, thereby presenting the observer with a single coherent and evolving view or the complete scene. Based on our hybrid approach, we built a portable tracking package that can be used for data acquisition and augmented reality overlays.

The system is complete self-contained portable tracking package consisting of a high resolution stereo camera head, differential GPS receiver, 3DOF gyro sensor, and a laptop computer. The stereo head equipping two high resolution digital cameras using Firewire (IEEE 1394) interface to the laptop computer. The dual cameras configuration has multi-purposes, e.g. one channel (left) of the acquired video streams will be used for video texture projection and vision tracking processing, while the both stereo streams are used to feed in a real-time stereo reconstruction package for detailed façade reconstruction. The integrated GPS and gyro sensors are used for tracking 6DOF (degree-of-freedom) pose. We developed data fusion approach to fuse and synchronize those different timing data streams. Our approach is to compensate for the shortcomings of each sensing technology by using multiple measurements to create continuous and reliable tracking data.

  6 DOF pose tracking using panoramic (omni-directional) imaging sensor

Currently, most vision based pose tracking methods require a priori knowledge about the environment. Calibration of environment is often relied on several pre-calibrated landmarks put in the work space to collect the 3D structure of environment. Attempting to, however, actively control and modify an outdoor environment in this way is unrealistic, which makes those methods impractical for outdoor applications. We emphasized this problem by using a new omni-directional imaging system (which can provide a full 360-degree horizontal viewing) and a RRF (Recursive Rotation Factorization) based motion estimate method we developed. We have tested our system on both indoor and outdoor environment with wide tracking range.  Compared with GPS measures, the estimated position accuracy is about thirty-centimeter with tracking range up to 60 meters. Detailed information can be found at http://deimos.usc.edu/~jonlee

 

  Markerless motion tracking for unprepared environments

We address the case where neither camera motion nor structure information is available. This algorithm uses naturally occurring features (point and region tracking to reconstruct camera motion and scene structure estimates (structure from motion). The closed-loop architecture makes the system possible to discriminate between good and poor estimations that maximize the quality of the final motion estimation. The estimated related pose tracking can be directly used for augmented reality overlays, and the structure estimates allows smooth tracking – also can improve/refine scene models. The designed framework allows further sensor fusion (GPS, gyroscopes) for absolute pose reconstruction.

 

       

 

        

 

  LiDAR data tessellation and model reconstruction

We acquired, in cooperation with Airbornal Inc, the LDAR model of the entire USC campus and surrounding Coliseum Park.  This 3D model data that has accuracy to sub-meter in ground position and cm in height is served as the base model on which we paint images and videos acquired from our roving tracked camera platform.  Since the LiDAR data came in unorganized 3D point cloud that was defined in sensors or world coordinate system, we processed the raw data for grid re-sampling, hole filling and geometry registration, to reconstruct a continuous 3D surface model.  With the raw point cloud as input, our system automatically performs all the necessary processes and outputs the reconstructed 3D model in VRML format. Following figures show the snapshots of the applying the system to process our USC campus LiDAR dataset. The left image is the reconstructed range image from the unorganized 3D point cloud, and the right one shows the reconstructed 3D model.

 

  Real-time video texture projection

We developed the approach for real-time video texture projection. Given the calibrated camera parameters, we can dynamic “paint” the acquired video/images onto the geometric model in real-time. In the normal texture mapping, the textures for each polygon are described by a fixed corresponding polygon in an image. Since the corresponding relationships between models and texture are pre-computed and stay fixed, in this case, it is impossible to update new texture image without preprocessing.  In contrast, the texture projection technology mimics the dynamic projection process of the real imaging sensor to generate the projected image in the same way as the photo reprinting. In this case, the corresponding transformations between models and texture are computed and updated dynamic based on the relationships of projective projection. Texture images are generated by a virtual projector with known imaging parameters. Moving the model or sensor will change the mapping function and image for a polygon, and also change the visibility and occlusion relationships that make the technology well suited for dynamic visualization and comprehension of data from multiple sensors. Our system can project real-time video or imagery files onto 3D geometric models and produce visualizations from arbitrary viewpoints. The system allows users to dynamic control during visualization session, such as viewpoint, image inclusion, blending, and projection parameters.

The left figure illustrates two snapshots of our results. The left image shows the aerial view of a texture image is projected on a 3D LiDAR model (campus of Purdue University), and the right one shows façade view of the video texture is projected on the USC LiDAR building model.

Our system also supports multi-texture projectors that simultaneously visualize many images projected on the same model. This feature enables the comprehension of data from multiple sensors. The following figure illustrates the results of two sensors are projecting onto one model. The first projector (sensor-1) view provides useful “footprint” information about global and building placements, while the second sensor (sensor-2) view potentially provides details of interested building.

 

  6DOF Auto-calibration technology

We extended our point based auto-calibration technology to line feature. The new algorithm can automatically estimate 3D information of line structures and camera pose simultaneously. We can used both those features for computing/refining camera pose and refining model structure based on auto-calibration of line/edge features. First, auto-calibration of the tracked features (points and lines) provides the necessary scale factor data to create the 6th DOF that is lacking from vision.  It also provides absolute pose data for stabilizing the multi-sensors data fusion. Since all the vision-tracked features have variable certainty in terms of their 2D and 3D (auto-calibrated) positions, adaptive calibration and threshold methods are needed to maintain robust tracking over longer periods. Second, the auto-calibration of structure features (point, line and edge) can provide continually estimates of 3D position coordinates for feature structures. The tracked feature positions are iteratively refined till the residual error reach minimum.  Combining the auto-calibration and image analysis technologies, we are able to refine the dominant features of model acquired from LiDAR or other sensors. Detailed information can be found at http://www-scf.usc.edu/~bjiang

 

  Camera pose stabilization with vision sensor

A key challenge for creating the dynamic texture projection (and augmented reality) is to maintain accurate registration between the geometric graphic models and the real video textures.  As users or cameras move their viewpoints, the observed image elements must retain their alignment with the 3D positions and orientations of model objects.  This alignment depends completely on accurate tracking of the viewing pose (position and orientation), relative to either the environment or the observed objects. The tracked viewing pose defines the virtual projector used to project image texture onto the 3D geometric models, so pose tracking accuracy directly determines the visually-perceived accuracy of augmented reality alignment and registration.

Inertial sensors can be used for orientation tracking and GPS for position. Those sensor are complete self-contained that can be packaged for tracking in larger working areas. However, their accuracy is not appropriated for our applications. The signal sensing range of as well as man-made and natural sources of interference also limits their usages. We overcome this problem by using a vision tracker to stabilize the tracked camera pose.  Following two videos illustrate the results of using only GPS/INS tracking and vision stabilized tracking, respectively.

 

  

Only GPS/INS Tracking (Avi Movie)                                Vision stabilization (Avi Movie)

 

  AR visualization environment

This application demonstrates the AR visualization system configured in a laboratory setting as base-station. The system consists of wide-view display screen (and also head mounted display), stereo video projectors, and user interaction devices that provide user a high performance AR visualization environment. The base-station will communicate and data exchange with multiple mobile or manned sensors that provide data streams (video/image/graphics/tracking) to be visualized. Sensors can be static or dynamic as above demonstrated (head-worn cameras, panoramic cameras, aerial camera platforms, etc.), and their data stream can communicate to base-station live in real-time or information can come from an archive. The tracking sensors will provide necessary pose and geo-spatial information of tracked sensors/users that are used for model view update. 3D graphics render engineering including projective video texture mapping is used for realistic visualization. We have created several large-scale 3D models (i.e. LIDAR model of USC campus) in an open format (i.e., VRML) as scenario to demonstrate the 4D visualization with dynamic video projection texturing.

(Avi movie)


 

 

  Model refinement and feature extraction

LiDAR offers a fast and effective way to acquire model for a large section of urban environment. A LiDAR system permits an airplane to quickly collect a height field with resolution of cm in height and sub-meter in ground position. This data provides useful “footprint” information about urban feature and building placement. However, due to the resolution limition and sensing noise, details on the buildings are missing and occlusions from landscaping and overhangs lead to data voids in many areas of interest. The model needs to be refined

.

We are developing techniques to automatically or semi-automatically refine the acquired LiDAR model and extract significant urban features (buildings, streetscapes) from it. Firstly, the LiDAR data provides us a clear footprint of the urban feature placement and height information. This global information cab be used to decide the feature geo-location and isolate the features from their background. Once the features are segmented, predefined geometric primitives are iteratively fitted to the height field and the best fitting models the extraced urban feature. Secondly, the technique of texture projection also provides us an effective way to refine the model, Given the texture images projected on the model, we are about to verify the errors of model structures and correct them. Since we can dynamically project real-time video or imagery files onto 3D geometric models and produce visualizations from arbitrary viewpoints, which allows us to verify and refine the models from different angles to cover the complete model surface. Finally, we are applying our auto-calibration technology for model refinement. Started from few reference landmarks, our method can automatically estimate 3D information of feature (point and/or line) structures and camera pose simultaneously. Combining the estimated 3D structure and image projection, we are about to refine the dominant features of model, especially for façade data.

Following figures illustrate the primitively results of applying our fitting approach to semi-automatically refine part of our USC campus LiDAR dataset. The left image is the original LiDAR model, and the right one is the extraced building models.

 

  

Original LiDAR Model                                 Refined Model

 


[MURI Home Page | Executive Summary | Our Mission | Technical Strategies]

[Results | Presentation | Technical Staff | Related Link | Event and News]

 

 

Question to Suya You