|
|
Following materials are produced from our
research. All rights reserved. We acknowledge the project sponsors from
NSF, DAPRA, ONR, NASA, NGA, US Army, and IMSC/USC for their supports and facilities
|
|
2D Image Motion Estimation and Tracking
Our technique is a closed-loop
architecture that is inspired by the use of feedback for correcting errors
in non-linear control system. The architecture integrates three main motion
analysis functions, i.e. feature detection, tracking, and verification, in
a closed-loop cooperative manner. The process acts as
"selection-hypothesis-verification-correction" strategy that
makes it’s possible to discriminate between good and poor estimation
features, which maximizes the quality of the final motion estimation.
|
|

|

|
|

|
Indoor man-made scene
Walking and pan the camera to capture the scene.
(Mpeg movie, about 300K)
|
|

|
Outdoor golf course
This scene is captured from a moving vehicle while viewing the scenes and
panning the camera.
(Mpeg movie, about 300K)
|
|

|
“FasTrack”
Behind the Blockbusters--Special Effects Tool Locks Characters onto Film
|
|
Video Annotation
We apply the above 2D tracking
technique for natural scene image annotation. The original approach has been adapted to
track several annotation points in image stream. Once those annotation points have been
tracked successfully, the graphic generated annotation labels are overlaid
on the real images to indicate the means of interested parts.
|
|

NASA training scene-1 (QuickTime
movie, 330K)
|

NASA training scene-2. (QuickTime
movie, 330K)
|
|
The experts select key frames and
link text and annotation to structural features in the image. The tracking system then automatically
keeps the annotation linked to the features as camera moves in the
following frames to show additional structure or views that clarify the
context and extent of the problem.
|
|
Augmenting Image with Web Links
as AR Media Player (ARMP)
We can consider the ARMP system as a
“media player” the way a cassette tape player is a player of
media encoded into the cassette tape standard. Similarly, an AR Media Player requires
media that it can interpret and display. We have developed a framework
based on web formats and our tracking techniques. The system provides a graphical user
interface to the underlying media analysis and augmented reality
application algorithms. Our current implementation supports a prerecorded
image file and online video sources, and graphic annotations include text,
2D labels, and audio. More features
of the system include a capability for linking to web format AR media and
automatic indexing of that data based on the cursor’s position
relative to objects in the scene.
|
|

|
Control panel of the AR Media Player
Based on a common cassette tape player control metaphor, the interface
can load image sequences; analyze and track features and pose; edit or
compose annotation; and store or playback defined AR files
|
|

|
AR Media for space crew training
As the user cursor touches items in the scene or the text annotations, a web
page (with audio) automatically opens beside the video window, displaying
additional or related information.
|
|

|
AR Media for online shopping
While viewing a pre-produced video, users can select interesting items with
their cursor. The video producers can also include predefined text labels
attached to the objects of special interest. As the cursor moves over the
scene, information is revealed about the objects the cursor points to.
|
|
Automatic Mosaic Creation Based
on Image Motion Estimation
We emphasize the topic of automatic
creation of high quality image mosaic from image/video sequences based on
image motion estimation, where no assumption is made about the 3D camera
motion or the scene structure. The key innovations proposed include: (1)
accurate image alignment using the robust motion estimation approach
described above; and (2) several motion models are considered for
inter-frame compensation, under a common projective geometry framework.
Experiments show that the method works robustly in natural environments,
can produce high quality image mosaics even in the presence of camera
translation and moving foreground objects.
|
|
|
|
Facial Feature Tracking
This clip shows the result applied the
proposed closed-loop tracking method to track human facial features. To
handle the rotation in depth, an affine model is used to compensate the
effect.
|
|
|
|
|
Facial feature tracking under
varying pose (Movie, 1.3M)
|
|
|
Real-time Face Detection
Our approach is based on a statistical
learning framework. Whenever people are coming within the field view of camera,
the system automatically recognizes the existence of the face and obtains
the location among segmented regions. In order to improve performance and
robustness of the system, several strategies have been adopted including ICA feature extraction, SVM recognition, and pyramid
matching, allowing us to automatically detect human faces with various
sizes in real-time.
|
|

|
|
Face Pose Estimation
We present real-time face detection and face pose estimation and tracking
technique for collaborative workspace. Foreground regions in each frame are
extracted by simple background subtraction method. Among these regions,
candidate regions of faces are estimated by sparse run-length coding based
analysis. Real-time detection system based on hybrid ICA-SVM is then used
for detecting faces among the candidate regions and tracking them over
time. An estimation of the head pose of the participants identifies the
focus of attention during the collaborative work. The head pose is computed
by approximating the shape of the head by a 3D cylinder. 2D velocities are
mapped onto the 3D cylinder for updating and tracking the pose of detected
faces. The combination of face detection and tracking technique with motion
estimation algorithm, demonstrates a more stable system applicable to head
pose estimation in a perceptual user interface system. The proposed system
produces head pose information at the interactive rate of 10Hz.
|
|
 
|
|
Globally Optimum Multiple
Object Tracking
Robust and accurate tracking of multiple objects is a key challenge in
video analysis and understanding.
Tracking algorithms generally suffer from either one or more of the
following problems. First, objects can
be incorrectly interpreted as one of the other objects in the scene. Second, interactions between objects,
such as occlusions, may cause tracking errors. Third, globally-optimum tracking is hard
to achieve since the combinatorial assignment problem is NP-Complete. We present a modified Multiple-Hypothesis
Tracking algorithm, MHT, for globally optimum tracking of moving
objects. The system defines five
states for tracked objects: appear, disappear, track, split, and merge, and
these states cover all the interactions of object pairs. After the detection of objects in the
current frame, a resemblance matrix is computed for every object pair. We convert the two-dimensional resemblance
matrix into a three-dimensional state-likelihood structure and use a MHT
technique to solve the state-assignment problem in 3D. This prevents incorrect assignments due
to local minima in the assignment process.
Moreover, the method models occlusion cases with the split and merge
states. Finally, this method
approximates a globally optimum state assignment in polynomial time
complexity
|
|

|

|
|
Real-time Landmark Detection
and Recognition
Accurate
landmark detection and recognition are crucial for the real-time AR system.
We developed a principal component analysis (PCA) based algorithm that can
robustly detect and recognize the designed B/W square landmarks in
real-time. It achieves 28 frames/sec on a PC with 450 Hz CPU, and allows
the viewpoint varying up to 70 degree in depth.
|
|

Tracking a 3X3 landmark array with some unique
characters printed inside (Mpeg movie, 1.8M)
|

|
|
Pose Estimation with Landmark
Tracking
We
use the B/W landmark as fiducial for camera pose
tracking. Once the fiducials have been detected
and recognized, their feature coordinates (four corners and center) are
used for camera pose estimation. This clip shows a virtual dolphin tightly
augmented on the desk as it were a real part of the scene.
|
|

|
Virtual dolphin augmented on the desk
(Mpeg
movie, 1.8M)
|
|
Markerless Pose Estimation in Unprepared
Environments
We
address the case where neither camera motion nor structure information is
available. This algorithm uses naturally
occurring features (point and region) tracking to reconstruct camera motion
and scene structure estimates (structure from motion) The closed-loop architecture makes the
system possible to discriminate between good and poor estimations that
maximize the quality of the final motion estimation. The estimated relative pose tracking can
be directly used such as for image overlays, and the structure estimates
allows smooth tracking – also can improve/refine scene models. The designed framework allows further
sensor fusion (GPS, gyroscopes) for absolute pose reconstruction.
|
|

Hello Buddy (Mpeg movie)
|

Virtual dolphin
(Mpeg movie)
|
|
Wide-area tracking using panoramic (omni-directional) imaging sensor
Currently,
most vision based pose tracking methods require a priori knowledge about
the environment. Calibration of environment is often relied on several
pre-calibrated landmarks put in the work space to collect the 3D structure of
environment. Attempting to, however, actively control and modify an outdoor
environment in this way is unrealistic, which makes those methods
impractical for outdoor applications. We emphasized this problem by using a
new omni-directional imaging system (which can provide a full 360-degree
horizontal viewing) and a RRF (Recursive Rotation Factorization) based
motion estimate method we developed.
We have tested our system on both indoor and outdoor environment
with wide tracking range. Compared
with GPS measures, the estimated position accuracy is about
thirty-centimeter with tracking range up to 60 meters.
|
|

|
|
Inertial Sensor
Recently,
there has been considerable interest in applying inertial sensors for
motion tracking. Inertial system is
completely self contained, sourceless, and can be
sampled at very high rate. It is very suitable for sensing rapid motion. We
developed a sensor module contains a CCD video camera, and three orthogonal
rate gyroscopes. The video camera provides 30 Hz video streams, and the
three gyroscopes are sampled at 1kHz via a 16-bits A/D converter. Several
low-level libraries have been developed to drive the A/D converter and
gyroscopes.
|
|

|
USC Gyro-Video sensor
The sensors are tightly
covered by a foam block to provide shock protection and a stable
temperature environment from the sensors. The video camera provides 30 Hz
video streams, and the three gyroscopes provided attitude tracking.
|
|
Orientation Motion
Stabilization with Hybrid Vision and Inertial Sensor
Making
navigation work in unconstrained environments, especially in unprepared
outdoors is the most challenge task. The reasons are because we have less
control over the environment and very fewer resources available. We developed a system that combines a
natural feature vision system with gyro sensor to provide accurate 3DOF
orientation tracking in outdoor environments. The fusion system is based on
the SFM (structure from motion) algorithm, in which approximate feature
motion is derived from the inertial data, and vision feature tracking
corrects and refines these estimates in the image domain. Furthermore, the
inertial data also serves as an aid to the vision tracking by reducing the
search space and providing tolerance to interruptions.
|
|

|

|
|
Virtual Sand Table
We
developed an AR annotation system based on the hybrid motion tracker to
illustrate its utility in AR or visual navigation applications. The system,
called “Virtual Sand Table”, simulates the scenario of Sand
Table are widely used in architecture and military applications.
|
|

|
Virtual digital map
When the target board is
in the view, the camera pose is tracked and a virtual digital map with
annotation is displayed overlaid on the board. The camera can be moved with
arbitrary 6DOF motion viewing at the board while the virtual model is
displayed as it tightly attached on the board. The user can also interact
with the scene with a landmark or laser pointer.
(Mpeg movie, about 3.3M)
|
|

|
Virtual terrain model
The system also allows
occlusion or the camera rotating out of view. In this case, there may be no
vision measurement available temporarily, but the gyro correction channel still
can keep the system tracking.
(Mpeg movie, about 3.6M)
|
|
Portable Tracking System
The system is complete
self-contained portable tracking package consisting of a high resolution stereo
camera head, differential GPS receiver, 3DOF gyro sensor, and a laptop
computer. The stereo head equipping two high resolution digital cameras
using Firewire interface to the laptop computer.
The dual cameras configuration has multi-purposes, e.g. one channel (left)
of the acquired video streams will be used for vision processing and
tracking, while the both stereo streams are used to feed in a real-time
stereo reconstruction package for detailed façade reconstruction. The
integrated GPS and gyro sensors are used for tracking 6DOF pose. We developed data fusion approach to fuse
and synchronize those different timing data streams. Our approach is to
compensate for the shortcomings of each sensing technology by using
multiple measurements to create continuous and reliable tracking data.
|
|

|
|
Integrating Model based Vision
and INS for 6DOF Pose Estimation
We present a real-time hybrid tracking system that integrates
gyroscopes and line-based vision tracking.
Gyroscope measurements are used to predict pose orientation and
image line feature correspondences.
Gyroscope drift is corrected by vision tracking. System robustness is achieved by using a
heuristic control system to evaluate measurement quality and select
measurements accordingly.
Experiments show that the system achieves robust, accurate, and
real-time performance for outdoor navigation.
|
|

|
|
6DOF Auto-calibration
Technology
We extended our point based auto-calibration
technology to line feature. The new algorithm can automatically estimate 3D
information of line structures and camera pose simultaneously. We used both
those features for computing/refining camera pose and refining model structure
based on auto-calibration of line/edge features. First, auto-calibration of
the tracked features (points and lines) provides the necessary scale factor
data to create the 6th DOF that is lacking from vision. It also provides absolute pose data for
stabilizing the multi-sensors data fusion.
Since all the vision-tracked features have variable certainty in
terms of their 2D and 3D (auto-calibrated) positions, adaptive calibration
and threshold methods are needed to maintain robust tracking over longer
periods. Second, the
auto-calibration of structure features (point, line and edge) can provide
continually estimates of 3D position coordinates for feature
structures. The tracked feature
positions are iteratively refined till the residual error reach minimum.
|
|
Sensor
Motion Stabilization with Model–based Vision
Inertial sensors can be used for
orientation tracking and GPS for position. Those sensor are complete
self-contained that can be packaged for tracking in larger working areas.
However, their accuracy is not appropriated for our applications. The
signal sensing range of as well as man-made and natural sources of
interference also limits their usages. We overcome this problem by using a
model-based vision tracker to stabilize the tracked camera pose. Following two videos illustrate the
results of using only GPS/INS tracking and vision stabilized tracking,
respectively.
|
|

Only GPS/INS
Tracking (Avi Movie) Vision
stabilization (Avi Movie)
|
|
3D Scene Reconstruction
From Stereo
Reconstructing 3D scene model
from stereo imagery. We developed
several stereo approaches suitable for different scenarios (man-made and
natural scenes). We also produced a
task-oriented stereo vision system for automatic 3D measure and
reconstruction of SEM (Scanning Electronic Microscope) imagery and
satellite imagery. We introduced the
approaches of using wavelets to stereo matching - a wavelet zero-crossing
algorithm, and a wavelet phase matching stereo algorithm.
|
|

|
|
Large-scale
Scene Modeling from LiDAR and Imagery
Airborne LiDAR offers a fast
and effective way to acquire model for a large section of urban
environment. This data provides useful “footprint” information
about urban feature and building placements. However, due to the resolution
limitation and sensing noise, details on the buildings are missing and
occlusions from landscaping and overhangs lead to data voids in many areas
of interest. The model needs to be refined. We developed techniques and
built a modeling system that can model a variety of complex building
structures with irregular shapes and surfaces. Our approach employs several
morphological filters operating on the LiDAR
range data, and texture and color from aerial imagery to segment the
targeted objects from background. To
model the extracted 3D mesh data to produce constrained CG models, we
present a primitive-based approach.
Based on the shape of building rooftop, we classify a building
section into one of several groups, and for each group we define
appropriate geometry primitives, including the standard CG primitives and
high-order surface primitives, fitting to the building’s mesh data to
represent the complete building structure.
We have tested the system with a range of dataset and the technique
has been transferred to the ARMY TEC Laboratory.
|
|

USC campus and
surround areas
|
|

LA Natural
History Museum
|
|

Perth City,
Australia
|
|

Carson City,
California
|
|

|
|

|
|
Rapid Modeling of Dynamic
Objects from Images
An intuitive and easy-to-use 3D modeling system has become more crucial with
the rapid growth of computer graphics in our daily lives. Image-based
modeling (IBM) has been a popular alternative to pure 3D modelers since its
introduction in the late 1990s.
However, IBM techniques are inherently very slow and rarely user
friendly. Most IBM techniques require either very extensive manual input
and/or multiple images. We develop an IBM technique that gives high level
of detail with 1-2 minutes of manipulation from a novice user using only
single, un-calibrated image. Our system modifies a generic part-based model
of the object under investigation. User inputs are entered via a simple
interface and converted into modifications to the whole 3D model. We
demonstrate the effectiveness of our modeler by modeling several vehicles,
such as SUVs, sedan/hatchback/coupe cars, minivans, trucks and more.
|
|
FREE-VIEWPOINT VIDEO
Creating 3D models from single uncalibrated
images is useful for many applications. In this research, we focus on
free-viewpoint video, i.e. vehicle tracking, pose estimation and
visualization. In particular, video
taken from a single fixed camera can be used to create a virtual
environment. We apply the above
modeling system to model the vehicle in one of the frames of the
video. At this frame, the
vehicle’s pose with respect to the camera is known. This information
is carried to the neighboring frames by first tracking several points on
the vehicle and then updating the pose to create a free-viewpoint modeling
video. The resulted scene is composed of the moving vehicle and the
piecewise-planar scene model.
|
|
Fusion
Dynamic Imagery and 3D Scene models
The rapid and reliable creation of realistic
three-dimensional environment models is vital to many applications in engineering,
mission planning, training simulations, entertainment, or tactical decision
making, and military operations in battlefield environments. In many cases, the value of the generated
model is increased if both the geometric information and the appearance of
the model are accurate and realistic analogues of the real world. While most existing systems support
high-quality image texture mapping, they are limited to static imagery
databases that must be created prior to use. The static
images are usually derived from fixed cameras at known or computed
transformations relative to the modeled objects, which does not contain
sufficient source information necessary to perform detailed scene
analysis. The creation and
management of such image databases is also time consuming since it includes
image capture and the creation of mapping functions for each segmented
image and model patch. Such static
database makes it cumbersome to introduce additional imagery and information
sources for analysis and does not permit rapid updating when new imagery
such as live video and sensor information is available, hence is limiting
for applications requiring a dynamic and up-to-date picture of the
environment. To cope with the aforementioned
limitations, we develop a video texture projection technology that mimics
the dynamic projection process of the real imaging sensor to generate the
projected image in the same way as the photo reprinting. In this case, the corresponding transformations
between models and texture are computed and updated dynamic based on the
relationships of projective projection.
Texture images are generated by the virtual projectors with known
imaging parameters. Moving the model
or sensor will change the mapping function, and also change the visibility
and occlusion relationships that make the technology well suited for
dynamic visualization and comprehension of data from multiple sensors. Our approach can fuse real-time video or
imagery files onto 3D geometric models and produce visualizations from
arbitrary viewpoints.
|
|

|
|

|
|
Real-time Video Painting for
large-scale Environments
With the rapid development of modeling and remote sensing technologies,
it becomes increasingly more feasible to model a large-scale
environment. However, the
acquisition of static textures for such a large-scale environment is still
a challenging task, often demanding tedious and time-consuming manual
interactions. We present a system
for real time video painting, which not only acquires textures
automatically from multiple images or video sequences, but also updates the
texture data in real time to capture the most up-to-date imagery of the
environment. The video streams can
be acquired from stationary or moving cameras, e.g., a handheld camcorder,
and the texture mapping onto a 3D model is computed in real-time. Unlike the traditional texture mapping
process, in which regions of each texture image are a priori associated
with patches of the geometric model, our approach dynamically creates the
associations between the model and texture image as a result of image
projection during the rendering process.
This allows our method to automate the texture mapping process and
update the textures in real time.
These capabilities are not feasible with the traditional texture
mapping method.
|
|

|

|
|

Simulation
Results
|
|

Real Data from USC Campus
|
|
Augmented Reality Tracking and
Authoring System
A complete software system that
integrates our latest tracking technologies provides users an integrated
working environment to develop, author, test, and evaluate their own applications. The system encapsulates and integrates
varied media (video, audio, graphics, text, URL) into one
“message” that allows users to easily acquire/track/edit/author
the media stream as frame timing-line based. These included a friendly used interactive
interface, a variety of extendable functional buttons, frame timing-line
edit functions, and different task control and information windows to
clearly distinguish appropriate operational modes. A software algorithm indexes Internet or
other database information based on a use’s cursor motion over
tracked points or regions in a video sequence. Video sequences are
annotated during an authoring phase.
The annotations are URLs or similar data that are meaningful data
indices during the real time playback or interacting phase. During playback the user cursor selects
objects in the video scenes by proximity or clicking, triggering sounds,
speech, or one or more information or Internet browser windows displaying graphics,
video, or text. This additional
information may also overlay the video scene as an augmented reality. The authoring software application is
interactive and includes real time and non-real-time functions. The playback software executes in real
time while allowing user interaction.
|
|

|
Augmented
Reality Tracking and Authoring Station
The system encapsulates and integrates varied media (video, audio,
graphics, text, URL) into one “message” that allows users to
easily acquire/track/edit/author the media stream in an integrated working
environment to develop, author, test, and evaluate their own applications.
These included a friendly used interactive interface, a variety of
extendable functional buttons, frame timing-line edit functions, and
different task control and information windows to clearly distinguish
appropriate operational modes.
A demo version with some example files can be downloaded.
|
|
MobiPortrait: Automatic Portraits with Mobile Computing
We present a work related to mobile device technology and advanced
network and multi-user services support.
Using a generic communication framework, we can connect different
mobile devices to servers and provide users with a variety of
services. We demonstrate a specific
application, called MobiPortrait, as an example
in which mobile users can request an analysis and processing of images
captured with their handheld device.
The main goal is to offer users a variety of services on mobile
devices that are nominally only available on stationary machines.
|
|

|

PDA renderings of the captured image and corresponding portrait image
|
|
Mobile AR Using PDA Device
This project targets the development of to mobile AR technology
provides accurate position aware computing and information assistance to
users, augmenting user's perception of real 3D world with additional
enhancements to navigate people effectively in the real word. The AR metaphor of displaying information
in the spatial context of the real world could have a range of applications
to many areas, for example, monitor at a distance; hazardous confinement,
remote consultation - inspection or maintenance crews in remote or
dangerous environments. One of the
key requirements for an AR system is a tracking system that determines the
user's viewpoint accurately. As the
user moves his or her head and viewpoint, the computer-generated objects must
remain aligned with the 3D locations and orientations of real objects. In the mobile AR scenario, the system has
to the capability of tracking user’s pose in an open environment. Our proposed system includes a mobile PDA
computer, probable GPS, and video camera.
The PDA with GPS/camera navigates user showing simple text/2D map on
the PDA screen (for fast navigation), and at several locations the system
can link (via wireless network) to their URLs that give user more complex
information such as video or 3D model.
|
|

|

|
|
A Video-based Augmented Reality
Golf Simulator
Recent advances in the augmented reality technology have opened a tremendous
scope for its applications and further research towards its deployment for
solving a host of problems spanning multiple domains. We propose use of the technology in
virtual golf gaming and exemplify how the technology can be made to suit
the specific needs of different applications. Various challenges involved,
proposed solutions, and the results obtained are described.
|
|

System
configuration
|

The equipments - a golf ball, a club, calibrated fiducials
and a HMD with a miniature camera
|
|

|

|
|
Interactive Volume Rendering
for Virtual Colonoscopy
3D virtual colonoscopy has recently
been proposed as a noninvasive alternative procedure for the visualization
of the human colon. Surface
rendering is sufficient for implementing such a procedure to obtain an
overview of the interior surface of the colon at interactive rendering
speeds. Unfortunately, physicians
can not use it to explore tissues beneath the surface to differentiate
between benign and malignant structures.
In this study, we present a direct volume rendering approach based
on perspective ray casting, as a supplement to the surface navigation. To accelerate the rendering speed,
surface assistant techniques are used to adapt the resampling
rates by skipping the empty space inside the colon. In addition, a parallel version of the
algorithm has been implemented on a sharedmemory
multiprocessing architecture.
Experiments have been conducted on both simulation and patient data
sets.
An example of the result from a real patient data set is given in
following figure. The left column images show the surface based rendering,
and right column illustrates the result with the purposed volume rendering approach.
|
|

|

|
|

Surface Rendering
|

Volume Rendering
|
|
AVE – Dynamic Fusion of
Multiple Sensor Data for Wide-area Situational Awareness
Developing robust and
intelligent systems for wide-area situational awareness is vital to many
applications including national security, transportation management,
environment monitoring, catastrophe response, and tactical decision-making
and military operations in battlefield environments. The systems exhibiting robust
intelligence will be able to rapidly detect, model, assess, and respond
intelligently to the situations of environments so that suitable
conclusions and decisions can be made and applied.
Enabling wide-area situational awareness
requires the fundamental capability of rapid and accurate exploitation,
interpretation, and presentation of the data derived from different sensor
modalities and resources. To provide
an accurate and comprehensive picture of wide area scenarios, large and
distributed multi-information network has to be used to cooperatively
interpret the entire scene.
Today’s sensing and information technologies have reached a
stage where multimodal sensors and data sources are becoming prevalent in
commercial or military establishments, promising to have significant impact
on a broad range of these applications.
However, new problems arise from the wide spread use and
proliferation of the diverse information sources. Most significant is the human cognitive
ability (or lack thereof) to successfully fuse and comprehend the
information that the diverse data modalities can provide. Many applications such as situational
awareness now have to confront the vital problem of dealing with the
explosion of sensing information.
This project addresses the
problem of processing, fusing, and presenting data from a number of sensor
sources in a way that leverages the human brain’s ability to
comprehend and understand the 3D world.
We introduce the concept of Augmented
Virtual Environment (AVE) as the framework for incorporating the
proposed techniques and algorithms.
The AVE is a novel and comprehensive approach to data fusion,
analysis, and presentation that incorporates and presents all the sensors,
abstract data, objects, and scenes models within a common context to
produce a concise, coherent, and non-conflicting representation for
time-space interpretation of real world activity. The AVE framework is particularly suited
to addressing the above difficult problems posed by multiple sensing
sources. This approach is inspired
by the flexibility and generality of human intelligence, leveraging the
human brain’s cognitive ability to perceive and comprehend complex
information of the 3D real world.
|
|

Visualization as separate streams provides no integration of
information, no high-level scene comprehension, and obstructs
collaboration. In this traditional
manner, such as a room of monitors each showing a single data stream from a
sensor can not scale as the number of sensors grows. People are easily overwhelmed with the
cognitive task of presentation as separate information, switching across
large number of displays can become extremely confusing while following a
specific event of interest in the scene.
|
|

An AVE presentation provides users with a comprehensive
spatial-temporal view of an environment.
Users can easily browse the data from any sensors in a single image,
and freely move their viewpoints from the aerial view that visualizes an
entire region of the environment or a specific area of interest.
|
|

Dynamic objects and events are tracked and presented in 3D context can
greatly improve the scene comprehension and situational awareness.
|
|

Users can freely move their
viewpoints from a “god’s-eye” view that visualizes an
entire region of an environment or a specific area of interest. From any viewpoint, users observe
multiple dynamic data streams from fixed or moving aerial or ground-level
sensors projected onto the model, painting real-time views of the actual
events and activities occuring in the real world. Users can also require timely
access and present of the registered multiple sensors, properly sequenced
and merged with other data, to create an integrated view of the mission
space.
|
|


The AVE Technology has broad impact upon a wide range of applications
for civilian, law enforcement, and defense, as well as education and
training applications.
|
|

|
|