Interactive Avatar Control
Real-time control of three-dimensional avatars (controllable, responsive animated characters) is an important problem in the context of computer games and virtual environments. Avatar animation and control is difficult, however, because a large repertoire of avatar behaviors must be made available, and the user must be able to select from this set of behaviors, possibly with a low-dimensional input device. One appealing approach to obtaining a rich set of avatar behaviors is to collect an extended, unlabeled sequence of motion data appropriate to the application. In this project, we explore efficient methods to exploit such a motion database for interactive avatar control.
In our system, the motion database initially consists of a number of motion clips containing many motion frames. The motion database is preprocessed to add variety and flexibility by creating connecting transitions where good matches in poses, velocities, and contact state of the character exist. The motion frames are then clustered into groups for efficient searching and for presentation in the interfaces. A unique aspect of our approach is that the original motion data and the generalization (clusters) of that data are closely linked; each frame of the original motion data is associated with a tree of clusters that captures the set of actions that can be performed by the avatar from that specific frame. The resulting cluster forest allows us to take advantage of the power of clusters to generalize the motion data without losing the actual connectivity and detail that can be derived from that data. This two-layer data structure (motion graph + cluster forest) can be efficiently searched at run time to find appropriate paths to behaviors and locations specified by the user.
Developing an intuitive interface for avatar control is challenging due to the high dimensionality of avatar's motion and the real-time constraints. We explored three different interfaces to provide the user with intuitive control of the avatar's motion: Sketch, choice, and performance interfaces.
In the maze example, we recorded a subject walking in an empty environment to create a motion database, and then use that motion to control the avatar in a virtual environment with obstacles. The user specifies a path through the environment by sketching on the terrain, and the database is searched to find motion sequences to follow the path and avoid the obstacles.
Similarly, we recorded motion on small sections of rough terrain, and use that motion to allow an avatar to navigate an extended rough terrain environment.
In choice interfaces, the user is continuously presented with a set of possible options (directions, locations, or behaviors) from which to choose. The user can scroll through the possible options to select one at any time. As the avatar moves, the display changes so that the choices remain appropriate to the context.
The display should be uncluttered to avoid confusing the user. In practice, this means that roughly three or four actions should be presented to the user at any given time. We use the cluster forest to obtain a small set of actions for display that are typically well-dispersed.
Performance (vision-based) Interface
In performance interfaces, the user acts out the desired motion in front of a video camera and the avatar duplicates the motion by selecting a sequence of motions of the database. The video data is processed to produce a silhouette, which is then matched to the silhouettes generated from the database in order to find a matching motion. When the appropriate motion is not in the database, the closest motion is identified and selected.
Jehee Lee, Jinxiang Chai, Paul Reitsma, Jessica Hodgins, and Nancy Pollard, Interactive Control of Avatars Animated with Human Motion Data, ACM Transactions on Graphics (SIGGRAPH 2002), volume 21, number 3, 491-500, July 2002.
Full video with audio (AVI, 46.6MB)
Sketch interface (AVI, 3.2 MB, no audio)
Choice interface (AVI, 8.2 MB, no audio)
Space&time windows for choice interface (AVI, 9.5 MB, no audio)
Vision interface (AVI, 5.1 MB, no audio)
Microsoft mpeg4 video codec is available at divx-digest
Jehee Lee (Seoul National University)
Jinxiang Chai (Carnegie Mellon University)
Paul S. A. Reitsma (Brown University)
Jessica K. Hodgins (Carnegie Mellon University)
Nancy S. Pollard (Carnegie Mellon University)
[Last modified : Feb 11, 2003]