Synthesis of expressive gesture

How can we convey expressive content to users in a shared Mixed Reality environment?
What kind of techniques can we utilize to synthesize expressive gestures?

In order to answer these questions, research on expressive content synthesis focuses on methods to generate expressive multimodal gestures.
Experiments concern possible uses of techniques coming from theatre and cinema and aim at implementing output systems able to produce in the mixed reality environment an expressive situation as required by the mapping strategies.
Naturally, research on expressive content synthesis takes into account the communication of expressive content both in the particular output channels (e.g., communication of expressive content through sound, music, movements, visual media) as well as in a multimodal perspective i.e. synthesis of expressive content deriving from combining several non-verbal communication channels.

On the side of expressive synthesis in audio research concentrated on

  1. Music performance. The score is given in a MIDI-based format. The performance of the score can be controlled in real-time using high-level gestural input control

  2. Vocal modifications. The sound of a singer can be modified in real-time using the GEM vocal processor hardware as well as using other modules (time-frequency related) integrated in EyesWeb. The gestural input can be mapped to parameters such as pitch of an added voice.

  3. Spatialisation. The position and movement of a sound in the room (3D) can be controlled by gesture input.

  4. Sound processing. Control of musical and sonic parameters of pre-recorded audio. A sound processing framework based on sinusoidal modeling provides musically relevant sound transformation capabilities, in particular: time stretching, pitch shifting, and timbre control, has been realized. Gestural control for interaction with the sound processing engine integrated in EyesWeb has been experimented.   

On the visual side, the work consisted in the development of a collection of EyesWeb blocks and patches for real-time generation and processing of visual content depending on extracted expressive audio and motion cues. The project also developed a 3D world consisting of an “underwater-world” containing several marine animals, for instance sharks, dolphins and stingrays, in addition to other sea-elements as seaweed and conches. The creatures are able to move around, and the world also contains elements that make the environment more realistic, e.g., water-bubbles and swaying seaweed. Plankton and other floating objects might be added to the scene as well. Aspects of the environment (e.g., the density of water-bubbles, the number of floating objects) and expression of the avatars/characters can be controlled in real-time by parameters coming from analysis of gesture in music and human movement. The underwater world has been employed for the interactive game "Ghost in the Cave" (Stockholm, August 2003).

As a concrete output the research work produced a collection of software modules for analysis and synthesis of expressive gestures, integrated or connected in the MEGA System Environment. The research outputs have been employed in a series of multimedia performances.