RWC multimodal database for interactions by integration of spoken language and visual information

24 December 2002

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 4, 2171-2174
https://doi.org/10.1109/icslp.1996.607234

Abstract

The paper describes the design policy and prototype data collection of RWC (Real World Computing Program) multimodal database. The database is intended for research and development on the integration of spoken language and visual information for human computer interactions. The interactions are supposed to use image recognition, image synthesis, speech recognition, and speech synthesis. Visual information also includes non-verbal communication such as interactions using hand gestures and facial expressions between human and a human-like CG (computer graphics) agent with a face and hands. Based on the experiments of interactions with these modes, specifications of the database are discussed from the viewpoint of controlling the variability and cost for the collection Author(s) Hayamizu, S. Electrotech. Lab., Tsukuba, Japan Hasegawa, O. ; Itou, K. ; Sakaue, K. ; Tanaka, K. ; Nagaya, S. ; Nakazawa, M. ; Endoh, T. ; Togawa, F. ; Sakamoto, K. ; Yamamoto, K.

Keywords

This publication has 2 references indexed in Scilit:

Speech dialogue with facial displays
Published by Association for Computational Linguistics (ACL) ,1994
“Put-that-there”
ACM SIGGRAPH Computer Graphics, 1980

Cited by 4 articles